## Purpose of Notebook
In this notebook, I will be testing out `./tutorialFAQs/babybonusTest.csv` using a Dialogflow agent's knowledge base (beta feature). The agent's knowledge base will be using the babybonus's faq from `./tutorialFAQs/baby_bonus_orignal.pkl`.`./tutorialFAQs/babybonusTest.csv` contains two parallel lists of reframed and original questions. `./tutorialFAQs/baby_bonus_orignal.pkl` contains the original lists of question-answer pairs with *no question augmentation performed*.

## How will it be tested
I will pass each question from the "reframed_questions" to the agent and verify whether the question predicted by the agent matches the corresponding original question. 

## History of Test on Bani
In Bani's test, the achieved accuracy was "0.9140893470790378". Three questions were omitted, amounting the total number of questions used to 291. 

In [78]:
!pip install google-cloud-dialogflow

In [95]:
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/Users/james/Downloads/Chatbot_Key.json"
PROJECT_ID = "chatbot-evaluation-299303"
SESSION_ID = "123456789"
LANGUAGE_CODE = "en-US"
BABY_BONUS_KB_ID = "MTgxNjE2MDc3MjMyNzg0MDE1MzY"
COMCARE_KB_ID = "MTIzMzAwMDkxNTU3ODcwMzA1Mjg"

In [147]:
# import proto
#     print(proto.Message.to_json(response))
def detect_intent_knowledge(project_id, session_id, language_code,
                            knowledge_base_id, texts):
    """Returns the result of detect intent with querying Knowledge Connector.
    Args:
    project_id: The GCP project linked with the agent you are going to query.
    session_id: Id of the session, using the same `session_id` between requests
              allows continuation of the conversation.
    language_code: Language of the queries.
    knowledge_base_id: The Knowledge base's id to query against.
    texts: A list of text queries to send.
    """
    from google.cloud import dialogflow_v2beta1 as dialogflow
    session_client = dialogflow.SessionsClient()

    session_path = session_client.session_path(project_id, session_id)

    text_input = dialogflow.TextInput(
        text=texts, language_code=language_code)

    query_input = dialogflow.QueryInput(text=text_input)

    knowledge_base_path = dialogflow.KnowledgeBasesClient \
        .knowledge_base_path(project_id, knowledge_base_id)

    query_params = dialogflow.QueryParameters(
        knowledge_base_names=[knowledge_base_path])

    request = dialogflow.DetectIntentRequest(
        session=session_path,
        query_input=query_input,
        query_params=query_params
    )
    response = session_client.detect_intent(request=request)
    return response
#     print(proto.Message.to_json(response))
#     print('=' * 20)
#     print('Query text: {}'.format(response.query_result.query_text))
#     print('Detected intent: {} (confidence: {})\n'.format(
#         response.query_result.intent.display_name,
#         response.query_result.intent_detection_confidence))
#     print('Fulfillment text: {}\n'.format(
#         response.query_result.fulfillment_text))
#     print('Knowledge results:')
#     knowledge_answers = response.query_result.knowledge_answers
#     for answers in knowledge_answers.answers:
#         print(' - Answer: {}'.format(answers.answer))
#         print(' - Confidence: {}'.format(
#             answers.match_confidence))
# [END dialogflow_detect_intent_knowledge]

In [220]:
response = detect_intent_knowledge(PROJECT_ID, SESSION_ID, LANGUAGE_CODE, BABY_BONUS_KB_ID, "More information about AI")

In [92]:
import pandas as pd
df = pd.read_csv("./tutorialFAQs/babybonusTest.csv")
# print(df.columns)
original_questions = df["original"].to_list()
reframed_questions = df["reframed"].to_list()

In [135]:
!pip install fuzzywuzzy
!pip install python-Levenshtein

Collecting fuzzywuzzy
  Downloading fuzzywuzzy-0.18.0-py2.py3-none-any.whl (18 kB)
Installing collected packages: fuzzywuzzy
Successfully installed fuzzywuzzy-0.18.0
You should consider upgrading via the '/Library/Frameworks/Python.framework/Versions/3.7/bin/python3 -m pip install --upgrade pip' command.[0m
Collecting python-Levenshtein
  Downloading python-Levenshtein-0.12.0.tar.gz (48 kB)
[K     |████████████████████████████████| 48 kB 312 kB/s eta 0:00:01
Building wheels for collected packages: python-Levenshtein
  Building wheel for python-Levenshtein (setup.py) ... [?25ldone
[?25h  Created wheel for python-Levenshtein: filename=python_Levenshtein-0.12.0-cp37-cp37m-macosx_10_9_x86_64.whl size=80008 sha256=3106c2b4d29a507bec92577cffeea48753396899a8c8e0f5540c1ec25304deb2
  Stored in directory: /Users/james/Library/Caches/pip/wheels/f0/9b/13/49c281164c37be18343230d3cd0fca29efb23a493351db0009
Successfully built python-Levenshtein
Installing collected packages: python-Levenshtein
Su

In [240]:
from fuzzywuzzy import fuzz
from fuzzywuzzy import process
import uuid
from tqdm import tqdm
from collections import defaultdict
def test_detect_intent_knowledge(project_id, language_code,
                            knowledge_base_id, reframed_questions, original_questions):
    session_id = f"session_{uuid.uuid4()}"
    
    correct = 0
    invalid = 0
    invalid_questions = []
    dubious_questions = defaultdict(list)
    wrong_questions = defaultdict(list)
    for reframed_q, original_q in tqdm(zip(reframed_questions, original_questions)):
        # Ask knowledge base the reframed question
        response = detect_intent_knowledge(project_id, session_id, language_code, knowledge_base_id, reframed_q)
        if not response:
            print(f"EMPTY RESPONSE for {reframed_q}")
        # Verify whether faqQuestion detected matches the original question
        answers = response.query_result.knowledge_answers.answers
        if len(answers)>0:
            detected_q = response.query_result.knowledge_answers.answers[0].faq_question
            ratio = fuzz.ratio(detected_q.lower(),original_q.lower())
            if ratio >= 90:
                correct += 1
                if ratio != 100:
                    dubious_questions[ratio].append([reframed_q, detected_q, original_q])
            else:
                wrong_questions[ratio].append([reframed_q, detected_q, original_q])
        else:
            invalid += 1
            invalid_questions.append(reframed_q)
    print(f"Result: {correct/len(reframed_questions)} with {correct} out of {len(reframed_questions)} questions correct")
    print(f"\n{len(wrong_questions.keys())} questions were WRONGLY MATCHED between detected question and original question")
    print(f"{invalid} questions were INVALID")
    outstand_count = len(reframed_questions) - len(wrong_questions.keys()) - invalid - correct
    if outstand_count != 0:
        print(f"You have {outstand_count} questions unaccounted for")
        print(f"New Result: {correct/(len(reframed_questions) - outstand_count)} with {correct} out of {len(reframed_questions) - outstand_count} questions")
    return dubious_questions, wrong_questions, invalid_questions
        

In [241]:
# Print percentage of questions answered correctly
dubious_questions, wrong_questions, invalid_questions = test_detect_intent_knowledge(PROJECT_ID, LANGUAGE_CODE, BABY_BONUS_KB_ID,reframed_questions, original_questions)

294it [07:03,  1.44s/it]

Result: 0.9421768707482994 with 277 out of 294 questions correct

13 questions were WRONGLY MATCHED between detected question and original question
1 questions were INVALID
You have 3 questions unaccounted for
New Result: 0.9518900343642611 with 277 out of 291 questions





In [229]:
print(dubious_questions.keys())

dict_keys([99, 97, 98, 93, 96])


In [224]:
print(len(invalid_questions))
print(invalid_questions)

1
['More information about AI?\n']


In [225]:
print(len(wrong_questions.items()))
print(wrong_questions.keys())

13
dict_keys([85, 37, 78, 43, 65, 46, 35, 58, 38, 45, 47, 89, 39])


In [226]:
print(len(reframed_questions))

294
