# Infobot Eval

This tool requieres user's input in several steps. Please run the cells one by one (Shift+Enter) to ensure all the steps are succesfully completed.

## Instructions:

1.  **Set-up**
    1. First cell: install and import dependencies
    2. Second cell: authentication - it requieres following the steps in the pop-up window. Alternatively, it can be replaced by other [supported authentication method](https://github.com/GoogleCloudPlatform/dfcx-scrapi#authentication)
    3. Third cell: introduce values for project, location and agent in the right panel; then run the cell.
    4. Fourth cell: run examples to validate set-up is correct
2.  **Generate Questions & Answer**
    1. First cell: save a sample csv file with correct format
    2. Second cell: upload csv file with the fields `user_query` and an `ideal_answer` for all examples
    3. Third cell: bulk generation of `agent_answer` that includes the text and link
3.  **Rating**
    1. First cell: download csv and add the ratings offline
    2. Second cell: upload csv file with the ratings
4. **Results**
    1. First cell: visualize distribution of ratings

This notebook calls `DetectIntent` using [dfcx-scrapi library](https://github.com/GoogleCloudPlatform/dfcx-scrapi) for Dialogflow CX.


## Rating guidance:

For each sample (aka row), the rater should evaluate each answer (including ythe link) that was generated by the agent. The answer will be evaluated with a integer number (escalar) from -1 to 3 as following:
*   **+3** : Perfect answer > fully addresses the question with correct information and polite tone
*   **+2** : Good answer > may contain unnecessary info, may miss some info, or may not be perfectly articulated
*   **+1** : Slightly good answer > some truth to the answer
*   **0** : Neutral answer > no answer or answer contains irrelevant info
*   **-1** : Hurtful answer > wrong or misleading info, or inappropriate tone



## Set-up


In [None]:
# Dependencies
!pip install dfcx-scrapi --quiet

import io
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from dfcx_scrapi.core.sessions import Sessions

**<span style="color:red">ATTENTION</span>: MANUAL STEP**

Instruction: Run the following commands one by one in the Terminal in order to authenticate the notebook
```
gcloud auth login
gcloud auth application-default login
```


**<span style="color:red">ATTENTION</span>: MANUAL STEP**

Instruction: In the next cell, edit the values of the Agent config, then run the cell


In [None]:
# Agent config
project_id = ''  #@param{type: 'string'}
location = 'global'  #@param{type: 'string'}
agent_id = ''  #@param{type: 'string'}

agent_id = f"projects/{project_id}/locations/{location}/agents/{agent_id}"
print(agent_id)

s = Sessions(agent_id=agent_id)

In [None]:
# Test
user_query = 'Hello World!'
agent_answer = s.get_agent_answer(user_query)
print(f" Q: {user_query}\n A: {agent_answer}")

user_query = 'Which is the cheapest plan?'
agent_answer = s.get_agent_answer(user_query)
print(f" Q: {user_query}\n A: {agent_answer}")

## Generate Questions & Answer

In [None]:
# Create sample csv

sample_df = pd.DataFrame({
  "user_query": [],
  "ideal_answer": [],
  "agent_answer": [],
  "rating": [],
  "comment": []
})

sample_df.loc[0] = ["Who are you?", "I am an assistant", "", 0, ""]
sample_df.loc[1] = ["Which is the cheapest plan?", "Basic plan", "", 0, ""]
sample_df.loc[2] = ["My device is not working", "Call 888-555", "", 0, ""]

# Export to local drive as csv file
file_name = 'data_sample.csv'
sample_df.to_csv(file_name, encoding='utf-8-sig', index=False)

**<span style="color:red">ATTENTION</span>: MANUAL STEP**

Instructions:

1. Download the file `data_sample.csv` to your local drive by right-clicking in the file
2. Open the csv file `data_sample.csv` and add the `user_query` and `ideal_answer` per example
3. Upload the updated file from your local drive to the Jupyter File system by clicking 'Upload File'


In [None]:

file_name2 = file_name
df = pd.read_csv(file_name2)

assert df.shape[0] > 0, "The csv has zero rows"
assert set(df.columns) == set(sample_df.columns), f"The csv must have the following columns: {sample_df.columns.values}"

df

In [None]:
# Generate answers for each query
df['agent_answer'] = df.apply(lambda row: s.get_agent_answer(row["user_query"]), axis=1)

# Export to local drive as csv file
file_name3 = file_name2
df.to_csv(file_name3, encoding='utf-8-sig', index=False)

df

# Rating

**<span style="color:red">ATTENTION</span>: MANUAL STEP**

Instructions:

1. Download the file `data_sample.csv` to your local drive by right-clicking in the file
2. Open the csv file `data_sample.csv` and add the `rating` and `comment` (optionally) per example
3. Upload the updated file from your local drive to the Jupyter File system by clicking 'Upload File'


In [None]:

df = pd.read_csv(file_name3)

assert df.shape[0] > 0, "The csv has zero rows"
assert set(df.columns) == set(sample_df.columns), f"The csv must have the following columns: {sample_df.columns.values}"

df

# Results


In [None]:
# Rating distribution
#df["rating"].describe()

# Histogram
ratings_set = [-1, 0, 1, 2, 3]
ratings_values = df['rating'].values
ratings_count = len(ratings_values)

bar_centers = np.linspace(min(ratings_set), max(ratings_set), len(ratings_set))
bar_edges = np.linspace(min(ratings_set)-0.5, max(ratings_set)+0.5, len(ratings_set)+1)
bar_heights, _ = np.histogram(ratings_values, bins=bar_edges, density=True)

for center, _h in zip(bar_centers, bar_heights):
  print(f"{center}: count={round(_h*ratings_count):.0f}, percentage={_h*100:.2f}%")

# Plot
height_sum = 100  # for percentage, use 100
fig, axs = plt.subplots(1, 1, figsize=(6, 4), tight_layout=True)

plt.bar(bar_centers, height_sum*bar_heights, width=0.8)
ratings_mean = np.mean(ratings_values)
plt.plot([ratings_mean, ratings_mean], [0, height_sum], '--', label=f"mean={ratings_mean:.2f}", color='red')
ratings_median = np.median(ratings_values)
plt.plot([ratings_median, ratings_median], [0, height_sum], '--', label=f"median={ratings_median:.2f}", color='green')

plt.axis((min(bar_edges), max(bar_edges), 0, round(1.2*max(height_sum*bar_heights), 1)))
plt.legend(loc='upper left')
plt.gca().grid(axis='y')
plt.xlabel('Rating')
plt.ylabel('Percentage [%]')
plt.title(f"Rating distribution (count={ratings_count})")
plt.show()
