# Programmatic Evaluation of Watson Assistant Intent Recognition Performance

This notebook demonstrates a technique to programmatically train and evaluate the intent recognition performance for a workspace in <a href="https://www.ibm.com/watson/developercloud/assistant/api/v1/" target="_blank" rel="noopener noreferrer">Watson Assistant</a>.

At a high level, intents are purposes or goals expressed in a user's input, such as answering a question or processing a bill payment. By recognizing the intent expressed in a customer's input, the Assistant service can choose the correct dialog flow for responding to it.

This notebook will demonstrate how the Watson Assistant API can be directly accessed to programmatically train the workspace on intents. This is an alternative to the GUI tool typically used to train a workspace.

By managing the training process programmatically, the intent recognition performance can be reliably tested with a truly blind test set.

This notebook runs on Python 3.5
So, on the top right be sure to run the right Kernel, if not go to menu **Kernel > Change Kernel** then select **Python 3.5** 

Tips:
* Code cells are identifiable by their `In [ ]:` prefix in the margin
* To execute the celsl in the notebook, select the cell and click the run button, or hit Ctrl-Enter.
* Cells which have not been executed before will have empty brackets, while executed cells will have a sequence number within, e.g. `In [13]`
* Cell execution result displays below the cell
* To clear all exection statuses and outputs, use the `Cell/All Output/Clear` menu.

Then execute the cell (Ctrl-Enter or run button)

## Table of contents

1. [Install and import packages](#setup)
2. [Import the data as a pandas DataFrame](#import)
3. [Split the data set for training and testing](#scikit)
4. [Authenticate to the Watson Assistant Service](#authenticate)
5. [Test the connection to the Watson Assistant](#wcs1)
6. [Create unique intents from the training data](#wcs2)
7. [Add examples to each intent from the training data set](#wcs3)
8. [Evaluate the test set with the message function](#wcs4)<br>
[Summary and next steps](#Summary-and-next-steps)

## <a id="setup"></a> Step 1. Install and import packages

Install and import the necessary packages.


In [None]:
!pip install --upgrade watson-developer-cloud

In [None]:
!pip install sklearn --upgrade

In [None]:
import pandas as pd
import numpy as np
#from bokeh.charts import Histogram, output_file, show
import random
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

## <a id="import"></a>Step 2. Import the data as a pandas DataFrame

The data consists of sample user questions and the assigned intents. 

**For notebooks running on IBM Data Science Experience:**

To get the data and load it into a pandas DataFrame:

* Select the code cell below, and **delete all its content**
* Open the data panel on the right using the 1001 button icon  (top right)
* Drop your file with the your intents and user examples.
* From the data panel on the right use context menu on the added file choose **Insert to code > Insert Pandas DataFrame** 

Some code should be generated, which creates a `df_data_1` panda DataFrame. If the name is different, change the variable name back to `df_data_1`

**For Python notebook servers**
1. Uncomment and modify the code stub to load data from your server's filesystem. 


In [None]:

import sys
import types
import pandas as pd
from botocore.client import Config
import ibm_boto3

def __iter__(self): return 0

# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It includes your credentials.
# You might want to remove those credentials before you share your notebook.
client_93c7a4746f1e4132864e7ef0a2d31c48 = ibm_boto3.client(service_name='s3',
    ibm_api_key_id='p030V5BiCTklpQ8YRLqlPMNhT-1j7H0Z5WBsBwOUXmRD',
    ibm_auth_endpoint="https://iam.eu-gb.bluemix.net/oidc/token",
    config=Config(signature_version='oauth'),
    endpoint_url='https://s3.eu-geo.objectstorage.service.networklayer.com')

body = client_93c7a4746f1e4132864e7ef0a2d31c48.get_object(Bucket='assistantevaluationdcd3054a40ba441bb5ae5f75c631e25f',Key='WCS_Lab_Data_Set.csv')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )

df_data_1 = pd.read_csv(body)
df_data_1.head()



Rename the DataFrame to `df`:

In [None]:
# Make sure this uses the variable above. The number will vary in the inserted code.
try:
    df = df_data_1
except NameError as e:
    print('Error: Setup is incorrect or incomplete.\n')
    print('Follow the instructions to insert the pandas DataFrame above, and edit to')
    print('make the generated df_data_# variable match the variable used here.')
    raise

## <a id="scikit"></a>Step 3. Split the data set for training and testing 
Using Scikit Learn, split the data set into two separate sets, one for training and one for testing. The size of the testing data set is set to 20% of the original data set, but you can change the percentage if you like.

In [None]:
train, test = train_test_split(df, test_size = 0.2)
train.head()

In [None]:
train.groupby(by='intent').count()

In [None]:
test.head()

In [None]:
test.groupby(by='intent').count()

## <a id="authenticate"></a>Step 4. Authenticate to the Watson Assistant

Sign up for the Watson Conversation service and enter your credentials. 

1. Sign up for [Watson Assistant](https://console.bluemix.net/catalog/services/conversation) in IBM Cloud.
1. On your Watson Conversation service page, click **Launch Tool**. The Workspaces page appears in a separate tab.
1. On your Watson Conversation Workspaces page, click **Create**. 
1. Add a name, for example, `Intents example`, and click **Create**.
1. Find your workspace ID and credentials by clicking the **Deploy** button and then **Credentials**. 
1. Add your workspace ID, username, and password to the next cell and run the cell.

Tips:
* The Watson Studio and the Watson Assistant must be in the same IBM Cloud region (US South for instance)

In [None]:
CONVERSATION_USERNAME = 'XXXXXXXXXXXXXXXXX'
CONVERSATION_PASSWORD = 'XXXXXXXXXXXXXXXXX'
VERSION = '2018-02-16'
WORKSPACE_ID = 'XXXXXXXXXXXXXXXXXXXXXX'

Import the Watson Assistant package and set variables:

In [None]:
import json
from watson_developer_cloud import ConversationV1
conversation = ConversationV1(
    username=CONVERSATION_USERNAME,
    password=CONVERSATION_PASSWORD,
    version= VERSION
)

## <a id="wcs1"></a>Step 5. Test the connection to the Watson Assistant
Run the <a href="https://www.ibm.com/watson/developercloud/assistant/api/v1/" target="_blank" rel="noopener noreferrer">Watson Assistant API</a> functions to make sure you are properly connected to your Watson Assistant Workspace.

List the existing intents with the `list_intents` function. If this is the first time you're using the Watson Assistant, you won't have any intents.

In [None]:
intents = conversation.list_intents(WORKSPACE_ID)
print(json.dumps(intents, indent=2))

Create a sample intent with the `create_intent` function:

In [None]:
create = conversation.create_intent(WORKSPACE_ID,'sample','This is an example')
print(json.dumps(create, indent=2))


Now delete all intents with the `delete_intent` function:

In [None]:
#Clear the workspace of all existing intents
intents = conversation.list_intents(workspace_id=WORKSPACE_ID)['intents']
for intent in intents:
    conversation.delete_intent(workspace_id=WORKSPACE_ID, intent=intent['intent'])

## <a id="wcs2"></a>Step 6. Create unique intents from the training data

Use the values from the `intent` column in the training data set to create intents: `locate_amenity`, `capabilities`, and `interface_issues`.

In [None]:
for intent in set([x for x in train['intent']]):
    conversation.create_intent(workspace_id=WORKSPACE_ID, intent=intent, description=intent)

## <a id="wcs3"></a>Step 7. Add examples to each intent from the training data set
Add example text from the training data set for each intent so that the Watson Assistant can learn what sorts of questions to assign to each intent.

In [None]:
for training_data in [x[1] for x in train[:].iterrows()]:
    conversation.create_example(workspace_id=WORKSPACE_ID, intent=training_data.intent, text=training_data.example)

## <a id="wcs4"></a>Step 8. Evaluate the test set with the message function
Now test how accurately the Watson Assistant can assign intents to the examples in the testing data set. By using the `message` function from the <a href="https://www.ibm.com/watson/developercloud/assistant/api/v1/" target="_blank" rel="noopener noreferrer">Watson Assistant API</a>, you can test all examples at once, instead of examining each example individually with the Assistant Workspace tool. The best way should be to create a data source to make some calculation and dashboarding of the efficienty of the service, rigth now you just calculate the accuracy of the solution.

In [None]:
results = []
#extract results and confidence rate
for test_data in [x[1] for x in test[:].iterrows()]:
    try:
        r=conversation.message(workspace_id=WORKSPACE_ID, input={"text": test_data.example})
        intent=r['intents'][0]['intent']
        confidence=r['intents'][0]['confidence']
 #       results.append({'test_intent': test_data.intent, 'test_example': test_data.example, 'res_intent': intent, 'confidence':confidence,'r':r, 'test': 1 if intent == test_data.intent else 0})
        results.append({'test_example': test_data.example,'test_intent': test_data.intent, 'res_intent': intent,'confidence':confidence, 'test': 1 if intent == test_data.intent else 0})
    except Exception as exc:
        results.append({'test_example': test_data.example,'test_intent': test_data.intent, 'res_intent': '','confidence':'0', 'test': 1 if intent == test_data.intent else 0})

        print({'error': test_data.example, 'exc':format(exc)})
        

Now display the results

In [None]:
print(json.dumps(results, indent=1))
#print(results)

In [None]:
res = []
#Calculate the number of right answer
for rz in results:
    try:
        m=rz['test']
        res.append(m)
    except:
        print(rz)
        res.append(0)
        
res = np.array(res)

#Calculate the accuracy
print("Intent Recognizer Performance / accuracy: {:.2%}".format(np.sum(res) / res.size))       

## Summary and next steps
You've learned how to use the Watson Assistant API to train and evaluate the service. Try adding your own user questions and intents data and see how Watson does!

Learn more:
- <a href="https://www.ibm.com/watson/developercloud/assistant/api/v1/" target="_blank" rel="noopener noreferrer">Watson Assistant API reference</a>
- <a href="https://github.com/watson-developer-cloud/python-sdk" target="_blank" rel="noopener noreferrer">Watson Assistant Python SDK</a>

### Authors
Laurent Vincent.

Copyright &copy; IBM Corp. 2018. This notebook and its source code are released under the terms of the MIT License.