# Skill analysis for New Watson Assistant Experience

## Introduction
Dialog/Action Skill Analysis for [New Experience Watson Assistant (WA)](https://cloud.ibm.com/docs/watson-assistant?topic=watson-assistant-watson-assistant-faqs#:~:text=What%20is%20the%20new%20IBM,or%20not%2C%20to%20create%20assistants) is intended for use by chatbot designers, developers and data scientists who would like to experiment with and improve their existing dialog or action skill design. 
    

This notebook assumes familiarity with the Watson Assistant product as well as concepts involved in dialog/action skill design such as intent, entities, and utterances.   

<font color='red'>**Important:**</font> For the new experience, Dialog takes precedence over Action. If you want to analyze Dialog Skill in the new experience, please make sure the Dialog option is activated in assistant setting and use Dialog workspace for the subsequent analysis. If you want to analyze Action Skill, please make sure the Dialog option is deactivated under assistant setting and use the Action workspace for subsequent analysis.

### Environment
- Python version 3.9 or above is required. 
- Install dependencies with `pip install -r requirements.txt` and refer to `requirements.txt`


Install all the required packages and filter out any warnings.

### Import dependencies

In [None]:
# visualize

from IPython.display import Markdown, display, HTML
import warnings
warnings.filterwarnings('ignore')

# Standard python libraries
import sys, os
import json
import importlib
from collections import Counter

# External python libraries
import pandas as pd
import numpy as np
import nltk
nltk.download('stopwords')
nltk.download('punkt')
import ibm_watson

# Internal python libraries
from assistant_skill_analysis.utils import skills_util, lang_utils
from assistant_skill_analysis.highlighting import highlighter
from assistant_skill_analysis.data_analysis import summary_generator
from assistant_skill_analysis.data_analysis import divergence_analyzer
from assistant_skill_analysis.data_analysis import similarity_analyzer
from assistant_skill_analysis.term_analysis import chi2_analyzer
from assistant_skill_analysis.term_analysis import keyword_analyzer
from assistant_skill_analysis.term_analysis import entity_analyzer
from assistant_skill_analysis.confidence_analysis import confidence_analyzer
from assistant_skill_analysis.inferencing import inferencer
from assistant_skill_analysis.experimentation import data_manipulator

### Assistant Settings
Please set values for the variables in the cell below to configure this notebook. The notebook uses CloudPakForDataAuthenticator to authenticate the APIs.

- **LANGUAGE_CODE:** language code correspond to your workspace data, supported languages: **en, fr, de, es, cs, it, pt, nl**

- **ASSISTANT_ID:** id of the Watson Assistant service instance

- **SKILL_FILENAME:** Depending on the type of analysis you want conduct, download dialog or action skill workspace json  from UI, and set variable in the cell below to the downloaded path.

- **username and password:** Replace them with your Cloud Pak for Data credentials
- **CP4D_URL:** This is the base url of your instance. It is in the format of `https://{cpd_cluster_host}{:port}/icp4d-api`
- **SERVICE_URL:** This is the service URL of your Watson Assistant. The URL follows this pattern: `https://{cpd_cluster_host}{:port}/assistant/{release}/instances/{instance_id}/api`. To find this URL, view the details for the service instance from the Cloud Pak for Data web client. For more information, see [Service Endpoint](https://cloud.ibm.com/apidocs/assistant-data-v1?code=python#service-endpoint)

Reference: https://cloud.ibm.com/apidocs/assistant-v2#authentication


In [None]:
# language

LANGUAGE_CODE = "en"

lang_util = lang_utils.LanguageUtility(LANGUAGE_CODE)

_, _, ASSISTANT_ID = skills_util.input_credentials(input_apikey=False,input_skill_id=False,input_assistant_id=True)

username = 'apikey'
password = '###'
SERVICE_URL = 'service_url'
CP4D_URL = 'cp4d_url'


# path to downloaded dialog/action skill workspace json depends on whether you want to analyze dialog or action

SKILL_FILENAME = '###'

## Table of contents

1. [Part 1: Prepare the training data](#part1)<br>
2. [Part 2: Prepare the test data](#part2)<br>
3. [Part 3: Perform advanced analysis](#part3)<br>
4. [Part 4: Summary](#part4)<br>

<a id='part1'></a>
# Part 1: Prepare the training data
1.1 [Set up access to the training data](#part1.1)<br>
1.2 [Process the dialog/action skill training data](#part1.2)<br>
1.3 [Analyze data distribution](#part1.3)<br>
1.4 [Perform a correlation analysis](#part1.4)<br>
1.5 [Visualize terms using a heat map](#part1.5)<br>
1.6 [Ambiguity in the training data](#part1.6)<br>

<a id='part1.1'></a>
## 1.1 Set up access to the training data

In [None]:
# load workspace json from local disk
with open(SKILL_FILENAME, 'r') as json_file:
    workspace_json = json.load(json_file)

# Extract user workspace
workspace_pd, workspace_vocabulary, entity_details, intent_to_action_mapping = skills_util.extract_workspace_data(workspace_json, language_util=lang_util)
entities_list = [item['entity'] for item in entity_details]


display(Markdown("### Sample of Utterances & Intents"))
display(HTML(workspace_pd.sample(n = len(workspace_pd) if len(workspace_pd)<10 else 10)
             .to_html(index=False)))

<a id='part1.2'></a>
## 1.2 Process the dialog/action skill training data

Generate summary statistics related to the given skill and workspace.

In [None]:
summary_generator.generate_summary_statistics(workspace_pd, entities_list)

<a id='part1.3'></a>
## 1.3 Analyze the data distribution

- [Analyze class imbalance](#imbalance)
- [List the distribution of user examples by intent](#distribution)
- [Actions for class imbalance](#actionimbalance)

### Analyze class imbalance<a id='imbalance'></a>

Analyze whether the data set contains class imbalance by checking whether the largest intent contains less than double the number of user examples contained in the smallest intent. If there is an imbalance it does not necessarily indicate an issue; but you should review the [actions](#actionimbalance) section below.

In [None]:
class_imb_flag = summary_generator.class_imbalance_analysis(workspace_pd)

### List the distribution of user examples by intent<a id='distribution'></a>
Display the distribution of intents versus the number of examples per intent (sorted by the number of examples per intent) below. Ideally you should not have large variations in terms of number of user examples for various intents. 

In [None]:
summary_generator.scatter_plot_intent_dist(workspace_pd)

In [None]:
summary_generator.show_user_examples_per_intent(workspace_pd)

### Actions for class imbalance<a id='actionimbalance'></a>

Class imbalance will not always lead to lower accuracy, which means that all intents (classes) do not need to have the same number of examples.

Given a hypothetical chatbot related to banking:<br>

- For intents like `updateBankAccount` and `addNewAccountHolder` where the semantics difference between them is subtler, the number of examples per intent needs to be somewhat balanced otherwise the classifier might favor the intent with the higher number of examples.
- For intents like `greetings` that are semantically distinct from other intents like `updateBankAccount`, it may be acceptable for it to have fewer examples per intent and still be easy for the intent detector to classify.



If the intent classification accuracy is lower than expected during testing, you should re-examine the distribution analysis.  

With regard to sorted distribution of examples per intent, if the sorted number of user examples varies a lot across different intents, it can be a potential source of bias for intent detection. Large imbalances in general should be avoided. This can potentially lead to lower accuracy. If your graph displays this characteristic, this could be a source of error.

For further guidance on adding more examples to help balance out your distribution, refer to 
<a href="https://cloud.ibm.com/docs/services/assistant?topic=assistant-intent-recommendations#intent-recommendations-get-example-recommendations" target="_blank" rel="noopener no referrer">Intent Example Recommendation</a>.

<a id='part1.4'></a>
## 1.4  Perform correlation analysis

- [Retrieve the most correlated unigrams and bigrams for each intent](#retrieve)
- [Actions for anomalous correlations](#anomalous)

### Retrieve the most correlated unigrams and bigrams for each intent<a id='retrieve'></a>

Perform a chi square significance test using count features to determine the terms that are most correlated with each intent in the data set. 

A `unigram` is a single word, while a `bigram` is two consecutive words from within the training data. For example, if you have a sentence like `Thank you for your service`, each of the words in the sentence are considered unigrams while terms like `Thank you`, `your service` are considered bigrams.

Terms such as `hi`, `hello` correlated with a `greeting` intent are reasonable. But terms such as `table`, `chair` correlated with the `greeting` intent are anomalous. A scan of the most correlated unigrams & bigrams for each intent can help you spot potential anomalies within your training data.

**Note**: We ignore the following common words (\"stop words\") from consideration `an, a, in, on, be, or, of, a, and, can, is, to, the, i`

In [None]:
unigram_intent_dict, bigram_intent_dict = chi2_analyzer.get_chi2_analysis(workspace_pd, lang_util=lang_util)

### Actions for anomalous correlations<a id='anomalous'></a>

If you identify unusual or anomalous correlated terms such as: numbers, names and so on, which should not be correlated with an intent, consider the following:
  
- **Case 1** : If you see names appearing amongst correlated unigrams or bigrams, add more variation of names so no specific names will be correlated  
- **Case 2** : If you see specific numbers like 1234 amongst correlated unigrams or bigrams and these are not helpful to the use case, remove or mask these numbers from the examples
- **Case 3** : If you see terms which should never be correlated to that specific intent, consider adding or removing terms/examples so that domain specific terms are correlated with the correct intent

<a id='part1.5'></a>
## 1.5 Visualize terms using a heat map

- [Display term analysis for a custom intent list](#customintent)
- [Actions for anomalous terms in the heat map](#heatmap)

A heat map of terms is a method to visualize terms or words that frequently occur within each intent. Rows are the terms, and columns are the intents. 

The code below displays the top 30 intents with the highest number of user examples in the analysis. This number can be changed if needed.

In [None]:

INTENTS_TO_DISPLAY = 30  # Total number of intents for display
MAX_TERMS_DISPLAY = 30  # Total number of terms to display

intent_list = []
keyword_analyzer.seaborn_heatmap(workspace_pd, lang_util, INTENTS_TO_DISPLAY, MAX_TERMS_DISPLAY, intent_list)

### Display term analysis for a custom intent list<a id='customintent'></a>

If you wish to see term analysis for specific intents, feel free to add those intents to the intent list. This generates a custom term heatmap. The code below displays the top 20 terms, but this can be changed if needed.

In [None]:
intent_list = workspace_pd['intent'].unique().tolist()
# intent_list = []
MAX_TERMS_DISPLAY = 20  # Total number of terms to display

if intent_list: 
    keyword_analyzer.seaborn_heatmap(workspace_pd, lang_util, INTENTS_TO_DISPLAY, MAX_TERMS_DISPLAY, intent_list)

### Actions for anomalous terms in the heat map<a id='heatmap'></a>

If you notice any terms or words which should not be frequently present within an intent, consider modifying examples in that intent.

<a id='part1.6'></a>
## 1.6 Ambiguity in the training data

- [Uncover ambiguous utterances across intents](#uncover)
- [Actions for ambiguity in the training data](#ambiguityaction)

Run the code blocks below to uncover possibly ambiguous terms based on feature correlation.

Based on the chi-square analysis above, generate intent pairs which have overlapping correlated unigrams and bigrams.
This allows you to get a glimpse of which unigrams or bigrams might cause potential confusion with intent detection:

#### A. Top intent pairs with overlapping correlated unigrams

In [None]:
ambiguous_unigram_df = chi2_analyzer.get_confusing_key_terms(unigram_intent_dict)

#### B. Top intent pairs with overlapping correlated bigrams

In [None]:
ambiguous_bigram_df = chi2_analyzer.get_confusing_key_terms(bigram_intent_dict)

#### C. Overlap checker for specific intents

In [None]:
# Add specific intent or intent pairs for which you would like to see overlap
intent1 = 'Where are you located?'
intent2 = 'Fallback'
chi2_analyzer.chi2_overlap_check(ambiguous_unigram_df,ambiguous_bigram_df,intent1,intent2)

### Uncover ambiguous utterances across intents<a id='uncover'></a>
The following analysis shows user examples that are similar but fall under different intents.  

In [None]:
similar_utterance_diff_intent_pd = similarity_analyzer.ambiguous_examples_analysis(workspace_pd, lang_util)

### Actions for ambiguity in the training data<a id='ambiguityaction'></a>

**Ambiguous intent pairs**  
If you see terms which are correlated with more than 1 intent, review if this seems anomalous based on the use case for that intent. If it seems reasonable, it is probably not an issue.  

**Ambiguous utterances across intents** 
- **Duplicate utterances**: For duplicate or almost identical utterances, remove those that seem unnecessary.
- **Similar utterances**: For similar utterances, review the use case for those intents and make sure that they are not accidental additions caused by human error when the training data was created.  

For more information about entity, refer to the <a href="https://cloud.ibm.com/docs/services/assistant/services/assistant?topic=assistant-entities" target="_blank" rel="noopener no referrer">Entity Documentation</a>.

For more in-depth analysis related to possible conflicts in your training data across intents, try the conflict detection feature in Watson Assistant. Refer to <br> <a href="https://cloud.ibm.com/docs/services/assistant?topic=assistant-intents#intents-resolve-conflicts" target="_blank" rel="noopener no referrer">Conflict Resolution Documentation</a>.

<a id='part2'></a>
# Part 2: Prepare the test data

Analyze your existing Watson Assistant Action Skill with the help of a test set.

2.1. [Obtain test data from Cloud Object Storage](#cos)<br>
2.2. [Evaluate the test data](#evaluate) <br>
2.3. [Analyze the test data](#testanalysis) <br>

## 2.1 Obtain test data from Cloud Object Storage<a id='cos'></a>

Upload a test set in tsv format. Each line in the file should have only `User_Input<tab>Intent`  

For example:
```
hello how are you<tab>Greeting  
I would like to talk to a human<tab>AgentHandoff  
```

In [None]:
importlib.reload(skills_util)
#Separator: Use '\t' for tab separated data, ',' for comma separated data
separator = "\t"

test_set_path = "./test.tsv"
test_df = skills_util.process_test_set(test_set_path, lang_util,separator)

display(Markdown("### Random Test Sample"))
display(HTML(test_df.sample(n=min(10, len(test_df))).to_html(index=False)))

## 2.2 Evaluate the test data<a id='evaluate'></a>
These steps can take time if you have a large test set.  

**<font color=red>Note</font>**: You will be charged for calls made from this notebook based on your Watson Assistant plan. The user_id will be the same for all message calls.

In [None]:
# Change Assistant API version if needed
# Find Latest --> https://cloud.ibm.com/apidocs/assistant-v2#versioning
API_VERSION = '2021-11-27'

# For ICP(IBM Cloud Private), you can disable SSL verification by changing this to True
DISABLE_SSL_VERTIFICATION = False 

conversation = skills_util.retrieve_conversation(username=username,
                                             password=password,                                                       
                                             authenticator_url=CP4D_URL,
                                             url=SERVICE_URL,
                                             api_version=API_VERSION,
                                             sdk_version="V2", # DO NOT CHANGE THIS ARG
                                             cp4d_auth=True,
                                             )

# if you have a BEARER_TOKEN, you can uncomment the code below and directly use your token
# BEARER_TOKEN = 'bearer_token'
# conversation = skills_util.retrieve_conversation(bearer_token=BEARER_TOKEN, url=SERVICE_URL)


conversation.set_disable_ssl_verification(DISABLE_SSL_VERTIFICATION)

In [None]:
THREAD_NUM = min(4, os.cpu_count() if os.cpu_count() else 1)
# increase timeout if you experience `TimeoutError`. 
# Increasing the `TIMEOUT` allows the process more breathing room to compete
TIMEOUT = 1 # `TIMEOUT` is set to 1 second
full_results = inferencer.inference(conversation,
                                    test_df,
                                    max_thread=THREAD_NUM, 
                                    assistant_id=ASSISTANT_ID,
                                    intent_to_action_mapping=intent_to_action_mapping,
                                    timeout=TIMEOUT
                                   )

<a id='part2.1'></a>
## 2.3 Analyze the test data<a id='testanalysis'></a>

- [Display an overview of the test data](#overview)
- [Compare the test data and the training data](#compare)
- [Determine the overall accuracy on the test set](#accuracy)
- [Analyze the errors](#errors)

### Display an overview of the test data<a id='overview'></a>

In [None]:
summary_generator.generate_summary_statistics(test_df)
summary_generator.show_user_examples_per_intent(test_df)

### Compare the test data and the training data<a id='compare'></a>

Ideally the test and training data distributions should be similar. The following metrics can help identify gaps between the test set and the training set:

**1.**  The distribution of user examples per intent for the test data should be comparable to the training data   
**2.**  The average length of user examples for test and training data should be comparable to the training data <br>
**3.**  The vocabulary and phrasing of utterances in the test data should be comparable to the training data

If your test data comprises of examples labelled from your logs, and the training data comprises of examples created by human subject matter experts, there may be discrepancies between what the virtual assistant designers thought the end users would type and the way they actually type in production. Thus, if you find discrepancies in this section, consider changing your design to resemble the way in which end users use your system more closely.

**<font color=red>Note</font>**: You will be charged for calls made from this notebook  based on your WA plan. The user_id will be the same for all message calls.

In [None]:
divergence_analyzer.analyze_train_test_diff(workspace_pd, test_df, full_results)

### Determine the overall accuracy on the test set<a id='accuracy'></a>

In [None]:
results = full_results[['correct_intent', 'top_confidence','top_intent','utterance']]
accuracy = inferencer.calculate_accuracy(results)
display(Markdown("### Accuracy on Test Data: {} %".format(accuracy)))

### Analyze the errors<a id='errors'></a>

This section gives you an overview of the errors made by the intent classifier on the test set.  

**Note**: `System Out of Domain` labels are assigned to user examples which get classified with confidence scores less than 0.2 as Watson Assistant considers them to be irrelevant.

In [None]:
wrongs_df = inferencer.calculate_mistakes(results)
display(Markdown("### Intent Detection Mistakes"))
display(Markdown("Number of Test Errors: {}".format(len(wrongs_df))))

with pd.option_context('max_colwidth', 250):
    if not wrongs_df.empty:
        display(wrongs_df)

<a id='part3'></a>
# Part 3: Perform advanced analysis

3.1 [Perform analysis using confidence thresholds](#part3.1)<br>
3.2 [Highlighting term importance](#part3.2)<br>
3.3 [Analyzing abnormal confidence levels](#part3.3)<br>


<a id='part3.1'></a>
## 3.1 Perform analysis using  confidence thresholds

This analysis illustrates how a confidence threshold is used to determine which data considered irrelevant or out of domain can be used for analysis. 

In [None]:
analysis_df= confidence_analyzer.analysis(results,None)

<a id='part3.2'></a>
## 3.2 Highlight term importance

This intent can be ground-truth or an incorrectly predicted intent. It provides term level insights about which terms the classifier thought were important in relation to that specific intent.

Even if the system predicts an intent correctly, the terms which the intent classifier thought were important may not be as expected by human insight. Human insight might suggest that the intent classifier is focusing on the wrong terms.  

The score of each term in the following highlighted images can be viewed as importance factor of that term for that specific intent. The larger the score, the more important the term.

You can get the highlighted images for either wrongly-predicted utterances or utterances where the classifier returned a low confidence.   

**<font color=red>Note</font>**: You will be charged for calls made from this notebook  based on your WA plan. The user_id will be the same for all message calls.

In [None]:
# Pick an example from section 1 which was misclassified
# Add the example and correct intent for the example
utterance = "what can i do to talk to someone"  # input example
intent = "Schedule An Appointment"  # input an intent in your workspace which you are interested in.

# increase timeout if you experience `TimeoutError`. 
# Increasing the `TIMEOUT` allows the process more breathing room to compete
TIMEOUT = 1 # `TIMEOUT` is set to 1 second
inference_results = inferencer.inference(conversation=conversation, 
                                         test_data=pd.DataFrame({'utterance':[utterance], 
                                                                 'intent':[intent]}), 
                                         max_thread = 1, 
                                         assistant_id=ASSISTANT_ID,
                                         intent_to_action_mapping=intent_to_action_mapping,
                                         timeout=TIMEOUT
                                        )

highlighter.get_highlights_in_batch_multi_thread(conversation=conversation, 
                                                 full_results=inference_results, 
                                                 output_folder=None,
                                                 confidence_threshold=1,
                                                 show_worst_k=1,
                                                 lang_util=lang_util,
                                                 assistant_id=ASSISTANT_ID,
                                                 intent_to_action_mapping=intent_to_action_mapping,
                                                )

In the section below you analyze your test results and produce highlighting for the top 25 problematic utterances which were either mistakes or had confidences below the threshold that was set.    

**<font color=red>Note</font>**: You will be charged for calls made from this notebook  based on your WA plan. The user_id will be the same for all message calls.

In [None]:
# The output folder for generated images
# Note modify this if you want the generated images to be stored in a different directory

highlighting_output_folder = './highlighting_images/'
if not os.path.exists(highlighting_output_folder):
    os.mkdir(highlighting_output_folder)

# The threshold the prediction needs to achieve below which  
# it will be considered as `out of domain` or `offtopic` utterances. 
threshold = 0.2

# Maximum number of test set examples whose highlighting analysis will be conducted
K=25
highlighter.get_highlights_in_batch_multi_thread(conversation=conversation, 
                                                 full_results=full_results, 
                                                 output_folder=highlighting_output_folder,
                                                 confidence_threshold=threshold,
                                                 show_worst_k=K,
                                                 lang_util=lang_util,
                                                 assistant_id=ASSISTANT_ID,
                                                 intent_to_action_mapping=intent_to_action_mapping,
                                                )

<a id='part3.3'></a>
## 3.3 Analyze abnormal confidence levels
Every test utterance is classified as a specific intent with a specific confidence by the Watson Assistant intent classifier. It is expected that model would be confident when it correctly predicts examples and not highly confident when it incorrectly predicts examples. 

But this is not always true. This can be because there are anomalies in the design. Examples that are predicted correctly with low confidence and the examples that are predicted incorrectly with high confidence are cases which need to be reviewed.

In [None]:
correct_thresh, wrong_thresh = 0.3, 0.7
correct_with_low_conf_list, incorrect_with_high_conf_list = confidence_analyzer.abnormal_conf(
    full_results, correct_thresh, wrong_thresh)

In [None]:
if len(correct_with_low_conf_list) > 0:
    display(Markdown("#### Examples correctedly predicted with low confidence"))
    with pd.option_context('max_colwidth', 250):
        display(HTML(correct_with_low_conf_list.to_html(index=False)))

In [None]:
if len(incorrect_with_high_conf_list) > 0:
    display(Markdown("#### Examples incorrectedly predicted with high confidence"))
    with pd.option_context('max_colwidth', 250):
        display(HTML(incorrect_with_high_conf_list.to_html(index=False)))

### Actions to take when you have examples of abnormal confidence

If there are examples which are incorrectly classified with high confidence for specific intents, it may indicate an issue in the design of those specific intents because the user examples provided for that intent may be overlapping with the design of other intents.

If intent A seems to always get misclassified as intent B with high confidence or gets correctly predicted with low confidence, consider using intent conflict detection. For more information, refer to the <a href="https://cloud.ibm.com/docs/services/assistant?topic=assistant-intents#intents-resolve-conflicts" target="_blank" rel="noopener no referrer">Conflict Resolution Documentation</a>.

Also consider whether those two intents need to be two separate intents or whether they need to be merged. If they can't be merged, then consider adding more user examples which distinguish intent A specifically from intent B.

<a id='part4'></a>
## Part 4: Summary
Congratulations! You have successfully completed the dialog/action skill analysis training. <br>
This notebook is designed to improve our dialog/action skill analysis in an iterative fashion. Use it to tackle one aspect of your dialog/action skill at a time and start over for another aspect later for continuous improvement.

##  Glossary

**True Positives (TP):** True Positive measures the number of correctly predicted positive values meaning that predicted class is the same as the actual class which is the target intent.

**True Negatives (TN):** True Negative measures the number of correctly predicted negative values meaning that the predicted class is the same as the actual class which is not the target intent.

**False Positives (FP):** False Positive measures the number of incorrectly predicted positive values meaning that the predicted class is the target intent but the actual class is not the target intent.  

**False Negatives (FN):** False Negatives measures the number of incorrectly predicted negative values meaning that the predicted class is not the target intent but the actual class is the target intent. 

**Accuracy:** Accuracy measures the ratio of corrected predicted user examples out of all user examples.   
Accuracy = (TP + TN) / (TP + TN + FP + FN)  

**Precision:** Precision measures the ratio of correctly predicted positive observations out of total predicted positive observations.   
Precision = TP / (TP + FP)  

**Recall:** Recall measures the ratio of correctly predicted positive observations out of all observations of the target intent.  
Recall = TP / (TP + FN)

**F1 Score:** F1 Score is the harmonic average of Precision and Recall.  
F1 = 2 \* (Precision \* Recall)/ (Precision + Recall)

For more information related to Watson Assistant, refer to the <a href="https://cloud.ibm.com/docs/services/assistant" target="_blank" rel="noopener no referrer">Watson Assistant Documentation</a>.

###  Authors

**Cheng Qian** is a data scientist at IBM Watson focusing on machine learning algorithms for IBM Watson Assistant conversational language understanding. Most recently he has been working on irrelevance detection algorithms for task oriented dialog systems. He is interested in applications of NLP/NLU and its theories.

**Haode Qi** is a data scientist at IBM Watson who delivers new machine learning algorithms into IBM Watson's market leading conversational AI service. He works with clients to help improve their conversational AI agents and helps them tackle complex challenges at scale with tools like Dialog Skill Analysis. His work primarily focuses on natural language technology with focuses on intent recognition and multingual technology.


### Previous Authors
**Navneet Rao** is an engineering lead at IBM Watson who believes in building unique AI-powered experiences which augment human capabilities. He currently works on AI innovation & research for IBM's award-winning conversational computing platform, the IBM Watson Assistant. His primary areas of interest include machine learning problems related to conversational AI, natural language understanding, semantic search & transfer learning.

**Ming Tan**, PhD, is a research scientist at IBM Watson who works on prototyping and productizing various algorithmic features for the IBM Watson Assistant. His research interests include a broad spectrum of problems related to conversational AI such as low-resource intent classification, out-of-domain detection, multi-user chat channels, passage-level semantic matching and entity detection. His work has been published at various top tier NLP conferences.

**Yang Yu**, PhD, is a research scientist at IBM Watson focusing on problems related to language understanding, question answering, deep learning and representation learning for various NLP tasks. He has been awarded by IBM for his contributions to several internal machine learning competitions which have included researchers from across the globe. Novel machine learning solutions designed by him have helped solve critical question answering and human-computer dialog problems for various IBM Watson products.

<hr>
Copyright &copy; IBM Corp. 2019. This notebook and its source code are released under the terms of the Apache License, Version 2.0.

<div style="background:#F5F7FA; height:110px; padding: 2em; font-size:14px;">
<span style="font-size:18px;color:#152935;">Love this notebook? </span>
<span style="font-size:15px;color:#152935;float:right;margin-right:40px;">Don't have an account yet?</span><br>
<span style="color:#5A6872;">Share it with your colleagues and help them discover the power of Watson Studio!</span>
<span style="border: 1px solid #3d70b2;padding:8px;float:right;margin-right:40px; color:#3d70b2;"><a href="https://ibm.co/wsnotebooks" target="_blank" style="color: #3d70b2;text-decoration: none;">Sign Up</a></span><br>
</div>