# Log Analysis

This notebook helps you gather log files, extract fields of interest, and use a variety of analytical recipes to learn about your log data.  Use these recipes as a starting point for developing your own analytics!

## Housekeeping
Run the next cell as-is to load the prerequisites used by this script.

In [None]:
!curl -O https://raw.githubusercontent.com/cognitive-catalyst/WA-Testing-Tool/master/log_analytics/extractConversations.py
!curl -O https://raw.githubusercontent.com/cognitive-catalyst/WA-Testing-Tool/master/log_analytics/getAllLogs.py
!curl -O https://raw.githubusercontent.com/cognitive-catalyst/WA-Testing-Tool/master/log_analytics/intent_heatmap.py

%load_ext autoreload
%autoreload 2

!pip install squarify
!pip install ibm-watson

import pandas as pd
import getAllLogs
import extractConversations
import intent_heatmap

## Configuration and log collection
The next few cells require some configuration.  Review the variables and update them for your specific assistant.  The comments in the cells guide you in the configuration.

In [None]:
# Extract logs from your assistant.

# API, URL, and workspace ID are extractable from "View API Details page"
iam_apikey="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
#url pattern depends on region and when it was created (update one to match your instance)
url="https://gateway-wdc.watsonplatform.net/assistant/api"
#url="https://api.us-east.assistant.watson.cloud.ibm.com"
# Workspace ID is found inside the legacy URL pattern: {url}/v1/workspaces/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/message
workspace_id="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"

#Filter API is described at: https://cloud.ibm.com/docs/assistant?topic=assistant-filter-reference#filter-reference
# For a multi-skill assistant pass `workspace_id=None` and include "request.context.system.assistant_id" or "request.context.metadata.deployment" in the filter
log_filter="response_timestamp>=2012-02-01,response_timestamp<2020-02-15"

#Change the number of logs retrieved, default settings will return 10,000 logs (20 pages of 500)
page_size_limit=500
page_num_limit=20

#WA API version
version="2018-09-20" 

rawLogsJson = getAllLogs.getLogs(iam_apikey, url, workspace_id, log_filter, page_size_limit, page_num_limit, version)

In [None]:
#Optionally, save the records to disk by uncommenting these lines, this lets you skip ahead next time you load this notebook.
# rawLogsPath="watson_assistant_log_events.json"
# getAllLogs.writeLogs(rawLogsJson, rawLogsPath)

In [None]:
# Define the conversation corrrelation field name for your Watson Assistant records.
# Provide the field name as it appears in the log payload (default is 'response.context.conversation_id')
# For a single-skill assistant use 'response.context.conversation_id'
# For a Voice Gateway/Voice Agent assistant use 'request.context.vgwSessionID'
# For a multi-skill assistant you will need to provide your own key
primaryLogKey = "response.context.conversation_id"

# Name of the correlating key as it appears in the data frame columns (remove 'response.context.')
conversationKey='conversation_id'

# Optionally provide a comma-separated list of custom fields you want to extract, in addition to the default fields
#customFieldNames = "response.context.STT_CONFIDENCE,response.context.action,response.context.vgwBargeInOccurred"
customFieldNames = None

#If you have previously stored your logs on the file system, you can reload them here by uncommenting these lines
# rawLogsPath="watson_assistant_log_events.json"
# rawLogsJson = extractConversations.readLogs(rawLogsPath)

allLogsDF   = extractConversations.extractConversationData(rawLogsJson, primaryLogKey, customFieldNames)
conversationsGroup = allLogsDF.groupby(conversationKey,as_index=False)

print("Total log events:",len(allLogsDF))
allLogsDF.head()

# Recipes
The remainder of the notebook is a collection of analytic recipes to help you get started.

The recipes cover a number of common patterns, starting with very simple analytics and expanding to more complex analytics.  You can consider this an introduction to using common Pandas concepts to analyze Watson Assistant log events.  This notebook makes heavy use of both DataFrames as well as the Groupby-Apply-Combine pattern.

A few of the recipes include rudimentary visualizations however the focus of this notebook is to create analytic summaries.  A full treatment of analytic visualizations is beyond the scope of this notebook.

## Recipe: Count unique conversations
A simple metric to start us off.

In [None]:
print("Total log events:",len(allLogsDF))
print("Total conversations:",len(allLogsDF[conversationKey].unique()))

## Recipe: Number of times a given node is visited
Dialog nodes are not created equal.  For a node of interest we will want to know how many times it is visited.  This example uses `node_5_1546894657426` but you should replace this value with a node from your own assistant.

In [None]:
node_to_search="node_5_1546894657426"
node_visits_as_frame = allLogsDF[[node_to_search in x for x in allLogsDF['nodes_visited']]]

print("Total visits to target node:",len(node_visits_as_frame))

## Recipe: Number of unique conversations visiting a node
Conversations may visit the same node twice.  We can build upon the previous cell and apply an additional filter to count unique conversations meeting this condition.

In [None]:
print("Unique visitors to target node:",len(node_visits_as_frame[conversationKey].unique()))

## Recipe: Percentage of conversations visiting a node
You may be interested in how many conversations start a particular type of dialog flow such as authentication or escalation and can easily count that with this pattern.

You may instead decide to alter this recipe to identify the specific conversations that do (or do not) reach a dialog node.  You can use list arithmetic on sets of conversation identifiers where each set is a group of conversation identifiers that reach a dialog node.  The example below demonstrates finding conversations that did not visit a dialog node by getting all conversation identifiers and then removing the conversations that reached that node.

In [None]:
def did_conversation_visit_node(df:pd.DataFrame, conversationKey, node_to_search):
   visited_ids = df[[node_to_search in x for x in df['nodes_visited']]][conversationKey].unique()
   all_ids     = df[conversationKey].unique()
   not_visited_ids = list(set(all_ids) - set(visited_ids))

   return pd.Series({
       True: len(visited_ids),
       False: len(not_visited_ids)
   })

node_to_search="node_5_1546894657426"

visitsToNode = did_conversation_visit_node(allLogsDF, conversationKey, node_to_search)
visitsToNode

### Sub-recipe: Plotting a result
The resultant Series from a groupby-apply may be easily plotted.
There are many, MANY visualiation possible in Python notebooks and further plotting is out of the scope of this notebook.

In [None]:
visitsToNode.plot(kind='pie',figsize=(16,8),title="Did conversation visit {}?".format(node_to_search))

# Recipe Group: Collecting data for blind testing or new ground truth
Whether we want to assess the performance of our classifier via a blind test or gather new ground truth training data we need a quick way to extract what our users are saying to open-ended questions.  There are multiple ways to extract these utterances depending on the type of assistant.

Regardless of method the general recipe is:

1. Extract user utterances and intents assigned by Watson Assistant
2. Use SMEs to provide the actual intent of each utterance
3. Assess test performance and update training (ie, via [Dialog Skill Analysis notebook](https://medium.com/ibm-watson/announcing-dialog-skill-analysis-for-watson-assistant-83cdfb968178))


## Recipe: Gathering initial user responses via hardcoded dialog turn number
For a single-skill assistant we can use the `dialog_turn_counter` field to extract utterances on a given turn.  This field uses a 1-based index, ie the first turn is index=1. (Python generally assumes a 0-based index).

If the user speaks first, search on USER_FIRST_TURN_COUNTER=1.  If the assistant speaks first, use USER_FIRST_TURN_COUNTER=2


In [None]:
USER_FIRST_TURN_COUNTER=2
userFirstTurnView = allLogsDF[allLogsDF['dialog_turn_counter']==USER_FIRST_TURN_COUNTER]
userFirstTurnDF = userFirstTurnView[["input.text","intent","intent_confidence"]]

userFirstTurnDF.head()

## Recipe: Write out the user utterances to a file
Dataframes are easily exported to a comma-separated file which is easily imported into Excel and other tools.
For a blind test you need at the user utterance and the predicted intent.
When you have SMEs review the intents you should mindfully select one of these two options:

1. Include the predicted intent and let the SME make corrections.  This is the fastest approach but may bias the SMEs towards what was already predicted.
2. Remove the predicted intent.  This is more time-consuming for SMEs but generates unbiased labels.

This file-writing code can be used with any of the "gather response patterns" in this notebook.

In [None]:
# Uncomment ONE of the patterns
# Pattern 1: Write out all utterances and predictions
#userFirstTurnDF.to_csv("utterances.csv",index=False,header=["Utterance","Predicted Intent", "Prediction Confidence"])

#Pattern 2: Write only the user utterance
# userFirstTurnDF = userFirstTurnView[["input.text"]]
# userFirstTurnDF.to_csv("utterances2.csv",index=False,header=["Utterance"])

## Recipe: Gathering user responses to a given dialog node
Our virtual assistant may have open-ended questions that use intents on turns other than the first user turn.  In this case we want to find conversations where a dialog node is visited and look at the next utterance from the user.

In this recipe we make use of the `prev_nodes_visited` field which is a "shift" of the `nodes_visited` field.  The contents of `nodes_visited` for message `n` are available in `prev_nodes_visited` message `n+1`.

In [None]:
def responses_to_node(df:pd.DataFrame, node_to_search:str):
    responses_df = df[[node_to_search in x for x in df['prev_nodes_visited']]]
    #Remove conversations that didn't reach this node
    responses_df = responses_df[responses_df['input.text'] != ''] 
    return responses_df[['input.text','intent','intent_confidence']]

nodeResponsesDF = responses_to_node(allLogsDF, node_to_search)
nodeResponsesDF.head()

# Miscellaneous recipes
The remainder of the recipes are to demonstrate more advanced analytic patterns.

## Recipe: Summarizing the response distribution
It's helpful to know what kinds of responses are more common than others and how the system responds to them.

For instance, we can determine the number of times each intent is identified and its average confidence.

We will build this assessment using a summarization of one of the previous dataframes.

In [None]:
# Using pandas aggregators to count how often each intent is selected and its average confidence
userIntentSummaryDF = userFirstTurnDF.groupby('intent',as_index=False).agg({
   'input.text': ['count'], 
   'intent_confidence': ['mean']
})

userIntentSummaryDF.columns=["intent","count","confidence"] #Flatten the column headers for ease of use

# Increases readability of dashboard reports
if userIntentSummaryDF.loc[0,"intent"]=="":
    userIntentSummaryDF.loc[0,"intent"]="(no intent found)"

userIntentSummaryDF.head()

#You can also print to a CSV for external review
#userIntentSummaryDF.to_csv("utterances.csv",index=False,header=["Intent","Total Prediction", "Average Confidence"])

### Sub-recipe: visualizing intents summary in a tree map
Tree maps are one of my favorite visualizations, they let you look at two dimensions at once. One dimension is expressed in size and the other via color and/or placement.

For instance you can visualize intents using:
* Number of times intent appears as the size
* Average confidence of intent as color/placement (high confidence as green and lower-left, low confidence as red and upper-right)

Thus the largest boxes in the top-right quadrant are areas you should first focus improvement on.

In [None]:
intent_heatmap.generateTreemap(userIntentSummaryDF, 'count', 'confidence', 'intent', 'Classifier confidence by intent')

## Recipe: Extract turns of interest with audio and transcription details

In a voice-enabled assistant it is useful to extract snippets of audio worth listening to.  A voice conversation may last several minutes but you may wish to quickly identify an audio segment of interest.

In this recipe we will identify responses to a turn of interest and note their start/end time within the audio file.  Optionally we will augment these responses with speech transcription confidence as `STT_CONFIDENCE` (your orchestration layer will have to pass this confidence from the speech engine to the voice assistant).

The output table includes `conversationKey`, `message_start`, and `message_end` so that audio segments can be located.  The `conversationKey` will help you find the relevant call recording.  The `message_start` is the HH:MM:SS time of the assistant's statement to the user (the time to fast-forward to).  The `message_end` is the HH:MM:SS time for the completion of the user's response.

If you get an error `KeyError: "['STT_CONFIDENCE'] not in index"` this indicates your log records do not contain `STT_CONFIDENCE` or you have not extracted them with the `customFieldNames` in the "Configuration and log collection" section.

In [None]:
def speech_responses_to_node(df:pd.DataFrame, conversationKey:str, node_to_search:str):
    responses_df = df[[node_to_search in x for x in df['prev_nodes_visited']]]
    #Remove conversations that didn't reach this node
    responses_df = responses_df[responses_df['input.text'] != ''] 
    return responses_df[[conversationKey,'message_start','message_end', 'input.text','intent','intent_confidence','STT_CONFIDENCE']]

voiceResponsesDF = speech_responses_to_node(allLogsDF, conversationKey, node_to_search)
voiceResponsesDF.head()