# Lab5 Final assignment: putting things together

The final assignment is an individual assignment in which you have to put things together:

   1. **Create and annotate a conversation**
       1. **BONUS POINT**: adapt the **eliza_language.py** to get better Eliza responses that trigger Ekman emotions 
       2. create an emotional conversation between a **fake** human and your version of Eliza consisting of 50 human input prompts (a total of 100 turn ids inclduing the prompts from Eliza): use **eliza-chat.ipynb**.
       3. annotate the conversation with Ekman emotions with the gold emotion labels and save the result to disk: : use **eliza-chat.ipynb**.
   2. **Automatically clasify the emotions using different classifiers**
       1. load the annotated conversation from disk in this notebook as a Pandas dataframe and apply the following emotion classifiers to your conversation:
       2. BoW SVM classifiers trained on 1) MELD, 2) Tweets and 3) MELD+Tweets data.
       3. BERT finetuned with GO_emotions (as shown below)
       4. **BONUS POINT**: Apply VADER to get sentiment scores for each utterance
   3. **Evaluate the classifiers against you gold annotations**
       1. Create a classification report and confusion matrix. Only evaluate the human input and ignore the ELiza prompts!
       2. Discuss their performance and result: 1) report on similarities and differences, 2) formulate what you expected from each model in terms of recall and precision and whether this is confirmed or not by the results 3) try to explain what did NOT confirm your expectations 
       3. **BONUS POINT**: If you also applied VADER and mapped the Ekman labels to sentiment, you can get an evaluation of all the cassifiers are the sentiment level as well.
   4. **Formulate what could be done to improve the automatic classification**
       1. How can you improve the classifiers by 1) different training data, 2) processing the training data differently
       2. Reflect on using different emotion labels: sentiment, ekman or go.

To be able to run the experiment, you need to have an emotional conversation with Eliza using the notebook **eliza_chat.ipynb** with 50 human turns as input, annotate this conversation with the Ekman emotion labels and save it to disk. Try to make it a **fake** but emotional conversation, so basicaly an emotional roller coaster exhibiting many different emotions to properly test the classifiers. You can earn maximal **1 bonus point** by adapting Eliza's responses in an intelligent way to respond better to get Ekman emotions.

For the automatoc classifications, you can use the BoW SVM classifiers that you created in Lab3. You should have saved these classifiers on disk and load these in this notebook. You should also build a 3rd BoW SVM classfier by combining the MELD and Tweet data into a single set of training data. Save the MELD-Tweet classifier to disk as well and load it in this notebook. You can look at **aggregating_results_across_systems.ipynb** to see how to load different classifiers from disk and to aggregate the results in a single pandas dataframe.

Below, we already show how you can apply the finetuned BERT classifier to the loaded conversation and add the result to the pandas dataframe. You can simply apply this to your own conversation. If for some reason, you are not able to load the transformer model in your computer and run it do the following:

   1. Send me your conversation with the annotation saved in CSV
   2. I will apply the transformer model for you and send back the CSV with the GO annotations
   3. Load the CSV with the GO anotations and proceed from there

Once the pandasframe is complete with the annotations from all the classifiers, make sure you save it to disk first for future use.

For the evaluation of the classifiers, you need to extract the gold labels and the classifier labels for each of them separately. Use the **sklearn** functions to generate a classification report and a confusion matrix (you can use **seaborn** to make the image nicer). You can earn **1 bonus point** max, if you also applied VADER to the conversation and evaluated the systems at the sentiment level. Note that you need to provide a mapping from Ekman to sentiment for your BoW classifiers. You can use the functions in **lab5_util.py** to apply such a mapping to your results.

When you report the results, it makes sense to combine these in a single table. So instead of having 4 Ekman emotion classification tables you can give the recall, precision and f1 scores 4 times for each emotion and averaged in a multicolumn table. This makes it easier to observe differences and similarities.

## 1. Submission

The assignment should be submitted individually on CANVAS as a zipfile and include the following:

   1. Optional: the **eliza_language.py** for the adapted Eliza 
   2. The notebook to create a Bow combining MELD and Tweets
   3. This notebook where you:
       1. load the emotional conversation with the gold labels, 
       2. loaded and applied the 4 classifiers (or 5 if VADER is included)
       3. added the classifier output to the pandas frame and save the result in a CSV file
       3. aggregated the human gold and system labels
       4. generated the classification report and the confusion matrix for Ekman (and possibly for sentiment)
   4. A CSV file containing the conversation with the gold and 
   5. A PDF report of max 5 pages (if you included VADER 6 pages) in which you describein three sections:
       1. what you have done and why: be explicit about changes you made e.g. to Eliza or training on MELD+Tweets
       2. report on the results: use a single table for recall, precision and f-score and put confusion matrixes in the appendix.
       3. have a discussion on the results and how to improve the classifiers (see above)

If any of 2,3,4,5 is missing in the zip file, you your submission is not graded. Note that 1. is optional.

## 2. Loading the conversation saved on disk

Using the notebook **eliza-chat.ipynb**, you can create a conversation with Eliza and save it to disk. For this final assignment, we ask each group 

In [19]:
import pandas as pd

In [20]:
file = 'my_emotional_conversation.csv'
df = pd.read_csv(file)
df.head()

Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,utterance,speaker,turn_id,Gold
0,0,0,Hello Piek. How are you feeling today?,Eliza,1,neutral
1,1,1,I am sad,Piek,1,sadness
2,2,2,How do you feel about being sad?,Eliza,1,neutral
3,3,3,Bad,Piek,1,sadness
4,4,4,How do you feel when you say that?,Eliza,1,neutral


## 2. BERT Finetuned for emotion detection with GO dataset

We will now load the language model BERT that is finetuned for emotion detection using the *go_emotions* data set. Go_emotions has 28 nuanced emotion labels including neutral, so many more than the basic Ekman emotion that we have seen before. 

We will define a *sentiment-analysis* pipeline and load the BERT model that was finetuned to classify sentences with the 28 GO_EMOTION labels. It will return a score for all the labels when we set the parameter *return_all_scores* to True.

In [10]:
from transformers import pipeline

In [11]:
model_name = "bhadresh-savani/bert-base-go-emotion" 
emotion = pipeline('sentiment-analysis', 
                    model=model_name, return_all_scores=True, truncation=True)

We now created an instance *emotion* of a transformer pipeline in analogy of an sentiment analysis classification task that we can apply to any utterance. The pipeline will use the tokenizer of the finetuned model and feed the sentence representation to the classifier as a sequence of contextualized token representations.

## 3. Applying emotion classification to Eliza conversations

In the next part, we will apply the GO_EMOTION classifier *emotion* to the conversation loaded in a Pandas frame. We will also map the GO_EMOTIONS to the 6 basic Ekman emotion and to neutral as well as to sentiment values. For the mappings, we defined a few simple utility functions in **lab5_util.py** . We also define a sort function to list the emotions from the highest score down. We first need to import these functions.

In [9]:
import lab5_util as util

In [23]:
sentiment_emotions = []
sentiment_scores = []
ekman_emotions = []
ekman_scores = []
go_emotions = []
go_scores = []

for index, utterance in enumerate(df['utterance']):
    emotion_labels = emotion(utterance)
    sorted_emotion_labels = util.sort_predictions(emotion_labels[0])
    go_emotions.append(sorted_emotion_labels[0]['label'])
    go_scores.append(sorted_emotion_labels[0]['score'])

    ekman_labels = util.get_averaged_mapped_scores(util.ekman_map, emotion_labels)
    ekman_emotions.append(ekman_labels[0]['label'])
    ekman_scores.append(ekman_labels[0]['score'])

    sentiment_labels = util.get_averaged_mapped_scores(util.sentiment_map, emotion_labels)
    sentiment_emotions.append(sentiment_labels[0]['label'])
    sentiment_scores.append(sentiment_labels[0]['score'])

We collected the GO output in separate lists for each utterance. We can now add the output to th pandas frame as separate columns, assuming that the values correspond to the rows.

In [24]:
df['Go_Sentiment']=sentiment_emotions
df['Go_SentimentScore']=sentiment_scores
df['Go_Ekman']=ekman_emotions
df['Go_EkmanScore']=ekman_scores
df['Go']=go_emotions
df['GoScore']=go_scores
df.head()

Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,utterance,speaker,turn_id,Gold,Go_Sentiment,Go_SentimentScore,Go_Ekman,Go_EkmanScore,Go,GoScore
0,0,0,Hello Piek. How are you feeling today?,Eliza,1,neutral,ambiguous,0.126107,neutral,0.287125,curiosity,0.330824
1,1,1,I am sad,Piek,1,sadness,negative,0.084571,sadness,0.181961,sadness,0.843814
2,2,2,How do you feel about being sad?,Eliza,1,neutral,ambiguous,0.077716,neutral,0.115115,sadness,0.344749
3,3,3,Bad,Piek,1,sadness,negative,0.064023,neutral,0.200429,neutral,0.200429
4,4,4,How do you feel when you say that?,Eliza,1,neutral,ambiguous,0.173038,neutral,0.233936,curiosity,0.585612


## 4. Load and apply the BoW classifiers

In [25]:
# HERE COMES THE CODE TO LOAD THE 3 BOW SVM CLASSIFIERS FROM DISK AND TO APPLY THESE TO THE UTTERANCES

In [28]:
# HERE COMES THE CODE ADD THE OUTPUT TO THE PANDAS DATA FRAME AS WAS DONE FOR THE GO EMOTIONS

In [29]:
# OPTIONAL BONUS: HERE COMES THE CODE TO APPLY VADER TO THE UTTERANCES AND ADD IT THE PANDAS DATAFRAME

In [27]:
# SAVE THE FINAL PANDAS FRAME TO A CSV FILE FOR YOUR SUBMISSION
file = "conversation_with_emotion.csv"
df.to_csv(file)

## 5. Evaluate the classifiers

In [None]:
# HERE COMES THE CODE TO GENERATE THE CLASSIFICATION REPORT AND THE CONFUSION MATRIX FOR EKMAN EMOTIONS

In [None]:
# OPTIONAL BONUS: HERE COMES THE CODE TO TO GENERATE THE CLASSIFICATION REPORT AND THE CONFUSION MATRIX FOR SENTIMENT

## End of Notebook