# Lab5 Final assignment: putting it all together

The final assignment is an individual assignment in which you put things together. You can earn 10 points when you carry out the basic tasks and discuss the results poperly.

   1. **Train a BoW SVM Ekman emotion classifier combining all MELD and Tweet data** [3 POINTS]
       1. Load the train, test and development data from MELD and WASSA
       2. Discuss the statistics on the training data for the Ekman emotions
       3. Create a BoW vector representation and train an SVM classifier
       4. Save the classifier to disk
   2. **Classify the turns from the Eliza conversation with emotion classifiers** [2 POINT]
       1. load the annotated conversation from disk in this notebook as a Pandas dataframe (see below)
       2. Apply the following emotion classifiers to your conversation:
           1. BoW SVM classifier trained on the data from MELD and Wassa Tweets (see 2. above)
           2. BERT finetuned with GO_emotions, where GO emotions are mapped to Ekman emotions (given in this notebook as shown below)
   3. **Evaluate the classifiers against your gold annotations** [2 POINT]
       1. Create a classification report and confusion matrix. Only evaluate the human input and ignore the Eliza prompts!
       2. Discuss the result:
           1. report on performance similarities and differences, 
           2. formulate what you expected from each model in terms of recall and precision and whether this is confirmed or falsified by the results 
           3. try to explain what did NOT meet your expectations
   4. **Formulate what could be done to improve the automatic classification** [2 POINT]
       1. How can you improve the classifiers by:
           1. adapting training data, 
           2. processing the training data differently
       2. Reflect on using different emotion labels: Ekman or Go.

For the assignment, you need to load the conversations from the students with Eliza, which is provided as a separate CSV file.

Below, we show how you can apply the BERT classifier finetuned with GO emotions to the conversations and add the result to the pandas dataframe. We also show how the GO emotions are mapped to Ekman emotions and sentiment values. This is an example how you can also proceed to you get the results for the other classifiers.

If for some reason, you are not able to load the BERT-GO transformer model in your computer and cannot run it do the following:

   1. Send me your conversation with the annotations saved in CSV
   2. I will apply the transformer model for you and send back the CSV with the GO annotations
   3. Load the CSV with the GO anotations and proceed from there

After applying the BERT-GO classifier to your conversation, you will also apply a Bag-of-Word SVM classifier to your conversation. For this, you should build a BoW SVM classfier in a separate notebook (use **lab5.meld-tweet-bow-svm-emotion-classifier.ipynb**) which combines the MELD and Tweet data into a single set of training data. Note that you can also include the test and development data for training since we are applying the model to your conversation and not to the MELD and Wassa tests. 

Make a statistical analysis of the training data (you need for understabding the results) and save the classifier to disk so that you can use it here in this notebook. The notebook **lab5.meld-tweet-bow-svm-emotion-classifier.ipynb** can be used as a guide for building the classifier. Load the BoW SVM classifier that you built in this notebook from disk and apply it to the conversation. Add the result to the Pandas dataframe of the conversation as well. Once the pandas frame is complete with the annotations from all the classifiers, make sure you save it to disk as an XLS or CSV file to not loose anything!!!!

For the evaluation of the classifiers, you need to extract the gold labels and the classifier labels for each classifcation model separately. When extracting the gold and system labels for the test, you should **ONLY(!!)** use the labels for the human responses and skip Eliza's responses. We provided a function in **lab5_util.py** to do this. 

Once you have extracted the gold labels and the system labels for the human responses (check if they have the same length and are in the right order), you can do the evaluation. With the **sklearn** functions you can generate a classification report and a confusion matrix (you can use **seaborn** to make the image nicer).

When you report the results, it makes sense to combine these in a single table. So instead of having separate classification report tables, try to combine these in a single table so that you can easly compare them, as is shown in the following examples. Obviously, the confusion matrixes cannot be combined. You can add these to the appendix of the report on a single page for comparison.

## Dummy example of an overview of EKMAN results

![Ekman classification of human responses in a conversation](ekman-results.png)


## 1. Submission

The assignment should be submitted individually on CANVAS as a zip file and include the following:

   1. The notebook to create a BoW-SVM combining MELD and Tweets for Ekman classification: **lab5.meld-tweet-bow-svm-emotion-classifier.ipynb**
   2. The current notebook **lab5.final_assignment.ipynb**
   3. A CSV file containing the conversation with all the gold and classifier outputs (use clear column names)
   4. A PDF report of max 5 pages:
       1. Section 1: what you have done and why: be explicit about changes you made e.g. to Eliza or training on MELD+Tweets
       2. Section 2: report on the Ekman classification results. Use a single table for recall, precision and f-score and put confusion matrixes in the appendix.
       4. Section 3:a discussion on the results and how to improve the classifiers

Use the notebooks that are given as a guide with the code and the output. You should NOT discuss the results in the notebooks but in the report. Use the notebooks to run the experiments and get the results. Include the tables and figures in the report.

## 2. Loading the conversation saved on disk

Using the notebook **eliza-chat.ipynb**, you should create a conversation with Eliza, annotate the conversation with Ekman labels and save it to disk as a CSV file. You should load the saved CSV file here.

In [1]:
import pandas as pd

In [2]:
file = 'my_emotional_conversation.csv'
df = pd.read_csv(file)
df.head()

Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,utterance,speaker,turn_id,Gold
0,0,0,Hello Piek. How are you feeling today?,Eliza,1,neutral
1,1,1,I am sad,Piek,1,sadness
2,2,2,How do you feel about being sad?,Eliza,1,neutral
3,3,3,Bad,Piek,1,sadness
4,4,4,How do you feel when you say that?,Eliza,1,neutral


In [3]:
import lab5_util as util

In [7]:
# THE CODE TO GET THE TEST TEXTS AND LABELS
test_instances =df['utterance']
test_labels = df['Gold']

In [8]:
# THE CODE TO LIMIT THE TEST LABELS TO THE HUMAN TEST LABELS
human_test_labels = util.remove_eliza_labels(df, test_labels)
print(len(test_labels), len(human_test_labels))

21 10


## 2. BERT Finetuned for emotion detection with GO dataset

We will now load the language model BERT that is finetuned for emotion detection using the *go_emotions* data set. Go_emotions has 28 nuanced emotion labels including neutral, so many more than the basic Ekman emotion that we have seen before. 

We will define a *sentiment-analysis* pipeline and load the BERT model that was finetuned to classify sentences with the 28 GO_EMOTION labels. It will return a score for all the labels when we set the parameter *return_all_scores* to True.

In [9]:
# HERE COMES THE CODE TO LOAD THE BERT-BASE-GO-EMOTION transformer model and create a pipeline
from transformers import pipeline

In [10]:
model_name = "bhadresh-savani/bert-base-go-emotion" 
emotion_pipeline = pipeline('sentiment-analysis', 
                    model=model_name, return_all_scores=True, truncation=True)

We now created an instance *emotion* of a transformer pipeline in analogy of an sentiment analysis classification task that we can apply to any utterance. The pipeline will use the tokenizer of the finetuned model and feed the sentence representation to the classifier as a sequence of contextualized token representations.

### 2.1 Applying emotion classification to Eliza conversations

In the next part, you will apply the GO_EMOTION classifier *emotion* to the conversation loaded in a Pandas frame. You will also map the GO_EMOTIONS to the 6 basic Ekman emotion and to neutral as well as to sentiment values. For the mappings, we defined a few simple utility functions in **lab5_util.py** . We also define a sort function to list the emotions from the highest score down. You first need to import these functions.

In [11]:
threshold = 0.05

go_sentiment_emotions = []
go_sentiment_scores = []
go_ekman_emotions = []
go_ekman_scores = []
go_emotions = []
go_scores = []

for index, utterance in enumerate(df['utterance']):
    emotion_labels = emotion_pipeline(utterance)
    sorted_emotion_labels = util.sort_predictions(emotion_labels[0])
    go_emotions.append(sorted_emotion_labels[0]['label'])
    go_scores.append(sorted_emotion_labels[0]['score'])

    ekman_labels = util.get_averaged_mapped_scores_by_threshold(util.ekman_map, emotion_labels, threshold)
    if ekman_labels:
        go_ekman_emotions.append(ekman_labels[0]['label'])
        go_ekman_scores.append(ekman_labels[0]['score'])
    else:
        #### none of the labels scored above the threshold
        go_ekman_emotions.append('None')
        go_ekman_scores.append(0)
        

    sentiment_labels = util.get_averaged_mapped_scores_by_threshold(util.sentiment_map, emotion_labels, threshold)
    if sentiment_labels:
        go_sentiment_emotions.append(sentiment_labels[0]['label'])
        go_sentiment_scores.append(sentiment_labels[0]['score'])
    else:
        #### none of the labels scored above the threshold
        go_sentiment_emotions.append('None')
        go_sentiment_scores.append(0)


## 2.2 Adding the output to the Pandas frame

We collected the GO output in separate lists for each utterance. You can now add the output to the pandas frame as separate columns, assuming that the values correspond to the rows.

In [12]:
df['Go_Sentiment']=go_sentiment_emotions
df['Go_SentimentScore']=go_sentiment_scores
df['Go_Ekman']=go_ekman_emotions
df['Go_EkmanScore']=go_ekman_scores
df['Go']=go_emotions
df['GoScore']=go_scores
df.head()

Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,utterance,speaker,turn_id,Gold,Go_Sentiment,Go_SentimentScore,Go_Ekman,Go_EkmanScore,Go,GoScore
0,0,0,Hello Piek. How are you feeling today?,Eliza,1,neutral,ambiguous,0.237066,neutral,0.287125,curiosity,0.330824
1,1,1,I am sad,Piek,1,sadness,negative,0.843814,sadness,0.843814,sadness,0.843814
2,2,2,How do you feel about being sad?,Eliza,1,neutral,negative,0.344749,sadness,0.344749,sadness,0.344749
3,3,3,Bad,Piek,1,sadness,negative,0.107874,neutral,0.200429,neutral,0.200429
4,4,4,How do you feel when you say that?,Eliza,1,neutral,ambiguous,0.339241,surprise,0.339241,curiosity,0.585612


## 2.3 Evaluation of the human response labels

In [13]:
# HERE COMES THE CODE TO REDUCE THE SYSTEM LABELS TO THE HUMAN RESPONSES
human_response_prediction_labels = util.remove_eliza_labels(df, go_ekman_emotions)
print(len(go_ekman_emotions), len(human_response_prediction_labels))
print(human_response_prediction_labels)

21 10
['sadness', 'neutral', 'neutral', 'neutral', 'sadness', 'sadness', 'neutral', 'neutral', 'neutral', 'neutral']


In [15]:
# HERE COMES THE CODE TO GENERATE THE CLASSIFICATION REPORT AND CONFUSION MATRIX

## 4.Apply the BoW classifier

In [16]:
from collections import Counter
import nltk
from nltk.corpus import stopwords
import pickle
import sklearn
from sklearn import svm
from sklearn.metrics import classification_report
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer

### 4.1 Loading the BoW SVM classifier from MELD and TWEETS

In [17]:
# HERE COMES THE CODE TO BUILD THE BOW SVM CLASSIFIER FROM MELD AND WASSA TWEETS

### 4.2 Applying and the classifier to the human part of the conversation

In [18]:
# HERE COMES TO CODE TO APPLY HERE COMES THE CODE TO APPLY THE CLASSIFIER TO THE UTTERANCES AND GENERATE THE CLASSIFICATION REPORT AND THE CONFUSION MATRIX

In [19]:
# HERE COMES THE CODE TO ADD THE OUTPUT TO THE PANDAS DATA FRAME AS WAS DONE FOR THE GO EMOTIONS

## 4.3 Evaluation of the human response labels

In [21]:
# HERE COMES THE CODE TO REDUCE THE SYSTEM LABELS TO THE HUMAN RESPONSES

In [22]:
# HERE COMES THE CODE TO GENERATE THE CLASSIFICATION REPORT AND CONFUSION MATRIX

## 5. [2 BONUS POINT] Sentiment classification and evaluation

In [26]:
# MAP EKMAN TO SENTIMENT
sentiment_map = {}

In [27]:
# HERE COMES THE CODE TO MAP THE EKMAN TEST LABELS TO SENTIMENT TEST LABEL

def ekman_to_sentiment (sentiment_map, test_labels):
    human_sentiment_test_labels = []
    return human_sentiment_test_labels

In [28]:
print(human_test_labels)
human_sentiment_test_labels = ekman_to_sentiment(sentiment_map, human_test_labels)
print(human_sentiment_test_labels)

[]
[]


### 5.1 Applying VADER for sentiment

In [29]:
# HERE COMES THE CODE TO APPLY VADER TO THE UTTERANCES AND ADD THE OUTPUT TO THE PANDAS DATAFRAME
vader_labels=[]

In [30]:
# HERE COMES THE CODE TO ADD THE VADER SENTIMENT TO THE DATAFRAME

In [31]:
# HERE COMES THE CODE TO REDUCE THE SYSTEM LABELS TO THE HUMAN RESPONSES

In [32]:
# HERE COMES THE CODE TO GENERATE THE CLASSIFICATION REPORT AND CONFUSION MATRIX

### 5.2 Evaluating the GO_emotion sentiment scores

In [33]:
# HERE COMES THE CODE TO REDUCE THE SENTIMENT LABELS TO THE HUMAN RESPONSES

In [34]:
# HERE COMES THE CODE FOR THE CLASSIFICATION REPORT AND CONFUSION MATRIX

### 5.3 Evaluating the BoW-SVM emotions mapped to sentiment values

In [35]:
# HERE COMES THE CODE TO MAP THE BOW EKMAN CODE TO SENTIMENT VALUES

In [36]:
# HERE COMES THE CODE TO ADD THE BoW-SVM sentiment to the dataframe

In [37]:
# HERE COMES THE CODE TO REDUCE THE PREDICTIONS TO THE HUMAN RESPONSES

In [38]:
# HERE COMES THE CODE TO GENERATE THE CLASSIFICATION REPORT AND CONFUSION MATRIX

## 5.4 Training Bow-SVM with sentiment values

In [39]:
# CREATE A SEPARATE NOTEBOOK lab5.meld-tweet-bow-svm-sentiment-classifier.ipynb TO TRAIN A BOW SVM CLASSIFIER FOR SENTIMENT FROM MELD AND WASSA-TWEETS
# SAVE THE CLASSIFIER TO DISK AND LOAD IT HERE

# HERE COMES THE CODE TO LOAD A BOW SVM CLASSIFIER FOR SENTIMENT FROM MELD AND WASSA-TWEETS

In [40]:
# HERE COMES THE CODE TO CLASSIFY THE CONVERSATION WITH SENTIMENT

In [41]:
# HERE COMES THE CODE TO ADD THE SENTIMENT TO THE DATAFRAME

In [42]:
# HERE COMES THE CODE TO REDUCE THE PREDICTIONS TO THE HUMAN RESPONSES

In [43]:
# HERE COMES THE CODE TO TO GENERATE THE CLASSIFICATION REPORT AND THE CONFUSION MATRIX FOR SENTIMENT

## 6. Saving the dataframe will all the predictions to disk 

In [44]:
# SAVE THE FINAL PANDAS FRAME TO A CSV FILE FOR YOUR SUBMISSION

## End of the assignment