## **Performance Metrics to evaluate text generating LLMs:**

- **Sentiment Analysis**

In this notebook, we essentially focus on the Sentiment Analysis performance metric (more about this on the readme.md file).

In [8]:
!pip install datasets

In [9]:
# Import Libraries
import pandas as pd
from datasets import load_dataset
import random

In [None]:
# Load the IMDb dataset from Hugging Face datasets
dataset = load_dataset("imdb")

In [30]:
# Select a subset of reviews and their labels
random_indices = random.sample(range(len(dataset['train'])), 10)
random_reviews = [dataset['train'][i]['text'] for i in random_indices]
random_labels = [dataset['train'][i]['label'] for i in random_indices]

# Convert label indices to actual labels
label_mapping = {
    0: 'Negative',
    1: 'Positive',
}
random_labels = [label_mapping[label] for label in random_labels]

In [31]:
df = pd.DataFrame({'Review': random_reviews, 'Ground_Truth_Label': random_labels})

### **From here: re-execute the code for the different Text Generative Models:**

In [32]:
testing_array = df['Review'].values
print(testing_array)
print(len(testing_array))

["Really, it's nothing much. I only recommend watching it if; 1.) You're a big fan of any of the main stars. 2.) If you really want to check out the first time Lucille Ball was seen with red hair.<br /><br />4 out of 10 stars"
 'There are a lot of highly talented filmmakers/actors in Germany now. None of them are associated with this "movie".<br /><br />Why in the world do producers actually invest money in something like this this? You could have made 10 good films with the budget of this garbage! It\'s not entertaining to have seven grown men running around as dwarfs, pretending to be funny. What IS funny though is that the film\'s producer (who happens to be the oldest guy of the bunch) is playing the YOUNGEST dwarf.<br /><br />The film is filled with moments that scream for captions saying "You\'re supposed to laugh now!". It\'s hard to believe that this crap\'s supposed to be a comedy.<br /><br />Many people actually stood up and left the cinema 30 minutes into the movie. I should

**Query the text generating llm with the following prompt:** (copy the document as mentionned: PASTE_DOCUMENTS_HERE)

```
Please classify the following 10 sentences: positive, negative. Here are the sentences: PASTE_SENTENCES_HERE. please return the answers as an array
```

In [33]:
# Add the result
predicted_labels = ['Positive', 'Negative', 'Negative', 'Negative', 'Negative', 'Negative', 'Positive', 'Negative', 'Negative', 'Positive']


In [34]:
df['Predicted_Labels'] = predicted_labels
correct_predictions = sum(df['Ground_Truth_Label'] == df['Predicted_Labels'])
total_reviews = len(df)
accuracy = correct_predictions / total_reviews
if accuracy >= 0.9:
    grade = 'A'
elif accuracy >= 0.8:
    grade = 'B'
elif accuracy >= 0.7:
    grade = 'C'
elif accuracy >= 0.6:
    grade = 'D'
else:
    grade = 'F'

print("Total Score:", accuracy)
print("Grade:", grade)
print("\nDataFrame with 10 random reviews:")

Total Score: 0.4
Grade: F

DataFrame with 10 random reviews:


In [35]:
df

Unnamed: 0,Review,Ground_Truth_Label,Predicted_Labels
0,"Really, it's nothing much. I only recommend wa...",Negative,Positive
1,There are a lot of highly talented filmmakers/...,Negative,Negative
2,Confounding melodrama taken from a William Gib...,Negative,Negative
3,I first heard about White Noise when I saw the...,Negative,Negative
4,Walter Matthau and George Burns were a famous ...,Positive,Negative
5,I'm really surprised this movie didn't get a h...,Positive,Negative
6,Body Slam (1987) is a flat out terrible movie....,Negative,Positive
7,"While most of the movie is very amateurish, th...",Negative,Negative
8,"I got to see the movie "" On Thin Ice"" on the t...",Positive,Negative
9,WARNING SPOILERS***** A really stupid movie ab...,Negative,Positive


In [36]:
model_name = "Chat GPT"
output_filename = "chat_gpt_sentiment.csv"

In [37]:
new_data = {
    'model_name': model_name,
    'sent_acc': [accuracy]
}
new_df = pd.DataFrame(new_data)
new_df.to_csv(output_filename, index=False)