# LLM Sentiment Analysis

We'll explore using large language models for sentiment analysis and classification.

Let's first get the OpenAI API key from <https://platform.openai.com/api-keys>, store it in the Colab Secrets, and retrieve it from there.

In [None]:
import requests
from google.colab import userdata

api_key = userdata.get("OPENAI_API_KEY")

We'll use the [Models API](https://platform.openai.com/docs/api-reference/models) to list all the models in the list of models.

In [None]:
# List all models
r = requests.get("https://api.openai.com/v1/models", headers={
    "Authorization": f"Bearer {api_key}"
})

# Show the 10 most recent models
sorted(r.json()["data"], key=lambda x: x['created'], reverse=True)[:10]

[{'id': 'gpt-4o-2024-05-13',
  'object': 'model',
  'created': 1715368132,
  'owned_by': 'system'},
 {'id': 'gpt-4o',
  'object': 'model',
  'created': 1715367049,
  'owned_by': 'system'},
 {'id': 'gpt-4-turbo-2024-04-09',
  'object': 'model',
  'created': 1712601677,
  'owned_by': 'system'},
 {'id': 'gpt-4-turbo',
  'object': 'model',
  'created': 1712361441,
  'owned_by': 'system'},
 {'id': 'gpt-4-1106-vision-preview',
  'object': 'model',
  'created': 1711473033,
  'owned_by': 'system'},
 {'id': 'gpt-3.5-turbo-0125',
  'object': 'model',
  'created': 1706048358,
  'owned_by': 'system'},
 {'id': 'gpt-4-turbo-preview',
  'object': 'model',
  'created': 1706037777,
  'owned_by': 'system'},
 {'id': 'gpt-4-0125-preview',
  'object': 'model',
  'created': 1706037612,
  'owned_by': 'system'},
 {'id': 'text-embedding-3-large',
  'object': 'model',
  'created': 1705953180,
  'owned_by': 'system'},
 {'id': 'text-embedding-3-small',
  'object': 'model',
  'created': 1705948997,
  'owned_by': '

We'll use GPT 4o -- which is currently the best available model, but is expensive.

## Movies dataset

We have a small dataset of [movie reviews](https://drive.google.com/file/d/1X33ao8_PE17c3htkQ-1p2dmW2xKmOq8Q/view) — about 20 movies, each with a review. These reviews are either positive or negative. We'll use a large language model to automatically identify the sentiment without any prior training. Additionally, we'll see if the model can determine the genre of each movie.

In [None]:
import pandas as pd

reviews = pd.read_csv("movie-reviews.csv")
reviews

Unnamed: 0,review,sentiment
0,One of the other reviewers has mentioned that ...,positive
1,A wonderful little production. <br /><br />The...,positive
2,I thought this was a wonderful way to spend ti...,positive
3,Basically there's a family where a little boy ...,negative
4,"Petter Mattei's ""Love in the Time of Money"" is...",positive
5,"Probably my all-time favorite movie, a story o...",positive
6,I sure would like to see a resurrection of a u...,positive
7,"This show was an amazing, fresh & innovative i...",negative
8,Encouraged by the positive comments about this...,negative
9,If you like original gut wrenching laughter yo...,positive


Here's an example of what a review looks like:

In [None]:
reviews.review.iloc[0]

"One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fa

## LLM-based sentiment

We can use GPT-4o to identify the sentiment of a movie.

In [None]:
import json

def get_sentiment(review, debug=False):
    response = requests.post(
        "https://api.openai.com/v1/chat/completions",
        headers={"Authorization": f"Bearer {api_key}"},
        json={
            "model": "gpt-4o",
            "messages": [
                {"role": "system", "content": "Identify the sentiment of the movie. JUST say positive / negative"},
                {"role": "user", "content": review}
            ]
        }
    )
    result = response.json()
    answer = result["choices"][0]["message"]["content"]
    print(answer)
    return answer

print(reviews.review.iloc[3])
get_sentiment(reviews.review.iloc[3], debug=True)

Basically there's a family where a little boy (Jake) thinks there's a zombie in his closet & his parents are fighting all the time.<br /><br />This movie is slower than a soap opera... and suddenly, Jake decides to become Rambo and kill the zombie.<br /><br />OK, first of all when you're going to make a film you must Decide if its a thriller or a drama! As a drama the movie is watchable. Parents are divorcing & arguing like in real life. And then we have Jake with his closet which totally ruins all the film! I expected to see a BOOGEYMAN similar movie, and instead i watched a drama with some meaningless thriller spots.<br /><br />3 out of 10 just for the well playing parents & descent dialogs. As for the shots with Jake: just ignore them.
Negative


'Negative'

Now we can apply this LLM-based sentiment to each review in the dataframe.

In [None]:
reviews["sentiment_llm"] = reviews["review"].apply(get_sentiment)
reviews

Positive
Positive
Positive
Negative
Positive
Positive
Positive
Negative
Negative
Positive
Negative
Negative
Negative
negative
Positive
Negative
Negative
Negative
Negative
Negative


Unnamed: 0,review,sentiment,sentiment_llm
0,One of the other reviewers has mentioned that ...,positive,Positive
1,A wonderful little production. <br /><br />The...,positive,Positive
2,I thought this was a wonderful way to spend ti...,positive,Positive
3,Basically there's a family where a little boy ...,negative,Negative
4,"Petter Mattei's ""Love in the Time of Money"" is...",positive,Positive
5,"Probably my all-time favorite movie, a story o...",positive,Positive
6,I sure would like to see a resurrection of a u...,positive,Positive
7,"This show was an amazing, fresh & innovative i...",negative,Negative
8,Encouraged by the positive comments about this...,negative,Negative
9,If you like original gut wrenching laughter yo...,positive,Positive


Let's see where the LLM sentiment different from the underlying data's sentiment

In [None]:
# prompt: List lines where sentiment_llm is different from sentiment

reviews[reviews.sentiment_llm.str.lower() != reviews.sentiment]

Unnamed: 0,review,sentiment,sentiment_llm
16,Some films just simply should not be remade. T...,positive,Negative
18,"I remember this film,it was the first film i h...",positive,Negative


## Aspect-based sentiment

We can provide LLMs examples of the kind of output we want and "train" them. This is called few-shot prompting.

Let's teach the LLMs to return the quality of acting, storyline, and direction using an example. Then let's extract these for another movie:

In [None]:
import json

def get_sentiment(review):
    response = requests.post(
        "https://api.openai.com/v1/chat/completions",
        headers={"Authorization": f"Bearer {api_key}"},
        json={
            "model": "gpt-4o",
            "messages": [
                {"role": "system", "content": "Identify good these are from the reviews:\nacting:\nstoryline:\ndirection:"},
                {"role": "user", "content": "One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side."},
                {"role": "assistant", "content": "acting: maybe good\nstoryline: good\ndirection: good"},
                {"role": "user", "content": review}
            ]
        }
    )
    result = response.json()
    answer = result["choices"][0]["message"]["content"]
    return answer

print(reviews.review.iloc[3])
print(get_sentiment(reviews.review.iloc[3]))

Basically there's a family where a little boy (Jake) thinks there's a zombie in his closet & his parents are fighting all the time.<br /><br />This movie is slower than a soap opera... and suddenly, Jake decides to become Rambo and kill the zombie.<br /><br />OK, first of all when you're going to make a film you must Decide if its a thriller or a drama! As a drama the movie is watchable. Parents are divorcing & arguing like in real life. And then we have Jake with his closet which totally ruins all the film! I expected to see a BOOGEYMAN similar movie, and instead i watched a drama with some meaningless thriller spots.<br /><br />3 out of 10 just for the well playing parents & descent dialogs. As for the shots with Jake: just ignore them.
acting: good
storyline: poor
direction: poor
