# Sentiment Analysis with ChatGPT

Sentiment Analysis is sort of like the "Hello, world!" of Natural Language Processing (NLP), but luckily for us, it's a bit more fun than just echoing out a string.

This notebook will guide you through analyzing sentiment with ChatGPT and discuss some of the differences between how you can approach this problem with a generative AI like ChatGPT versus how you might have approached this problem in the past.

## What is sentiment analysis?

Sentiment Analysis is a way of analyzing some text to determine if it's positive, negative, or neutral.

This is the kind of thing that's pretty easy for a human who understands the language the text is written in, but it can be hard for a computer to really understand the underlying meaning behind the text.

### Examples

1. "I saw that movie." - Neutral
2. "I love that movie." - Positive
3. "I hate that movie." - Negative

## How do we analyze sentiment?

We'll start with some housekeeping first by making sure that our dependencies are ready.

For this demo, we'll start out by exploring a more traditional approach that uses the Python Natural Language Toolkit (NLTK) and then we'll see how our approach might change when we use ChatGPT via the OpenAI SDK instead.

In [1]:
%%capture

%pip install openai nltk ipywidgets

import os

import nltk
import ipywidgets as pywidgets

from widgets.simple import simpleAnalysisWidget
from widgets.config import modelDropdown, apiKeyInput, apiKeyUpdateButton
from utils.obfuscate import obfuscateKey

nltk.download('vader_lexicon')

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
DISPLAY_KEY = obfuscateKey(OPENAI_API_KEY)

{'name': 'value', 'old': '', 'new': 'T', 'owner': Text(value='T', placeholder='Type something'), 'type': 'change'}
Neutral
{'name': 'value', 'old': 'T', 'new': 'Th', 'owner': Text(value='Th', placeholder='Type something'), 'type': 'change'}
Neutral
{'name': 'value', 'old': 'Th', 'new': 'This is ', 'owner': Text(value='This is ', placeholder='Type something'), 'type': 'change'}
Neutral
{'name': 'value', 'old': 'This is ', 'new': 'This is a te', 'owner': Text(value='This is a te', placeholder='Type something'), 'type': 'change'}
Neutral
{'name': 'value', 'old': 'This is a te', 'new': 'This is a test', 'owner': Text(value='This is a test', placeholder='Type something'), 'type': 'change'}
Neutral
{'name': 'value', 'old': 'This is a test', 'new': 'This is a tes', 'owner': Text(value='This is a tes', placeholder='Type something'), 'type': 'change'}
Neutral
{'name': 'value', 'old': 'This is a tes', 'new': 'This is a te', 'owner': Text(value='This is a te', placeholder='Type something'), 'type

## Simple sentiment analysis with NLTK

Let's take a look at a simple example of sentiment analysis with `nltk` and VADER.

The `SentimentIntensityAnalyzer` returns an object with positive, negative, and neutral scores for the given text as well as a combined `compound` score computed from the other three.

For this basic example, we're going to rely on the `compound` score and use a naive rating scale.

In [2]:
# import the VADER sentiment analyzer
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# instantiate the sentiment analyzer
analyzer = SentimentIntensityAnalyzer()

# analyze the sentiment of a string of text
def analyzeSentiment(text):
  if not text:
    return('')

  # use VADER to get the +/- sentiment of the string
  sentiment = analyzer.polarity_scores(text)

  # map the sentiment to a human readable label
  if sentiment['compound'] >= 0.75:
    return('Very Positive')
  elif sentiment['compound'] >= 0.4:
    return('Positive')
  elif sentiment['compound'] >= 0.1:
    return('Leaning Positive')
  elif sentiment['compound'] <= -0.1 and sentiment['compound'] > -0.4:
    return('Leaning Negative')
  elif sentiment['compound'] <= -0.4 and sentiment['compound'] > -0.75:
    return('Negative')
  elif sentiment['compound'] <= -0.75:
    return('Very Negative')
  else:
    return('Neutral')

Now let's test this analyzer with some example strings.

In [3]:
# some simple test statements for our analyzer
statements = [
  'I love that movie.',
  'I hate that movie.',
  'I like that movie.',
  'I dislike that movie.',
  'I saw that movie.',
]

for statement in statements:
  print(f"{statement} ({analyzeSentiment(statement)})")

I love that movie. (Positive)
I hate that movie. (Negative)
I like that movie. (Leaning Positive)
I dislike that movie. (Leaning Negative)
I saw that movie. (Neutral)


We've wired the input below up to the same analyzer function from above. Type in some text and see how the analyzer responds.

In [7]:
# this code cell is just used to display a widget
# that uses the analyzeSentiment function we created
display(simpleAnalysisWidget)

Box(children=(Text(value='', placeholder='Type something'), Output()), layout=Layout(align_items='center', dis…

## How Sentiment Analysis Works

Sentiment analysis, like most text analysis involves a multistep process:

1. **Stemming / Lemmatization**: reduces the words in the text to their root forms to simplify comparison between different forms of the same words
   1. **Stemming**: removes suffixes as an attempt to reduce words to their root forms
   2. **Lemmatization**: uses a morphological analysis of words to reduce them to their root forms
2. **Tokenization**: breaks the text into individual units of meaning called tokens
3. **Vectorization**: converts the tokens into a id that can be used for comparison
4. **Comparison**: compares the tokens to a known set of tokens to determine the sentiment

In this case we're taking advantage of an existing model that has been trained to analyze sentiment in text. If we wanted to build our own from scratch, it would be a more complicated process and require training data to feed into the model.

With the advent of Generative Pre-Trained Transformer (GPT) models like those that power ChatGPT, and other transformer models that have exploded in popularity since, we can leverage the powerful inference and predictive capabilities of these models to perform sentiment analysis without having to train our own model, and we can even leverage some prompting techniques to quickly teach the model how to perform more unique analyses.

## How ChatGPT works

Break down how ChatGPT turns text into tokens and then predicts the most likely tokens to follow the given text so far.

## Prompt engineering

Describe prompt engineering and break down system, user, and assistant prompts

## Zero shot and few shot prompting

Discuss the differences between few show and zero shot and give some examples

In [5]:
# this code cell is just used to display a widget
# for us to configure the OpenAI API and to update
# your API key if you need to change it
apiKeyInput.value = obfuscateKey(OPENAI_API_KEY)

def updateApiKey(event):
  global OPENAI_API_KEY

  # store the updated key in our global variable
  OPENAI_API_KEY = apiKeyInput.value

  # obfuscate the displayed key
  apiKeyInput.value = obfuscateKey(OPENAI_API_KEY)


openAiConfigWidget = pywidgets.Box([apiKeyInput, apiKeyUpdateButton, modelDropdown], layout=pywidgets.Layout(display='flex', flex_direction='column', align_items='center', width='100%'))

apiKeyUpdateButton.on_click(updateApiKey)

display(openAiConfigWidget)

Box(children=(Text(value='sk-I**********yeC3', description='OpenAI API Key', placeholder='Enter your OpenAI AP…

In [6]:
# def get_completion(prompt, model=model.value):
#     messages = [{"role": "user", "content": prompt}]

#     response = openai.ChatCompletion.create(
#         model=model,
#         messages=messages,
#         temperature=0, # this is the degree of randomness of the model's output
#     )

#     return response.choices[0].message["content"]