In [None]:
%%capture

%pip install openai nltk ipywidgets numpy requests-cache backoff tiktoken nrclex pandas

# Sentiment Analysis with ChatGPT

Sentiment Analysis is sort of like the "Hello, world!" of Natural Language Processing (NLP), but luckily for us, it's a bit more fun than just echoing out a string - otherwise this workshop could be a bit bland.

This notebook will guide you through analyzing sentiment with ChatGPT and discuss some of the differences between how you can approach this problem with a generative AI like ChatGPT versus how you might have approached this problem in the past.

**Note**: For a better learning experience, this notebook purposely hides some implementation details like how interactive widgets are created and certain imports of notebook-specific utilities. Full details are available if you open this notebook in your editor of choice or expand the hidden cells.

## What is sentiment analysis?

Sentiment Analysis is a way of analyzing some text to determine if it's positive, negative, or neutral.

This is the kind of thing that's pretty easy for a human who understands the language the text is written in, but it can be hard for a computer to really understand the underlying meaning behind the text.

### Examples

1. "I saw that movie." - Neutral
2. "I love that movie." - Positive
3. "I hate that movie." - Negative

## How do we analyze sentiment?

We'll start with some housekeeping first by making sure that our dependencies are ready.

For this demo, we'll start out by exploring a more traditional approach that uses the Python Natural Language Toolkit (NLTK) and then we'll see how our approach might change when we use ChatGPT via the OpenAI SDK instead.

## Initial Setup

First, we'll import the relevant tools we'll be using in the notebook and configure some global variables.

In [None]:
import os

import nltk
import openai

# download nltk data
nltk.download('vader_lexicon')
nltk.download('punkt')

# globals
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
TEMPERATURE = 0.2
STORY_SAMPLE_SIZE = 5

In [None]:
# this cell focuses on some implemetation details for this notebook
# that aren't actually important to the workshop
import ipywidgets as pywidgets
import requests as request
import requests_cache
import backoff

import plotly.express as px
import plotly.io as pio

# configuration widgets
from widgets.config import modelDropdown, apiKeyInput, apiKeyUpdateButton, temperatureSlider, sampleSizeSlider, sampleSizeWarningLabel, openAiHeader, hackerNewsHeader

# project-specific widgets
from widgets.simple import simpleAnalysisWidget
from widgets.advanced import advancedAnalysisWidget, configureOpenAi
from widgets.tokens import tokenAnalysisWidget, configureModel

# project-specific utilities
from utils.obfuscate import obfuscateKey

pio.templates.default = "plotly_white"
pio.renderers.default = "notebook"

# we'll use this session to cache our hacker news api requests
REQUEST_CACHE_EXPIRATION_SECONDS = 60 * 15
session = requests_cache.CachedSession('hackernews_cache', expire_after=REQUEST_CACHE_EXPIRATION_SECONDS)

## Configuration

The code cell below renders a configuration form that you can use to adjust some variables used by other cells in this notebook.

You can make changes to the configuration form at any time and rerun cells that make requests to the OpenAI API or Hacker News API to see how the results change.

You can configure the following values:

- **Open AI API Key**: Your [OpenAI API key](https://platform.openai.com/account/api-keys) is read from the `$OPENAI_API_KEY` environment variable if it's set, but you can override it in this notebook. When you click the **Update Key** button the key you entered will be obfuscated and stored in the `OPENAI_API_KEY` global variable.
- **Model**: You can choose between the `gtp-3.5-turbo` and `gpt-4` models for this demo. The `gpt-4` model is more powerful, but it's also slower and more expensive to use.
- **Temperature**: A model's temperature is a measure of how "creative" or "unique" it's response will be. You can set this to `0` for something pretty close to deterministic responses to simple queries.
- **Sample Size**: We'll be gathering the top storeis from the [Hacker News API](https://github.com/HackerNews/API) and then sending the titles of a sample of those stories to the model for analysis. For quicker, cheaper results you may want to set this to a lower number. The larger your sample, the more tokens that will be consumed and the more likely you are to hit any rate limits

In [None]:
# this code cell is just used to display a widget
# for us to configure some settings that other cells
# in this notebook rely on
apiKeyInput.value = obfuscateKey(OPENAI_API_KEY)
sampleSizeSlider.value = STORY_SAMPLE_SIZE
temperatureSlider.value = TEMPERATURE

def updateApiKey(event):
  global OPENAI_API_KEY
  OPENAI_API_KEY = apiKeyInput.value
  apiKeyInput.value = obfuscateKey(OPENAI_API_KEY)

def updateSampleSize(change):
  global STORY_SAMPLE_SIZE
  STORY_SAMPLE_SIZE = change['new']

def updateTemperature(change):
  global TEMPERATURE
  TEMPERATURE = change['new']

temperatureSlider.observe(updateTemperature, names='value')
sampleSizeSlider.observe(updateSampleSize, names='value')
apiKeyUpdateButton.on_click(updateApiKey)

apiKeyConfigWidget = pywidgets.HBox([apiKeyInput, apiKeyUpdateButton])
openAiConfigWidget = pywidgets.VBox([openAiHeader, apiKeyConfigWidget, modelDropdown, temperatureSlider])
hackerNewsConfigWidget = pywidgets.VBox([hackerNewsHeader, sampleSizeSlider, sampleSizeWarningLabel])
configWidget = pywidgets.VBox([openAiConfigWidget, hackerNewsConfigWidget])

display(configWidget)

## Simple sentiment analysis with NLTK

Let's take a look at a simple example of sentiment analysis with `nltk` and VADER.

The `SentimentIntensityAnalyzer` returns an object with positive, negative, and neutral scores for the given text as well as a combined `compound` score computed from the other three.

For this basic example, we're going to rely on the `compound` score and use a naive rating scale.

In [None]:
# import the VADER sentiment analyzer
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# instantiate the sentiment analyzer
analyzer = SentimentIntensityAnalyzer()

# analyze the sentiment of a string of text
def analyzeSentiment(text):
  if not text:
    return('')

  # use VADER to get the +/- sentiment of the string
  sentiment = analyzer.polarity_scores(text)

  # map the sentiment to a human readable label
  if sentiment['compound'] >= 0.75:
    return('very positive')
  elif sentiment['compound'] >= 0.4:
    return('positive')
  elif sentiment['compound'] >= 0.1:
    return('leaning positive')
  elif sentiment['compound'] <= -0.1 and sentiment['compound'] > -0.4:
    return('leaning negative')
  elif sentiment['compound'] <= -0.4 and sentiment['compound'] > -0.75:
    return('negative')
  elif sentiment['compound'] <= -0.75:
    return('very negative')
  else:
    return('neutral')
  
# some simple test statements for our analyzer
statements = [
  'I love that movie.',
  'I hate that movie.',
  'I like that movie.',
  'I dislike that movie.',
  'I saw that movie.',
]

for statement in statements:
  print(f"{statement} ({analyzeSentiment(statement)})")

We've wired the input below up to the same analyzer function from above. Type in some text and see how the analyzer responds.

In [None]:
# this code cell is just used to display a widget
# that uses the analyzeSentiment function we created
display(simpleAnalysisWidget)

## How Sentiment Analysis Works

Sentiment analysis, like most text analysis involves a multistep process:

1. **Stemming / Lemmatization**: reduces the words in the text to their root forms to simplify comparison between different forms of the same words
   1. **Stemming**: removes suffixes as an attempt to reduce words to their root forms
   2. **Lemmatization**: uses a morphological analysis of words to reduce them to their root forms
2. **Tokenization**: breaks the text into individual units of meaning called tokens
3. **Vectorization**: converts the tokens into a id that can be used for comparison
4. **Comparison**: compares the tokens to a known set of tokens to determine the sentiment

In this case we're taking advantage of an existing model that has been trained to analyze sentiment in text. If we wanted to build our own from scratch, it would be a more complicated process and require training data to feed into the model.

With the advent of Generative Pre-Trained Transformer (GPT) models like those that power ChatGPT, and other transformer models that have exploded in popularity since, we can leverage the powerful inference and predictive capabilities of these models to perform sentiment analysis without having to train our own model, and we can even leverage some prompting techniques to quickly teach the model how to perform more unique analyses.

## Real world example

So, let's see how this works with text generated by other humans without knowing that someone would be trying to analyze the sentiment of their text.

For this example, we'll pull in a random sample of the [top stories](https://github.com/HackerNews/API#new-top-and-best-stories) on [Hacker News](https://news.ycombinator.com/) and analyze the sentiment of each submission's title.

You can run the cell below a few times to generate different samples of the top stories until you find a collection you prefer and then rerun the cells below to use that sample for the rest of the notebook.

In [None]:
import numpy as np

def getSampleStories(sampleSize = STORY_SAMPLE_SIZE):
  topStoryIdsRequest = session.get('https://hacker-news.firebaseio.com/v0/topstories.json')

  if topStoryIdsRequest.status_code != 200:
    print('There was a problem getting the top stories from Hacker News')
    exit()

  topStoryIds = topStoryIdsRequest.json()

  storyIds = np.array(topStoryIds)[np.random.choice(len(topStoryIds), sampleSize, replace=False)]

  return storyIds


def getStoryDetails(storyId):
  # we'll use the same request cache so that we don't have to request a story's details more than once
  storyRequest = session.get(f'https://hacker-news.firebaseio.com/v0/item/{storyId}.json')

  if storyRequest.status_code != 200:
    print(f'There was a problem getting story {storyId} from Hacker News')
    return None
  else:
    story = storyRequest.json()

  return story


def getStories(storyIds):
  stories = {}

  for storyId in storyIds:
    story = getStoryDetails(storyId)

    if 'title' in story:
      stories[storyId] = {
        "title": story['title'],
        "time": story['time'],
        "sentiment": {
          "vader": '',
          "nrclex": {},
          "openai": {}
        }
      }
  
  return stories


stories = getStories(getSampleStories())

for storyId, story in stories.items():
  print(story['title'])

In [None]:
def analyzeStories(stories):
  for _, story in stories.items():
    story['sentiment']['vader'] = analyzeSentiment(story['title'])
    print(f"{story['title']} ({story['sentiment']['vader']})")


analyzeStories(stories)

## How ChatGPT works

In responding to our prompts, ChatGPT follows a similar process to the NLP workflow described above.

It breaks our prompts into tokens and then predicts which tokens should most logically follow the ones that we've provided.

### Tokens

Tokenization breaks text down into units of meaning, and just like the stemming/lemmatization that we discussed earlier, you'll notice that words are often broken down into their roots and suffixes.

You can test out the tokenizer used by ChatGPT below, there's also a great visual breakdown available at [https://gpt-tokenizer.dev/](https://gpt-tokenizer.dev/).

#### Example

In [None]:
import tiktoken

def tokenize(text):
  tokens = []
  ids = []
  
  # To get the tokeniser corresponding to a specific model in the OpenAI API:
  encoding = tiktoken.encoding_for_model(modelDropdown.value)

  tokenized = encoding.encode(text)

  for tokenId in tokenized:
    ids.append(tokenId)
    tokens.append(encoding.decode_single_token_bytes(tokenId).decode('utf-8'))

  return (tokens, ids)

statements = [
  'I love that movie.',
  'I hate that movie.',
  'I like that movie.',
  'I dislike that movie.',
  'I saw that movie.',
]

for statement in statements:
  (statementTokens, statementIds) = tokenize(statement)
  print(f"{statementTokens} ({len(statementTokens)} tokens)")
  print(f"{statementIds}")
  print('---')

We've wired the input below up to the same tokenizer function from above. Type in some text and see how the tokenizer responds.

In [None]:
# this code cell is just used to display a widget
configureModel(modelDropdown.value)

display(tokenAnalysisWidget)

## Prompt engineering

[Prompt engineering](https://en.wikipedia.org/wiki/Prompt_engineering), prompt design, or simply "prompting" is the process of creating and testing instructions for the model (called "prompts") to find the right instructions that return your desired results as often as possible.

Generally, each message you send and each response that you receive become part of the overall prompt for the next message.

You can think of the overall conversation as a document of text - it can help to imagine it as a screenplay.

There are various types of messages that make up this screenplay:

- **System**: system messages are sort of like stage directions, they describe the overall parameters that the model should follow and provide any other context that the model might need to know about as the conversation continues
- **User**: user messages are the individual prompts that the user sends to the model
- **Assistant**: assistant messages are the responses the model generates to the user's prompts

The whole thing looks a bit like this:

```
[System]
Assistant is a large language model trained by OpenAI.
Knowledge cutoff: 2021-09
Current date: 2023-08-17
Browsing: disabled

[User]
Who won the Super Bowl in 2022?

[Assistant]
I'm sorry, but I don't have access to real-time information as my knowledge was last updated in September 2021. To find out the winner of the Super Bowl in 2022, I recommend checking a reliable sports news website, a sports app, or conducting a quick internet search for the most up-to-date information.
```

**Note**: There are also messages with the type `function` that indicate that the Assistant would like to execute the function with the given name and pass it the given parameters, but for this demo, we'll be ignoring those.

### Basic example

Here's an example of a basic prompt for seniment analysis:

In [None]:
BASIC_SYSTEM_PROMPT = """
You are VibeCheck, an advanced AI system for detecting the sentiment conveyed in user-generated text.

The user will provide you with a prompt, and you will respond with the sentiment of that prompt.

Do not include any punctuation and only use lower case letters.
"""

@backoff.on_exception(backoff.expo, openai.error.RateLimitError)
def basicChatGptSentiment(prompt, model=modelDropdown.value):
    messages = [{ "role": "system", "content": BASIC_SYSTEM_PROMPT }]

    messages.append({"role": "user", "content": prompt})

    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=TEMPERATURE,
    )

    return response.choices[0].message["content"]

Let's apply this to our Hacker News example from earlier and compare the sentiment detected by ChatGPT to the sentiment we received from the VADER analysis.

In [None]:
if OPENAI_API_KEY:
  for storyId, story in stories.items():
    sentiment = basicChatGptSentiment(story['title'])
    
    if modelDropdown.value not in story['sentiment']['openai']:
      story['sentiment']['openai'][modelDropdown.value] = {}

    story['sentiment']['openai'][modelDropdown.value]['basic'] = sentiment

    print(f"{story['title']}\nVADER: {story['sentiment']['vader']}\n{modelDropdown.value}: {sentiment}\n---")
else:
  print('Please enter your OpenAI API key above and rerun this cell')

### Going further

What if we wanted to dig a bit deeper and consider the emotions that might be associated with some text rather than just a simple positive to negative spectrum?

There are tools like [NRCLex](https://pypi.org/project/nrclex/) that can help us with this, too.

In [None]:
from nrclex import NRCLex

def getNRCEmotion(text):
  emotion = NRCLex(text)

  return emotion.top_emotions

for storyId, story in stories.items():
  emotions = []

  emotionAnalysis = getNRCEmotion(story['title'])

  for (emotion, value) in emotionAnalysis:
    if value > 0.00:
      emotions.append(emotion)

  story['sentiment']['nrclex'] = ", ".join(emotions)

  print(f"{story['title']} {('(' + ', '.join(emotions) + ')') if len(emotions) else ''}")

But, with how short some of our titles are, it doesn't always seem to get good results and it seems like sometimes it disagrees with the VADER sentiment analysis, too.

We could try to train a model to do this for us, but that would be a lot of work and we'd need a lot of data to train it on.

Luckily, we can pretty easily adapt our initial prompt to get ChatGPT to do this for us, too.

In [None]:
ADVANCED_SYSTEM_PROMPT = """
You are VibeCheck, an advanced AI system for detecting the sentiment conveyed in user-generated text.

The user will provide you with a prompt, and you will analyze it following these steps:

1. Analyze the prompt for relevant emotion, tone, affinity, sarcasm, irony, etc.
2. Analyze the likely emotional state of the author based on those findings
3. Summarize the emotional state and sentiment of the prompt based on your findings using 5 or less names for emotions using lowercase letters and separating each emotional state with a comma

Only return the output from the final step to the user.
"""

@backoff.on_exception(backoff.expo, openai.error.RateLimitError)
def advancedChatGptSentiment(prompt, model=modelDropdown.value):
    messages = [{ "role": "system", "content": ADVANCED_SYSTEM_PROMPT }]

    messages.append({"role": "user", "content": prompt})

    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=TEMPERATURE,
    )

    return response.choices[0].message["content"]

In [None]:
if OPENAI_API_KEY:
  for storyId, story in stories.items():
    sentiment = advancedChatGptSentiment(story['title'])

    if modelDropdown.value not in story['sentiment']['openai']:
      story['sentiment']['openai'][modelDropdown.value] = {}

    story['sentiment']['openai'][modelDropdown.value]['advanced'] = sentiment

    print(f"{story['title']} ({sentiment})")
else:
  print('Please enter your OpenAI API key above and rerun this cell')

In [None]:
# this code cell is just used to display a widget
# that uses the analyzeSentiment function we created
# as well as the advancedChatGptSentiment function
configureOpenAi(OPENAI_API_KEY, modelDropdown.value, TEMPERATURE)

display(advancedAnalysisWidget)

## Beyond sentiment

What if we were looking to do something a little more complicated than just sentiment analysis?

What if we wanted to describe the sentiment of some text via an emoji?

In [None]:
EMOJI_SYSTEM_PROMPT = """
You are VibeCheck, an advanced AI system for detecting the sentiment conveyed in user-generated text.

The user will provide you with a prompt, and you will analyze it following these steps:

1. Analyze the prompt for relevant emotion, tone, affinity, sarcasm, irony, etc.
2. Analyze the likely emotional state of the author based on those findings
3. Summarize the emotional state and sentiment of the prompt based on your findings into 5 or fewer names for emotional states
4. Convert each of the emotional states you identified into a representative emoji

Only return the output from the final step to the user.
"""

@backoff.on_exception(backoff.expo, openai.error.RateLimitError)
def emojiChatGptSentiment(prompt, model=modelDropdown.value):
    messages = [{ "role": "system", "content": EMOJI_SYSTEM_PROMPT }]

    messages.append({"role": "user", "content": prompt})

    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=TEMPERATURE,
    )

    return response.choices[0].message["content"]

In [None]:
if OPENAI_API_KEY:
  for storyId, story in stories.items():
    sentiment = emojiChatGptSentiment(story['title'])

    if modelDropdown.value not in story['sentiment']['openai']:
      story['sentiment']['openai'][modelDropdown.value] = {}

    story['sentiment']['openai'][modelDropdown.value]['emoji'] = sentiment

    print(f"{story['title']}({sentiment})")
else:
  print('Please enter your OpenAI API key above and rerun this cell')

## Prompting strategies

In the previous examples we've been using "zero shot" prompting, which means we're asking the model to repsond without giving it an example of what kind of response we'd like for it to have.

There are other prompting strategies we can employ, though:

- **One shot**: gives the model a single example of how we'd like it to respond to guide it's output
- **Few shot**: gives the model a few examples of how we'd like it to respond to different prompts to help guide it's output

### One shot prompting

In [None]:
# Grabbed from https://news.ycombinator.com/ at 2023-09-20 13:00 EDT
# Reference: https://news.ycombinator.com/item?id=37598299
ONE_SHOT_USER_EXAMPLE = "Cisco pulled out of the SentinelOne acquisition after due dilligence"

ONE_SHOT_BOT_EXAMPLE = "🤨"

@backoff.on_exception(backoff.expo, openai.error.RateLimitError)
def oneShotChatGptSentiment(prompt, model=modelDropdown.value):
    messages = [
        { "role": "system", "content": EMOJI_SYSTEM_PROMPT },
        {"role": "user", "content": ONE_SHOT_USER_EXAMPLE },
        {"role": "assistant", "content": ONE_SHOT_BOT_EXAMPLE }
    ]

    messages.append({"role": "user", "content": prompt})

    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=TEMPERATURE,
    )

    return response.choices[0].message["content"]

In [None]:
if OPENAI_API_KEY:
  for storyId, story in stories.items():
    sentiment = oneShotChatGptSentiment(story['title'])

    if modelDropdown.value not in story['sentiment']['openai']:
      story['sentiment']['openai'][modelDropdown.value] = {}

    story['sentiment']['openai'][modelDropdown.value]['oneshot'] = sentiment

    print(f"{story['title']}({sentiment})")
else:
  print('Please enter your OpenAI API key above and rerun this cell')

### Few shot prompting

In [None]:
# Grabbed from https://news.ycombinator.com/ at 2023-09-20 13:10 EDT
FEW_SHOT_USER_EXAMPLES = [
  # Reference: https://news.ycombinator.com/item?id=37598299
  "Cisco pulled out of the SentinelOne acquisition after due dilligence",
  # Reference: https://news.ycombinator.com/item?id=37595898
  "Atlassian cripples Jira automation for all but enterprise customers",
  # Reference: https://news.ycombinator.com/item?id=37586264
  "Toyota Research claims breakthrough in teaching robots new behaviors"
]

FEW_SHOT_BOT_EXAMPLES = [
  "🤨",
  "😖",
  "👏",
]

@backoff.on_exception(backoff.expo, openai.error.RateLimitError)
def fewShotChatGptSentiment(prompt, model=modelDropdown.value):
    messages = [{ "role": "system", "content": EMOJI_SYSTEM_PROMPT }]

    for i, userExample in enumerate(FEW_SHOT_USER_EXAMPLES):
        messages.append({"role": "user", "content": userExample })
        messages.append({"role": "assistant", "content": FEW_SHOT_BOT_EXAMPLES[i] })

    messages.append({"role": "user", "content": prompt})

    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=TEMPERATURE,
    )

    return response.choices[0].message["content"]

In [None]:
if OPENAI_API_KEY:
  for storyId, story in stories.items():
    sentiment = fewShotChatGptSentiment(story['title'])

    if modelDropdown.value not in story['sentiment']['openai']:
      story['sentiment']['openai'][modelDropdown.value] = {}

    story['sentiment']['openai'][modelDropdown.value]['fewshot'] = sentiment

    print(f"{story['title']} ({sentiment})")
else:
  print('Please enter your OpenAI API key above and rerun this cell')

## Comparing methods

Let's compare the sentiment analysis results of the different approaches we've taken in this workshop.

In [None]:
print(stories)

## Comparing approaches

We've looked at various approaches to analyzing sentiment and explored some interesting and novel ways that we can work with AI models like ChatGPT to perform tasks that used to require large investments of time to gather and label data and then train a model.

Let's compare the results of each analysis.

In [None]:
import pandas as pd

sentimentData = {
  "Story": [],
  "VADER": [],
  "NRC": [],
  "ChatGPT (Sentiment)": [],
  "ChatGPT (Emotion)": [],
  "Zero Shot": [],
  "One Shot": [],
  "Few Shot": [],
}

for storyId, story in stories.items():
  if 'title' in story:
    sentimentData["Story"].append(story['title'])
  
  if 'vader' in story['sentiment']:
    sentimentData["VADER"].append(story['sentiment']['vader'])
  
  if 'nrclex' in story['sentiment']:
    sentimentData["NRC"].append(story['sentiment']['nrclex'])

  if 'openai' in story['sentiment'] and modelDropdown.value in story['sentiment']['openai']:
    if 'basic' in story['sentiment']['openai'][modelDropdown.value]:
      sentimentData["ChatGPT (Sentiment)"].append(story['sentiment']['openai'][modelDropdown.value]['basic'])

    if 'advanced' in story['sentiment']['openai'][modelDropdown.value]:
      sentimentData["ChatGPT (Emotion)"].append(story['sentiment']['openai'][modelDropdown.value]['advanced'])

    if 'emoji' in story['sentiment']['openai'][modelDropdown.value]:
      sentimentData["Zero Shot"].append(story['sentiment']['openai'][modelDropdown.value]['emoji'])

    if 'oneshot' in story['sentiment']['openai'][modelDropdown.value]:
      sentimentData["One Shot"].append(story['sentiment']['openai'][modelDropdown.value]['oneshot'])

    if 'fewshot' in story['sentiment']['openai'][modelDropdown.value]:
      sentimentData["Few Shot"].append(story['sentiment']['openai'][modelDropdown.value]['fewshot'])

In [None]:
# this cell is only used to display a dataframe of our sentiment analysis results
sentimentDataFrame = pd.DataFrame(data=sentimentData, columns=["Story", "VADER", "ChatGPT (Sentiment)"], )

display(sentimentDataFrame)

In [None]:
# this code cell is only used to display a dataframe with our emotional analysis results
emotionDataFrame = pd.DataFrame(data=sentimentData, columns=["Story", "NRC", "ChatGPT (Emotion)"])

# often NRCLex will not have data and instead of displaying NaN we'll leave it blank
emotionDataFrame = emotionDataFrame.fillna('')

display(emotionDataFrame)

In [None]:
# this cell is just used to display a dataframe with our emoji results
emojiDataFrame = pd.DataFrame(data=sentimentData, columns=["Story", "Zero Shot", "One Shot", "Few Shot"])

display(emojiDataFrame)