# Natural Language Processing < Sentiment Analysis />

In this module, we will work with both Huggingface transformers and LangChain for the task of sentiment analysis.

## Huggingface

### 1) Install Python libraries

In [None]:
!pip install --no-cache-dir transformers sentencepiece

Collecting transformers
  Downloading transformers-4.35.0-py3-none-any.whl (7.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.9/7.9 MB[0m [31m67.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting sentencepiece
  Downloading sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m189.1 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.16.4 (from transformers)
  Downloading huggingface_hub-0.18.0-py3-none-any.whl (301 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m302.0/302.0 kB[0m [31m304.7 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers<0.15,>=0.14 (from transformers)
  Downloading tokenizers-0.14.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.8/3.8 MB[0m [31m311.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting safet

### 2) Analysis function

Here we define our sentiment analysis function.

In [None]:
from transformers import AutoModelForSequenceClassification
from transformers import TFAutoModelForSequenceClassification
from transformers import AutoTokenizer, AutoConfig
import numpy as np
from scipy.special import softmax

# Preprocess text (username and link placeholders)
def preprocess(text):
    new_text = []
    for t in text.split(" "):
        t = '@user' if t.startswith('@') and len(t) > 1 else t
        t = 'http' if t.startswith('http') else t
        new_text.append(t)
    return " ".join(new_text)

MODEL = f"cardiffnlp/twitter-xlm-roberta-base-sentiment"

tokenizer = AutoTokenizer.from_pretrained(MODEL, use_fast=False)
config = AutoConfig.from_pretrained(MODEL)

model = AutoModelForSequenceClassification.from_pretrained(MODEL)
tokenizer.save_pretrained(MODEL)
model.save_pretrained(MODEL)

def run_sentiment_analysis(text):
  text = preprocess(text)
  encoded_input = tokenizer(text, return_tensors='pt')
  output = model(**encoded_input)
  scores = output[0][0].detach().numpy()
  scores = softmax(scores)

  # Print labels and scores
  ranking = np.argsort(scores)
  ranking = ranking[::-1]
  print(text)
  for i in range(scores.shape[0]):
      l = config.id2label[ranking[i]]
      s = scores[ranking[i]]
      print(f"{i+1}) {l} {np.round(float(s), 4)}")
  print('')

Downloading (…)lve/main/config.json:   0%|          | 0.00/841 [00:00<?, ?B/s]

Downloading (…)tencepiece.bpe.model:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/150 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/1.11G [00:00<?, ?B/s]

### 3) Run sentiment analysis

In [None]:
run_sentiment_analysis("I am not sure.")
run_sentiment_analysis("There's a great day ahead of us.")
run_sentiment_analysis("Hmm, the stock market did not do very well last week.")

I am not sure.
1) neutral 0.7032
2) negative 0.2462
3) positive 0.0507

There's a great day ahead of us.
1) positive 0.9257
2) neutral 0.0617
3) negative 0.0126

Hmm, the stock market did not do very well last week.
1) negative 0.7485
2) neutral 0.2038
3) positive 0.0477



## OpenAI GPT-3.5

In this section we will create and use a sentiment analysis function using OpenAI's GPT-3.5. You'll need an OpenAI developer API key.

### 1) Install Python libraries

In [None]:
!pip install openai

Collecting openai
  Downloading openai-1.1.1-py3-none-any.whl (217 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m217.8/217.8 kB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.25.1-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.0/75.0 kB[0m [31m8.4 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore (from httpx<1,>=0.23.0->openai)
  Downloading httpcore-1.0.1-py3-none-any.whl (76 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.9/76.9 kB[0m [31m7.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting h11<0.15,>=0.13 (from httpcore->httpx<1,>=0.23.0->openai)
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: h11, httpcore, httpx, openai
[31mERROR: pip's dependency resolver does not currently

### 2) Analysis function

In [None]:
import os
from openai import OpenAI
os.environ['OPENAI_API_KEY'] = 'YOUR_OPENAI_DEVELOPER_API_KEY'
client = OpenAI()


def run_sentiment_analysis_gpt(text):
  completion = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
      {"role": "system", "content": "You are going to help me determine the sentiment of a piece of text. You will simply respond with 'positive', 'negative', or 'neutral'."},
      {"role": "user", "content": text}
    ]
  )
  print(text)
  print(completion.choices[0].message.content)
  print('')

### 3) Run sentiment analysis

In [None]:
run_sentiment_analysis_gpt("I am not sure.")
run_sentiment_analysis_gpt("There's a great day ahead of us.")
run_sentiment_analysis_gpt("Hmm, the stock market did not do very well last week.")

I am not sure.
neutral

There's a great day ahead of us.
positive

Hmm, the stock market did not do very well last week.
negative

