# Sentiment Analysis

## Step 1 : Install and import dependencies
---
    Install torch (get the link from PyTorch website)
    Install Numpy
    Install Pandas

In [1]:
!pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/, https://download.pytorch.org/whl/cu113


In [44]:
!pip install transformers numpy pandas

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [45]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import pickle

## Step 2 : Instantiate model
---
### Tokenizer
    Tokenizers are used to convert raw text into numbers

### AutoModelFor
    AutoModelFor lets you load a pretrained model for a given task



In [27]:
# Instantiate AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("nlptown/bert-base-multilingual-uncased-sentiment")

# Instantiate AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("nlptown/bert-base-multilingual-uncased-sentiment")

## Step 3 : Encode and Calculate Sentiment
---
    Create a function that takes an input string/text and return its sentiment score

In [37]:
def sentiment_analysis(input_string) :
  # encode the input string
  tokens = tokenizer.encode(input_string[:512], return_tensors="pt")
  
  # output the tokens
  print(f"Tokensized text : {tokens}\n")

  # output the decoded text
  print(f"Decoded tokenized text : {tokenizer.decode(tokens[0])}\n")

  # use the model to get the sentiment score array
  result = model(tokens)

  # output the score array
  print(f"Score : {result.logits}\n")

  # output the max value as a final result
  ans = int(torch.argmax(result.logits)) + 1
  print(f"Final Score : {ans} stars (on a scale of 5)")

In [41]:
def sentiment_score(input_string) :
  # encode the input string
  tokens = tokenizer.encode(input_string[:512], return_tensors="pt")

  # use the model to get the sentiment score array
  result = model(tokens)

  # output the max value as a final result
  ans = int(torch.argmax(result.logits)) + 1
  return ans

In [38]:
sentiment_analysis("I love you, I can't live without you.")

Tokensized text : tensor([[  101,   151, 11157, 10855,   117,   151, 10743,   112,   162, 11343,
         13208, 10855,   119,   102]])

Decoded tokenized text : [CLS] i love you, i can't live without you. [SEP]

Score : tensor([[-1.8182, -2.1729, -1.1707,  0.8624,  3.3813]],
       grad_fn=<AddmmBackward0>)

Final Score : 5 stars (on a scale of 5)


In [40]:
sentiment_analysis("I hope we'll never meet again.")

Tokensized text : tensor([[  101,   151, 18763, 11312,   112, 17361, 13362, 19508, 12590,   119,
           102]])

Decoded tokenized text : [CLS] i hope we'll never meet again. [SEP]

Score : tensor([[-0.2748,  0.1177,  0.4230, -0.0332, -0.2835]],
       grad_fn=<AddmmBackward0>)

Final Score : 3 stars (on a scale of 5)


In [39]:
sentiment_analysis("Maybe you should have died that day.")

Tokensized text : tensor([[  101, 69557, 10855, 14693, 10574, 12677, 10203, 11111,   119,   102]])

Decoded tokenized text : [CLS] maybe you should have died that day. [SEP]

Score : tensor([[ 0.8251,  0.5463,  0.4002, -0.6007, -1.0558]],
       grad_fn=<AddmmBackward0>)

Final Score : 1 stars (on a scale of 5)


In [42]:
sentiment_score("I love you, I can't live without you.")

5

## Step 4 : Save the function for later use
---
    Use pickle to save sentiment_score function

In [46]:
pickle.dump(sentiment_score, open("sentiment_score.pkl", "wb"))