<a href="https://colab.research.google.com/github/ankitpakhale/NLPSentimentAnalysis/blob/master/NLPSentimentAnalysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Import Dependencies

In [15]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch  
import requests
from bs4 import BeautifulSoup
import re
import pandas as pd
import numpy as np

### Creating Main Dictionary

In [32]:
# mainDict = {
#     1:"Poor",
#     2:"Unsatisfactory",
#     3:"Satisfactory",
#     4:"Very Satisfactory",
#     5:"Outstanding"
# }

mainDict = {
    1:"😞",
    2:"😏",
    3:"🙂",
    4:"😎",
    5:"🤩"
}

### Instantiate Model

In [17]:
tokenizer = AutoTokenizer.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')

model = AutoModelForSequenceClassification.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')

Downloading:   0%|          | 0.00/39.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/953 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/872k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/669M [00:00<?, ?B/s]

### Encode and Calculate Sentiment

In [18]:
# text = "I hated this, absolutely the worst"
# text = "I is amazing, I loved it. Great!"
text = "It's a good company it's fabrics are genuine more comfort It's some what costly but the value for money will be there"
tokens = tokenizer.encode(text, return_tensors="pt")

In [19]:
encodedTokens = tokens[0]
encodedTokens

tensor([  101, 10197,   112,   161,   143, 12050, 11062, 10197,   112,   161,
        95431, 10107, 10320, 14242, 64934, 10772, 66493, 10197,   112,   161,
        10970, 11523, 18153, 10563, 10502, 10103, 18267, 10139, 15033, 11229,
        10346, 10768,   102])

In [20]:
decodedTokens = tokenizer.decode(tokens[0])
decodedTokens

"[CLS] it's a good company it's fabrics are genuine more comfort it's some what costly but the value for money will be there [SEP]"

In [21]:
result = model(tokens)
result

SequenceClassifierOutput(loss=None, logits=tensor([[-3.0541, -1.3997,  1.6624,  2.1341,  0.4352]],
       grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)

In [22]:
result.logits

tensor([[-3.0541, -1.3997,  1.6624,  2.1341,  0.4352]],
       grad_fn=<AddmmBackward0>)

In [23]:
mainDict[int(torch.argmax(result.logits))+1]

'Very Satisfactory'

### Collecting Reviews

In [24]:
r = requests.get('https://www.yelp.com/biz/the-little-chihuahua-san-francisco?osq=Mexican')
soup = BeautifulSoup(r.text, 'html.parser')
regex = re.compile('.*comment.*')
results = soup.find_all('p', {'class': regex})
reviews = [result.text for result in results]
reviews

["If I could give this place more than 5 stars I would! I have tried their amazingly spiced soyrizo, salmon, plantain and bean tacos, burrito bowls, burritos and everything is delicious! This review is valid for fresh food - I'm not a fan of takeout, a lot of the flavor is somehow compromised in takeouts from this place.Food:- My special favorite is the soyrizo - embracing a vegan option as a Mexican food joint already makes TLC rare, and then spicing up the soyrizo beautifully just makes TLC the unbeatable Mexican food choice for me.Unique selling points:- they have delicious vegan options- they use organic ingredients- have a wonderful salsa barAnother great thing about TLC is that it is situated next to an amazing shaved ice cream place called Powder! These are our go-to dinner/dessert spots!So lucky to be situated so close to TLC!",
 'Steep prices! Would eat here more often if the prices were even just a dollar cheaper than they currently are because the food is pretty good. I reco

### Load reviews into DataFrame and Score

In [25]:
df = pd.DataFrame(np.array(reviews), columns=['reviews'])
len(df)

11

In [26]:
df['reviews'].iloc[0]

"If I could give this place more than 5 stars I would! I have tried their amazingly spiced soyrizo, salmon, plantain and bean tacos, burrito bowls, burritos and everything is delicious! This review is valid for fresh food - I'm not a fan of takeout, a lot of the flavor is somehow compromised in takeouts from this place.Food:- My special favorite is the soyrizo - embracing a vegan option as a Mexican food joint already makes TLC rare, and then spicing up the soyrizo beautifully just makes TLC the unbeatable Mexican food choice for me.Unique selling points:- they have delicious vegan options- they use organic ingredients- have a wonderful salsa barAnother great thing about TLC is that it is situated next to an amazing shaved ice cream place called Powder! These are our go-to dinner/dessert spots!So lucky to be situated so close to TLC!"

In [27]:
def sentiment_score(review):
  tokens = tokenizer.encode(review, return_tensors="pt")
  result = model(tokens)
  return (int(torch.argmax(result.logits))+1)

In [28]:
for i in df['reviews']:
  print(sentiment_score(i))

4
4
3
2
2
3
4
5
2
4
4


#### In the below line we have taken only 512 characters from each review because --> NLP pipeline is limited to how much text or tokens you can pass through it at one particular period of time

In [34]:
df['rating'] =([sentiment_score(individual_review[:512]) for individual_review in df['reviews']])
df['sentiment'] =([mainDict[sentiment_score(individual_review[:512])] for individual_review in df['reviews']])
df

Unnamed: 0,reviews,sentiment,rating
0,If I could give this place more than 5 stars I...,😎,4
1,Steep prices! Would eat here more often if the...,😎,4
2,"The place does have good flavor, unfortunately...",🙂,3
3,I didn't hate the salad as I'm a sucker for co...,😏,2
4,Hi Connie. Thanks for your feedback. The plant...,😏,2
5,Had lunch catered by employer. 3 locations and...,🙂,3
6,Long-time fans of this place. My kids LOVE it....,😎,4
7,Delicious Vegan options available. Loved the b...,🤩,5
8,"Just had my work lunch here, like literally 30...",😏,2
9,Cool spot for late night eats! I got the Fried...,😎,4


In [30]:
df['reviews'].iloc[0]

"If I could give this place more than 5 stars I would! I have tried their amazingly spiced soyrizo, salmon, plantain and bean tacos, burrito bowls, burritos and everything is delicious! This review is valid for fresh food - I'm not a fan of takeout, a lot of the flavor is somehow compromised in takeouts from this place.Food:- My special favorite is the soyrizo - embracing a vegan option as a Mexican food joint already makes TLC rare, and then spicing up the soyrizo beautifully just makes TLC the unbeatable Mexican food choice for me.Unique selling points:- they have delicious vegan options- they use organic ingredients- have a wonderful salsa barAnother great thing about TLC is that it is situated next to an amazing shaved ice cream place called Powder! These are our go-to dinner/dessert spots!So lucky to be situated so close to TLC!"