## Install and Import Dependencies

In [1]:
!pip install transformers requests beautifulsoup4 pandas numpy

Collecting transformers
  Downloading transformers-4.21.1-py3-none-any.whl (4.7 MB)
[K     |████████████████████████████████| 4.7 MB 1.3 MB/s eta 0:00:01
Collecting filelock
  Downloading filelock-3.8.0-py3-none-any.whl (10 kB)
Collecting tokenizers!=0.11.3,<0.13,>=0.11.1
  Downloading tokenizers-0.12.1-cp39-cp39-macosx_10_11_x86_64.whl (3.6 MB)
[K     |████████████████████████████████| 3.6 MB 46.5 MB/s eta 0:00:01
[?25hCollecting huggingface-hub<1.0,>=0.1.0
  Downloading huggingface_hub-0.8.1-py3-none-any.whl (101 kB)
[K     |████████████████████████████████| 101 kB 22.5 MB/s ta 0:00:01
[?25hCollecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp39-cp39-macosx_10_9_x86_64.whl (197 kB)
[K     |████████████████████████████████| 197 kB 25.1 MB/s eta 0:00:01
Installing collected packages: pyyaml, filelock, tokenizers, huggingface-hub, transformers
Successfully installed filelock-3.8.0 huggingface-hub-0.8.1 pyyaml-6.0 tokenizers-0.12.1 transformers-4.21.1


In [3]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import requests
from bs4 import BeautifulSoup
import re

## Instantiate Model

In [4]:
tokenizer = AutoTokenizer.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')

model = AutoModelForSequenceClassification.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')

Downloading tokenizer_config.json: 100%|██████████| 39.0/39.0 [00:00<00:00, 10.0kB/s]
Downloading config.json: 100%|██████████| 953/953 [00:00<00:00, 248kB/s]
Downloading vocab.txt: 100%|██████████| 851k/851k [00:00<00:00, 1.35MB/s]
Downloading special_tokens_map.json: 100%|██████████| 112/112 [00:00<00:00, 37.5kB/s]
Downloading pytorch_model.bin: 100%|██████████| 638M/638M [00:15<00:00, 44.6MB/s] 


## Encode and Calculate Sentiment


In [16]:
tokens = tokenizer.encode("Love it. Amazing", return_tensors='pt')

In [6]:
tokenizer.decode(tokens[0])

'[CLS] i hated this, absolutely the worst [SEP]'

In [17]:
result=model(tokens)
int(torch.argmax(result.logits))+1

5

## Collect Reviews

In [30]:
r = requests.get('https://www.yelp.com/biz/mejico-sydney-2')
soup = BeautifulSoup(r.text, 'html.parser')
regex = re.compile('.*comment.*')
results = soup.find_all('p',{'class':regex})
reviews = [result.text for result in results]

In [31]:
reviews

['The food is fresh and tasty. \xa0The scallop ceviche started the lunch. The scallops were tender with a great acidity and use of mango and peppers. The steak was tender and I got the hint of tequila in the sauce. I enjoyed a watermelon salad that complimented the the steak. The portions are good, but a stretch if you are sharing. My only down point is the service. They really only showed up to present my next plate and never checked to see if I wanted another drink (which I did).Enjoyed the food.',
 "Don't come here expecting legit Mexican food but a modern twist on some staples. Loud party area, fun drinks and friendly staff make this a hip meeting area for large groups. Drinks were better than the food. They stuff the families toward the back but lack any amenities (no changing table) except a high chair. Service started off friendly but it took a while to get someone to take our order and then they forgot our dish which came out cold when we asked for it. Then we had to flag someo

## Load Reviews into Data Frame and Score

In [32]:
import pandas as pd
import numpy as np

In [33]:
df = pd.DataFrame(np.array(reviews),columns=['review'])

In [34]:
df.tail()

Unnamed: 0,review
5,Have been here twice and have absolutely loved...
6,Really nice (upmarket) Mexican restaurant. Goo...
7,If you're looking for a quiet little romantic ...
8,The service at this place was top notch - the ...
9,Ordered feed me for $59 along with that.. Food...


In [35]:
def sentiment_score(review):
    tokens = tokenizer.encode(review, return_tensors='pt')
    result = model(tokens)
    return int(torch.argmax(result.logits))+1

In [37]:
sentiment_score(df['review'].iloc[1])

3

In [38]:
df['sentiment'] = df['review'].apply(lambda x: sentiment_score(x[:512]))

In [41]:
df['review'].iloc[3]

'We came here on a Thursday night @ 5pm and by 6pm the place was packed. A lovely big restaurant with a bar at the front (which is a bit awkward to try and push past everyone to get to your table). Friendly, helpful staff which is always a good start. The menu is large so we went with the "feed me" selection. All you need to do is sit back and let the chef feed you. As the other reviewers have stated the corn is a highlight and the pulled pork tacos, the sangria wasn\'t bad either.Loved the Mexican tapas style food and will be back.'