 ## Hugging face
 https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment

# 1. Install and Import Dependencies

In [2]:
# Install pytorch
!pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117

Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu117
Collecting torch
  Downloading https://download.pytorch.org/whl/cu117/torch-1.13.1%2Bcu117-cp39-cp39-win_amd64.whl (2255.6 MB)
Collecting torchvision
  Downloading https://download.pytorch.org/whl/cu117/torchvision-0.14.1%2Bcu117-cp39-cp39-win_amd64.whl (4.8 MB)
Collecting torchaudio
  Downloading https://download.pytorch.org/whl/cu117/torchaudio-0.13.1%2Bcu117-cp39-cp39-win_amd64.whl (2.3 MB)
Installing collected packages: torch, torchvision, torchaudio
Successfully installed torch-1.13.1+cu117 torchaudio-0.13.1+cu117 torchvision-0.14.1+cu117


In [3]:
!pip install transformers requests beautifulsoup4 pandas numpy

Collecting transformers
  Downloading transformers-4.26.1-py3-none-any.whl (6.3 MB)
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.2-cp39-cp39-win_amd64.whl (3.3 MB)
Collecting huggingface-hub<1.0,>=0.11.0
  Downloading huggingface_hub-0.12.1-py3-none-any.whl (190 kB)
Installing collected packages: tokenizers, huggingface-hub, transformers
Successfully installed huggingface-hub-0.12.1 tokenizers-0.13.2 transformers-4.26.1


In [5]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import requests
from bs4 import BeautifulSoup
import re

# 2. Instantiate Model

In [8]:
tokenizer = AutoTokenizer.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')
model = AutoModelForSequenceClassification.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')


Downloading (…)okenizer_config.json:   0%|          | 0.00/39.0 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


Downloading (…)lve/main/config.json:   0%|          | 0.00/953 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/872k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

Downloading (…)"pytorch_model.bin";:   0%|          | 0.00/669M [00:00<?, ?B/s]

# 3. Encode and Calcualate Sentiment

In [9]:
tokens = tokenizer.encode('It was not that bad', return_tensors = 'pt')

In [10]:
result = model(tokens)

In [11]:
result

SequenceClassifierOutput(loss=None, logits=tensor([[-1.4933, -0.1706,  1.5934,  0.8356, -0.6145]],
       grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)

In [12]:
result.logits

tensor([[-1.4933, -0.1706,  1.5934,  0.8356, -0.6145]],
       grad_fn=<AddmmBackward0>)

In [16]:
# Sentiment points (scale from 1-5)
int(torch.argmax(result.logits))+1

3

# 4. Collect Reviews

In [17]:
r = requests.get('https://www.yelp.com/biz/kothai-republic-san-francisco')
soup = BeautifulSoup(r.text, 'html.parser')
regex = re.compile('.*comment.*')
results = soup.find_all('p', {'class': regex})
reviews = [result.text for result in results]

In [18]:
reviews

['I fall in love this amazing food. Atmospare was very good. Staff excellent.',
 'Hi Shrader. Thanks for coming by to visit us. We will continue to give our best effort everyday. Thanks',
 'Amazing new family-run restaurant! Super delicious flavors (we had the crispy pork, bibimbap, noodle soup and wish we had room for more), friendly service, lovely modern space, nice ambiance. Excellent portion sizes and prices.',
 'Hi Colin. Our team really has no other words but to say thank you. We are sincerely grateful that you had a great experience with us.',
 "Super solid spot! Loved it!Came on Sat evening around 6:30. Fairly busy but there wasn't a line out the door or anything. We had a reservation and were seated promptly. Interior is not too big, some tables are quite close to each other. But it was bright and cozy. Pretty classy dishware and decor in my opinion. Kitchen is somewhat visible from the dining room which was interesting!Service was really good! Waiters were attentive, kind, a

# 5. Load Reviews into DataFrame and Score

In [19]:
import numpy as np
import pandas as pd

In [20]:
df = pd.DataFrame(np.array(reviews), columns = ['review'])

In [30]:
df['review'].iloc[1]

'Hi Shrader. Thanks for coming by to visit us. We will continue to give our best effort everyday. Thanks'

In [31]:
def sentiment_score(review):
    tokens = tokenizer.encode(review, return_tensors = 'pt')
    result = model(tokens)
    return int(torch.argmax(result.logits))+1

In [32]:
sentiment_score(df['review'].iloc[1])

5

In [33]:
df['sentiment'] = df['review'].apply(lambda x: sentiment_score(x[:512]))

In [34]:
df

Unnamed: 0,review,sentiment
0,I fall in love this amazing food. Atmospare wa...,5
1,Hi Shrader. Thanks for coming by to visit us. ...,5
2,Amazing new family-run restaurant! Super delic...,5
3,Hi Colin. Our team really has no other words b...,5
4,Super solid spot! Loved it!Came on Sat evening...,4
5,Hello Stephanie! Thank you for coming in and a...,5
6,Fantastic addition to the Inner Sunset neighbo...,5
7,Hello Kyle. Thank you for your gracious words ...,5
8,I gave 3 stars for food 5 stars for service Ki...,3
9,A wonderful fusion between Thai and Korean tha...,5
