### BERT Model menggunakan transformers package.
#### Rencana Proyek :
1. Install transformers
2. Perform sentiment scoring using BERT
3. Scrape reviews from Yelp and Score

#### Alur Kerja :
1. Download and install BERT from HF Transformers.
* pre-trained bert model from hugging face transformers.

2. Run sentiment analysis on reviews
* from rating scale 1 to 5.

3. Scrape reviews from Yelp and Score
* stored in pandas dataset.

### Praktek :
#### 1. Instal paket dan impor turunannya

In [1]:
!pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117

Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu117
Collecting torch
  Downloading https://download.pytorch.org/whl/cu117/torch-1.13.0%2Bcu117-cp38-cp38-win_amd64.whl (2258.2 MB)
Collecting torchvision
  Downloading https://download.pytorch.org/whl/cu117/torchvision-0.14.0%2Bcu117-cp38-cp38-win_amd64.whl (4.8 MB)
Collecting torchaudio
  Downloading https://download.pytorch.org/whl/cu117/torchaudio-0.13.0%2Bcu117-cp38-cp38-win_amd64.whl (2.3 MB)
Installing collected packages: torch, torchvision, torchaudio
Successfully installed torch-1.13.0+cu117 torchaudio-0.13.0+cu117 torchvision-0.14.0+cu117


In [2]:
!pip install transformers requests beautifulsoup4 pandas numpy

Collecting transformers

ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.

We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.

huggingface-hub 0.11.0 requires packaging>=20.9, but you'll have packaging 20.4 which is incompatible.



  Downloading transformers-4.24.0-py3-none-any.whl (5.5 MB)
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.2-cp38-cp38-win_amd64.whl (3.3 MB)
Collecting huggingface-hub<1.0,>=0.10.0
  Downloading huggingface_hub-0.11.0-py3-none-any.whl (182 kB)
Installing collected packages: tokenizers, huggingface-hub, transformers
Successfully installed huggingface-hub-0.11.0 tokenizers-0.13.2 transformers-4.24.0


#### 2. Mengunduh dan membuat model

In [4]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import requests
from bs4 import BeautifulSoup
import re

#### 3. Mengubah kode dan menghitung sentimen

In [6]:
tokenizer = AutoTokenizer.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')

model = AutoModelForSequenceClassification.from_pretrained('nlptown/bert-base-multilingual-uncased-sentiment')

HBox(children=(HTML(value='Downloading'), FloatProgress(value=0.0, max=669491321.0), HTML(value='')))




In [27]:
tokens = tokenizer.encode('To be fair, it’s at least a fun addition to the MCU. The special clocks in at less than an hour and is basically a feel-good Christmas story with some sci-fi shenanigans thrown in.', return_tensors = 'pt')

In [22]:
tokens

tensor([[  101, 12818, 10103, 10889, 86606, 19803, 10108, 29642, 83643,   100,
           162, 37498, 10114, 10346, 29559, 28415,   143, 11975, 10151, 47338,
         10661, 53092, 10103, 13623, 24590, 17029, 10871, 10108,   143, 10564,
         71488, 10146,   143, 24731,   119,   102]])

In [23]:
tokenizer.decode(tokens[0])

'[CLS] even the most hardened of hearts couldn [UNK] t fail to be softened a little by drax beating the living crap out of a man dressed as a robot. [SEP]'

In [28]:
result = model(tokens)

In [25]:
result

SequenceClassifierOutput(loss=None, logits=tensor([[-0.0480, -0.0648, -0.4371, -0.0834,  0.2918]],
       grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)

In [29]:
int(torch.argmax(result.logits))+1

3

#### 4. Mengambil kumpulan ulasan dari situs Yelp

In [77]:
r = requests.get('https://www.yelp.com/biz/mejico-sydney-2')
soup = BeautifulSoup(r.text, 'html.parser')
regex = re.compile('.*comment.*')
results = soup.find_all('p', {'class':regex})
reviews = [result.text for result in results]

In [80]:
reviews

['Great atmosphere, attentive service, solid margs, and a Tasty menu. The Brisket Tacos were substantial and delicious. The corn ribs??? \xa0Fawgetaboutit! \xa0Unreal. \xa0Wanted to order another plate.',
 "Don't come here expecting legit Mexican food but a modern twist on some staples. Loud party area, fun drinks and friendly staff make this a hip meeting area for large groups. Drinks were better than the food. They stuff the families toward the back but lack any amenities (no changing table) except a high chair. Service started off friendly but it took a while to get someone to take our order and then they forgot our dish which came out cold when we asked for it. Then we had to flag someone down to pay the bill. The watermelon salad was tasty but not complex, tossed with a few cucumbers and pistachios. The corn lollipops with spicy mayo sauce were probably the best dish. The beef empanadas were cold and average though the salsa was an interesting pickled onion. Definitely skip the oc

#### 5. Memuat ulasan dalam bingkai data dan menghitung sentimen

In [81]:
import numpy as np
import pandas as pd

In [82]:
df = pd.DataFrame(np.array(reviews), columns=['review'])

In [84]:
df['review'].iloc[0]

'Great atmosphere, attentive service, solid margs, and a Tasty menu. The Brisket Tacos were substantial and delicious. The corn ribs??? \xa0Fawgetaboutit! \xa0Unreal. \xa0Wanted to order another plate.'

In [85]:
def sentiment_score(review):
    tokens = tokenizer.encode(review, return_tensors='pt')
    result = model(tokens)
    return int(torch.argmax(result.logits))+1

In [86]:
sentiment_score(df['review'].iloc[0])

3

In [87]:
df['sentiment'] = df['review'].apply(lambda x: sentiment_score(x[:512]))

In [88]:
df

Unnamed: 0,review,sentiment
0,"Great atmosphere, attentive service, solid mar...",3
1,Don't come here expecting legit Mexican food b...,3
2,Out of all the restaurants that I tried in Syd...,5
3,The food is fresh and tasty. The scallop cevi...,4
4,We came here on a Thursday night @ 5pm and by ...,4
5,Have been here twice and have absolutely loved...,5
6,I was pleasantly surprised at what a great job...,5
7,If you're looking for a quiet little romantic ...,2
8,The service at this place was top notch - the ...,5
9,Really nice (upmarket) Mexican restaurant. Goo...,4


In [92]:
df['review'].iloc[4]

'We came here on a Thursday night @ 5pm and by 6pm the place was packed. A lovely big restaurant with a bar at the front (which is a bit awkward to try and push past everyone to get to your table). Friendly, helpful staff which is always a good start. The menu is large so we went with the "feed me" selection. All you need to do is sit back and let the chef feed you. As the other reviewers have stated the corn is a highlight and the pulled pork tacos, the sangria wasn\'t bad either.Loved the Mexican tapas style food and will be back.'