In [1]:
! pip install -Uqq fastbook
import fastbook
fastbook.setup_book()



In [2]:
from fastai.text.all import *
import pandas as pd

## 1. Preparing the data

In [3]:
path = untar_data(URLs.YELP_REVIEWS_POLARITY)
path.ls()

(#3) [Path('/root/.fastai/data/yelp_review_polarity_csv/readme.txt'),Path('/root/.fastai/data/yelp_review_polarity_csv/train.csv'),Path('/root/.fastai/data/yelp_review_polarity_csv/test.csv')]

In [4]:
! cat {path}/'readme.txt'

Yelp Review Polarity Dataset

Version 1, Updated 09/09/2015

ORIGIN

The Yelp reviews dataset consists of reviews from Yelp. It is extracted from the Yelp Dataset Challenge 2015 data. For more information, please refer to http://www.yelp.com/dataset_challenge

The Yelp reviews polarity dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the above dataset. It is first used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).


DESCRIPTION

The Yelp reviews polarity dataset is constructed by considering stars 1 and 2 negative, and 3 and 4 positive. For each polarity 280,000 training samples and 19,000 testing samples are take randomly. In total there are 560,000 trainig samples and 38,000 testing samples. Negative polarity is class 1, and positive class 2.

The files train.csv and test.csv contain all the

In [8]:
column_names = ["label", "text"]

In [13]:
df = pd.read_csv(path/'train.csv', names=column_names, nrows=25000)

In [14]:
df.tail()

Unnamed: 0,label,text
24995,2,"I haven't flown in 5 or 6 years and wasn't really looking forward to experiencing the \""joy\"" that is an airport. I've always heard you have to be at the airport two hours in advance and with our flight being at 6:30am that means being there are 4:30am. Luckily, the traffic at this hour was minimal and getting in was a breeze. Additionally, the airport seemed pretty dead. Not too many other flyers going at this hour on a Friday morning. We were able to go through the full body scanners relatively quickly and then wait around for our plane. (We headed on over to La Grand Orange - see review..."
24996,2,"Love this airport. The airport itself is very nice, with friendly staff and volunteers (look for the people in purple before TSA) to help you get where you are going. Pre security is pretty nice. Good selection of restaurants. Also, this has to be one of the few large airports in the US to offer free and reasonably fast Wi-Fi. I personally enjoy the speed and efficiency of this airport.\nFor those of you who say that the airport is in need of some new carpet: I agree. Those of you who think the layout is stupid: you are wrong. Airports aren't supposed to be a work of art. If you were to kn..."
24997,2,Overall I really like this airport. Access to the airport from multiple roadways helps reduce the traffic coming in. I usually fly United or US Airways (star alliance) here and security is always quite fast. Once you are inside the concourses are laid out efficiently and there are plenty of people movers in between. There is free wifi here that is actually quite fast 3.89Mbps download and 31.48Mbps upload(speedtest.net). Pretty good selection of restuarants. As far as airports go this one is highly recommended.
24998,1,"If you're heading to Terminal 3, make sure to eat and get coffee BEFORE going through security. Food/shopping option in the terminal are terrible!"
24999,1,Crowded with overpriced food :(


In [15]:
dls_lm = DataBlock(
    blocks=TextBlock.from_df("text", is_lm=True),
    get_items=ColReader("text"),
    splitter=RandomSplitter(0.1)
).dataloaders(df, bs=128, seq_len=80)

In [16]:
dls_lm.show_batch(max_n=2)

Unnamed: 0,text,text_
0,"xxbos xxmaj we went to the xxmaj mad xxmaj mex in xxmaj cranberry xxmaj xxunk and wo n't make the mistake of going back . \n\n xxmaj the food was horribly hot , like they just toss in heat for heat 's sake , rather than for taste . i ordered the mole chicken enchiladas and that was xxup not any kind of mole xxmaj i 've ever had . xxmaj the sauce was more like a weak , thin","xxmaj we went to the xxmaj mad xxmaj mex in xxmaj cranberry xxmaj xxunk and wo n't make the mistake of going back . \n\n xxmaj the food was horribly hot , like they just toss in heat for heat 's sake , rather than for taste . i ordered the mole chicken enchiladas and that was xxup not any kind of mole xxmaj i 've ever had . xxmaj the sauce was more like a weak , thin ranchero"
1,"street from the xxmaj suns arena and i think this is really the last time xxmaj i""m going to subject myself to this place . xxmaj i 've been to xxup many hrc 's across the globe and this one is the worst one . \n\n xxmaj ca n't xxup someone put a decent place in downtown xxmaj phoenix ? xxup please ! xxbos xxmaj pane xxmaj bianco somehow got even cuter . xxmaj since the last time we were","from the xxmaj suns arena and i think this is really the last time xxmaj i""m going to subject myself to this place . xxmaj i 've been to xxup many hrc 's across the globe and this one is the worst one . \n\n xxmaj ca n't xxup someone put a decent place in downtown xxmaj phoenix ? xxup please ! xxbos xxmaj pane xxmaj bianco somehow got even cuter . xxmaj since the last time we were here"


## 2. Training the language model on downstream text corpus

In [17]:
learn = language_model_learner(dls_lm, AWD_LSTM, metrics=[accuracy, Perplexity()], drop_mult=0.3).to_fp16()

In [18]:
learn.fit_one_cycle(1, 2e-2)

epoch,train_loss,valid_loss,accuracy,perplexity,time
0,3.951237,3.831378,0.283394,46.126045,10:27


In [19]:
learn.save("1epoch")

Path('models/1epoch.pth')

In [20]:
learn = learn.load("1epoch")

In [21]:
learn.unfreeze()
learn.fit_one_cycle(6, 2e-3)

epoch,train_loss,valid_loss,accuracy,perplexity,time
0,3.687118,3.706221,0.297525,40.699726,11:54
1,3.589683,3.621173,0.306904,37.381405,12:00
2,3.425974,3.578948,0.312289,35.835815,08:05
3,3.271107,3.56787,0.314391,35.44101,08:01
4,3.111527,3.585646,0.314162,36.07666,08:31
5,3.010373,3.606695,0.31285,36.844078,08:01


In [22]:
learn.save_encoder("finetuned")

We can now "generate" reviews!

In [23]:
TEXT = "This is utterly"
N_WORDS = 50
N_SENTENCES = 2
preds = [learn.predict(TEXT, N_WORDS, temperature=0.75) for _ in range(N_SENTENCES)]

In [24]:
print('\n'.join(preds))

This is utterly overrated and pretty overrated . i come here for a good deal of odd dim sum dishes , but it 's just not that good . Trendy places with a weird look , great prices , and great food . This place was pretty good , and the
This is utterly DISGUSTING Mexican food . The food is greasy and the service is awful . The rice was terrible , the rice was fried , the rice was hard and the potatoes were cold . The chips and salsa were the worst tasting I 've


## 3. Preparing data for the downstream task

In [25]:
df["label"] = df["label"].replace(1, 'neg')
df["label"] = df["label"].replace(2, 'pos')

In [26]:
df.label

0        neg
1        pos
2        neg
3        neg
4        pos
        ... 
24995    pos
24996    pos
24997    pos
24998    neg
24999    neg
Name: label, Length: 25000, dtype: object

In [27]:
dls_clas = DataBlock(
    blocks=(TextBlock.from_df("text", vocab=dls_lm.vocab), CategoryBlock),
    get_x=ColReader("text"),
    get_y=ColReader("label"),
    splitter=RandomSplitter(0.25)
).dataloaders(df, bs=128, seq_len=72)

In [28]:
dls_clas.show_batch(max_n=2)

Unnamed: 0,text,category
0,"xxbos xxmaj let 's get right to it . xxmaj the smell of xxup grilled xxup onions hits you before you even touch the front door . xxmaj that is the definition of a xxmaj chicago dog / burger joint . xxmaj that xxup is xxmaj xxunk xxmaj street . xxmaj that xxup is xxmaj chicago . \n\n xxmaj the xxmaj food - i ordered the fire dog , a cheese slider , and fried zucchini . \n xxmaj the xxunk did n't let me down in the authenticity department . xxmaj good meat , the correct and required condiments ( neon relish , celery salt , sport peppers , tomato , mustard , onion . and seedy seed bun ) , cooked just right , and it smelled up the office when i opened it . \n xxmaj the slider was the ex - factor . i saw '",pos
1,"xxbos i tried . \n\n i really did . i really wanted this to be "" my watch guy "" for repair and servicing . \n\n xxmaj after reading all of the amazing reviews about ' the xxmaj watch xxmaj repair xxmaj company ' i decided to bring in a few of my old watches with the hope of giving them new life . i brought in older xxmaj seiko xxunk that needed a battery , cleaning , and some scratches buffed out . i also brought my xxmaj skagen xxmaj world xxup xxunk watch that needed a crown replacement , battery , and cleaning along with my vostok - europe xxmaj xxunk that needed the rotating bezel pin replaced or fixed ( to keep the bezel from falling off ) . \n\n i was certain this was going to be the beginning of a wonderful business relationship . xxmaj",neg


## 4. Training the classifier model

In [29]:
learn = text_classifier_learner(dls_clas, AWD_LSTM, drop_mult=0.5, metrics=accuracy).to_fp16()

In [30]:
learn = learn.load_encoder("finetuned")

We now train the model using the gradual unfreezing technique introduced in the ULMFiT paper.

In [31]:
learn.fit_one_cycle(1, 2e-2)

epoch,train_loss,valid_loss,accuracy,time
0,0.264936,0.175361,0.93072,00:41


In [32]:
learn.freeze_to(-2)
learn.fit_one_cycle(1, slice(1e-2/(2.6**4), 1e-2))

epoch,train_loss,valid_loss,accuracy,time
0,0.231054,0.164219,0.9352,00:45


In [33]:
learn.freeze_to(-3)
learn.fit_one_cycle(1, slice(5e-3/(2.6**4), 5e-3))

epoch,train_loss,valid_loss,accuracy,time
0,0.186952,0.160553,0.936,01:03


In [34]:
learn.unfreeze()
learn.fit_one_cycle(2, slice(1e-3/(2.6**4), 1e-3))

epoch,train_loss,valid_loss,accuracy,time
0,0.133989,0.142144,0.94496,01:18
1,0.118296,0.144714,0.94592,01:18


In [35]:
learn.predict("This is your standard Chick-Fil-A - consistently good food with friendly staff.\n\nThis Chick is located off of South Blvd in the same shopping center as a movie theater, Target, Old Navy, Bed Bath & Beyond, Kohl's, Michaels and a bunch of other stuff. The traffic pattern in the center can be a bit of a cluster, but isn't that true of most shopping centers? Coming through a little before 11:30 am on a Thursday, I drove straight up to the call box, ordered my #1, quickly paid and was on my way.")

('pos', tensor(1), tensor([0.0093, 0.9907]))

In [36]:
learn.predict("Highly overpriced!! Even after you haggle down the price is above average with a lower than average quality product. On top of that, their storage is tiny, so you have to come back later to pick your purchase up.\nEXAMPLE: Bought a TV stand. Waited THREE DAYS to be able to pick it up. Waited in line for the pickup for nearly an hour. Got home and SIX PIECES WERE MISSING!! Called and they were happy to provide replacements, but I had to WAIT SIX MORE DAYS for the replacement parts to arrive. When we went to pick up the parts they were reluctant to give them to us")

('neg', tensor(0), tensor([9.9971e-01, 2.8574e-04]))

It's really interesting to see the influence of certain words on the predictions by the model.

In [37]:
learn.predict("The experience we had was very good but the waiters were also good")

('neg', tensor(0), tensor([0.8920, 0.1080]))

In [38]:
learn.export("reviews.pkl")