<a href="https://colab.research.google.com/github/iasad1/NLP/blob/main/imdb_reviews_TL.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Building a sentiment analyser: Applying Transfer Learning

With availability of neural networks, concepts such as stemming and lemmatisation are not necessary. In fact, neural networks may miss out learning subtlities taken away from stemming or lemmatisation.

In this notebook, we will build a sentiment classifier that has been trained on IMDb dataset and then apply transfer learning to detect sentiments within our own dataset.

There are four steps involved in the process,

1. Reading and viewing the IMDb data
2. Getting our own data for modelling
3. Fine-tuning a language model
4. Building the classifier

## About Language model

Language model is an example of a pre-trained text based model that is good at predicting the next word that is likely to be typed. 

It has a good understanding of english, what's a happy word or sad word etc.




## Steps

1. Get pre-processed data
2. Create a language model with pre-trained weights that you can fine-tune
3. Create other models like classificer on top of the encoder of the language model.

# Import libraries

In [3]:
from fastai.text import *

In [4]:
path = untar_data(URLs.IMDB_SAMPLE)

Downloading http://files.fast.ai/data/examples/imdb_sample.tgz


In [5]:
# path??

In [6]:
path

PosixPath('/root/.fastai/data/imdb_sample')

## Data preprocessing

Unlike images, you cannot upload a text data simply into a pre-trained model. For text, you will have to transform it into a  list of words, tokens and then transform these tokens in to numbers. It is these numbers that are passed to embedding layers that will convert them in arrays of floats before passing through the model.

In [7]:
df =  pd.read_csv(path/'texts.csv')

In [8]:
df.head()

Unnamed: 0,label,text,is_valid
0,negative,Un-bleeping-believable! Meg Ryan doesn't even ...,False
1,positive,This is a extremely well-made film. The acting...,False
2,negative,Every once in a long while a movie will come a...,False
3,positive,Name just says it all. I watched this movie wi...,False
4,negative,This movie succeeds at being one of the most u...,False


In [9]:
df.shape

(1000, 3)

##  Create a `TextDataBunch` suitable for training a language model.


Get the data and create a 'databunch' object to use for a language model

In [10]:
%%time

# throws `BrokenProcessPool` Error sometimes. Keep trying `till it works!
count = 0
error = True
while error:
    try: 
        # The following line throws `AttributeError: backwards` on the learning step, below
        # data_lm = TextDataBunch.from_csv(path, 'texts.csv')
        # This Fastai Forum post shows the solution:
        #      https://forums.fast.ai/t/backwards-attributes-not-found-in-nlp-text-learner/51340?u=jcatanza
        # We implement the solution on the following line:
        data_lm = TextLMDataBunch.from_csv(path, 'texts.csv')
        error = False
        print(f'failure count is {count}\n')    
    except: # catch *all* exceptions
        # accumulate failure count
        count = count + 1
        print(f'failure count is {count}')

  return np.array(a, dtype=dtype, **kwargs)


failure count is 0

CPU times: user 411 ms, sys: 82.1 ms, total: 493 ms
Wall time: 35.8 s


## Create `databunch` object for use in a classifier object

In [11]:
%%time

# throws `BrokenProcessPool` Error sometimes. Keep trying `till it works!
count = 0
error = True
while error:
    try: 
        # Create the databunch for the classifier model
        data_clas = TextClasDataBunch.from_csv(path, 'texts.csv', vocab=data_lm.train_ds.vocab, bs=32)
        error = False
        print(f'failure count is {count}\n')    
    except: # catch *all* exceptions
        # accumulate failure count
        count = count + 1
        print(f'failure count is {count}')

  return np.array(a, dtype=dtype, **kwargs)


failure count is 0

CPU times: user 396 ms, sys: 82.5 ms, total: 479 ms
Wall time: 35.7 s


  return array(a, dtype, copy=False, order=order)


## Save the `databunch` objects for use later

In [12]:
data_lm.save('data_lm_export.pkl')
data_clas.save('data_cls_export.pkl')

In [13]:
type(data_lm)

fastai.text.data.TextLMDataBunch

## Load the `databunch` objects for use

In [14]:
data_lm = load_data(path,'data_lm_export.pkl',bs = 32)
data_clas = load_data(path, 'data_cls_export.pkl',bs =32)

  return array(a, dtype, copy=False, order=order)


# IMDb review 'writer': Build and fine-tune a language model

The idea is to create a learner out of a pre-trained model and then push through it our data_lm object. Later, we fine-tune a hyperparameters.

Using fastai, we will download a model with an AWD_LSTM architecture with it's pretrained weights. Train it on the available data and then fine-tune.

### Set up to use the GPU

In [15]:
torch.cuda.set_device(0)

In [16]:
# torch.cuda??

## Build the IMDb language model

This language model has been pre-trained on wikitext with hypertrained weights

In [17]:
learn = language_model_learner(data_lm,AWD_LSTM,drop_mult=.5)

Downloading https://s3.amazonaws.com/fast-ai-modelzoo/wt103-fwd.tgz


In [18]:
# language_model_learner??  

## Start training

Now that the language model object has been created, we will train it on our own data. As the first step, the model by default unfreezes the final layer, using the weights of the pre-trained set up for the initial layer.

Once done, we will then unfreeze all the layers with differing learning rates in the first and final; with first layer learning rates set to low.

In [19]:
learn.fit_one_cycle(1,1e-2)

epoch,train_loss,valid_loss,accuracy,time
0,4.320063,3.903686,0.284298,00:08


## Unfreeze the weights and train all the layers
Here we will be unfreezing all the layers while feeding the IMDB reviews. It is to be remembered, that given the model was trained on wiki-text the initial layers are already well optimised. Therefore, when learning with new data it is good to slow the learning rate of the earlier layers.

In [20]:
learn.unfreeze()
learn.fit_one_cycle(5,slice(1e-4,1e-2))

epoch,train_loss,valid_loss,accuracy,time
0,3.933365,3.866877,0.287485,00:10
1,3.78528,3.905602,0.283836,00:10
2,3.367502,3.95769,0.281065,00:10
3,2.86502,4.035101,0.277602,00:10
4,2.506804,4.093227,0.27637,00:10


In [21]:
# learn.fit_one_cycle??

In [22]:
learn.predict("This is a review about",n_words=10)

'This is a review about " The Answer " , a remake of'

In [25]:
learn.predict("When it comes to direction", n_words=100)

"When it comes to direction Mr. Head is a delicate . He works with the nuns who came to his office to handle the assassination attempt . He works for the dream bank and , despite some designs of art , not more than flawless . As an genius piece Hollywood 's co - funniest and every otherwise atmospheric film seems somewhat charming . i do n't know if this film was made for TV even though it was when it was shown on television . But if more than two other movies have been made for"

### Save the encoder

As seen above, the quality of the predicted text isn't great. This is because the model has been trained on a very limited number of IMDb reviews. As the training corpus increases, this is expected to improve.

For now, we will save the encoder to be able to use it for the classification task.

In [26]:
learn.save('mini_imdb_language_model')
learn.save_encoder('mini_imdb_language_model_encoder')


## Building the review sentiment classifier

In [33]:
learn = text_classifier_learner(data_clas,AWD_LSTM,drop_mult=.5).to_fp16()

# We use mixed precision (.to_fp16())for greater speed, smaller memory footprint, and a regularizing effect.

In [34]:
learn.load_encoder('mini_imdb_language_model_encoder')

RNNLearner(data=TextClasDataBunch;

Train: LabelList (799 items)
x: TextList
xxbos xxup the xxup shop xxup around xxup the xxup corner is one of the xxunk and most feel - good romantic comedies ever made . xxmaj there 's just no getting around that , and it 's hard to actually put one 's feeling for this film into words . xxmaj it 's not one of those films that tries too hard , nor does it come up with the xxunk possible scenarios to get the two protagonists together in the end . xxmaj in fact , all its charm is xxunk , contained within the characters and the setting and the plot ... which is highly believable to xxunk . xxmaj it 's easy to think that such a love story , as beautiful as any other ever told , * could * happen to you ... a feeling you do n't often get from other romantic comedies , however sweet and heart - warming they may be . 
 
  xxmaj alfred xxmaj xxunk ( xxmaj james xxmaj stewart ) and xxmaj xxunk xxmaj xxunk ( xxmaj margaret xxmaj xxunk ) do n't have the most xxun

In [56]:
data_clas.show_batch()


  return array(a, dtype, copy=False, order=order)


text,target
"xxbos xxmaj raising xxmaj victor xxmaj vargas : a xxmaj review \n \n xxmaj you know , xxmaj raising xxmaj victor xxmaj vargas is like sticking your hands into a big , steaming bowl of xxunk . xxmaj it 's warm and gooey , but you 're not sure if it feels right . xxmaj try as i might , no matter how warm and gooey xxmaj raising xxmaj",negative
"xxbos xxup the xxup shop xxup around xxup the xxup corner is one of the xxunk and most feel - good romantic comedies ever made . xxmaj there 's just no getting around that , and it 's hard to actually put one 's feeling for this film into words . xxmaj it 's not one of those films that tries too hard , nor does it come up with",positive
"xxbos xxmaj now that xxmaj che(2008 ) has finished its relatively short xxmaj australian cinema run ( extremely limited xxunk screen in xxmaj xxunk , after xxunk ) , i can xxunk join both xxunk of "" xxmaj at xxmaj the xxmaj movies "" in taking xxmaj steven xxmaj soderbergh to task . \n \n xxmaj it 's usually satisfying to watch a film director change his style /",negative
"xxbos xxmaj this film sat on my xxmaj xxunk for weeks before i watched it . i dreaded a self - indulgent xxunk flick about relationships gone bad . i was wrong ; this was an xxunk xxunk into the screwed - up xxunk of xxmaj new xxmaj yorkers . \n \n xxmaj the xxunk is the same as xxmaj max xxmaj xxunk ' "" xxmaj la xxmaj ronde",positive
"xxbos xxmaj many neglect that this is n't just a classic due to the fact that it 's the first xxup 3d game , or even the first xxunk - up . xxmaj it 's also one of the first stealth games , one of the xxunk definitely the first ) truly claustrophobic games , and just a pretty well - rounded gaming experience in general . xxmaj with graphics",positive


### Training final layer

In [35]:
learn.fit_one_cycle(1,1e-2)

epoch,train_loss,valid_loss,accuracy,time
0,0.57449,0.558551,0.741294,00:04


  return array(a, dtype, copy=False, order=order)


### Unfreeze all the layers

In [37]:
learn.unfreeze()
learn.fit_one_cycle(3,slice(1e-4,1e-2))

epoch,train_loss,valid_loss,accuracy,time
0,0.440874,0.591832,0.761194,00:10
1,0.406676,0.395373,0.825871,00:11
2,0.324262,0.385808,0.840796,00:11


  return array(a, dtype, copy=False, order=order)
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  return array(a, dtype, copy=False, order=order)
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:


### Test the classifier

In [38]:
learn.predict('Although Joker as a movie had it shock moments, much of it felt manufactured')

(Category tensor(1), tensor(1), tensor([0.4201, 0.5799]))

In [45]:
learn.unfreeze()
learn.fit_one_cycle(5, slice(1e-4, 1e-2))


epoch,train_loss,valid_loss,accuracy,time
0,0.094509,0.567058,0.825871,00:10
1,0.118626,0.757057,0.79602,00:11
2,0.13086,0.503329,0.835821,00:11
3,0.112839,0.547296,0.845771,00:10
4,0.082546,0.558863,0.845771,00:10


  return array(a, dtype, copy=False, order=order)
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  return array(a, dtype, copy=False, order=order)
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:
  if p.grad is not None:


In [55]:
learn.predict('the joker is an incredible film')

(Category tensor(1), tensor(1), tensor([2.1867e-04, 9.9978e-01]))