In [None]:
from fastai.basics import *
from fastai.gen_doc.nbdoc import *

In [None]:
import fastai
from fastai.version import __version__
print(__version__)

## Text

Next application is text, so let's start by importing everything we'll need.

In [None]:
from fastai.text import *

### Language modelling

First we'll fine-tune a pretrained language model on our subset of imdb.

In [None]:
imdb = untar_data(URLs.IMDB_SAMPLE)

In [None]:
data_lm = (TextList.from_csv(imdb, 'texts.csv', cols='text')
                   .split_by_rand_pct()
                   .label_for_lm()  # Language model does not need labels
                   .databunch())
data_lm.save()

`data.show_batch()` will work here as well. For a language model, it shows us the beginning of each sequence of text along the batch dimension (the target being to guess the next word).

In [None]:
data_lm.show_batch()

In [None]:
# Special tokens
# xxbos: Begining of a sentence
# xxfld: Represent separate parts of a document like title, summary etc., each one will get a separate field and so they will get numbered (e.g. xxfld 1, xxfld 2).
# xxup: If there's something in all caps, it gets lower cased and a token called xxup will get added to it. Words that are fully capitalized, such as “I AM SHOUTING”, are tokenized as “xxup i xxup am xxup shouting“
# xxunk: token used instead of an uncommon word.
# xxmaj: token indicates that there is capitalization of the word. “The” will be tokenized as “xxmaj the“.
# xxrep: token indicates repeated word, if you have 29 ! in a row, (i.e. xxrep 29 !).

In [None]:
data_lm.vocab.itos[:20]

#### Numericalization
Finally it is easier for machine to deal with the numbers so replace the tokens with the location of the token in the vocab:

In [None]:
data_lm.train_ds[0][0].data[:10]

The default vocab size is set to 60,000 words and min count for a word to be added to vocab is 2, to avoid getting the weight matrix huge.

Now let's define a language model learner. drop_mult - a hyper-parameter ,used for regularization, sets the amount of dropout. If the model is over-fitting increase it, if under-fitting, you can decrease the number.

In [None]:
learn = language_model_learner(data_lm, AWD_LSTM)
learn.fit_one_cycle(4, 1e-2)
learn.save('mini_train_lm')
learn.save_encoder('mini_train_encoder')

Then we can have a look at the results. It shows a certain amount of words (default 20), then the next 20 target words and the ones that were predicted.

In [None]:
learn.show_results()

In [None]:
learn.predict('When I saw this movie the second time', 100)


In [None]:
learn.predict('As I was going up the stair I met a man who wasn\'t there.', 100)

Learning rate hyper-parameter is one of the most important parameters to train a model. Fast.ai provides a convenient utility (learn.lr_find) to search through a range of learning rates to find the optimum one for our dataset. Learning rate finder will increase the learning rate after each mini-batch. Eventually, the learning rate is too high that loss will get worse. Now look at the plot of learning rate against loss and determine the lowest point (around 1e-1 for the plot below) and go back by one magnitude and choose that as a learning rate (something around 1e-2).

### Classification

Now let's see a classification example. We have to use the same vocabulary as for the language model if we want to be able to use the encoder we saved.

In [None]:
data_clas = (TextList.from_csv(imdb, 'texts.csv', cols='text', vocab=data_lm.vocab)
                   .split_from_df(col='is_valid')
                   .label_from_df(cols='label')
                   .databunch(bs=42))

Here show_batch shows the beginning of each review with its target.

In [None]:
data_clas.show_batch()

And we can train a classifier that uses our previous encoder.

In [None]:
learn_cl = text_classifier_learner(data_clas, AWD_LSTM, drop_mult=0.5)
learn_cl.load_encoder('mini_train_encoder')
learn_cl.fit_one_cycle(6, slice(1e-3,1e-2))
learn_cl.save('mini_train_clas')

**Momentum**  
There is one more argument(moms=(0.8,0.7))﹣momentums equals 0.8,0.7. Basically for training recurrent neural networks (RNNs), it really helps to decrease the momentum a little bit.The right side above is the momentum plot. Every time our learning rate is small, our momentum is high. Why is that? Because as you are learning small learning rate, but you keep going in the same direction, you may as well go faster (higher momentum). But as you are learning high learning rate, but you keep going in the same direction, you may overshoot the target, so momentum should be slowed. This trick can help you train 10 times faster.

In [None]:
learn_cl.fit_one_cycle(2, slice(1e-3,1e-2), moms=(0.8, 0.7))

In [None]:
learn_cl.recorder.plot_losses()

In [None]:
learn_cl.show_results()

In [None]:
preds, y, losses = learn_cl.get_preds(with_loss=True)
interp = ClassificationInterpretation(learn_cl, preds, y, losses)
interp.plot_confusion_matrix()

### Excercise
Make sentiment predictions for 5 movie reviews from IMDB web site.

In [None]:
learn_cl.predict("Very beautiful and cinematic movie with lots of classic scenes.Also extremely sad at times.Absolute 90's classic.")

In [None]:
learn_cl.predict("Worst movie of the century. A MUST see movie if you have a sleeping disorder.The money spent to make this movie could feed millions of starving children.John Cameron maybe an okay director but cannot hold a candle to Robert Wise.")

In [None]:
learn_cl.predict("I thought this movie extremely disappointing. The characters were not very motivating, and the acting was horrible. I was not convinced of their roles at all. The story line was also very poor and unrealistic. I would not recommend this movie to anyone, instead I suggest the older version made in 1953. Although it isn't recent, the story is very moving and the ending quite touching.")

In [None]:
learn_cl.predict("This one really draws you in. Keeps you thinking and emotionally engaged throughout the entire film. Great plot, relevant and terrific acting. Nuff said. Watch it.")

In [None]:
learn_cl.predict("Watch this movie, get a stimulation of sense of justice so that you keep quiet for another decade.")