<a href="https://colab.research.google.com/github/marianqian/Intro-to-ML-and-DL-Using-fast.ai/blob/master/notebooks/Lesson_7_Implementing_RNNs_with_fast_ai.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Welcome to the AI Academy! This is the seventh and last lesson, focused on learning how to use the fast.ai library to build recurrent neural networks (RNNs). We used [fastai](https://www.fast.ai/) in the Lesson 4, 5, and 6 Google Colab notebooks, so please make sure you understand how the data was loaded and processed and how normal, or fully connected, neural networks because we will be building upon that previous knowledge in this notebook.

You can learn more about fastai [here](https://docs.fast.ai/); the library is split between four different parts, which are vision, text, tabular, and collab models. fastai focuses on neural networks, and for the rest of the course we will be exploring how to use this library. 

The creater of fastai, Jeremy Howard, also taught a course explaining how to use the library and introduces deep learning to those who have no experience with it before. We highly recommend you to look at his videos linked [here](https://course.fast.ai/videos/?lesson=1) when you have the time. 

NOTE: Educational use and distribution is permitted, but credit and attribution to AIM Academy is required. 

#Learning Objectives: 
* Understand how to use recurrent neural networks (RNNs)
* Use the `fastai.text` section of the `fastai` library.


In this notebook, we will be basing off our code from the [fastai Lesson 7 Human numbers notebook](https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson7-human-numbers.ipynb). We will use the the `fastai.text` section of the `fastai` library, and we will train our recurrent neural network to count numbers. Given three words, the model will predict what word would come after. The training examples we give to our model are spelled out sequences of words from 1 to 9,999 (**"one , two , three , four , five ... eight thousand , eight thousand one ... nine thousand nine hundred ninety nine"**).

For example, if we give our model "ten , eleven , twelve", we want our model to predict "thirteen" as a result. 


```
#CODE BELOW IS FULLY CREDITED TO FASTAI (fast.ai)
#USED ONLY FOR EDUCATIONAL PURPOSES UNDER FAIR USE
```


In [0]:
from fastai.text import *

Here, we use `URLs.HUMAN_NUMBERS` to access the data we want to use for the RNN by passing it through the `untar_data`method. After calling `path.ls()`, we can see that the path to the images are from the `human_numbers` folder, and two files are located inside, `train.txt` and `valid.txt`. 

In [0]:
path = untar_data(URLs.HUMAN_NUMBERS)
path.ls()

Downloading http://files.fast.ai/data/examples/human_numbers


[PosixPath('/root/.fastai/data/human_numbers/train.txt'),
 PosixPath('/root/.fastai/data/human_numbers/valid.txt')]

We define the method `readnums` to read in the words of the `.txt` file into a list. 

In [0]:
def readnums(d): return [', '.join(o.strip() for o in open(path/d).readlines())]

In [0]:
train_txt = readnums('train.txt')
train_txt[0][:80]

'one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirt'

In [0]:
valid_txt = readnums('valid.txt')
valid_txt[0][-80:]

' nine thousand nine hundred ninety eight, nine thousand nine hundred ninety nine'

The lists (`train_txt` and `valid_txt`) each contain ONE STRING with all the words from the '.txt' files, and we index into the string by calling `[:80]` or `[-80:]` which gives us the first 80 characters from `train_txt` and the last 80 characters from `valid_txt` printed above.

The training dataset includes numbers from 1 to 8,000, and the validation set indcludes the rest of the numbers, 8,001 to 9,999. 

Using the `train_txt` and `valid_txt`lists, we can use the `TextList` class, a subclass of the `ItemList` class, to store the words. 

A `DataBunch` is created, where we pass in the `valid` and `train` TextLists. The DataBunch can be directly passed into the model we create later. 

Note how we set our batch size `bs` to be 64. 

In [0]:
train = TextList(train_txt, path=path)
valid = TextList(valid_txt, path=path)

data = ItemLists(path=path, train=train, valid=valid).label_for_lm().databunch(bs=64)

The validation set contains a total number of about 13,000 chracters. The backpropgation through time `bptt` is 70, which is essentially the number of tokens or characters we pass to the model for every batch. 

In [0]:
len(data.valid_ds[0][0].data)

13017

In [0]:
data.bptt, len(data.valid_dl)

(70, 3)

Run the following code blocks to see how the data and its labels are organized. 



In [0]:
v = data.valid_ds.vocab

In [0]:
it = iter(data.valid_dl)
x1,y1 = next(it)

In [0]:
v.textify(x1[0])

'xxbos eight thousand one , eight thousand two , eight thousand three , eight thousand four , eight thousand five , eight thousand six , eight thousand seven , eight thousand eight , eight thousand nine , eight thousand ten , eight thousand eleven , eight thousand twelve , eight thousand thirteen , eight thousand fourteen , eight thousand fifteen , eight thousand sixteen , eight thousand seventeen , eight'

In [0]:
v.textify(y1[0])

'eight thousand one , eight thousand two , eight thousand three , eight thousand four , eight thousand five , eight thousand six , eight thousand seven , eight thousand eight , eight thousand nine , eight thousand ten , eight thousand eleven , eight thousand twelve , eight thousand thirteen , eight thousand fourteen , eight thousand fifteen , eight thousand sixteen , eight thousand seventeen , eight thousand'

`x1` is one batch from the validation dataset. Notice how the labels for that batch are in `y1`, and that `y1` is essentially the words in `x1` but shifted one to the right. That is because the correct answer for any given word is the word following it. 

The fast.ai library does not yet include an RNN learner, but we can use PyTorch to create one. We create a class, called `Model`. Inside the model, we use the individual layers included in PyTorch. 

In [0]:
nv = len(v.itos)
nh=64
bs=64   #batch size
class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.i_h = nn.Embedding(nv,nh)
        self.rnn = nn.RNN(nh,nh, batch_first=True)
        self.h_o = nn.Linear(nh,nv)
        self.bn = BatchNorm1dFlat(nh)
        self.h = torch.zeros(1, bs, nh).cuda()
        
    def forward(self, x):
        res,h = self.rnn(self.i_h(x), self.h)
        self.h = h.detach()
        return self.h_o(self.bn(res))

Now we can create a `learner` object called `learn` by passing through the data we want to use and the model structure. We include `accuracy` as one of the metrics we want to print out. 

In [0]:
learn = Learner(data, Model(), metrics=accuracy)

We train our model for 30 epochs with a learning rate of 3e-3, which will change based off of the one cycle learning rate pattern. 

In [0]:
learn.fit_one_cycle(30, 3e-3)

epoch,train_loss,valid_loss,accuracy,time
0,3.780158,3.664972,0.03058,00:00
1,3.637743,3.427254,0.180208,00:00
2,3.400543,2.94113,0.424033,00:00
3,3.07592,2.417015,0.461384,00:00
4,2.733226,2.175397,0.467411,00:00
5,2.435847,2.212817,0.314583,00:00
6,2.203952,2.214181,0.315253,00:00
7,2.028279,2.183154,0.316443,00:00
8,1.895301,2.160851,0.317039,00:00
9,1.793393,2.171996,0.317188,00:00


Using a simple RNN model, we were able to reach an accuracy of 60.8% for predicting words! By increasing the number of layers in our model, we can increase our accuracy even more. 