# How I served a simple proverb generating Neural Network using PyTorch, fast.ai, and Flask

## The Problem: How could I share my friendly AI with the world?

I'd just finished training my shiny new neural network on about ~10000 proverbs, and it was working! My very first solo AI language model, ready to inspire humanity. I was feeling great! 

But my neural net lived only in my Jupyter notebook, not ideal for sharing with friends & fam. How could I serve up my PyTorch model for the web? To the Google!

### Solution 1: Convert from PyTorch to Tensorflow via ONNX and then Serve (Boo!)
At first, I thought it would be easiest to export and convert my model via ONNX over to Tensorflow, after which I would serve it up using Tensorflow.js. It turned out (at the time, at least) that exporting PyTorch models with ONNX had some [limitations](https://pytorch.org/docs/stable/onnx.html#limitations) which applied to me. Okay, so I'd have to stick with PyTorch. (In the end, I'm glad I did!)

### Solution 2:  Use Python to create an API for my model (Winner!)
I would just have to find a Python library that I could use to create a simple web API, and then a place to host it. I'd played around a bit with Django, but from what I'd read, it might be a bit heavy for the task at hand. I knew that Flask was a light-weight alternative, and some quick research turned up a trove of tutorials on how to create a REST API using Flask--***Bingo!***

## Horse Before Cart
So I had my general solution mapped out:
`[Flask API w/ Model] <---> [Web App]`

Simple enough! But before I start crafting the Python file that would become my API, I needed to make sure that my language model could infer on a **cpu-only** machine, for affordability's sake. Basically, any `cuda()` tensors would need to be converted to `cpu()` tensors. This proved to be my biggest headache, one where I had to babystep my way toward the solution.

## Architecture & Weights: Gotta Keep 'em Separated

Exporting the model came with a soft rule: **don't save & export the whole trained model.** It's possible to do so, but not recommended. Instead, only the model weights, or *state dictionary*, should be saved and then loaded.

In summary, three things were required before I could ask the model to deliver some sweet sweet wisdom:
1. On the GPU training machine: export *only* the weights--without the architecture--from the trained model
2. On the CPU inference machine: define the architecture of the language model
3. Load the weights I saved from the trained model into this architecture 

### Saving the State Dictionary, or 'Weights'
One of the problems I had at first was that I trying to export the entire model, architecture and all, using `torch.save()` instead of exporting only the weights (what the model 'learned') via `torch.save(model.state_dict(), "./model")`.

So, for those who've got a PyTorch model trained and ready to be saved, you can do what I did and run:
`torch.saved(my_trained_nlp_model.state_dict(), "./nlp_model_dict")`

My model's weights were saved. Glorious! But I wasn't quite ready to import them. FIrst, I had to define the LSTM model architecture and prepare it to receive the weights.

## Calling in the Cavalry

Because I'd used them to train my neural net, I needed fastai and torchtext for inference. Installing Conda and fastai took a while, but I ended up with a nice environment ready to go.

fast.ai installation instructions here: [https://github.com/fastai/fastai](https://github.com/fastai/fastai)

With that out of the way, I created my `zeno.py` file and imported some tools:

In [None]:
from fastai.nlp import *
from fastai.lm_rnn import *
from fastai import sgdr
from torchtext import vocab, data

Next I defined the class for the LSTM model, exactly as it was defined when training the model.

*You can see the explanation for this in lessons 6 & 7 of fast.ai where you get to create various types of RNNs from scratch. LSTM is the final type of neural net in the lesson.*

In [27]:
class CharSeqStatefulLSTM(nn.Module):
    def __init__(self, vocab_size, n_fac, bs, nl):
        super().__init__()
        self.vocab_size,self.nl = vocab_size,nl
        self.e = nn.Embedding(vocab_size, n_fac)
        self.rnn = nn.LSTM(n_fac, n_hidden, nl, dropout=0.5)
        self.l_out = nn.Linear(n_hidden, vocab_size)
        self.init_hidden(bs)
        
    def forward(self, cs, **kwargs):
        bs = cs[0].size(0)
        if self.h[0].size(1) != bs: self.init_hidden(bs)
        self.rnn.flatten_parameters()
        self.h = (self.h[0].cpu(), self.h[1].cpu())
        ecs = self.e(cs)
        outp,h = self.rnn(ecs, self.h)
        return F.log_softmax(self.l_out(outp), dim=-1).view(-1, self.vocab_size)
    
    def init_hidden(self, bs):
        self.h = (V(torch.zeros(self.nl, bs, n_hidden)),
                  V(torch.zeros(self.nl, bs, n_hidden)))

Paths to training data need to be set for when the NLP model data is defined below.

As far as I know, this is necessary because I need to build the language model's vocabulary before I can import my trained model.

In [None]:
PATH='data/proverbs/'
TRN_PATH = 'train/'
VAL_PATH = 'valid/'
TRN = PATH + TRN_PATH
VAL = PATH + VAL_PATH

In [None]:
PATH, TRN, VAL

In [None]:
TEXT = data.Field(lower=True, tokenize=list)
bs=64; bptt=8; n_fac=42; n_hidden=512

TEXT

In [None]:
FILES = dict(train=TRN_PATH, validation=VAL_PATH, test=VAL_PATH)
md = LanguageModelData.from_text_files(PATH, TEXT, **FILES, bs=bs, bptt=bptt, min_freq=3)

In [32]:
md

<fastai.nlp.LanguageModelData at 0x2048f234048>

In [33]:
m = CharSeqStatefulLSTM(md.nt, n_fac, 256, 2)


In [34]:
m.load_state_dict(torch.load(f'{PATH}models/gen_2_dict', map_location=lambda storage, loc: storage))


In [35]:
m = m.cpu()


In [36]:
m.eval()

CharSeqStatefulLSTM(
  (e): Embedding(59, 42)
  (rnn): LSTM(42, 512, num_layers=2, dropout=0.5)
  (l_out): Linear(in_features=512, out_features=59, bias=True)
)

In [39]:
def get_next(inp):
    idxs = TEXT.numericalize(inp, device=-1)
    pid = idxs.transpose(0,1)
    pid = pid.cpu()
    vpid = VV(pid)
    vpid = vpid.cpu()
    p = m(vpid)
    r = torch.multinomial(p[-1].exp(), 1)
    return TEXT.vocab.itos[to_np(r)[0]]

In [41]:
get_next('t')

't'

In [42]:
def get_next_n(inp, n):
    res = inp
    for i in range(n):
        c = get_next(inp)
        res += c
        inp = inp[1:]+c
        if c == '.': break
    return res

In [46]:
get_next_n('People are', 1000)

'People are good fools.'