# State of the Art in NLP

|  | Tuesday 4-5:15pm | Friday  4-5:30pm |
|:------:|:-------------------------------------------:|:--------------------------------------------------------------------------:|
| **Week 1** | Introduction | Introduction |
| **Week 2** | Custom computer vision tasks | State of the art in Computer Vision |
| **Week 3** | Introduction to Tabular modeling and pandas | Pandas workshop and feature engineering |
| **Week 4** | Tabular and Image Regression | Feature importance and advanced feature  engineering |
| **Week 5** | Natural Language Processing | **State of the art in NLP** |
| **Week 6** | Segmentation and Kaggle | Audio |
| **Week 7** | Computer vision from scratch | NLP from scratch |
| **Week 8** | Callbacks | Optimizers |
| **Week 9** | Generative Adversarial Networks | Research time / presentations |
| **Week 10** | Putting models into production | Putting models into production |

Today will be run just like it was for last week, we will use the sample to run it within class time, but you should run the bottom part to see teh full extent.

Topics we will be looking at:

* Backwards models
* Combining models

# What is a backwards model?

It does exactly as you would think. It reads the langauge backwards. Let's show a databunch

In [0]:
from fastai.text import *

In [0]:
path = untar_data(URLs.IMDB_SAMPLE)

In [0]:
df = pd.read_csv(path/'texts.csv')

In [0]:
df_train = df[:800]
df_valid = df[800:]

In [0]:
data_lm_bwd = TextLMDataBunch.from_df(path, train_df = df_train, valid_df=df_valid, backwards=True)
data_lm_fwd = TextLMDataBunch.from_df(path, train_df = df_train, valid_df=df_valid, backwards=False)

In [0]:
data_lm_fwd.show_batch()

idx,text
0,"! ! ! xxmaj finally this was directed by the guy who did xxmaj big xxmaj xxunk ? xxmaj must be a replay of xxmaj jonestown - hollywood style . xxmaj xxunk ! xxbos xxmaj this is a extremely well - made film . xxmaj the acting , script and camera - work are all first - rate . xxmaj the music is good , too , though it is"
1,"first loved him and then later hated him because he was xxunk . xxmaj he tries to explain to us the reasons he did what he did , but it 's really really so hard to xxunk . xxmaj such sad and unusual self destruction . xxmaj was it supposed to be funny ? xxmaj what was it all about really ? xxbos ( aka : xxup blood xxup castle"
2,"sleeping was funny . xxmaj the xxunk bubble , and when xxmaj pumba leaves , the xxunk stop . xxmaj it 's all harmless fun , good for kids and some adults . i think this movie will last for a while because it is rather good for a straight to xxmaj video and xxup dvd movie . xxmaj while the movie does seem a little odd and kind of"
3,"most people have said was rooting for the homeless people to make it , specially the guy , he gave me a few cheap laughs here and there . i think this film could have really been something special instead it became what every other horror nowadays are ! xxmaj just boring and well not worth the money . \n \n if you are looking for a cheap scare"
4,"xxmaj the director is unable to resist showing the destruction of a major landmark ( xxmaj big xxmaj ben ) , but at least does n't dwell xxunk on the xxunk of xxmaj london . \n \n xxmaj the victory of the xxmaj martians is hardly a surprise , despite the destruction by xxunk of some of their machines . xxmaj the xxmaj narrator , traveling about to seek"


In [0]:
data_lm_bwd.show_batch()

idx,text
0,"convincing more make makeup without queens drag seen have i . completely fail they , now by guessed have will you as xxmaj . women white specific two as themselves disguise to required are brothers wayans xxmaj the however xxmaj . xxunk white to change a was script the by required was that all if enough bad be would it xxmaj . screen on put ever xxunk convincing least the"
1,"as ( xxunk xxmaj robert xxmaj xxunk which in ) 1932 ( ' zombie xxmaj white xxmaj ' film earlier , better brothers xxunk xxmaj the from themes basic its of lot a borrows story the xxmaj . ) 1958 ( ' creole xxmaj king xxmaj ' and ) 1956 ( ' unknown xxmaj the xxmaj x ' , ) xxunk ( ' christmas xxmaj white xxmaj ' in roles"
2,"late the in popular so became that formula horror teen stereotypical the takes , legend xxmaj urban xxmaj 's xxunk for responsible also was who , blanks xxmaj jamie xxmaj director xxmaj . horror / comedy a almost is valentine xxmaj . get to something is there but , this like film a get n't did i saying ridiculous sound might it xxmaj \n \n . "" a """
3,". end its toward moves it when it in quality redeeming some have does movie does enough surprisingly xxmaj xxbos . forever "" xxunk - xxunk "" of land the in lost being of consequences the suffer might she or fast , it find to needs she xxmaj . lost still is vampire xxmaj the with interview xxmaj in saw i that xxunk the , however character 's pegg xxmaj"
4,"boy every , beautiful is melissa xxmaj since xxmaj . week one in up coming is xxunk her and old years fifteen 's she , town in girl new a is melissa xxmaj = plot xxmaj xxbos . director the sack xxmaj . better been have could xxmaj . pity xxmaj . day rainy a on entertaining - xxunk is it : general xxmaj \n \n . watch to"


It's as simple as that! One model that goes forwards, one model that goes backwards. However this is now what Jeremy et al used to achieve state of the art results on the IMDB dataset. Now that's a lot of models to train, let's automate it

In [0]:
fwd = language_model_learner(data_lm_fwd, AWD_LSTM, drop_mult=0.3).to_fp16()
bwd = language_model_learner(data_lm_bwd, AWD_LSTM, drop_mult=0.3).to_fp16()

In [0]:
def train_models(fwd:Learner, bwd:Learner):
  models = [fwd, bwd]
  bs = fwd.data.batch_size
  names = ['forward', 'backward']
  x = 0
  for model in models:
    lr = 1e-2
    lr *= bs/48
    model.fit_one_cycle(1, lr, moms=(0.8, 0.7))
    model.unfreeze()
    model.fit_one_cycle(1, lr/10, moms=(0.8, 0.7))
    model.save(f'{names[x]}_fine_tuned_10')
    model.save_encoder(f'{names[x]}_fine_tuned_enc_10')
    x += 1

We're going to overfit terribly here, but just remember it's a small dataset. Normally we also want to run for many more epochs

In [0]:
train_models(fwd, bwd)

epoch,train_loss,valid_loss,accuracy,time
0,4.228386,3.871615,0.290223,00:04


epoch,train_loss,valid_loss,accuracy,time
0,3.755796,3.813086,0.295759,00:05


epoch,train_loss,valid_loss,accuracy,time
0,4.264483,3.918674,0.314375,00:04


epoch,train_loss,valid_loss,accuracy,time
0,3.787022,3.848752,0.32125,00:05


Great! Now we have our langauge model. Let's train the classifier now

In [0]:
data_cls_bwd = TextClasDataBunch.from_df(path, train_df = df_train, valid_df=df_valid, backwards=True, 
                                         vocab=data_lm_bwd.vocab)
data_cls_fwd = TextClasDataBunch.from_df(path, train_df = df_train, valid_df=df_valid, backwards=False,
                                         vocab=data_lm_fwd.vocab)

In [0]:
data_cls_bwd.show_batch()

text,target
"\n \n . grits of bowl a into hands my stick go to going 'm i , me excuse 'll you if so xxmaj ? him about film a like i can how ultimately so , him like n't do i way either xxmaj ' ? xxunk xxmaj - xxunk xxmaj xxunk my into get to want ju , bee - bay hey xxmaj ` , car his from",negative
". could possibly very we that feeling it from away come we that is film this about thing best the ; film this in depicted romance the have to as lucky so be all should we xxmaj . role his in impeccable is stewart xxmaj and , charming , funny , sweet 's it xxmaj . instead christmas xxmaj this corner xxup the xxup around xxup shop xxup the xxup",positive
") 10 / xxunk the about xxunk a just even know who all of hearts the break to bound is it xxmaj . costs all at avoided be to , film soderbergh xxmaj a for , shockingly -- misery irritating , confusing just is xxunk cinema interminable this of rest the xxmaj \n \n . subject 's film the of execution and capture -- tragic yet -- dramatic the",negative
". end xxup the xxup . other each at look they xxmaj . window the down xxunk , up drives movie the in earlier her pay n't did who contractor the xxmaj . business does normally she where corner the at xxunk and , street the down walking prostitute the see we and , off goes gun the , eventually xxmaj . aim to where her telling , mouth his",positive
"10 / 8 . games 3d xxup of fans all to this recommended i . appeared first he after years fifteen nearly ) ? especially even maybe xxunk , recognition the deserves he ... shoes 's xxunk xxup into step and bunker the enter to door the open , xxunk xxmaj the up load so xxmaj . one this to existence their owe , genre the of rest the as",positive


In [0]:
data_cls_fwd.show_batch()

text,target
"xxbos xxmaj raising xxmaj victor xxmaj vargas : a xxmaj review \n \n xxmaj you know , xxmaj raising xxmaj victor xxmaj vargas is like sticking your hands into a big , steaming bowl of xxunk . xxmaj it 's warm and gooey , but you 're not sure if it feels right . xxmaj try as i might , no matter how warm and gooey xxmaj raising xxmaj",negative
"xxbos xxup the xxup shop xxup around xxup the xxup corner is one of the sweetest and most feel - good romantic comedies ever made . xxmaj there 's just no getting around that , and it 's hard to actually put one 's feeling for this film into words . xxmaj it 's not one of those films that tries too hard , nor does it come up with",positive
"xxbos xxmaj now that xxmaj che(2008 ) has finished its relatively short xxmaj australian cinema run ( extremely limited xxunk screen in xxmaj sydney , after xxunk ) , i can xxunk join both xxunk of "" xxmaj at xxmaj the xxmaj movies "" in taking xxmaj steven xxmaj soderbergh to task . \n \n xxmaj it 's usually satisfying to watch a film director change his style /",negative
"xxbos xxmaj this film sat on my xxmaj xxunk for weeks before i watched it . i dreaded a self - indulgent xxunk flick about relationships gone bad . i was wrong ; this was an xxunk xxunk into the screwed - up xxunk of xxmaj new xxmaj yorkers . \n \n xxmaj the format is the same as xxmaj max xxmaj xxunk ' "" xxmaj la xxmaj ronde",positive
"xxbos xxmaj many neglect that this is n't just a classic due to the fact that it 's the first xxup 3d game , or even the first xxunk - up . xxmaj it 's also one of the first xxunk games , one of the xxunk definitely the first ) truly claustrophobic games , and just a pretty well - xxunk gaming experience in general . xxmaj with graphics",positive


In [0]:
learn_cls_fwd = text_classifier_learner(data_cls_fwd, AWD_LSTM, drop_mult=0.5).to_fp16()
learn_cls_bwd = text_classifier_learner(data_cls_bwd, AWD_LSTM, drop_mult=0.5).to_fp16()

In [0]:
learn_cls_fwd.load_encoder('forward_fine_tuned_enc_10')
learn_cls_bwd.load_encoder('backward_fine_tuned_enc_10')
learn_cls_fwd.freeze()
learn_cls_bwd.freeze()

Now here we are going to return our models, so we can ensemble them

In [0]:
def train_models(fwd:Learner, bwd:Learner):
  models = [fwd, bwd]
  bs = fwd.data.batch_size
  names = ['forward', 'backward']
  x = 0
  for model in models:
    lr = 1e-2
    lr *= bs/48
    model.fit_one_cycle(1, lr, moms=(0.8, 0.7))
    
    model.freeze_to(-2)
    model.fit_one_cycle(1, slice(lr/(2.6**4),lr), moms=(0.8,0.7))
    
    model.freeze_to(-3)
    model.fit_one_cycle(1, slice(lr/2/(2.6**4),lr/2), moms=(0.8,0.7))
    
    model.unfreeze()
    model.fit_one_cycle(2, slice(lr/10/(2.6**4),lr/10), moms=(0.8,0.7))
  return fwd, bwd

In [0]:
learn_cls_fwd, learn_cls_bwd = train_models(learn_cls_fwd, learn_cls_bwd)

epoch,train_loss,valid_loss,accuracy,time
0,0.583756,0.64921,0.755,00:03


epoch,train_loss,valid_loss,accuracy,time
0,0.485301,0.597441,0.635,00:03


epoch,train_loss,valid_loss,accuracy,time
0,0.393136,0.465417,0.815,00:04


epoch,train_loss,valid_loss,accuracy,time
0,0.278782,0.414369,0.81,00:05
1,0.247707,0.40936,0.81,00:05


epoch,train_loss,valid_loss,accuracy,time
0,0.609281,0.652833,0.595,00:03


epoch,train_loss,valid_loss,accuracy,time
0,0.576736,0.583028,0.695,00:03


epoch,train_loss,valid_loss,accuracy,time
0,0.458296,0.520996,0.715,00:04


epoch,train_loss,valid_loss,accuracy,time
0,0.386745,0.531537,0.725,00:06
1,0.365691,0.496964,0.76,00:05


Now let's ensemble!

# Ensemble

In [0]:
preds_a, targs_a = learn_cls_fwd.get_preds(ordered=True)
preds_b, targs_b = learn_cls_bwd.get_preds(ordered=True)

In [0]:
accuracy(preds_a, targs_b), accuracy(preds_b, targs_b)

(tensor(0.8100), tensor(0.7600))

In [0]:
preds_avg = (preds_a + preds_b)/2

In [0]:
accuracy(preds_avg, targs_b)

tensor(0.8150)

See? We got some improvement!