<a href="https://colab.research.google.com/github/lkarjun/fastai-workouts/blob/master/Using_HF_Transformers.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### libraries

In [1]:
!pip install -Uqq fastai transformers

[K     |████████████████████████████████| 189 kB 5.3 MB/s 
[K     |████████████████████████████████| 3.5 MB 43.1 MB/s 
[K     |████████████████████████████████| 56 kB 4.1 MB/s 
[K     |████████████████████████████████| 596 kB 42.5 MB/s 
[K     |████████████████████████████████| 67 kB 5.0 MB/s 
[K     |████████████████████████████████| 895 kB 47.0 MB/s 
[K     |████████████████████████████████| 6.8 MB 31.2 MB/s 
[?25h

In [6]:
from fastai.text.all import *

### importing a transformers pretained model

In [7]:
from transformers import GPT2LMHeadModel, GPT2TokenizerFast

In [8]:
pretrained_weights = 'gpt2'

tokenizer = GPT2TokenizerFast.from_pretrained(pretrained_weights)
model = GPT2LMHeadModel.from_pretrained(pretrained_weights)

In [9]:
ids = tokenizer.encode("This is an example.")
ids

[1212, 318, 281, 1672, 13]

In [10]:
tokenizer.decode(ids)

'This is an example.'

In [11]:
t = torch.LongTensor(ids)[None]

preds = model.generate(t)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [12]:
preds.shape, preds[0]

(torch.Size([1, 20]),
 tensor([1212,  318,  281, 1672,   13,  198,  198,  464,  717, 1517,  284,  466,
          318,  284, 2251,  257,  649, 2393, 1444,  366]))

In [13]:
tokenizer.decode(preds[0].numpy())

'This is an example.\n\nThe first thing to do is to create a new file called "'

### Briding the gap with fastai

In [14]:
path = untar_data(URLs.WIKITEXT_TINY)
path.ls()

(#2) [Path('/root/.fastai/data/wikitext-2/train.csv'),Path('/root/.fastai/data/wikitext-2/test.csv')]

In [15]:
df_train = pd.read_csv(path/'train.csv', header=None)
df_valid = pd.read_csv(path/'test.csv', header=None)

all_text = np.concatenate([df_train[0].values, df_valid[0].values])

In [16]:
class TransformersTokenizer(Transform):
  
  def __init__(self, tokenizer):
    self.tokenizer = tokenizer
  
  def encodes(self, x):
    toks = self.tokenizer.tokenize(x)
    toks = self.tokenizer.convert_tokens_to_ids(toks)
    return tensor(toks)

  def decodes(self, x):
    text = self.tokenizer.decode(x.cpu().numpy())
    text = TitledStr(text)
    return text

In [17]:
splits = [range_of(df_train), list(range(len(df_train), len(all_text)))]

In [18]:
tls = TfmdLists(all_text, 
                TransformersTokenizer(tokenizer), 
                splits = splits,
                dl_type = LMDataLoader
                )

Token indices sequence length is longer than the specified maximum sequence length for this model (4576 > 1024). Running this sequence through the model will result in indexing errors


In [19]:
tls.train[0]

tensor([220, 198, 796,  ..., 198, 220, 198])

In [20]:
tls.tfms(tls.train.items[0]).shape, tls.tfms(tls.valid.items[0]).shape

(torch.Size([4576]), torch.Size([1485]))

In [21]:
show_at(tls.train, 0)

 
 = 2013 – 14 York City F.C. season = 
 
 The 2013 – 14 season was the <unk> season of competitive association football and 77th season in the Football League played by York City Football Club, a professional football club based in York, North Yorkshire, England. Their 17th @-@ place finish in 2012 – 13 meant it was their second consecutive season in League Two. The season ran from 1 July 2013 to 30 June 2014. 
 Nigel Worthington, starting his first full season as York manager, made eight permanent summer signings. By the turn of the year York were only above the relegation zone on goal difference, before a 17 @-@ match unbeaten run saw the team finish in seventh @-@ place in the 24 @-@ team 2013 – 14 Football League Two. This meant York qualified for the play @-@ offs, and they were eliminated in the semi @-@ final by Fleetwood Town. York were knocked out of the 2013 – 14 FA Cup, Football League Cup and Football League Trophy in their opening round matches. 
 35 players made at least

In [34]:
bs, sl = 20, 500

dls = tls.dataloaders(bs = bs, seq_len = sl)

In [35]:
dls.show_batch(dataloader = dls, max_n = 1)

Unnamed: 0,text,text_
0,"\n = Polish culture during World War II = \n \n Polish culture during World War II was suppressed by the occupying powers of Nazi Germany and the Soviet Union, both of whom were hostile to Poland's people and cultural heritage. <unk> aimed at cultural <unk> resulted in the deaths of thousands of scholars and artists, and the theft and destruction of <unk> cultural artifacts. The "" <unk> of the Poles was one of many ways in which the Nazi and Soviet regimes had grown to resemble one another "", wrote British historian Niall Ferguson. \n The <unk> looted and destroyed much of Poland's cultural and historical heritage, while <unk> and murdering members of the Polish cultural elite. Most Polish schools were closed, and those that remained open saw their <unk> altered significantly. \n Nevertheless, underground organizations and individuals – in particular the Polish Underground State – saved much","\n = Polish culture during World War II = \n \n Polish culture during World War II was suppressed by the occupying powers of Nazi Germany and the Soviet Union, both of whom were hostile to Poland's people and cultural heritage. <unk> aimed at cultural <unk> resulted in the deaths of thousands of scholars and artists, and the theft and destruction of <unk> cultural artifacts. The "" <unk> of the Poles was one of many ways in which the Nazi and Soviet regimes had grown to resemble one another "", wrote British historian Niall Ferguson. \n The <unk> looted and destroyed much of Poland's cultural and historical heritage, while <unk> and murdering members of the Polish cultural elite. Most Polish schools were closed, and those that remained open saw their <unk> altered significantly. \n Nevertheless, underground organizations and individuals – in particular the Polish Underground State – saved much of"


##### Fine Tuning the model

In [26]:
# Hugging Face Model return a tuple, one is the actual predition and some additional activations

class FixHFOutput(Callback):
    def after_pred(self): self.learn.pred = self.pred[0]

In [36]:
learn = Learner(dls, model, 
                loss_func=CrossEntropyLossFlat(), 
                cbs = [FixHFOutput],
                metrics=Perplexity()).to_fp16()

In [31]:
learn.validate()

(#2) [4.217128276824951,67.83839416503906]

In [37]:
x, y = learn.dls.valid.one_batch()
rslt = model(x[0])
print(f"Hugging Face model return a tuple {len(rslt)}, type: {type(rslt)}")
print(f"prediction result shape {rslt[0].shape}")

Hugging Face model return a tuple 2, type: <class 'transformers.modeling_outputs.CausalLMOutputWithCrossAttentions'>
prediction result shape torch.Size([500, 50257])


In [None]:
learn.lr_find(suggest_funcs=[minimum, steep, valley, slide])

In [None]:
learn.fit_one_cycle(1, 1e-4)