<a href="https://colab.research.google.com/github/tanuja-2/ML_OPs-main/blob/main/LanguageModel_Playground.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Generation from a language model

This exercise gives a very quick introduction to the Huggingface Transformers library and how to use it to generate continuation for some given text.

First we install the library.


In [1]:
!pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.28.1-py3-none-any.whl (7.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.0/7.0 MB[0m [31m43.3 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m49.1 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.11.0
  Downloading huggingface_hub-0.14.1-py3-none-any.whl (224 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m224.5/224.5 kB[0m [31m21.7 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tokenizers, huggingface-hub, transformers
Successfully installed huggingface-hub-0.14.1 tokenizers-0.13.3 transformers-4.28.1


Next, we import some classes and instantiate a Tokenizer and a Model. 

Remember, a tokenizer is used to convert text (a sequence of words) to a sequence of integer IDs. 

"gpt2-large" is a model that is already trained to generate text continuations. You can try to replace it with other models like "gpt2" / "gpt2-xl" etc.

In [2]:
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("gpt2-large")
model = AutoModelForCausalLM.from_pretrained("gpt2-large")


Downloading (…)lve/main/config.json:   0%|          | 0.00/666 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/3.25G [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

In [3]:
model.to("cuda:0") # This moves the model to the GPU. This enables faster execution.

GPT2LMHeadModel(
  (transformer): GPT2Model(
    (wte): Embedding(50257, 1280)
    (wpe): Embedding(1024, 1280)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-35): 36 x GPT2Block(
        (ln_1): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2Attention(
          (c_attn): Conv1D()
          (c_proj): Conv1D()
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D()
          (c_proj): Conv1D()
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((1280,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=1280, out_features=50257, bias=False)
)

Let's now try to generate some text. 

`prompt`: contains the input text sequence for which we want to generate continuations

`input_ids`: the list of integer IDs that the `prompt` is mapped to.

`outputs`: What does this contain? 

In [6]:
import torch
prompt = "The woman cried for"
input_ids = tokenizer(prompt, return_tensors="pt").input_ids

input_ids = input_ids.to(model.device)


attention_mask = torch.ones(input_ids.size()).to(model.device)
outputs = model.generate(input_ids, attention_mask=attention_mask, num_beams=5, num_return_sequences=5, max_length=20)




tokenizer.batch_decode(outputs, skip_special_tokens=True)


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


['The woman cried for help.\n\n"I\'m sorry, I\'m sorry, I\'m sorry',
 'The woman cried for help.\n\n"I\'m sorry," the man said.\n\n"',
 'The woman cried for help.\n\n"I\'m sorry, I didn\'t mean to hurt you',
 'The woman cried for help.\n\n"I\'m sorry, I\'m sorry," the man said',
 'The woman cried for help.\n\n"I\'m sorry," the man said. "I\'m']