<a href="https://colab.research.google.com/github/jonkrohn/NLP-with-LLMs/blob/main/code/DistilGPT2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# DistilGPT2

This notebook uses a compact LLM called *DistilGPT2* for a generative "Hello, World!"-type demonstration.

To make this quick and easy, we leverage `transformers`, a popular Python library for natural language processing (NLP) models (especially transformer architectures!) and tools.

In [None]:
# Install the transformers library if it isn't already
#!pip install transformers

The `pipeline` function allows you to easily use a pre-trained model for a specific NLP task, e.g.:
* `"sentiment-analysis"`
* `"ner"` (named-entity recognition)
* `"summarization"`
* `"translation_en_to_fr"`
* `"feature-extraction"`

In [1]:
from transformers import pipeline

We'll use [DistilGPT2](https://huggingface.co/distilbert/distilgpt2), a compact (82-million parameter) generative LLM, for `text-generation`.

More on **DistilGPT2**:
* English-language model
* Was pre-trained with the supervision (through a process called *knowledge distillation*) of the smallest (124-million parameter) version of [OpenAI's GPT-2](https://openai.com/research/better-language-models)
* Was designed to be a faster, lighter version of GPT-2

In [2]:
model = pipeline("text-generation", model="distilgpt2")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/762 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/353M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Way back in the GPT-2 days, "few-shot" prompting was often required to nudge an LLM to learn your particular task:

In [3]:
prompt = "The capital of China is Beijing. The capital of Germany is Berlin. The capital of France is"
output = model(prompt, max_new_tokens=2, num_return_sequences=1)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [4]:
print(output[0]['generated_text'])

The capital of China is Beijing. The capital of Germany is Berlin. The capital of France is Paris,


N.B.: You may need to re-run DistilGPT2 several times to get it to output `Paris`.