# 4.2 How to use GPT

GPT (Generative Pretrained Transformer) is a model trained to generate text given a preceding input (Brown et al 2020) It can do this repetitively up to a certain length, likewise generating short stories.

Another generative model is T5 (Text to Text Transfer Transformer), which models many tasks as text generation tasks (Raffel et al. 2019).

In this notebook, we look into an older model GPT2, which is smaller and publicly available.

### References

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 2020.

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2019. Exploring the limits of transfer learning with a unified text-to-text trans- former. arXiv preprint arXiv:1910.10683.

We use the **simpletransformer** package to download the model. This may take a while. While the model loads, you may read the documentation on:

https://simpletransformers.ai



In [4]:
from simpletransformers.language_generation import (
    LanguageGenerationModel,
)

model = LanguageGenerationModel(
    "gpt2", "gpt2", use_cuda=False
)

Once you succesfully downloaded it, it is saved on disk in cache for futher use. The next time you load the model it will be faster from disk.

Because we loaded the model through the **simpletransformer** class LanguageGenerationModel, we can use the function **generate** directly. Without a prompt, it will *prompt* you for input, with a prompt, will respond by completing the text.

In [5]:
model.generate()

Model prompt >>>  Hi how are you?


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


["Hi how are you? Well, you're good… you're kind of like that old school big brother. And there's"]

## GPT2 for other languages than English

Building a GPT model from scratch is costly. You not only need a lot of data but also computer power to create such a model. An interesting alternative is to only train the vocabulary part of a model for a language and to keep the hidden layers of the English model for the contextual attention relations and capability to predict the next token embeddings.

This is what was done by de Vries and Nissime (2021) from Groningen University for Dutch and Italian. You can read the paper for more details.

References:

de Vries, Wietse, and Malvina Nissim. "As good as new. How to successfully recycle English GPT-2 to make models for other languages." arXiv preprint arXiv:2012.05628 (2020). https://aclanthology.org/2021.findings-acl.74.pdf

See also: https://github.com/wietsedv/gpt2-recycle


We can download the models from Huggingface as we did for the English GPT2 and generate a Dutch and Italian short story from a prompt.

In [2]:
from transformers import pipeline


dutchGpt2pipe = pipeline("text-generation", model="GroNLP/gpt2-small-dutch")
print(dutchGpt2pipe('Was ik maar een klein'))


[{'generated_text': "Was ik maar een kind geweest?'\nIk keek haar in de ogen. 'Wat is er met je gebeurd? Ik heb mijn ouders verloren.'\nZe schudde heftig haar hoofd. 'Nee, we hadden het niet kunnen hebben gedaan,' zei ze terwijl ze naar achteren liep om te voorkomen dat zijn gezicht zou gaan bevriezen. 'Je hebt geen idee waar jij mee bezig was en wat hij ervan had weerhouden me zo snel mogelijk hiernaartoe te komen...'\n'We wilden allebei weten wie die vent achter ons aan"}]


In [3]:
 print(dutchGpt2pipe('Was ik maar een klein'))

[{'generated_text': 'Was ik maar een klein kind.\'\nHij knikte. Hij had haar niet verteld wat ze moest doen om hem te beschermen, en er was geen bewijs dat hij ooit in de buurt van zijn huis zou blijven rondhangen. \'En dan is het nog steeds zo moeilijk als je me wilt vragen waarom jij dit allemaal hebt gedaan?\'\nHaar ogen vulden zich met tranen omdat ze niets kon zeggen over die vreselijke waarheid. De woorden waren al bijna twee weken uit hun hoofd geweest: "Je kunt mijn leven ruïneren'}]


In [5]:
italianGpt2pipe = pipeline("text-generation", model="GroNLP/gpt2-small-italian")

print(italianGpt2pipe('Uno bambino picolo'))

## End of notebook