# GPT-2 Fine-Tuning for the PPP project
This is a simplified script for fine-tuning GPT2 using Hugging Face's Transformers library, PyTorch, and the "eli5" dataset from Hugging Face's datasets library.

### Setup and installing the needed libraries :

In [16]:
pip install torch torchvision --user

Collecting torchvision
  Using cached torchvision-0.15.1-cp310-cp310-win_amd64.whl (1.2 MB)
Installing collected packages: torchvision
Successfully installed torchvision-0.15.1
Note: you may need to restart the kernel to use updated packages.


In [1]:
# The Transformers library provides a wide range of pre-trained models, tokenizers, and utilities for NLP tasks such as text classification, question-answering, and language generation.
!pip install transformers



In [2]:
# The Datasets library provides access to a large collection of public datasets for NLP tasks.
!pip install datasets

Collecting datasets
  Downloading datasets-2.11.0-py3-none-any.whl (468 kB)
     ------------------------------------ 468.7/468.7 kB 863.4 kB/s eta 0:00:00
Collecting multiprocess
  Downloading multiprocess-0.70.14-py310-none-any.whl (134 kB)
     ------------------------------------ 134.3/134.3 kB 880.0 kB/s eta 0:00:00
Collecting responses<0.19
  Downloading responses-0.18.0-py3-none-any.whl (38 kB)
Collecting aiohttp
  Downloading aiohttp-3.8.4-cp310-cp310-win_amd64.whl (319 kB)
     ------------------------------------ 319.8/319.8 kB 793.9 kB/s eta 0:00:00
Collecting huggingface-hub<1.0.0,>=0.11.0
  Downloading huggingface_hub-0.13.4-py3-none-any.whl (200 kB)
     ------------------------------------ 200.1/200.1 kB 871.1 kB/s eta 0:00:00
Collecting pyarrow>=8.0.0
  Downloading pyarrow-11.0.0-cp310-cp310-win_amd64.whl (20.6 MB)
     -------------------------------------- 20.6/20.6 MB 861.4 kB/s eta 0:00:00
Collecting xxhash
  Downloading xxhash-3.2.0-cp310-cp310-win_amd64.whl (30 kB

In [2]:
# Importing classes and functions from the Transformers library.
from transformers import GPT2LMHeadModel,  GPT2Tokenizer, GPT2Config
from transformers import AdamW, get_linear_schedule_with_warmup

GPT2LMHeadModel: a pre-trained GPT-2 model for language generation and completion.
GPT2Tokenizer: a tokenizer for the GPT-2 model
GPT2Config: a configuration class for the GPT-2 model.
AdamW: an optimizer for training neural networks with weight decay.
get_linear_schedule_with_warmup: a function that generates a learning rate schedule with warmup for training neural networks.

In [13]:
# Load the GPT tokenizer.
tokenizer = GPT2Tokenizer.from_pretrained('gpt2', bos_token='<|startoftext|>', eos_token='<|endoftext|>', pad_token='<|pad|>')
# Instantiate a configuration for the model, but it's not really needed.
configuration = GPT2Config.from_pretrained('gpt2', output_hidden_states=False)
# Instantiate the pre-trained model.
model = GPT2LMHeadModel.from_pretrained("gpt2", config=configuration)

# Resizes the model's token embeddings Matrix (Matrix of tokens IDs) to match the size of the tokenizer's vocabulary.
model.resize_token_embeddings(len(tokenizer))

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Embedding(50259, 768)

In [22]:
import torch
# Tell pytorch to run this model on the GPU for faster and more efficient computation of deep learning models. [Change the runtime type]
device = torch.device("cuda")
model.cuda()

AssertionError: Torch not compiled with CUDA enabled

In [10]:
model.device

device(type='cpu')

In [21]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
try:
    model = model.to(device)
except RuntimeError as e:
    print(f"Failed to move the model to device {device}: {e}")