# Read Me

GPT-2 is a transformers model pretrained on a very large corpus of English data in a self-supervised fashion. This means it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it was trained to guess the next word in sentences.

More precisely, inputs are sequences of continuous text of a certain length and the targets are the same sequence, shifted one token (word or piece of word) to the right. The model uses internally a mask-mechanism to make sure the predictions for the token i only uses the inputs from 1 to i but not the future tokens.

This way, the model learns an inner representation of the English language that can then be used to extract features useful for downstream tasks. The model is best at what it was pretrained for however, which is generating texts from a prompt.

This is the smallest version of GPT-2, with 124M parameters.

note: The training data used for this model has not been released as a dataset one can browse. We know it contains a lot of unfiltered content from the internet, which is far from neutral. Because large-scale language models like GPT-2 do not distinguish fact from fiction, we don’t support use-cases that require the generated text to be true. Its Lambda acc is 45.99%.

Additionally, language models like GPT-2 reflect the biases inherent to the systems they were trained on, so we do not recommend that they be deployed into systems that interact with humans > unless the deployers first carry out a study of biases relevant to the intended use-case. We found no statistically significant difference in gender, race, and religious bias probes between 774M and 1.5B, implying all versions of GPT-2 should be approached with similar levels of caution around use cases that are sensitive to biases around human attribute

- Amirali Nassimis.

# Implement Libs

In [1]:
import torch
import numpy as np
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

# Load Models

In [2]:
# Initialize GPU device
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Load the tokenizer and model
model_name = 'gpt2'  # Replace with your model name
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).to(device)



# Generating texts

In [3]:
from langchain.schema import (
    SystemMessage,
    HumanMessage,
    AIMessage
)

messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="Hi AI, how are you today?"),
    AIMessage(content="I'm great thank you. How can I help you?"),
    HumanMessage(content="I'd like to understand string theory.")
]

# Prepare the prompt for GPT-2
conversation = ""
for message in messages:
    if isinstance(message, SystemMessage):
        conversation += f"System: {message.content}\n"
    elif isinstance(message, HumanMessage):
        conversation += f"Human: {message.content}\n"
    elif isinstance(message, AIMessage):
        conversation += f"AI: {message.content}\n"

# Add the final human message for which we want a response
conversation += "AI:"

In [None]:
def generate_response(prompt):
    # Tokenize the input prompt
    inputs = tokenizer(prompt, return_tensors='pt').to(device)
    
    # Generate output using the model
    outputs = model.generate(
        inputs.input_ids, 
        max_length=150, 
        num_return_sequences=1, 
        pad_token_id=tokenizer.eos_token_id
    )
    
    # Decode the output and return as a string
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Evaluation

In [4]:
response = generate_response(conversation)

print("AI Response:", response)

AI Response: System: You are a helpful assistant.
Human: Hi AI, how are you today?
AI: I'm great thank you. How can I help you?
Human: I'd like to understand string theory.
AI: I'm sorry, but I'm not sure what string theory is.
Human: I'm sorry, but I'm not sure what string theory is.
Human: I'm sorry, but I'm not sure what string theory is.
Human: I'm sorry, but I'm not sure what string theory is.
Human: I'm sorry, but I'm not sure what string theory is.
Human: I'm sorry, but I'm not sure what string theory is.
Human:
