## Install Required Libraries

In [1]:
!pip install transformers torch

Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.1.1[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


## Download and Save the GPT Model

In [2]:
from transformers import GPT2LMHeadModel, GPT2Tokenizer

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
# Specify the model name
model_name = "gpt2"

In [4]:
# Load the pre-trained GPT-2 model and tokenizer from Hugging Face
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)



In [5]:
# Save the model and tokenizer locally
model_save_path = "./gpt2/"
tokenizer.save_pretrained(model_save_path)
model.save_pretrained(model_save_path)

## Load the Model Locally and Run

In [6]:
# Load the saved model and tokenizer from the local path
local_tokenizer = GPT2Tokenizer.from_pretrained(model_save_path)
local_model = GPT2LMHeadModel.from_pretrained(model_save_path)

In [9]:
def generate_text(prompt, max_length=50):
    inputs = local_tokenizer.encode(prompt, return_tensors="pt")

    # Set the pad_token_id to the eos_token_id to suppress the warning
    outputs = local_model.generate(
        inputs, 
        max_length=max_length, 
        num_return_sequences=1,
        pad_token_id=local_tokenizer.eos_token_id
    )
    
    return local_tokenizer.decode(outputs[0], skip_special_tokens=True)


In [10]:
# Test the model
prompt = "Once upon a time"
generated_text = generate_text(prompt)
print(generated_text)

Once upon a time, the world was a place of great beauty and great danger. The world was a place of great danger, and the world was a place of great danger. The world was a place of great danger, and the world was a


In [11]:
prompt = "What is the capital of France?"
generated_text = generate_text(prompt)
print(generated_text)

What is the capital of France?

The capital of France is Paris.

The capital of France is Paris.

The capital of France is Paris.

The capital of France is Paris.

The capital of France is


In [13]:
prompt = "Can you meet me tomorrow ? Yes I can meet"
generated_text = generate_text(prompt)
print(generated_text)

Can you meet me tomorrow? Yes I can meet you tomorrow. I am going to meet you tomorrow. I am going to meet you tomorrow. I am going to meet you tomorrow. I am going to meet you tomorrow. I am going to meet


In [14]:
generate_text("Aslam o Alikm !")

'Aslam o Alikm!\n\nThe Prophet (peace and blessings of Allaah be upon him) said, "O you who believe! Do not believe in the Messenger of Allaah (peace and blessings of Allaah be upon him) and do not'

In [15]:
generate_text("Would you play CoD ?")

'Would you play CoD?\n\nI would play CoD. I would play CoD. I would play CoD. I would play CoD. I would play CoD. I would play CoD. I would play CoD.'

In [17]:
import re

def question_answer(prompt):
    # Gen Response
    response = generate_text(prompt)

    # Remove the prompt from the response
    cleaned_response = response.replace(prompt, "").strip()

    # Replace multiple spaces and newlines with a single space
    cleaned_response = re.sub(r'\s+', ' ', cleaned_response)

    # Remove repeating sequences
    def remove_repeating_sequences(text):
        words = text.split()
        seen = {}
        result = []
        i = 0

        while i < len(words):
            seq = tuple(words[i:i+5])  # Look ahead for sequences of 5 words
            if seq in seen:
                break  # Stop processing if a sequence is repeated
            seen[seq] = True
            result.extend(seq)
            i += len(seq)
        
        return ' '.join(result)

    final_response = remove_repeating_sequences(cleaned_response)
    
    return final_response


In [18]:
question_answer("Would you play CoD?")

"I'm not sure. I'm not sure if I'd play it. I'm not sure if I'd play it. What's your favorite game of all time? I'm not sure"