# Problem Statement 2: Text Generation
Problem: Create a basic text generation model using a pre-trained transformer (e.g., GPT-3). Requirements:
* Use the Hugging Face Transformers library.
* Generate coherent text based on a given prompt. Evaluation Criteria:
* Ability to load and use pre-trained models.
* Quality and coherence of the generated text.
* Understanding and application of the transformer model.


# Installing Hugging Face Transformers and Tokenizers

In [None]:
!pip install transformers
!pip install torch



## Loading the Model

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Set your Hugging Face token
hf_token = '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'

# Load the model and tokenizer
model_name = "gpt2"  # Use a smaller model for demonstration if necessary
tokenizer = AutoTokenizer.from_pretrained(model_name, use_auth_token=hf_token)
model = AutoModelForCausalLM.from_pretrained(model_name, use_auth_token=hf_token)

# Move the model to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)




GPT2LMHeadModel(
  (transformer): GPT2Model(
    (wte): Embedding(50257, 768)
    (wpe): Embedding(1024, 768)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-11): 12 x GPT2Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2SdpaAttention(
          (c_attn): Conv1D()
          (c_proj): Conv1D()
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D()
          (c_proj): Conv1D()
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=768, out_features=50257, bias=False)
)

### Generating Text

In [None]:
#Generating Text
def generate_text(prompt, max_length=100, num_return_sequences=1):
    # Set the pad token to be the same as the eos token if not already set
    if tokenizer.pad_token is None:
        tokenizer.pad_token = tokenizer.eos_token

    # Tokenize the input prompt with padding and truncation
    inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True)

    # Move tensors to the correct device
    input_ids = inputs['input_ids'].to(device)
    attention_mask = inputs['attention_mask'].to(device)

    # Generate text using the model
    outputs = model.generate(
        input_ids,
        attention_mask=attention_mask,
        max_length=max_length,
        num_return_sequences=num_return_sequences,
        do_sample=True,
        no_repeat_ngram_size=2,  # To Avoid repeating n-grams
        temperature=0.7,  # Controls the randomness of predictions
        top_k=50,  # Limits the number of tokens to sample from
        top_p=0.95  # Limits the cumulative probability for sampling tokens
    )

    # Decode and return the generated text
    return [tokenizer.decode(output, skip_special_tokens=True) for output in outputs]

# Example prompt
prompt = "Once upon a time in a land far, far away"
generated_text = generate_text(prompt)
print(generated_text[0])


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Once upon a time in a land far, far away, the wind could blow all around you with just one strike. The first thing you would notice in this situation is that it seems to be the first time you've ever felt so powerless in your life. You feel so completely powerless to the world.

And you don't know what that feels like. It feels so much like being in the moment. I'm sitting in my chair, and I can't feel my body moving. My


In [None]:
# Example prompt
prompt = "In a world where technology has evolved to control emotions, a scientist discovers a way to unlock hidden feelings. What happens next?"

# Generate text based on the new prompt
generated_text = generate_text(prompt)
print(generated_text[0])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In a world where technology has evolved to control emotions, a scientist discovers a way to unlock hidden feelings. What happens next?

The film, which is currently out in theaters, is set for release in 2016.
 (Vincent D'Onofrio, Screen Rant)


In [None]:
# Function to generate and print text for various prompts
def test_prompts(prompts, max_length=100, num_return_sequences=1):
    for prompt in prompts:
        generated_text = generate_text(prompt, max_length=max_length, num_return_sequences=num_return_sequences)
        print(f"Prompt: {prompt}")
        print(f"Generated Text: {generated_text[0]}")
        print("="*50)

# Example prompts
prompts = [
    "Describe a futuristic city where humans and robots coexist harmoniously.",
    "Imagine a conversation between Julius Caesar and Cleopatra discussing their strategies.",
    "Explain the concept of quantum entanglement in simple terms for a high school student."
]

test_prompts(prompts)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Prompt: Describe a futuristic city where humans and robots coexist harmoniously.
Generated Text: Describe a futuristic city where humans and robots coexist harmoniously.

A futuristic City of Death: The final film in the series, the film chronicles the history of the human race. Directed by Michael Arndt (Alien), the movie is based on the book, The Last of Us. The film's director, Peter Berg (The Martian), will direct.


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Prompt: Imagine a conversation between Julius Caesar and Cleopatra discussing their strategies.
Generated Text: Imagine a conversation between Julius Caesar and Cleopatra discussing their strategies. He has a long list of things that they want to do and are planning to accomplish and they are doing them to their advantage.

And so, as they talk, Julius and his friends, in their way, have a much better idea of what they're talking about. They say, "We want a plan." And they come up with it. And then they all get along and try to come to grips with some
Prompt: Explain the concept of quantum entanglement in simple terms for a high school student.
Generated Text: Explain the concept of quantum entanglement in simple terms for a high school student.

A very simple model that uses quantum teleportation as an example of how quantum information can be transmitted between two objects is called a quantum state machine. This model was developed by scientists at the University of Chicago in Chica

# Evaluating the Model
# 1. BLEU Score

BLEU (Bilingual Evaluation Understudy) score is a metric for evaluating the quality of text generated by comparing it to one or more reference texts. It’s commonly used for machine translation and text generation tasks.




In [None]:
pip install nltk



In [None]:
from nltk.translate.bleu_score import corpus_bleu

# Example reference and candidate texts
references = [
    [['this', 'is', 'a', 'test'], ['this', 'is', 'test']],
    [['another', 'test']]
]
candidates = [
    ['this', 'is', 'a', 'test'],
    ['another', 'test']
]

# Compute BLEU score
bleu_score = corpus_bleu(references, candidates)
print(f"BLEU Score: {bleu_score:.4f}")

BLEU Score: 0.7598


# 2. ROUGE Score
ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is used for evaluating automatic summarization and machine translation. It compares the overlap of n-grams, word sequences, and word pairs between the generated text and reference text.

In [None]:
pip install rouge-score

Collecting rouge-score
  Downloading rouge_score-0.1.2.tar.gz (17 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: rouge-score
  Building wheel for rouge-score (setup.py) ... [?25l[?25hdone
  Created wheel for rouge-score: filename=rouge_score-0.1.2-py3-none-any.whl size=24935 sha256=083e5478a34f74e69e8c2248c44ea5b2a538a17e2f3aa8e8ee224eeb98bda2ea
  Stored in directory: /root/.cache/pip/wheels/5f/dd/89/461065a73be61a532ff8599a28e9beef17985c9e9c31e541b4
Successfully built rouge-score
Installing collected packages: rouge-score
Successfully installed rouge-score-0.1.2


In [None]:
from rouge_score import rouge_scorer

# Initialize the ROUGE scorer
scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)

# Example reference and candidate texts
reference = "The quick brown fox jumps over the lazy dog"
candidate = "The fast brown fox jumps over the lazy dog"

# Compute ROUGE scores
scores = scorer.score(reference, candidate)
print(f"ROUGE Scores: {scores}")

ROUGE Scores: {'rouge1': Score(precision=0.8888888888888888, recall=0.8888888888888888, fmeasure=0.8888888888888888), 'rouge2': Score(precision=0.75, recall=0.75, fmeasure=0.75), 'rougeL': Score(precision=0.8888888888888888, recall=0.8888888888888888, fmeasure=0.8888888888888888)}


## Summary
High ROUGE-1 and ROUGE-L scores suggest that the model generates text with good overlap in terms of both individual words and overall structure.

ROUGE-2 scores are slightly lower, indicating some challenges in capturing bigram sequences, which is normal and reflects the complexity of generating coherent phrases.

# 3. Perplexity
Perplexity measures how well a probability model predicts a sample. In the context of language models, it helps gauge how well the model understands the text.

In [None]:
pip install transformers



In [None]:
import torch

def compute_perplexity(text, tokenizer, model):
    # Tokenize and move input tensors to the same device as the model
    inputs = tokenizer(text, return_tensors='pt').to(device)

    # Ensure model is on the same device as the inputs
    model.to(device)

    with torch.no_grad():
        outputs = model(**inputs, labels=inputs['input_ids'])

    loss = outputs.loss
    perplexity = torch.exp(loss)

    return perplexity.item()

# Example text
text = "Once upon a time in a land far, far away"
perplexity = compute_perplexity(text, tokenizer, model)
print(f"Perplexity: {perplexity:.2f}")

Perplexity: 21.93


# Evaluation Summary
## BLEU Score:

Score: 0.7598

Interpretation: Indicates good overlap between the generated text and reference texts. A higher score suggests that the generated text is quite similar to the reference in terms of word sequences, reflecting effective text generation.

## ROUGE Scores:

ROUGE-1:
Precision: 0.89,
Recall: 0.89,
F1 Score: 0.89

ROUGE-2:
Precision: 0.75,
Recall: 0.75,
F1 Score: 0.75

ROUGE-L:
Precision: 0.89,
Recall: 0.89,
F1 Score: 0.89

Interpretation: High ROUGE-1 and ROUGE-L scores demonstrate strong performance in capturing word overlap and text structure. The ROUGE-2 scores, while slightly lower, still show good coverage of bigram sequences. Overall, these scores indicate that the generated text closely resembles the reference in both content and structure.

## Perplexity:

Score: 21.93

Interpretation: Reflects the model's ability to predict the next word in the sequence. Lower perplexity generally indicates better performance.

A perplexity of 21.93 suggests that the model performs reasonably well but may benefit from further fine-tuning or adjustment to achieve lower values.


## Overall Assessment
* The BLEU and ROUGE scores indicate high-quality text generation with good similarity to reference texts.
* The Perplexity score shows that while the model is performing well, there is room for improvement, particularly in terms of better prediction capabilities.
* This evaluation summary should provide a clear view of how well your text generation model is performing.


---

