# 🧠 Assignment: GPT-2 Blog Generation + GPT-3.5 Critique
This notebook demonstrates how to generate blog post ideas using GPT-2 and critique them using GPT-3.5 Turbo with OpenAI API.

In [6]:
# Install required libraries
!pip install transformers



In [7]:
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch

In [8]:
# Load GPT-2 tokenizer and model
model_name = "gpt2"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)
model.eval()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

GPT2LMHeadModel(
  (transformer): GPT2Model(
    (wte): Embedding(50257, 768)
    (wpe): Embedding(1024, 768)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-11): 12 x GPT2Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2Attention(
          (c_attn): Conv1D(nf=2304, nx=768)
          (c_proj): Conv1D(nf=768, nx=768)
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D(nf=3072, nx=768)
          (c_proj): Conv1D(nf=768, nx=3072)
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=768, out_features=50257, bias=False)
)

In [9]:
tokenizer.pad_token = tokenizer.eos_token
model.config.pad_token_id = tokenizer.eos_token_id

In [10]:
def generate_blog(prompt, max_length=50):
    inputs = tokenizer.encode_plus(prompt, return_tensors="pt",padding="longest",truncation = True)
    input_ids = inputs["input_ids"]
    attention_mask = inputs["attention_mask"]
    outputs = model.generate(inputs=input_ids, attention_mask=attention_mask,
                             max_length=max_length, num_return_sequences=1, do_sample=True,
                             pad_token_id = tokenizer.pad_token_id)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

In [11]:
# Define prompts
prompts = [
    "Top 5 AI trends in 2024 are",
    "The future of agriculture and AI involves",
    "How AI will transform healthcare in the next decade",
    "Why GPT models are revolutionizing content creation",
    "Top AI startup ideas to watch in 2025"
]

gpt2_outputs = []
for prompt in prompts:
    output = generate_blog(prompt)
    gpt2_outputs.append(output)
    print(f"Prompt: {prompt}\nGenerated: {output}\n")

Prompt: Top 5 AI trends in 2024 are
Generated: Top 5 AI trends in 2024 are:

Nominated: 2nd. This is the highest in the country. At 3rd place I think, a real race.

I think is 2nd lowest (it's not even close)!

Prompt: The future of agriculture and AI involves
Generated: The future of agriculture and AI involves bringing about a fundamental change in the course of history that does not merely have to occur, but only affects a portion of the country. The end goal, I suggest, is to give the government the full power of

Prompt: How AI will transform healthcare in the next decade
Generated: How AI will transform healthcare in the next decade

AI's potential in the 21st Century

A big problem

AI's potential to change healthcare by 2025

A huge change in medical care

AI's potential in the 21

Prompt: Why GPT models are revolutionizing content creation
Generated: Why GPT models are revolutionizing content creation as if its an entirely new way to be created in America.

They could be. And 

In [1]:
import os
import openai

In [4]:
# 🔐 Set your OpenAI API key
from google.colab import userdata
os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')
openai.api_key = os.getenv("OPENAI_API_KEY")

In [12]:
# Format the GPT-2 outputs for GPT-3.5 critique
input_text = "\n".join([f"{i+1}. {line}" for i, line in enumerate(gpt2_outputs)])

critique_prompt = f"""
You are an expert in AI writing analysis. Below are 5 outputs generated by GPT-2 for blog post ideas:

{input_text}

Your task is to write a professional and insightful markdown-formatted critique under the title:

## 🔍 Limitations Analysis: Critique of GPT-2 Blog Post Outputs

Discuss:
1. Relevance of each response to the prompt.
2. Coherence and fluency of ideas.
3. Repetition or generic content.
4. Any bias or shallowness observed.
5. Actionable suggestions to improve the outputs using better prompts, fine-tuning, or model alternatives.

Keep the markdown clean, structured with bullet points or subheadings where necessary, and usable directly in a Jupyter Notebook markdown cell.
"""

In [14]:
# Call GPT-3.5 Turbo to critique the GPT-2 outputs
client = openai.OpenAI()
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": critique_prompt}],
    temperature=0.7
)

# Print critique in markdown format
print("\n---\n")
print(response.choices[0].message.content)


---

## 🔍 Limitations Analysis: Critique of GPT-2 Blog Post Outputs

### 1. Relevance of each response to the prompt:
- The responses show some relevance to the given prompts, but they lack depth and specificity. For example, the AI trends in 2024 output does not provide a comprehensive list of trends, and the healthcare transformation by AI lacks specific insights.

### 2. Coherence and fluency of ideas:
- The coherence and fluency of ideas in the outputs are lacking. The responses seem to jump between different topics and fail to provide a clear narrative or logical flow.

### 3. Repetition or generic content:
- There is a noticeable amount of repetition and generic content in the outputs. The use of vague language and lack of specific details make the responses seem like they have been generated without much context.

### 4. Any bias or shallowness observed:
- The outputs exhibit shallowness in their analysis, with surface-level insights and lack of in-depth exploration of the topi