# 📝 Text Generation Project — GPT-2 & LSTM

This notebook demonstrates text generation using:
- **GPT-2 (transformers)** for modern transformer-based generation
- **A simple LSTM** (Keras) for educational sequence modeling

---

In [None]:
# 🚀 Environment setup (auto-fix dependencies)
# This cell makes sure you have the correct versions of packages.
# Run this once at the start of your notebook.

!pip install --quiet --upgrade pip

# Core dependencies
!pip install --quiet torch transformers accelerate

# Web UI / Playground
!pip install --quiet gradio

# Fix compatibility issues
!pip install --quiet "markupsafe==2.0.1" "typing-extensions>=4.8.0" "urllib3>=2.0" "packaging>=24.0" "fsspec>=2023.5.0"


[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.8 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.8/1.8 MB[0m [31m62.9 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m29.0 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[33m  DEPRECATION: Building 'markupsafe' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'markupsafe'. Discussion can be found at https://github.com/pypa/pip/issues/6334[0m[33m
[0m  Building wheel for markupsafe (setup.py) ... [?25l[?25hdone
[31mERROR: pip's dependency resolver doe

## 🚀 GPT-2 Demo
Load a pretrained GPT-2 and generate text from prompts.

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch

# Device setup
device = 0 if torch.cuda.is_available() else -1
print('CUDA available:', torch.cuda.is_available())

# Load GPT-2 model and tokenizer
MODEL_NAME = 'gpt2'
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)

# Pipeline for generation
gen = pipeline('text-generation', model=model, tokenizer=tokenizer, device=device)

def generate_gpt2(prompt, max_length=200, temperature=0.8, top_p=0.9, top_k=50, num_return_sequences=1):
    out = gen(prompt, max_length=max_length, temperature=temperature, top_p=top_p, top_k=top_k, num_return_sequences=num_return_sequences)
    return [o['generated_text'] for o in out]

# 🔥 Test GPT-2 generation
print(generate_gpt2('internship', max_length=400)[0])


CUDA available: False


Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=400) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


internship program. The program is intended to create the capacity to have a strong and effective leadership role within the community, to encourage collaboration, to develop partnerships, and to foster mutual understanding.

The students receive an education in English, History, Politics, and Literature, and are expected to complete their coursework in the second year. They will be responsible for meeting the academic standards of the program, which include the requirement for a bachelor's degree, master's degree, or equivalent in English from a university in which they are enrolled.

The students will also be expected to complete a master's degree in the Department of Economics, with the goal of completing their studies in economics. They will be expected to complete an equivalency degree and complete their master's degree in the Department of Economics.

The program is intended to provide a strong and effective leadership role within the community, to encourage collaboration, to dev

## 🔤 LSTM Demo (toy)
A very small word-level LSTM trained on a tiny sample text for demonstration.

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np

# Sample dataset
sample_text = """Artificial intelligence is changing how people interact with technology.
Researchers build models that can write, translate, and understand languages.
These models are trained on large datasets and require compute resources."""

# Tokenize
tokens = sample_text.lower().split()
vocab = sorted(set(tokens))
word2idx = {w:i for i,w in enumerate(vocab)}
idx2word = {i:w for w,i in word2idx.items()}

# Build sequences
seq_length = 5
sequences = []
next_words = []
for i in range(len(tokens) - seq_length):
    sequences.append([word2idx[w] for w in tokens[i:i+seq_length]])
    next_words.append(word2idx[tokens[i+seq_length]])

X = np.array(sequences)
y = keras.utils.to_categorical(next_words, num_classes=len(vocab))

# Define model
model = keras.Sequential([
    layers.Embedding(input_dim=len(vocab), output_dim=16, input_length=seq_length),
    layers.LSTM(64),
    layers.Dense(len(vocab), activation='softmax')
])
model.compile(loss='categorical_crossentropy', optimizer='adam')

# Train
model.fit(X, y, epochs=100, verbose=0)

# Sampling function
def sample_from_model(seed_text, gen_len=20):
    words = seed_text.lower().split()
    words = words[:seq_length]
    for _ in range(gen_len):
        seq = [word2idx.get(w,0) for w in words[-seq_length:]]
        pred = model.predict(np.array([seq]), verbose=0)[0]
        ix = np.argmax(pred)
        words.append(idx2word[ix])
    return ' '.join(words)

# 🔥 Test LSTM
print(sample_from_model('artificial intelligence is changing'))




artificial intelligence is changing people people interact with technology. build build models that can translate, translate, and understand languages. models models are trained on
