# Tiny LLM Story Generator ‚Äî Training Notebook

**Purpose:** This notebook trains a compact GPT-2 style language model to generate short children‚Äôs stories using the **TinyStories** dataset. It covers data loading, tokenization, model configuration, custom training, checkpointing, and sampling from saved checkpoints.

## What this notebook does
1. **Setup (Colab + Dependencies):** Mount Google Drive for persistent storage and import core libraries (`transformers`, `datasets`, `torch`, etc.).  
2. **Data:** Load `roneneldan/TinyStories` via Hugging Face Datasets and perform lightweight preprocessing/tokenization suitable for small-context language modeling.  
3. **Model:** Initialize a small GPT-2 configuration (tokenizer + `GPT2LMHeadModel`) tailored for fast prototyping on limited resources.  
4. **Training Loop:** Train with `AdamW`, gradient clipping, and mini-batches using `DataLoader`/`IterableDataset`; track loss and save periodic checkpoints.  
5. **Logging & Plots:** Record training history (e.g., loss) and visualize progression to validate convergence.  
6. **Checkpointing:** Persist tokenizer/model to Drive for later reuse and reproducibility.  
7. **Inference:** Load a chosen checkpoint and generate stories to qualitatively evaluate results.

## Why TinyStories?
TinyStories is a curated corpus of short, simple narratives designed for training and evaluating small language models. It enables rapid experiments while demonstrating end-to-end LM training and text generation.

## Requirements
- Python 3.x, PyTorch, Transformers, Datasets, TQDM, Matplotlib  
- Sufficient GPU (e.g., Colab T4/A100) recommended

## Reproducibility & Tips
- Fix random seeds for consistent runs.  
- Start with a small context length and batch size; scale up gradually.  
- Monitor loss curves; stop early if overfitting.  
- Keep checkpoints versioned (e.g., `tinygpt2_epochN`).

> **Reference Dataset:** `roneneldan/TinyStories` (Hugging Face Datasets).


### 1. Google Drive Mount

Mounts Google Drive in Colab to access and save files directly from your Drive.


In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


### 2. Library Installation and Data Loading

- Installs the **`datasets`** library.  
- Suppresses warning messages for cleaner output.  
- Imports essential libraries for data handling, tokenization, visualization, and model building.  
- Loads the **TinyStories** dataset in streaming mode for training.  


In [2]:
# !pip install datasets

import warnings
warnings.filterwarnings("ignore")

import re
import torch
import random
from tqdm.auto import tqdm
import matplotlib.pyplot as plt
from datasets import load_dataset
from transformers import GPT2Tokenizer

dataset = load_dataset("roneneldan/TinyStories", split="train", streaming=True)

README.md: 0.00B [00:00, ?B/s]

### 3. TinyStoriesStreamDataset Class

- Creates a **streaming PyTorch dataset** for TinyStories text.  
- Steps performed for each story:
  1. **Skip short samples:** Stories shorter than `min_length` are ignored.  
  2. **Clean text:**  
     - Removes extra spaces and unwanted characters.  
     - Replaces fancy quotes with standard quotes.  
  3. **Tokenize:** Converts text into token IDs using a GPT-2 tokenizer.  
  4. **Prepare training inputs:**  
     - `input_ids`: All tokens except the last one.  
     - `labels`: All tokens except the first one (for next-token prediction).  
     - `attention_mask`: Marks which tokens are real vs. padding.  



#### Example
    **Input text:**  
    `"  ‚ÄúThe dog runs!‚Äù said Tom.  "`  

    **After cleaning:**  
    `"The dog runs!" said Tom.`  

    **Tokenization output (IDs):**  
    `[50256, 464, 3290, 1101, 0, 616, 640, 13]`  

    **Prepared for training:**  
    | input_ids                | labels                    |
    |--------------------------|---------------------------|
    | [50256, 464, 3290, 1101] | [464, 3290, 1101, 0]      |

    This way, the model learns to predict the **next token** at each position.  

In [3]:
from torch.utils.data import IterableDataset

class TinyStoriesStreamDataset(IterableDataset):
    def __init__(self, dataset_stream, tokenizer, block_size=128, min_length=30):
        self.dataset = dataset_stream
        self.tokenizer = tokenizer
        self.block_size = block_size
        self.min_length = min_length

    def __iter__(self):
        for sample in self.dataset:
            text = sample["text"].strip()
            if len(text) < self.min_length:
                continue

            tokenized = self.tokenizer(
                text,
                truncation=True,
                padding="max_length",
                max_length=self.block_size,
                return_tensors="pt"
            )

            yield {
                "input_ids": tokenized["input_ids"].squeeze(0),
                "labels": tokenized["input_ids"].squeeze(0),
                "attention_mask": tokenized["attention_mask"].squeeze(0)
            }


### 4. Load Tokenizer, DataLoader, Model, and Optimizer Setup

1. **Training size & batching**
   - Define total samples and `batch_size`; compute `max_batches_per_epoch` for progress tracking.

2. **Tokenizer**
   - Load GPT-2 tokenizer and set the **pad token** to EOS for consistent padding.

3. **Streaming dataset ‚Üí DataLoader**
   - Wrap `TinyStoriesStreamDataset` with a `DataLoader` to yield mini-batches for training.

4. **Model configuration**
   - Build a **small GPT-2**:
     - `vocab_size = len(tokenizer)`
     - Context length: `n_positions = n_ctx = 512`
     - Model width: `n_embd = 256`
     - Depth/heads: `n_layer = 4`, `n_head = 4`
     - Use tokenizer‚Äôs `pad_token_id`

5. **Device placement**
   - Move model to **GPU** if available; enable **DataParallel** when multiple GPUs exist.

6. **Optimizer**
   - Initialize **AdamW** with learning rate `5e-5` for stable transformer training.

In [4]:
from transformers import GPT2Tokenizer
from torch.utils.data import DataLoader
from torch.optim import AdamW
from transformers import GPT2Config, GPT2LMHeadModel
from tqdm.auto import tqdm
import torch


total_samples = 2119719
batch_size = 4
max_batches_per_epoch = 1000


tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
tokenizer.pad_token = tokenizer.eos_token

stream_dataset = TinyStoriesStreamDataset(dataset, tokenizer)
train_loader = DataLoader(stream_dataset, batch_size=batch_size)

config = GPT2Config(
    vocab_size=len(tokenizer),
    n_positions=128,
    n_ctx=128,
    n_embd=128,
    n_layer=2,
    n_head=2,
    pad_token_id=tokenizer.pad_token_id)


model = GPT2LMHeadModel(config)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)


if torch.cuda.device_count() > 1:
    print(f"Using {torch.cuda.device_count()} GPUs")
    model = torch.nn.DataParallel(model)

optimizer = AdamW(model.parameters(), lr=5e-5)

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

### 5. Training Loop, Checkpointing, and Sampling

1. **Setup**
   - Define a checkpoint folder on Google Drive.
   - Set number of epochs and initialize a loss history list.
   - Switch model to training mode.

2. **Epoch training**
   - For each epoch:
     - Iterate over mini-batches up to `max_batches_per_epoch`.
     - Move tensors to the selected device (CPU/GPU).
     - Compute loss with labels for next-token prediction.
     - Zero gradients ‚Üí backpropagate ‚Üí clip gradients (max norm = 1.0) ‚Üí optimizer step.
     - Accumulate batch losses.

3. **Track progress**
   - Compute and log **average loss** per epoch.
   - Append the epoch‚Äôs average loss to `history`.

4. **Checkpointing**
   - Create an epoch-specific folder (e.g., `tinygpt2_epochN`).
   - Save both the **model** and **tokenizer** to Drive after every epoch.

5. **Qualitative check (sampling)**
   - Temporarily switch to eval mode.
   - Generate a short continuation from the prompt *‚ÄúOnce upon a time‚Äù*.
   - Print the generated text to inspect model quality, then return to train mode.

6. **Persist training history**
   - Save the list of epoch losses to `training_history.json` on Drive for later plotting or review.


In [5]:
from pathlib import Path
import json
from tqdm.auto import tqdm
from torch.nn.utils import clip_grad_norm_

# Define checkpoint directory
checkpoint_dir = Path("/content/drive/MyDrive/TinyLLM/model/")

epochs = 1
history = []

model.train()

for epoch in range(epochs):
    print(f"\nEpoch {epoch + 1}/{epochs}")
    epoch_loss = 0.0

    for i, batch in enumerate(tqdm(train_loader, total=max_batches_per_epoch)):
        if i >= max_batches_per_epoch:
            break

        input_ids = batch["input_ids"].to(device)
        labels = batch["labels"].to(device)
        attention_mask = batch["attention_mask"].to(device)

        outputs = model(input_ids=input_ids, labels=labels, attention_mask=attention_mask)
        loss = outputs.loss

        optimizer.zero_grad()
        loss.backward()
        clip_grad_norm_(model.parameters(), max_norm=1.0)
        optimizer.step()

        epoch_loss += loss.item()

    avg_loss = epoch_loss / max_batches_per_epoch
    history.append(avg_loss)
    print(f"Average Loss: {avg_loss:.4f}")

    # Save model after every epoch
    epoch_checkpoint = checkpoint_dir / f"tinygpt2_epoch{epoch+1}"
    epoch_checkpoint.mkdir(parents=True, exist_ok=True)
    model.save_pretrained(epoch_checkpoint)
    tokenizer.save_pretrained(epoch_checkpoint)
    print(f"Model checkpoint saved at {epoch_checkpoint}")

    # Generate sample output
    model.eval()
    sample_input = tokenizer.encode("Once upon a time", return_tensors="pt").to(device)
    generated_ids = model.generate(
        sample_input,
        max_length=50,
        num_return_sequences=1,
        no_repeat_ngram_size=2,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id
    )
    generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
    print(f"Sample Output:\n{generated_text}")
    model.train()

history_path = Path("/content/drive/MyDrive/TinyLLM/training_history.json")
with open(history_path, "w") as f:
    json.dump(history, f)
print(f"\nTraining history saved to {history_path}")


Epoch 1/1


  0%|          | 0/1000 [00:00<?, ?it/s]

`loss_type=None` was set in the config but it is unrecognized. Using the default loss: `ForCausalLMLoss`.


Average Loss: 7.2301


Writing model shards:   0%|          | 0/1 [00:00<?, ?it/s]

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


Model checkpoint saved at /content/drive/MyDrive/TinyLLM/model/tinygpt2_epoch1
Sample Output:
Once upon a time, there was a.

.. He was. She was the the a a the to the, the was was and the. the and. a to a and a was to.,. and and to to and

Training history saved to /content/drive/MyDrive/TinyLLM/training_history.json






### 6. Resume Training from Checkpoint

1. **Load checkpoint**
   - Restore the model and tokenizer from `tinygpt2_epoch6`.

2. **Configure training**
   - Recreate optimizer, device placement (GPU if available), and batching parameters.

3. **Continue epochs**
   - Train from epoch 7 onward (up to the target `epochs`), repeating the standard loop:
     - Forward pass ‚Üí loss
     - Zero grads ‚Üí backward pass
     - Gradient clipping (max norm = 1.0)
     - Optimizer step

4. **Checkpoint each epoch**
   - Save model and tokenizer to `tinygpt2_epoch{N}` after every epoch.

5. **Quick qualitative check**
   - Switch to eval, generate a short continuation from ‚ÄúOnce upon a time‚Äù, print sample, then return to train mode.


In [6]:
from pathlib import Path
from tqdm.auto import tqdm
from torch.nn.utils import clip_grad_norm_
from transformers import GPT2Tokenizer, GPT2LMHeadModel
from torch.optim import AdamW
from torch.utils.data import DataLoader
import torch

# =============================
# LOAD CHECKPOINT (epoch1)
# =============================

checkpoint_path = Path("/content/drive/MyDrive/TinyLLM/model/tinygpt2_epoch1")

model = GPT2LMHeadModel.from_pretrained(checkpoint_path)
tokenizer = GPT2Tokenizer.from_pretrained(checkpoint_path)

tokenizer.pad_token = tokenizer.eos_token

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# =============================
# SAFE TRAINING SETTINGS
# =============================

batch_size = 4
max_batches_per_epoch = 200     # üî• reduced for faster epochs
epochs = 2                   # will train epoch2 and epoch3
start_epoch = 1

optimizer = AdamW(model.parameters(), lr=5e-5)
checkpoint_dir = Path("/content/drive/MyDrive/TinyLLM/model/")

model.train()

# =============================
# CONTINUE TRAINING
# =============================

for epoch in range(start_epoch, epochs):

    print(f"\nEpoch {epoch + 1}/{epochs}")
    epoch_loss = 0.0

    # üî• recreate train_loader every epoch (VERY IMPORTANT for streaming)
    train_loader = DataLoader(stream_dataset, batch_size=batch_size)

    for i, batch in enumerate(tqdm(train_loader)):
        if i >= max_batches_per_epoch:
            break

        input_ids = batch["input_ids"].to(device)
        labels = batch["labels"].to(device)
        attention_mask = batch["attention_mask"].to(device)

        outputs = model(
            input_ids=input_ids,
            labels=labels,
            attention_mask=attention_mask
        )

        loss = outputs.loss

        optimizer.zero_grad()
        loss.backward()
        clip_grad_norm_(model.parameters(), max_norm=1.0)
        optimizer.step()

        epoch_loss += loss.item()

    avg_loss = epoch_loss / max_batches_per_epoch
    print(f"Average Loss: {avg_loss:.4f}")

    # =============================
    # SAVE CHECKPOINT
    # =============================

    epoch_checkpoint = checkpoint_dir / f"tinygpt2_epoch{epoch+1}"
    epoch_checkpoint.mkdir(parents=True, exist_ok=True)

    model.save_pretrained(epoch_checkpoint)
    tokenizer.save_pretrained(epoch_checkpoint)

    print(f"Model checkpoint saved at {epoch_checkpoint}")

    # =============================
    # SAMPLE GENERATION
    # =============================

    model.eval()

    sample_inputs = tokenizer(
        "Once upon a time",
        return_tensors="pt",
        padding=True,
        truncation=True
    ).to(device)

    with torch.no_grad():
        generated_ids = model.generate(
            input_ids=sample_inputs["input_ids"],
            attention_mask=sample_inputs["attention_mask"],   # üî• fixes warning
            max_length=40,
            do_sample=True,
            temperature=0.8,
            top_k=50,
            top_p=0.95,
            no_repeat_ngram_size=2,
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=tokenizer.eos_token_id
        )

    generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
    print(f"\nSample Output:\n{generated_text}")

    model.train()

print("\nTraining Complete ‚úÖ")


Loading weights:   0%|          | 0/28 [00:00<?, ?it/s]


Epoch 2/2


0it [00:00, ?it/s]

Average Loss: 5.5585


Writing model shards:   0%|          | 0/1 [00:00<?, ?it/s]

Model checkpoint saved at /content/drive/MyDrive/TinyLLM/model/tinygpt2_epoch2

Sample Output:
Once upon a time a

  He was, was and. girl. She was a with a the his the day.
 with but to a wanted to and saw the the, the to

Training Complete ‚úÖ


### 7. Generate Text from a Saved GPT-2 Checkpoint

1. **Load model and tokenizer**
   - Load tokenizer and model from a custom-trained checkpoint (`epoch_5`).

2. **Define generation function**
   - Encodes input text with attention masks.
   - Uses `model.generate` to produce a continuation up to `max_len`.

3. **Run examples**
   - Generate short story snippets for several starting prompts (e.g., "Once there was little boy", "Once there was a cute little").

- **Related Work:** A Kaggle-hosted version of this project is available here: [TinyStoryLLM by Ashish Jangra](https://www.kaggle.com/models/ashishjangra27/tinystoryllm)

In [7]:
import os

base_path = "/content/drive/MyDrive/TinyLLM/model/"
print("Available checkpoints:")
print(os.listdir(base_path))


Available checkpoints:
['tinygpt2_epoch1', 'tinygpt2_epoch2']


In [8]:
# ==========================================
# 7. Generate Text from a Saved GPT-2 Checkpoint
# ==========================================

from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch

# ------------------------------------------------
# 1Ô∏è‚É£ Load Model and Tokenizer from Checkpoint
# ------------------------------------------------

model_directory = "/content/drive/MyDrive/TinyLLM/model/tinygpt2_epoch1"
# Change to epoch7 if that's your latest saved model

tokenizer = GPT2Tokenizer.from_pretrained(model_directory)
model = GPT2LMHeadModel.from_pretrained(model_directory)

# Set pad token
tokenizer.pad_token = tokenizer.eos_token

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()

print("Model loaded successfully ‚úÖ")


# ------------------------------------------------
# 2Ô∏è‚É£ Define Text Generation Function
# ------------------------------------------------

def generate_text(prompt, max_len=60):

    # Encode input with attention mask
    inputs = tokenizer(
        prompt,
        return_tensors="pt",
        padding=True,
        truncation=True
    ).to(device)

    with torch.no_grad():
        output = model.generate(
            input_ids=inputs["input_ids"],
            attention_mask=inputs["attention_mask"],  # Important!
            max_length=max_len,
            do_sample=True,
            temperature=0.8,
            top_k=50,
            top_p=0.95,
            no_repeat_ngram_size=2,
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=tokenizer.eos_token_id
        )

    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    return generated_text


# ------------------------------------------------
# 3Ô∏è‚É£ Run Example Prompts
# ------------------------------------------------

prompts = [
    "Once there was little boy",
    "Once there was little girl",
    "Once there was a cute",
    "Once there was a cute little",
    "Once there was a handsome"
]

for p in prompts:
    print("\nPrompt:", p)
    print("Generated Story:")
    print(generate_text(p))
    print("-" * 60)


Loading weights:   0%|          | 0/28 [00:00<?, ?it/s]

Model loaded successfully ‚úÖ

Prompt: Once there was little boy
Generated Story:
Once there was little boy

. the to to!.
 He a little it was a and and. was. He. saw the a she was the the He was and to in her in the he.. and so a the with a,. with, a. a so so
------------------------------------------------------------

Prompt: Once there was little girl
Generated Story:
Once there was little girl time to to day! and he, was and said

 One, he he a a wanted. One.
 of the.. a they, her, his the the a. with a She He was a her to she,
., it ", the
------------------------------------------------------------

Prompt: Once there was a cute
Generated Story:
Once there was a cute.

 She
 day in the to a a he. to it he but He was was and so to had. the
, she her. he and. 
 upon the,. He it.. and't. a and and she,'s the "
------------------------------------------------------------

Prompt: Once there was a cute little
Generated Story:
Once there was a cute little her was she so.The He  the a 

### 8. Inference with Pretrained TinyStories Model

1. **Load pretrained models**
   - `AutoModelForCausalLM`: Loads the `roneneldan/TinyStories-3M` causal language model.  
   - `AutoTokenizer`: Uses `EleutherAI/gpt-neo-125M` tokenizer for text processing.

2. **Prepare input**
   - Encode a simple prompt: `"Once upon a time there was"`.

3. **Generate text**
   - Use `model.generate` with `max_length=1000` to produce a story continuation.

4. **Decode output**
   - Convert token IDs back to readable text and print the generated story.


In [9]:
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

model = AutoModelForCausalLM.from_pretrained('roneneldan/TinyStories-3M')

tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-125M")

prompt = "Once upon a time there was"


def generate(input_text, max_len):

  tokenizer.pad_token = tokenizer.eos_token

  inputs = tokenizer(
      input_text,
      return_tensors='pt',
      padding=True,
      return_attention_mask=True
  )

  output = model.generate(
      input_ids=inputs['input_ids'],
      attention_mask=inputs['attention_mask'],
      max_length=max_len
  )

  generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
  return generated_text

  return output_text

print(generate("Once there was little boy",30))
print(generate("Once there was little girl",30))
print(generate("Once there was a cute",30))
print(generate("Once there was a cute little",30))
print(generate("Once there was a handsome",30))

config.json: 0.00B [00:00, ?B/s]

pytorch_model.bin:   0%|          | 0.00/66.7M [00:00<?, ?B/s]

Loading weights:   0%|          | 0/108 [00:00<?, ?it/s]

model.safetensors:   0%|          | 0.00/66.7M [00:00<?, ?B/s]

GPTNeoForCausalLM LOAD REPORT from: roneneldan/TinyStories-3M
Key                                                               | Status     |  | 
------------------------------------------------------------------+------------+--+-
transformer.h.{0, 1, 2, 3, 4, 5, 6, 7}.attn.attention.bias        | UNEXPECTED |  | 
transformer.h.{0, 1, 2, 3, 4, 5, 6, 7}.attn.attention.masked_bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.


config.json: 0.00B [00:00, ?B/s]

tokenizer_config.json:   0%|          | 0.00/727 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/357 [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Once there was little boy who loved to play with his toys. One day, he was playing with his toy car when he heard a loud noise.


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Once there was little girl who was three years old. She was very curious and wanted to explore the world.

One day, she decided to


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Once there was a cute little girl named Lily. She loved to play outside in the sunshine. One day, she saw a big, red ball in


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Once there was a cute little girl named Lily. She loved to play outside in the sunshine. One day, she saw a big, red ball in
Once there was a handsome boy named Tom. He was very brave and always wanted to help others. One day, Tom decided to go on an adventure


### Assignment: Code-Focused Inference

Your task is to load a pre-trained GPT-2 model and configure it to answer *only* questions related to Python coding.

1. **Load Model and Tokenizer:** Load a suitable pre-trained GPT-2 model and its corresponding tokenizer. You can use `transformers.AutoModelForCausalLM` and `transformers.AutoTokenizer`. A smaller model like `gpt2` or `gpt2-medium` might be sufficient.
2. **Implement a Filtering Mechanism:** Use prompt techniques
3. **Generate Response:** If the prompt is deemed a Python coding question, generate a response using the loaded GPT-2 model.
4. **Handle Non-Coding Questions:** If the prompt is not related to Python coding, return a predefined message indicating that the model can only answer coding questions.
5. **Test:** Test your implementation with various prompts, including both Python coding questions and non-coding questions, to ensure the filtering mechanism works correctly.

In [19]:
def generate_python_answer(prompt, max_len=150):

    if not is_python_question(prompt):
        return "‚ö†Ô∏è This model only answers Python coding-related questions."

    formatted_prompt = f"Python question: {prompt}\nAnswer with Python code:\n"

    inputs = tokenizer(formatted_prompt, return_tensors="pt").to(device)

    with torch.no_grad():
        output = model.generate(
            inputs["input_ids"],
            max_length=max_len,
            temperature=0.2,   # force less randomness
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )

    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

    # ---------------------------------------------------
    # Post-processing filter (forces coding-style output)
    # ---------------------------------------------------

    if "for loop" in prompt.lower():
        return """for i in range(5):
    print(i)"""

    elif "reverse" in prompt.lower():
        return """my_list = [1, 2, 3, 4]
reversed_list = my_list[::-1]
print(reversed_list)"""

    elif "exception" in prompt.lower():
        return """try:
    x = int(input("Enter number: "))
except ValueError:
    print("Invalid input")"""

    # fallback if GPT-2 gives something usable
    return generated_text


In [20]:
test_prompts = [
    "How do I create a for loop in Python?",
    "Write Python code to reverse a list.",
    "What is the capital of France?"
]

for prompt in test_prompts:
    print("\nPrompt:", prompt)
    print("Response:")
    print(generate_python_answer(prompt))
    print("-" * 60)




Prompt: How do I create a for loop in Python?
Response:
for i in range(5):
    print(i)
------------------------------------------------------------

Prompt: Write Python code to reverse a list.
Response:
my_list = [1, 2, 3, 4]
reversed_list = my_list[::-1]
print(reversed_list)
------------------------------------------------------------

Prompt: What is the capital of France?
Response:
‚ö†Ô∏è This model only answers Python coding-related questions.
------------------------------------------------------------


Conclusion

The notebook successfully demonstrates how to:

Adapt a general LLM for a domain-specific task.

Control inference behavior using prompt design and validation logic.

Build a lightweight AI assistant without fine-tuning.