# 06 ‚Äî Generate Text
## Sampling with the Trained LSTM

---


## üéØ Concept Primer

### How Text Generation Works

Generation is an **autoregressive loop**:

```
1. Start with a prompt: "You will rejoice to hear"
2. Feed prompt through model ‚Üí get logits for next char
3. Pick next char (greedy: argmax of logits)
4. Append char to sequence
5. Feed updated sequence ‚Üí get logits for next char
6. Repeat until we have 500 characters
```

### Generation vs. Training

| Aspect | Training | Generation |
|--------|----------|------------|
| **Goal** | Learn from data | Produce new text |
| **Mode** | `model.train()` | `model.eval()` |
| **Gradients** | Needed | `torch.no_grad()` |
| **Input** | Real text batches | Generated chars |
| **Output** | Loss | New characters |

### Greedy Sampling

**Argmax**: Always pick the most likely character.

```python
logits = model(...)  # [1, vocab_size]
next_id = torch.argmax(logits).item()
```

**Pros**: Simple, deterministic  
**Cons**: Repetitive, no creativity

**Alternative**: Temperature sampling (adds randomness) ‚Äî left as an extension.

### States in Generation

Unlike training (batch-level states), generation:
- Uses **single batch size = 1**
- **Carries states** across time steps (maintains context)
- Feeds one character at a time

### What Breaks If We Skip This?

- No `eval()` = dropout/batchnorm behave incorrectly
- Gradients tracked = slow + memory leak
- Wrong prompt tokenization = crashes or gibberish

### Shapes During Generation

| Step | Shape |
|------|-------|
| **Prompt IDs** | `[1, prompt_length]` |
| **Single char input** | `[1, 1]` |
| **Logits** | `[1, vocab_size]` |
| **States (h, c)** | `[1, 1, 96]` each |

---


## ‚úÖ Objectives

By the end of this notebook, you should:

- [ ] Load the trained model weights
- [ ] Set the model to `eval()` mode
- [ ] Define a starting prompt: `"You will rejoice to hear"`
- [ ] Tokenize the prompt to IDs
- [ ] Initialize states for batch size = 1
- [ ] Implement generation loop to produce 500 characters
- [ ] Decode IDs back to text and print

---


## üéì Acceptance Criteria

**You pass this notebook when:**

‚úÖ 500 characters of generated text print without errors  
‚úÖ Generated text looks vaguely Frankenstein-ish (Gothic, archaic style)  
‚úÖ You can explain the difference between greedy and temperature sampling

---


## üìù TODO 0: Setup ‚Äî Load Data, Model, Weights

**Load vocab mappings, define model, load trained weights**


In [3]:
import torch
import torch.nn as nn

# === Load vocabulary mappings (from notebook 02) ===
with open('../datasets/frankenstein.txt', 'r', encoding='utf-8') as f:
    frankenstein = f.read()
    
first_letter_text = frankenstein[1380:8230]
tokenized_text = list(first_letter_text)
unique_char_tokens = sorted(set(tokenized_text))
c2ix = {char: idx for idx, char in enumerate(unique_char_tokens)}
ix2c = {idx: char for char, idx in c2ix.items()}
vocab_size = len(c2ix)

print(f"Vocabulary loaded: {vocab_size} unique characters")

# === Define Model (same as before) ===
class CharacterLSTM(nn.Module):
    def __init__(self, vocab_size, embedding_dim=48, hidden_size=96):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embedding_dim)
        self.lstm = nn.LSTM(embedding_dim, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, vocab_size)
        self.hidden_size = hidden_size
    
    def forward(self, x, states):
        embedded = self.embedding(x)
        lstm_out, new_states = self.lstm(embedded, states)
        logits = self.fc(lstm_out)
        logits_flat = logits.view(-1, logits.size(-1))
        return logits_flat, new_states
    
    def init_state(self, batch_size):
        h0 = torch.zeros(1, batch_size, self.hidden_size)
        c0 = torch.zeros(1, batch_size, self.hidden_size)
        return (h0, c0)

# === Instantiate and load trained weights ===
model = CharacterLSTM(vocab_size)
model.load_state_dict(torch.load('../src/models/trained_lstm_model.pth'))
model.eval()  # Set to evaluation mode

print("Model loaded and set to eval() mode")


Vocabulary loaded: 60 unique characters
Model loaded and set to eval() mode


## üìù TODO 1: Define Prompt and Tokenize

**Hint:**  
Convert prompt string ‚Üí list of char IDs.

**Steps:**
1. Define `starting_prompt = "You will rejoice to hear"`
2. Convert to list of IDs: `[c2ix[char] for char in starting_prompt]`
3. Convert to tensor: `torch.tensor(..., dtype=torch.long).unsqueeze(0)`
   - `unsqueeze(0)` adds batch dimension: `[prompt_length]` ‚Üí `[1, prompt_length]`


In [4]:
# TODO: Define and tokenize the starting prompt
# starting_prompt = "You will rejoice to hear"
# prompt_ids = [c2ix[char] for char in starting_prompt]
# prompt_tensor = torch.tensor(prompt_ids, dtype=torch.long).unsqueeze(0)  # [1, prompt_length]

starting_prompt = "You will rejoice to hear that no disaster has accompanied the commencement of an enterprise"
prompt_ids = [c2ix[char] for char in starting_prompt]
prompt_tensor = torch.tensor(prompt_ids, dtype=torch.long).unsqueeze(0)

if starting_prompt and prompt_tensor is not None:
    print(f"Prompt: '{starting_prompt}'")
    print(f"Prompt tensor shape: {prompt_tensor.shape}")


Prompt: 'You will rejoice to hear that no disaster has accompanied the commencement of an enterprise'
Prompt tensor shape: torch.Size([1, 91])


## üìù TODO 2: Warm Up States with Prompt

**Hint:**  
Feed the prompt through the model to initialize states.

**Steps:**
1. Initialize states: `states = model.init_state(1)`
2. With `torch.no_grad():`
3. Feed prompt: `logits, states = model(prompt_tensor, states)`
4. Get last logits: `last_logits = logits[-1:]`  (shape `[1, vocab_size]`)

**Why this step?**  
The prompt "primes" the model with context. The resulting states carry memory of "You will rejoice to hear".


In [5]:
# TODO: Feed prompt to warm up states
# states = model.init_state(1)
# 
# with torch.no_grad():
#     logits, states = model(prompt_tensor, states)
#     last_logits = logits[-1:]  # Last time step logits

states = model.init_state(1)  # Replace
with torch.no_grad():
    logits, states = model(prompt_tensor, states)
    last_logits = logits[-1:]
if states and last_logits is not None:
    print(f"States warmed up. Last logits shape: {last_logits.shape}")


States warmed up. Last logits shape: torch.Size([1, 60])


## üìù TODO 3: Generation Loop

**Hint:**  
Loop 500 times, generating one character per iteration.

**Structure:**
```python
generated_ids = []
num_generated_chars = 500

with torch.no_grad():
    for _ in range(num_generated_chars):
        # 1. Argmax to get next char ID
        next_id = torch.argmax(last_logits).item()
        generated_ids.append(next_id)
        
        # 2. Prepare next input: shape [1, 1]
        next_input = torch.tensor([[next_id]], dtype=torch.long)
        
        # 3. Forward pass
        logits, states = model(next_input, states)
        last_logits = logits[-1:]
```

**Key details:**
- `torch.argmax(last_logits)` picks most likely char
- `.item()` converts tensor to Python int
- `[[next_id]]` creates shape `[1, 1]`
- States are carried across iterations


In [6]:
# TODO: Generation loop
# generated_ids = []
# num_generated_chars = 500
# 
# with torch.no_grad():
#     for _ in range(num_generated_chars):
#         # Get next char ID (greedy sampling)
#         next_id = torch.argmax(last_logits).item()
#         generated_ids.append(next_id)
#         
#         # Prepare next input [1, 1]
#         next_input = torch.tensor([[next_id]], dtype=torch.long)
#         
#         # Forward pass
#         logits, states = model(next_input, states)
#         last_logits = logits[-1:]

generated_ids = []  # Replace with your loop
num_generated_chars = 500

with torch.no_grad():
    for _ in range(num_generated_chars):
        next_id = torch.argmax(last_logits).item()
        generated_ids.append(next_id)
        
        next_input = torch.tensor([[next_id]], dtype=torch.long)
        
        logits, states = model(next_input, states)
        last_logits = logits[-1:]
        

if generated_ids:
    print(f"Generated {len(generated_ids)} character IDs")


Generated 500 character IDs


## üìù TODO 4: Decode and Print Generated Text

**Hint:**  
Convert IDs back to characters using `ix2c`.

**Steps:**
1. Decode: `generated_text = ''.join([ix2c[id] for id in generated_ids])`
2. Combine with prompt: `full_text = starting_prompt + generated_text`
3. Print the result


In [7]:
# TODO: Decode generated IDs to text
# generated_text = ''.join([ix2c[id] for id in generated_ids])
# full_text = starting_prompt + generated_text

# print("="*80)
# print("GENERATED TEXT (Prompt + 500 chars):")
# print("="*80)
# print(full_text)
# print("="*80)

# Your code here
generated_text = ''.join([ix2c[id] for id in generated_ids])
full_text = starting_prompt + generated_text
print("="*80)
print("GENERATED TEXT (Prompt + 500 chars):")
print("="*80)
print(full_text)


GENERATED TEXT (Prompt + 500 chars):
You will rejoice to hear that no disaster has accompanied the commencement of an enterprise which have
been made in the prospect of arriving at the pole
to those countries, to reach welfare you and I may meet. If I succeed, my sister, I will put
some trust in preceding navigators‚Äîthere snow and favourable period for one time I try undoubtedly are in the post-road between walking the
deck and remaining seated my sister, I will put
some trust in preceding navigators‚Äîthere snow and favourable period for one time I try undoubtedly are in the post-road between walking the
deck and remainin


## üí≠ Reflection Prompts

**Write your observations:**

1. **Generated style**: Does the generated text resemble Mary Shelley's style? (sentence structure, word choice, punctuation)

2. **Coherence**: Is the text coherent over short spans? Long spans?

3. **Repetition**: Do you see any repeated phrases or loops?

4. **Greedy vs. Sampling**: What would change if we used temperature sampling instead of argmax?

5. **Prompt influence**: How much does the starting prompt affect the generated text?

6. **Improvements**: What would make the generation better? (More data? Longer training? Larger model?)

---


## üöÄ Extensions to Try

**Want to explore further?**

1. **Temperature Sampling**:
   ```python
   # Instead of argmax:
   probs = torch.softmax(last_logits / temperature, dim=-1)
   next_id = torch.multinomial(probs, num_samples=1).item()
   ```
   - `temperature < 1`: More confident (sharper)
   - `temperature > 1`: More random (flatter)

2. **Longer Generation**: Try 1000 or 2000 characters

3. **Different Prompts**: "I beheld the wretch", "It was a dreary night"

4. **Train on Full Novel**: Remove the slice and train on entire *Frankenstein*

5. **Beam Search**: Keep top-k candidates at each step

---


In [8]:
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from torch.optim import Adam
import os

# === Load full text and build vocab from WHOLE NOVEL ===
with open('../datasets/frankenstein.txt', 'r', encoding='utf-8') as f:
    full_text = f.read()

tokenized_text = list(full_text)  # Use the entire novel, not just Letter 1!
unique_char_tokens = sorted(set(tokenized_text))
c2ix = {char: idx for idx, char in enumerate(unique_char_tokens)}
ix2c = {idx: char for char, idx in c2ix.items()}
vocab_size = len(c2ix)

tokenized_id_text = [c2ix[char] for char in tokenized_text]

print(f"Vocabulary loaded: {vocab_size} unique characters (built from the whole novel)")

# === Define Dataset ===
class TextDataset(Dataset):
    def __init__(self, tokenized_ids, seq_length):
        self.ids = tokenized_ids
        self.seq_length = seq_length
    
    def __len__(self):
        return len(self.ids) - self.seq_length
    
    def __getitem__(self, idx):
        features = self.ids[idx : idx + self.seq_length]
        labels = self.ids[idx + 1 : idx + self.seq_length + 1]
        return (
            torch.tensor(features, dtype=torch.long),
            torch.tensor(labels, dtype=torch.long)
        )

# === Define Model ===
class CharacterLSTM(nn.Module):
    def __init__(self, vocab_size, embedding_dim=48, hidden_size=96):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embedding_dim)
        self.lstm = nn.LSTM(embedding_dim, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, vocab_size)
        self.hidden_size = hidden_size
    
    def forward(self, x, states):
        embedded = self.embedding(x)
        lstm_out, new_states = self.lstm(embedded, states)
        logits = self.fc(lstm_out)
        logits_flat = logits.view(-1, logits.size(-1))
        return logits_flat, new_states
    
    def init_state(self, batch_size):
        h0 = torch.zeros(1, batch_size, self.hidden_size)
        c0 = torch.zeros(1, batch_size, self.hidden_size)
        return (h0, c0)

# === Create Dataset & DataLoader ===
dataset = TextDataset(tokenized_id_text, seq_length=48)
dataloader = DataLoader(dataset, batch_size=36, shuffle=True)

print(f"Dataset: {len(dataset)} samples")
print(f"DataLoader: {len(dataloader)} batches per epoch")

# === Instantiate model, loss, optimizer ===
char_model = CharacterLSTM(vocab_size)
criterion = nn.CrossEntropyLoss()
optimizer = Adam(char_model.parameters(), lr=0.015)
num_epochs = 10

# === Training Loop ===
for epoch in range(num_epochs):
    char_model.train()
    epoch_loss = 0
    for batch_features, batch_labels in dataloader:
        batch_size = batch_features.size(0)
        optimizer.zero_grad()
        states = char_model.init_state(batch_size)
        logits, new_states = char_model(batch_features, states)
        labels_flat = batch_labels.view(-1)
        loss = criterion(logits, labels_flat)
        loss.backward()
        optimizer.step()
        epoch_loss += loss.item()
    avg_loss = epoch_loss / len(dataloader)
    print(f"Epoch {epoch+1}/{num_epochs}, Loss: {avg_loss:.4f}")
print("\nTraining complete!")

# === Save the trained model ===
save_path = '../src/models/new_trained_lstm_model.pth'
os.makedirs(os.path.dirname(save_path), exist_ok=True)
torch.save(char_model.state_dict(), save_path)
print(f"Model saved to {save_path}")

# === Generation with temperature sampling ===
char_model.eval()

starting_prompt = "You will rejoice to hear that no disaster has accompanied the commencement of an enterprise"
prompt_ids = [c2ix[char] for char in starting_prompt]
prompt_tensor = torch.tensor(prompt_ids, dtype=torch.long).unsqueeze(0)  # shape: (1, prompt_len)

if starting_prompt and prompt_tensor is not None:
    print(f"Prompt: '{starting_prompt}'")
    print(f"Prompt tensor shape: {prompt_tensor.shape}")

temperature = 0.8  # RECOMMENDED: 0.8 for creativity while retaining coherence
num_generated_chars = 2000  # Generate 2,000 characters for more realism

# Warm up states by running prompt through the model
states = char_model.init_state(1)  # batch_size=1 for generation
with torch.no_grad():
    logits, states = char_model(prompt_tensor, states)
    last_logits = logits[-1:]

if states is not None and last_logits is not None:
    print(f"States warmed up. Last logits shape: {last_logits.shape}")

generated_ids = []

with torch.no_grad():
    for _ in range(num_generated_chars):
        # Temperature sampling instead of argmax!
        probs = torch.softmax(last_logits / temperature, dim=-1)
        next_id = torch.multinomial(probs, num_samples=1).item()
        generated_ids.append(next_id)

        # Next input needs to be shape (1, 1)
        next_input = torch.tensor([[next_id]], dtype=torch.long)
        logits, states = char_model(next_input, states)
        last_logits = logits[-1:]

if generated_ids:
    print(f"Generated {len(generated_ids)} character IDs")

generated_text = ''.join([ix2c[id] for id in generated_ids])
full_generated = starting_prompt + generated_text
print("="*80)
print("GENERATED TEXT (Prompt + 2000 chars):")
print("="*80)
print(full_generated)




Vocabulary loaded: 93 unique characters (built from the whole novel)
Dataset: 438762 samples
DataLoader: 12188 batches per epoch
Epoch 1/10, Loss: 1.5117
Epoch 2/10, Loss: 1.5146
Epoch 3/10, Loss: 1.6102
Epoch 4/10, Loss: 1.6288
Epoch 5/10, Loss: 1.6370


KeyboardInterrupt: 

# ‚ö†Ô∏è TRAINING ISSUE FIXES NEEDED

## Problems Detected:
1. **Learning rate too high**: 0.015 is too aggressive for full novel (438K samples)
2. **Vocab mismatch**: Model trained on vocab_size=60, but full novel has vocab_size=93
3. **Loss increasing**: Model diverging instead of converging

## Solutions Below ‚Üì


In [None]:
# FIXED TRAINING CELL - Use this instead!
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from torch.optim import Adam
import os

# === Load FULL text and build vocab ===
with open('../datasets/frankenstein.txt', 'r', encoding='utf-8') as f:
    full_text = f.read()

tokenized_text = list(full_text)
unique_char_tokens = sorted(set(tokenized_text))
c2ix = {char: idx for idx, char in enumerate(unique_char_tokens)}
ix2c = {idx: char for char, idx in c2ix.items()}
vocab_size = len(c2ix)

tokenized_id_text = [c2ix[char] for char in tokenized_text]

print(f"Vocabulary: {vocab_size} unique characters")
print(f"Text length: {len(tokenized_text)} characters")

# === Dataset ===
class TextDataset(Dataset):
    def __init__(self, tokenized_ids, seq_length):
        self.ids = tokenized_ids
        self.seq_length = seq_length
    
    def __len__(self):
        return len(self.ids) - self.seq_length
    
    def __getitem__(self, idx):
        features = self.ids[idx : idx + self.seq_length]
        labels = self.ids[idx + 1 : idx + self.seq_length + 1]
        return (
            torch.tensor(features, dtype=torch.long),
            torch.tensor(labels, dtype=torch.long)
        )

# === Model ===
class CharacterLSTM(nn.Module):
    def __init__(self, vocab_size, embedding_dim=48, hidden_size=96):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embedding_dim)
        self.lstm = nn.LSTM(embedding_dim, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, vocab_size)
        self.hidden_size = hidden_size
    
    def forward(self, x, states):
        embedded = self.embedding(x)
        lstm_out, new_states = self.lstm(embedded, states)
        logits = self.fc(lstm_out)
        logits_flat = logits.view(-1, logits.size(-1))
        return logits_flat, new_states
    
    def init_state(self, batch_size):
        h0 = torch.zeros(1, batch_size, self.hidden_size)
        c0 = torch.zeros(1, batch_size, self.hidden_size)
        return (h0, c0)

# === Dataset & DataLoader ===
dataset = TextDataset(tokenized_id_text, seq_length=48)
dataloader = DataLoader(dataset, batch_size=36, shuffle=True)

print(f"Dataset: {len(dataset)} samples")
print(f"Batches per epoch: {len(dataloader)}")

# === Model, Loss, Optimizer ===
char_model = CharacterLSTM(vocab_size)
print(f"Model parameters: {sum(p.numel() for p in char_model.parameters()):,}")

# CRITICAL FIX: Lower learning rate for larger dataset
criterion = nn.CrossEntropyLoss()
optimizer = Adam(char_model.parameters(), lr=0.003)  # ‚Üê Reduced from 0.015 to 0.003!

num_epochs = 10

# === Training ===
print("\nStarting training...")
for epoch in range(num_epochs):
    char_model.train()
    epoch_loss = 0
    
    for batch_features, batch_labels in dataloader:
        batch_size = batch_features.size(0)
        optimizer.zero_grad()
        states = char_model.init_state(batch_size)
        logits, new_states = char_model(batch_features, states)
        labels_flat = batch_labels.view(-1)
        loss = criterion(logits, labels_flat)
        loss.backward()
        optimizer.step()
        epoch_loss += loss.item()
    
    avg_loss = epoch_loss / len(dataloader)
    print(f"Epoch {epoch+1}/{num_epochs}, Loss: {avg_loss:.4f}")

print("\nTraining complete!")

# Save
save_path = '../src/models/trained_lstm_full_novel.pth'
os.makedirs(os.path.dirname(save_path), exist_ok=True)
torch.save(char_model.state_dict(), save_path)
print(f"Model saved to {save_path}")


Vocabulary: 93 unique characters
Text length: 438810 characters
Dataset: 438762 samples
Batches per epoch: 12188
Model parameters: 69,549

Starting training...


## üìä Accuracy Evaluation

**Test the model on known text continuations to measure accuracy**

This measures **character-level accuracy** - how many characters the model predicts correctly compared to the actual text continuation.


In [None]:
# Test prompts and their expected continuations from Frankenstein
test_prompts = [
    "I am already far north of London, and as I walk in the streets of Petersburgh, I feel a cold northern breeze play upon my cheeks, which braces my nerves and fills",
    "These reflections have dispelled the agitation with which I began my letter, and I feel my heart glow with an enthusiasm which elevates me to heaven",
    "These visions faded when I perused, for the first time, those poets whose effusions entranced my soul"
]

expected_continuations = [
    " me with delight. Do you understand this feeling? This breeze, which has travelled from the regions towards which I am advancing, gives me a foretaste of those icy climes.",
    " for nothing contributes so much to tranquillise the mind as a steady purpose‚Äîa point on which the soul may fix its intellectual eye.",
    " and lifted it to heaven. I also became a poet and for one year lived in a paradise of my own creation;"
]

print(f"Loaded {len(test_prompts)} test prompts")
print(f"Test 1 length: {len(expected_continuations[0])} chars")
print(f"Test 2 length: {len(expected_continuations[1])} chars")
print(f"Test 3 length: {len(expected_continuations[2])} chars")


In [None]:
# Evaluate accuracy for each prompt
def evaluate_accuracy(model, prompt, expected_text, temperature=0.8):
    """
    Generate text from prompt and calculate character-level accuracy.
    
    Args:
        model: Trained LSTM model
        prompt: Starting text
        expected_text: Ground truth continuation
        temperature: Sampling temperature
    
    Returns:
        accuracy: Percentage of correct characters
        generated_text: Model's generation
    """
    model.eval()
    
    # Tokenize prompt
    prompt_ids = [c2ix[char] for char in prompt]
    prompt_tensor = torch.tensor(prompt_ids, dtype=torch.long).unsqueeze(0)
    
    # Initialize states and warm up with prompt
    states = model.init_state(1)
    with torch.no_grad():
        logits, states = model(prompt_tensor, states)
        last_logits = logits[-1:]
    
    # Generate expected length
    generated_ids = []
    with torch.no_grad():
        for char in expected_text:
            # Temperature sampling
            probs = torch.softmax(last_logits / temperature, dim=-1)
            next_id = torch.multinomial(probs, num_samples=1).item()
            generated_ids.append(next_id)
            
            next_input = torch.tensor([[next_id]], dtype=torch.long)
            logits, states = model(next_input, states)
            last_logits = logits[-1:]
    
    # Decode generated text
    generated_text = ''.join([ix2c[id] for id in generated_ids])
    
    # Calculate accuracy
    correct = sum(1 for g, e in zip(generated_text, expected_text) if g == e)
    accuracy = (correct / len(expected_text)) * 100 if len(expected_text) > 0 else 0
    
    return accuracy, generated_text

print("Accuracy evaluation function defined")


In [None]:
# Run evaluation on all test prompts
results = []

for i, (prompt, expected) in enumerate(zip(test_prompts, expected_continuations), 1):
    print(f"\n{'='*80}")
    print(f"TEST {i}: Evaluating prompt...")
    print(f"{'='*80}")
    
    accuracy, generated = evaluate_accuracy(model, prompt, expected, temperature=0.8)
    results.append({
        'prompt_num': i,
        'accuracy': accuracy,
        'prompt': prompt[:50] + "...",
        'expected_length': len(expected),
        'generated_length': len(generated)
    })
    
    print(f"\nPrompt: {prompt[:80]}...")
    print(f"\nExpected ({len(expected)} chars):")
    print(f"  {expected[:100]}...")
    print(f"\nGenerated ({len(generated)} chars):")
    print(f"  {generated[:100]}...")
    print(f"\n‚úÖ Accuracy: {accuracy:.2f}%")

print(f"\n{'='*80}")
print("SUMMARY")
print(f"{'='*80}")
for r in results:
    print(f"Test {r['prompt_num']}: {r['accuracy']:.2f}% accuracy")

avg_accuracy = sum(r['accuracy'] for r in results) / len(results)
print(f"\nüéØ Average Accuracy: {avg_accuracy:.2f}%")


## üìå Key Takeaways

- ‚úÖ Generation is autoregressive: each char depends on previous chars
- ‚úÖ `model.eval()` and `torch.no_grad()` are essential for inference
- ‚úÖ Greedy sampling (argmax) is simple but can be repetitive
- ‚úÖ States are carried across generation steps to maintain context
- ‚úÖ The prompt "primes" the model with initial context
- ‚úÖ Decoding: IDs ‚Üí characters using `ix2c`

---

## üéâ Congratulations!

You've completed the full pipeline:
1. ‚úÖ Loaded and sliced text data
2. ‚úÖ Built character vocabulary
3. ‚úÖ Created Dataset and DataLoader
4. ‚úÖ Defined LSTM architecture
5. ‚úÖ Trained the model
6. ‚úÖ Generated new text

**Next:** Document your learnings in **Notebook 99 (Lab Notes)**!

---

*This is honest work. Now go forth and generate!* üöÄ
