<a href="https://colab.research.google.com/github/profliuhao/CSIT599/blob/main/CSIT599_Module4_Text_Generation_v1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Character-Level LSTM Text Generation - Shakespeare Style

## Exercise

Instructions for Students:

Fill in the blanks marked with "# TODO: Student fills this" to complete the code.

You will need to specify:
1. Sequence length for training
2. LSTM architecture parameters
3. Model compilation settings
4. Training parameters
5. Text generation temperature

The model will learn to generate text character-by-character in Shakespeare's style!


In [None]:
import numpy as np
import matplotlib.pyplot as plt
import requests
import random
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, LSTM, Dense, Dropout
from tensorflow.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from tensorflow.keras.utils import to_categorical
import sys
import re

# Set random seeds for reproducibility
np.random.seed(42)
import tensorflow as tf
tf.random.set_seed(42)
random.seed(42)

print("TensorFlow version:", tf.__version__)
print(tf.config.list_physical_devices('GPU'))

In [None]:
# ============================================================================
# STEP 1: DOWNLOAD AND LOAD THE CORPUS
# ============================================================================

print("\n" + "="*70)
print("STEP 1: Downloading Shakespeare Corpus")
print("="*70)

def download_shakespeare():
    """
    Download Shakespeare's complete works from Project Gutenberg.
    This is a public domain text perfect for text generation exercises.
    """
    print("Downloading Shakespeare's complete works...")

    # URL to Shakespeare's complete works (public domain)
    url = "https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt"

    try:
        response = requests.get(url, timeout=10)
        response.raise_for_status()
        text = response.text
        print(f"✓ Download successful!")
        print(f"✓ Corpus size: {len(text):,} characters")
        return text
    except Exception as e:
        print(f"Error downloading: {e}")
        print("Using a fallback sample text...")
        # Fallback sample if download fails
        return """To be, or not to be, that is the question:
Whether 'tis nobler in the mind to suffer
The slings and arrows of outrageous fortune,
Or to take arms against a sea of troubles
And by opposing end them. To die—to sleep,
No more; and by a sleep to say we end
The heart-ache and the thousand natural shocks
That flesh is heir to: 'tis a consummation
Devoutly to be wish'd.""" * 100

# Download the corpus
raw_text = download_shakespeare()

# Show a preview
print("\nCorpus preview (first 500 characters):")
print("-" * 70)
print(raw_text[:500])
print("-" * 70)

In [None]:
# ============================================================================
# STEP 2: PREPROCESS THE TEXT
# ============================================================================

print("\n" + "="*70)
print("STEP 2: Preprocessing Text Data")
print("="*70)

# Convert to lowercase for simplicity (optional - can keep original case)
text = raw_text.lower()
# text = text.replace('\n', ' ')    # Remove newlines
# text = text.replace('\r', ' ')    # Remove carriage returns
# text = text.replace('\t', ' ')    # Remove tabs
# text = re.sub(r' +', ' ', text)  # Collapse multiple spaces

# Get all unique characters in the text
chars = sorted(list(set(text)))
n_chars = len(chars)

print(f"Total characters in corpus: {len(text):,}")
print(f"Unique characters: {n_chars}")
print(f"Characters: {chars[:50]}...")  # Show first 50 characters

# Create character-to-integer mappings
char_to_int = {c: i for i, c in enumerate(chars)}
int_to_char = {i: c for i, c in enumerate(chars)}

print(f"\nExample mappings:")
print(f"  '{chars[0]}' -> {char_to_int[chars[0]]}")
print(f"  '{chars[1]}' -> {char_to_int[chars[1]]}")
print(f"  {char_to_int[chars[0]]} -> '{int_to_char[char_to_int[chars[0]]]}'")

In [None]:
# ============================================================================
# STEP 3: CREATE TRAINING SEQUENCES
# ============================================================================

print("\n" + "="*70)
print("STEP 3: Creating Training Sequences")
print("="*70)

# TODO: Student fills this - Define sequence length
# This is how many characters the model looks at to predict the next character
# Hint: Try values like 40, 60, or 100
# Longer sequences = more context but slower training
seq_length = ___________

print(f"Sequence length: {seq_length}")

# Prepare the dataset
# We'll create overlapping sequences of length seq_length
# For each sequence, the target is the next character

X_data = []  # Input sequences (encoded as integers)
y_data = []  # Target characters (next character after each sequence)

print("Creating sequences...")
for i in range(len(text) - seq_length):
    # Input: sequence of characters
    seq_in = text[___________]
    # Output: next character
    seq_out = text[___________]

    # Convert characters to integers
    X_data.append([char_to_int[char] for char in seq_in])
    y_data.append(char_to_int[seq_out])

n_patterns = len(X_data)
print(f"✓ Total training sequences: {n_patterns:,}")

# Example of what we created
print(f"\nExample sequence:")
print(f"  Input:  '{text[0:seq_length]}'")
print(f"  Target: '{text[seq_length]}'")

# Convert to numpy arrays
X = np.array(X_data)
y = np.array(y_data)

print(f"\nData shapes before normalization:")
print(f"  X shape: {X.shape} (samples, sequence_length)")
print(f"  y shape: {y.shape} (samples,)")

In [None]:
# ============================================================================
# STEP 4: NORMALIZE AND RESHAPE DATA
# ============================================================================

print("\n" + "="*70)
print("STEP 4: Normalizing and Reshaping Data")
print("="*70)

# Normalize input to 0-1 range
# This helps with training stability
X = X / float(n_chars)

# Reshape X to be [samples, time steps, features]
# LSTM expects 3D input: (batch_size, sequence_length, input_dim)
X = np.reshape(X, (X.shape[0], X.shape[1], 1))

# Convert target to one-hot encoding
# Each character becomes a binary vector of length n_chars
y = to_categorical(y, num_classes=n_chars)

print(f"Data shapes after processing:")
print(f"  X shape: {X.shape} (samples, sequence_length, features)")
print(f"  y shape: {y.shape} (samples, num_classes)")
print(f"\nThe model will predict a probability distribution over {n_chars} possible characters")

In [None]:
# ============================================================================
# STEP 5: BUILD THE LSTM MODEL
# ============================================================================

print("\n" + "="*70)
print("STEP 5: Building Character-Level LSTM Model")
print("="*70)

def build_lstm_model():
    """
    Build an LSTM model for character-level text generation.

    Architecture:
    1. Input layer: Defines input shape
    2. LSTM layer(s): Learn sequential patterns in text
    3. Dropout layer(s): Prevent overfitting
    4. Dense output layer: Predict next character

    The model learns to predict the probability distribution over all
    possible characters given a sequence of previous characters.
    """
    model = Sequential([
        # Input layer: shape is (seq_length, 1)
        # 1 feature because we feed one character at a time
        Input(shape=(seq_length, 1)),

        # TODO: Student fills this - Add first LSTM layer
        # Hint: Use 128 or 256 units for good performance
        # Set return_sequences=True if you want to stack another LSTM layer
        # Set return_sequences=False if this is the last LSTM layer
        LSTM(units=___________, return_sequences=___________),

        # Dropout to prevent overfitting
        Dropout(0.2),

        # TODO: Student fills this (OPTIONAL) - Add second LSTM layer
        # Hint: If you added return_sequences=True above, uncomment this
        # Use same or fewer units (128 or 256)
        # This layer should have return_sequences=False
        # LSTM(units=___________, return_sequences=False),
        # Dropout(0.2),

        # TODO: Student fills this - Add Dense output layer
        # Hint: Output size should be n_chars (number of unique characters)
        # What activation function gives us a probability distribution?
        Dense(n_chars, activation='___________')
    ])

    return model

# Build the model
model = build_lstm_model()

# Display model architecture
print("\nLSTM Text Generation Model:")
model.summary()

print(f"\nModel insights:")
print(f"  - Input: sequences of {seq_length} characters")
print(f"  - Output: probability distribution over {n_chars} characters")
print(f"  - Training will teach the model Shakespeare's writing patterns")

In [None]:
# ============================================================================
# STEP 6: COMPILE THE MODEL
# ============================================================================

print("\n" + "="*70)
print("STEP 6: Compiling Model")
print("="*70)

# TODO: Student fills these - Specify loss and optimizer
# Hints:
# - Loss: We're doing multi-class classification (use 'categorical_crossentropy')
# - Optimizer: 'adam' works well, or try 'rmsprop'

loss_function = '___________'
optimizer_name = '___________'

print(f"Loss function: {loss_function}")
print(f"Optimizer: {optimizer_name}")

# IMPORTANT: Gradient clipping prevents exploding gradients in RNNs
# This is crucial for stable training!
if optimizer_name.lower() == 'adam':
    from tensorflow.keras.optimizers import Adam
    optimizer = Adam(learning_rate=0.002, clipnorm=1.0)
    print("Using Adam optimizer with gradient clipping (clipnorm=1.0)")
elif optimizer_name.lower() == 'rmsprop':
    from tensorflow.keras.optimizers import RMSprop
    optimizer = RMSprop(learning_rate=0.002, clipnorm=1.0)
    print("Using RMSprop optimizer with gradient clipping (clipnorm=1.0)")
else:
    optimizer = optimizer_name
    print("Note: Consider using gradient clipping for stable RNN training!")

model.compile(
    loss=loss_function,
    optimizer=optimizer,
    metrics=['accuracy']
)

print("✓ Model compiled successfully")
print("\n💡 Key Training Settings:")
print("  - Gradient clipping enabled (clipnorm=1.0)")
print("  - This prevents exploding gradients common in RNNs")
print("  - Learning rate: 0.001 (will be reduced automatically if needed)")

In [None]:
# ============================================================================
# STEP 7: SET UP CALLBACKS (Including Text Generation During Training)
# ============================================================================

print("\n" + "="*70)
print("STEP 7: Setting Up Training Callbacks")
print("="*70)

# Custom callback to generate text at specific epochs
class TextGenerationCallback(tf.keras.callbacks.Callback):
    """
    Custom callback to generate text during training at specific epochs.
    This lets us see how generation quality improves over time!
    """
    def __init__(self, seed_text, generate_at_epochs=[1, 2, 5, 10, 15, 20]):
        super().__init__()
        self.seed_text = seed_text
        self.generate_at_epochs = generate_at_epochs
        self.generation_history = {}

    def on_epoch_end(self, epoch, logs=None):
        """Generate text after specific epochs to show progress"""
        current_epoch = epoch + 1  # Keras uses 0-indexing

        if current_epoch in self.generate_at_epochs:
            print(f"\n{'='*70}")
            print(f"🎭 Text Generation Preview at Epoch {current_epoch}")
            print(f"{'='*70}")

            # Prepare seed
            seed = self.seed_text[-seq_length:] if len(self.seed_text) >= seq_length else self.seed_text
            generated = seed

            # Generate 200 characters
            for i in range(200):
                x_pred = np.zeros((1, seq_length, 1))
                for t, char in enumerate(generated[-seq_length:]):
                    if char in char_to_int:
                        x_pred[0, t, 0] = char_to_int[char] / float(n_chars)

                predictions = self.model.predict(x_pred, verbose=0)[0]

                # Use temperature = 0.2 for more determinstic generation
                predictions = np.asarray(predictions).astype('float64')
                predictions = np.log(predictions + 1e-10) / 0.2
                exp_preds = np.exp(predictions)
                predictions = exp_preds / np.sum(exp_preds)
                next_index = np.argmax(np.random.multinomial(1, predictions, 1))

                next_char = int_to_char[next_index]
                generated += next_char

            # Store and display
            self.generation_history[current_epoch] = generated
            print(f"Seed: '{seed}'")
            print(f"\nGenerated text:")
            print("-" * 70)
            print(generated)
            print("-" * 70)

# Initialize callbacks
print("Setting up training callbacks...")

# Choose a seed text for generation during training
generation_seed = text[1000:1000+seq_length]

# Text generation callback - shows progress during training!
text_gen_callback = TextGenerationCallback(
    seed_text=generation_seed,
    generate_at_epochs=[1, 2, 5, 10, 15, 20]
)

# Checkpoint: Save the best model during training
checkpoint = ModelCheckpoint(
    'best_model.keras',
    monitor='loss',
    verbose=1,
    save_best_only=True,
    mode='min'
)

# Reduce learning rate when loss plateaus
reduce_lr = ReduceLROnPlateau(
    monitor='loss',
    factor=0.5,
    patience=2,
    min_lr=0.000001,
    verbose=1
)

callbacks = [text_gen_callback, checkpoint, reduce_lr]
print("✓ Callbacks configured:")
print("  - Text generation: Shows progress at epochs 1, 2, 5, 10, 15, 20")
print("  - Model checkpoint: Saves best model")
print("  - Learning rate reduction: Adapts learning rate if loss plateaus")
print(f"\n📝 Will generate preview text at epochs: {text_gen_callback.generate_at_epochs}")
print("\n⚠️  Training Tips:")
print("  - If loss increases, training will automatically reduce learning rate")
print("  - Gradient clipping (clipnorm=1.0) prevents exploding gradients")
print("  - Loss should steadily decrease; if it doesn't, stop and check settings")

In [None]:
# ============================================================================
# STEP 8: TRAIN THE MODEL
# ============================================================================

print("\n" + "="*70)
print("STEP 8: Training the Model")
print("="*70)

# TODO: Student fills these - Set training hyperparameters
# Hints:
# - batch_size: Try 128, 256, or 512 (larger is faster but uses more memory)
# - epochs: Text generation needs more epochs, try 20-50
#   (Note: We'll use fewer for the exercise to save time)

batch_size = ___________
epochs = ___________

print(f"Batch size: {batch_size}")
print(f"Epochs: {epochs}")
print(f"Total training samples: {len(X):,}")
print(f"Steps per epoch: {len(X) // batch_size}")

print("\n" + "-"*70)
print("Starting training... (This may take a while)")
print("-"*70)

import time
start_time = time.time()

history = model.fit(
    X, y,
    batch_size=batch_size,
    epochs=epochs,
    callbacks=callbacks,
    verbose=1
)

training_time = time.time() - start_time

print(f"\n✓ Training completed in {training_time:.2f} seconds ({training_time/60:.2f} minutes)")
print(f"✓ Final loss: {history.history['loss'][-1]:.4f}")
print(f"✓ Final accuracy: {history.history['accuracy'][-1]:.4f}")

In [None]:
# ============================================================================
# STEP 8A: REVIEW GENERATION PROGRESS DURING TRAINING
# ============================================================================

print("\n" + "="*70)
print("STEP 8A: Reviewing Text Generation Progress")
print("="*70)

print("\nLet's see how text generation quality improved during training!\n")

# Save generation history to file
with open('generation_progress.txt', 'w') as f:
    f.write("Text Generation Progress During Training\n")
    f.write("=" * 70 + "\n")
    f.write(f"Seed text: '{generation_seed}'\n")
    f.write("Temperature: 1.0 (for consistent comparison)\n")
    f.write("=" * 70 + "\n\n")

    for epoch_num in sorted(text_gen_callback.generation_history.keys()):
        generated_text = text_gen_callback.generation_history[epoch_num]
        f.write(f"\n{'='*70}\n")
        f.write(f"EPOCH {epoch_num}\n")
        f.write(f"{'='*70}\n")
        f.write(generated_text + "\n")

print("✓ Generation progress saved to 'generation_progress.txt'")

# Display a nice comparison
print("\n" + "="*70)
print("GENERATION QUALITY COMPARISON")
print("="*70)

for epoch_num in sorted(text_gen_callback.generation_history.keys()):
    generated_text = text_gen_callback.generation_history[epoch_num]
    # Show first 150 characters
    preview = generated_text[:150] + "..." if len(generated_text) > 150 else generated_text

    print(f"\n📅 Epoch {epoch_num}:")
    print("-" * 70)
    print(preview)

print("\n" + "="*70)
print("OBSERVATION: Notice how the text becomes more coherent as training progresses!")
print("Early epochs: Random characters or repeated patterns")
print("Later epochs: Recognizable words, better grammar, Shakespeare-like style")
print("="*70)

In [None]:
# ============================================================================
# STEP 9: VISUALIZE TRAINING HISTORY
# ============================================================================

print("\n" + "="*70)
print("STEP 9: Visualizing Training Progress")
print("="*70)

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Loss over epochs
axes[0].plot(history.history['loss'], marker='o', linewidth=2)
axes[0].set_title('Model Loss During Training', fontsize=14, fontweight='bold')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Loss (Categorical Cross-Entropy)')
axes[0].grid(True, alpha=0.3)

# Plot 2: Accuracy over epochs
axes[1].plot(history.history['accuracy'], marker='o', linewidth=2, color='green')
axes[1].set_title('Model Accuracy During Training', fontsize=14, fontweight='bold')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Accuracy')
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig('training_history.png', dpi=300, bbox_inches='tight')
print("✓ Training history plot saved")
plt.show()

In [None]:
# ============================================================================
# STEP 10: TEXT GENERATION FUNCTIONS
# ============================================================================

print("\n" + "="*70)
print("STEP 10: Defining Text Generation Functions")
print("="*70)

def sample_with_temperature(predictions, temperature=1.0):
    """
    Sample a character from a probability distribution with temperature.

    Temperature controls randomness:
    - temperature = 1.0: Use the model's predicted probabilities as-is
    - temperature < 1.0: More conservative (picks high probability chars)
    - temperature > 1.0: More creative/random (explores low probability chars)

    Args:
        predictions: Array of probabilities for each character
        temperature: Controls randomness (default 1.0)

    Returns:
        Index of the sampled character
    """
    predictions = np.asarray(predictions).astype('float64')

    # Apply temperature
    predictions = np.log(predictions + 1e-10) / temperature
    exp_preds = np.exp(predictions)
    predictions = exp_preds / np.sum(exp_preds)

    # Sample from the distribution
    probas = np.random.___________(1, predictions, 1)
    return np.argmax(probas)


def generate_text(model, seed_text, length=400, temperature=1.0):
    """
    Generate text using the trained model.

    Args:
        model: Trained LSTM model
        seed_text: Starting text (should be at least seq_length chars)
        length: Number of characters to generate
        temperature: Sampling temperature (controls creativity)

    Returns:
        Generated text string
    """
    # Ensure seed text is long enough
    if len(seed_text) < seq_length:
        seed_text = seed_text + ' ' * (seq_length - len(seed_text))

    # Use only the last seq_length characters
    seed_text = seed_text[-seq_length:]
    generated = seed_text

    print(f"Generating {length} characters with temperature={temperature}...")
    print(f"Seed text: '{seed_text}'")
    print("-" * 70)

    # Generate characters one by one
    for i in range(length):
        # Prepare input sequence
        x_pred = np.zeros((1, seq_length, 1))
        for t, char in enumerate(generated[-seq_length:]):
            if char in char_to_int:
                x_pred[0, t, 0] = char_to_int[char] / float(n_chars)

        # Predict next character
        predictions = model.predict(x_pred, verbose=0)[0]

        # Sample next character using temperature
        next_index = sample_with_temperature(predictions, temperature)
        next_char = int_to_char[next_index]

        # Add to generated text
        generated += next_char

        # Show progress every 100 characters
        if (i + 1) % 100 == 0:
            print(f"Generated {i + 1}/{length} characters...")

    return generated

print("✓ Text generation functions defined")


In [None]:
# ============================================================================
# STEP 11: GENERATE TEXT WITH DIFFERENT TEMPERATURES
# ============================================================================

print("\n" + "="*70)
print("STEP 11: Generating Shakespeare-Style Text")
print("="*70)

# Choose a seed text from the corpus
seed_text = text[1000:1000+seq_length]

# TODO: Student fills this - Set temperatures for generation
# Hint: Try different values to see the effect
# - Low temperature (0.2-0.5): Conservative, repetitive
# - Medium temperature (0.7-1.0): Balanced
# - High temperature (1.2-1.5): Creative, chaotic

temperatures = [___________, ___________, ___________]

print(f"Seed text: '{seed_text}'")
print(f"\nWe'll generate text at different 'temperatures' to see the effect:")
print("  - Low temp: More conservative, follows patterns closely")
print("  - High temp: More creative, takes more risks")

# Generate text with different temperatures
generated_texts = {}

for temp in temperatures:
    print("\n" + "="*70)
    print(f"Temperature: {temp}")
    print("="*70)

    generated = generate_text(model, seed_text, length=400, temperature=temp)
    generated_texts[temp] = generated

    print("\nGenerated text:")
    print("-" * 70)
    print(generated)
    print("-" * 70)

In [None]:
# ============================================================================
# STEP 12: SAVE GENERATED TEXTS TO FILE
# ============================================================================

print("\n" + "="*70)
print("STEP 12: Saving Generated Texts")
print("="*70)

with open('generated_text.txt', 'w') as f:
    f.write("Shakespeare-Style Text Generation Results\n")
    f.write("=" * 70 + "\n\n")
    f.write(f"Training Details:\n")
    f.write(f"  - Corpus size: {len(text):,} characters\n")
    f.write(f"  - Vocabulary size: {n_chars} unique characters\n")
    f.write(f"  - Sequence length: {seq_length}\n")
    f.write(f"  - Training time: {training_time:.2f} seconds\n")
    f.write(f"  - Final loss: {history.history['loss'][-1]:.4f}\n")
    f.write(f"  - Final accuracy: {history.history['accuracy'][-1]:.4f}\n\n")

    for temp in temperatures:
        f.write("=" * 70 + "\n")
        f.write(f"Temperature: {temp}\n")
        f.write("=" * 70 + "\n")
        f.write(generated_texts[temp] + "\n\n")

print("✓ Generated texts saved to 'generated_text.txt'")

In [None]:
# ============================================================================
# STEP 13: SUMMARY AND KEY TAKEAWAYS
# ============================================================================

print("\n" + "="*70)
print("STEP 13: Summary and Key Takeaways")
print("="*70)

print(f"""
TRAINING SUMMARY:
{'='*70}
Corpus: Shakespeare's complete works
Total characters: {len(text):,}
Unique characters: {n_chars}
Sequence length: {seq_length}
Training samples: {len(X):,}

Model architecture: LSTM with {model.count_params():,} parameters
Training time: {training_time:.2f} seconds ({training_time/60:.2f} minutes)
Final loss: {history.history['loss'][-1]:.4f}
Final accuracy: {history.history['accuracy'][-1]:.4f}

KEY CONCEPTS LEARNED:
{'='*70}
1. CHARACTER-LEVEL MODELING:
   - Models text as a sequence of characters (not words)
   - Learns spelling, punctuation, and style patterns
   - Can generate novel words and names

2. SEQUENCE PREDICTION:
   - Uses previous characters to predict the next one
   - Sequence length determines how much context the model sees
   - Longer sequences = more context but slower training

3. ONE-HOT ENCODING:
   - Each character converted to a binary vector
   - Output is a probability distribution over all characters
   - Categorical cross-entropy loss for multi-class prediction

4. LSTM FOR TEXT:
   - LSTM remembers long-term patterns in text
   - Learns grammar, style, and structure
   - Can generate coherent text in the training style

5. TEMPERATURE SAMPLING:
   - Controls randomness in generation
   - Low temp: Safe, predictable, repetitive
   - High temp: Creative, diverse, sometimes nonsensical
   - Balances between mimicry and creativity

APPLICATIONS:
{'='*70}
- Creative writing assistance
- Code generation
- Music composition (with notes as "characters")
- DNA sequence analysis
- Language modeling
- Auto-completion systems

NEXT STEPS:
{'='*70}
1. Try longer training (more epochs)
2. Experiment with different sequence lengths
3. Use word-level modeling instead of character-level
4. Try different texts (modern novels, code, poetry)
5. Implement beam search for better generation
6. Add attention mechanisms
7. Try transformer models (GPT-style)
""")

print("="*70)
print("Exercise completed! Review your generated text and discuss with your instructor.")
print("="*70)

## The KEY Insight:

### 60% accuracy = 60% of INDIVIDUAL characters correct
But over 200 characters:

- ✅ ~120 correct characters
- ❌ ~80 wrong characters
- These 80 errors compound and break the flow!

Plus, with temperature=1.0:

- Model picks from FULL probability distribution
- 40% of picks are suboptimal
- Makes text look MORE random than it actually is


### 🎭 What Your Model Learned vs Didn't Learn
✅ At 60%, Your Model HAS Learned:

- Common words (the, and, be, to, not)
- Basic grammar
- Letter patterns (th, er, ing)
- Word boundaries

❌ What It HASN'T Learned Yet:

- Shakespeare's archaic language (thou, 'tis, wherefore)
- Complex sentence structures
- Poetic rhythm
- Long-range coherence

### Why not? Needs 70%+ accuracy for style!

🚀 How to Fix It (Ranked by Impact)
1. Train Much Longer 📈 (BIGGEST IMPACT)
- epochs = 100  # Instead of 20
- Result: 58% → 66% accuracy, Shakespeare style emerges!
2. Lower Temperature ⚡ (IMMEDIATE FIX)
- generate_text(model, seed, temperature=0.6)  # Instead of 1.0
- Result: 15-20% better quality instantly!
3. Remove Line Breaks
- Result: Cleaner, faster learning
4. Larger Model (Optional)
- LSTM(units=512, ...)  # Instead of 256
- Result: More capacity for complex patterns