# Look into the Shakespeare data

In [1]:
import numpy as np
import tensorflow as tf
from pathlib import Path

# Load the Shakespeare text
path = tf.keras.utils.get_file(
    "shakespeare.txt",
    "https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt",
)
text = Path(path).read_text(encoding="utf-8")

# Print some basic info
print("Total characters:", len(text))
print("Unique characters:", sorted(set(text)))
print("Sample text:\n", text[:1000])  # first 1000 chars

# Char to index and back
vocab = sorted(set(text))
char2idx = {u: i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)

# Convert example string
example = "To be or not to be"
encoded = np.array([char2idx[c] for c in example])
decoded = ''.join([idx2char[i] for i in encoded])

print("\nExample encoding/decoding:")
print("Original:", example)
print("Encoded:", encoded)
print("Decoded:", decoded)


Total characters: 1115394
Unique characters: ['\n', ' ', '!', '$', '&', "'", ',', '-', '.', '3', ':', ';', '?', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
Sample text:
 First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.

All:
We know't, we know't.

First Citizen:
Let us kill him, and we'll have corn at our own price.
Is't a verdict?

All:
No more talking on't; let it be done: away, away!

Second Citizen:
One word, good citizens.

First Citizen:
We are accounted poor citizens, the patricians good.
What authority surfeits on would relieve us: if they
would yield us but the superf

# Pretrained model output vs Fine tuned model output

In [3]:
# compare_models_output.py

import tensorflow as tf
import numpy as np
from pathlib import Path

# Load raw text to rebuild vocab
path = tf.keras.utils.get_file(
    "shakespeare.txt",
    "https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt",
)
text = Path(path).read_text(encoding="utf-8")
vocab = sorted(set(text))
char2idx = {u: i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)

# Text generation function
def generate_text(model, start_string, gen_len=400):
    input_eval = [char2idx[s] for s in start_string]
    input_eval = tf.expand_dims(input_eval, 0)

    output = []

    for _ in range(gen_len):
        preds = model(input_eval)
        preds = preds[:, -1, :]  # Get last timestep logits
        pred_id = tf.random.categorical(preds, num_samples=1)[-1, 0].numpy()
        input_eval = tf.concat([input_eval, [[pred_id]]], axis=-1)
        output.append(idx2char[pred_id])

    return start_string + ''.join(output)

# Load both models
base_model = tf.keras.models.load_model("pretrained_best.keras", compile=False)
finetuned_model = tf.keras.models.load_model("finetuned_best.keras", compile=False)

# Run comparison
prompt = "To be or not to be,"

print("\n--- Output from Pretrained Base Model ---\n")
print(generate_text(base_model, prompt, gen_len=400))

print("\n--- Output from Fine-Tuned Model ---\n")
print(generate_text(finetuned_model, prompt, gen_len=400))



--- Output from Pretrained Base Model ---

To be or not to be, to prey;
What an the thrown made you?

MERCIUT:
Gentle, pale! yet you mean, sometime me to you,
hin Gay behelk; coze and looks: while heaven armity,
I make home from: there's to Rome, sir.

otherful:
Lady sake: no more.

MISTRESS OVERDONA:
What! justice she but but tender fronce.

HAMY BETHA:
Might hell loves her trut,--God, within this angle
Be come. Must abused proverbled to.

PERDITA:
Ay, ay, 

--- Output from Fine-Tuned Model ---

To be or not to be,
And this impletered will but at his pair.

First Condiner:
Yet mad thee and my good lord.

LEONTES:
It will me good gone!
Cfay was looks stabs to him, bedal our side.
Give me any let me to alack grievour.

DUKE OF YORK:
Sirt lose, thembereators so noteiff.
My tresship is done and place to a grave
And the soft
May seem him confessiple, with urge desperrate
Bohement hath Clifford's robe; and so it 


#  Autoregressive Fine-Tuning Comparison: Shakespeare Char-RNN

This experiment compares the text generation performance of a character-level RNN model **before and after fine-tuning** on a reserved portion of Shakespeare's corpus.

## Prompt

```text
To be or not to be,
```

---

### Pretrained Base Model Output

```
To be or not to be, to prey;
What an the thrown made you?

MERCIUT:
Gentle, pale! yet you mean, sometime me to you,
hin Gay behelk; coze and looks: while heaven armity,
I make home from: there's to Rome, sir.

otherful:
Lady sake: no more.

MISTRESS OVERDONA:
What! justice she but but tender fronce.

HAMY BETHA:
Might hell loves her trut,--God, within this angle
Be come. Must abused proverbled to.

PERDITA:
Ay, ay, 
```

**Observations:**

* Contains valid Shakespearean character formatting and line breaks.
* Semantically chaotic and often nonsensical.
* High rate of word mashups and grammar errors (e.g., *“abused proverbled”*, *“hin Gay behelk”*).
* Lacks dramatic coherence and thematic flow.

---

### Fine-Tuned Model Output

```
To be or not to be,
And this impletered will but at his pair.

First Condiner:
Yet mad thee and my good lord.

LEONTES:
It will me good gone!
Cfay was looks stabs to him, bedal our side.
Give me any let me to alack grievour.

DUKE OF YORK:
Sirt lose, thembereators so noteiff.
My tresship is done and place to a grave
And the soft
May seem him confessiple, with urge desperrate
Bohement hath Clifford's robe; and so it
```

**Observations:**

* Mimics Shakespearean syntax and rhythm more convincingly.
* Dramatic progression and emotion are better represented.
* Made-up words (*“confessiple”*, *“noteiff”*) still appear, but with contextual flavor.
* Character interactions are more coherent and structured.

---

## Conclusion

The **fine-tuned model** produces outputs that are significantly more stylistically faithful to Shakespearean drama. While both models hallucinate words, the fine-tuned version demonstrates:

* Improved structure and scene formatting
* Sharper emotional tone
* More plausible pseudo-Elizabethan phrasing

Fine-tuning elevated the model from *"bard-core gibberish"* to *"plausible lost play fragment."*

---

