## Using a GPT to generate song lyrics
This notebook generates song lyrics based on a GPT model and is based on material from [here](https://www.youtube.com/@GPTandChill).

Change Runtime -> Change Runtime Type -> T4 GPU for better performance.

In [None]:
import torch
import torch.nn as nn
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

Create the GPT class.

In [None]:
class GPT(nn.Module):
    """
    A GPT-like transformer model.

    Parameters
    ----------
    vocab_size : int
        The size of the vocabulary.
    context_length : int
        The length of the input context.
    model_dim : int
        The dimensionality of the model.
    num_blocks : int
        The number of transformer blocks.
    num_heads : int
        The number of attention heads.
    """
    class TransformerBlock(nn.Module):
        """
        A single transformer block consisting of multi-headed self-attention
        and a feedforward neural network.

        Parameters
        ----------
        model_dim : int
            The dimensionality of the model.
        num_heads : int
            The number of attention heads.
        """
        class MultiHeadedSelfAttention(nn.Module):
            """
            Multi-headed self-attention mechanism.

            Parameters
            ----------
            model_dim : int
                The dimensionality of the model.
            num_heads : int
                The number of attention heads.
            """
            class SingleHeadAttention(nn.Module):
                """
                Single head attention mechanism.

                Parameters
                ----------
                model_dim : int
                    The dimensionality of the model.
                head_size : int
                    The size of each attention head.
                """
                def __init__(self, model_dim: int, head_size: int):
                    super().__init__()
                    self.key_layer = nn.Linear(model_dim, head_size, bias=False)
                    self.query_layer = nn.Linear(model_dim, head_size, bias=False)
                    self.value_layer = nn.Linear(model_dim, head_size, bias=False)

                def forward(self, embedded):
                    """
                    Forward pass for single-head self-attention.

                    Parameters
                    ----------
                    embedded : torch.Tensor
                        The input tensor of shape (batch_size, context_length, model_dim).

                    Returns
                    -------
                    torch.Tensor
                        The attention-weighted values.
                    """
                    k = self.key_layer(embedded)
                    q = self.query_layer(embedded)
                    v = self.value_layer(embedded)

                    scores = q @ torch.transpose(k, 1, 2)  # Compute attention scores
                    context_length, attention_dim = k.shape[1], k.shape[2]
                    scores = scores / (attention_dim ** 0.5)  # Scale scores

                    # Create a lower triangular mask for causal attention
                    lower_triangular = torch.tril(torch.ones(context_length, context_length))
                    mask = (lower_triangular == 0).to(device)
                    scores = scores.masked_fill(mask, float('-inf'))
                    scores = nn.functional.softmax(scores, dim=2)

                    return scores @ v  # Weighted sum of values

            def __init__(self, model_dim: int, num_heads: int):
                super().__init__()
                self.attention_heads = nn.ModuleList()
                for _ in range(num_heads):
                    self.attention_heads.append(self.SingleHeadAttention(model_dim, model_dim // num_heads))
                self.compute = nn.Linear(model_dim, model_dim)
                self.dropout = nn.Dropout(0.2)

            def forward(self, embedded):
                """
                Forward pass for multi-headed self-attention.

                Parameters
                ----------
                embedded : torch.Tensor
                    The input tensor of shape (batch_size, context_length, model_dim).

                Returns
                -------
                torch.Tensor
                    The output tensor after multi-headed attention.
                """
                head_outputs = [head(embedded) for head in self.attention_heads]
                concatenated = torch.cat(head_outputs, dim=2)
                return self.dropout(self.compute(concatenated))

        class VanillaNeuralNetwork(nn.Module):
            """
            A simple feedforward neural network used within the transformer block.

            Parameters
            ----------
            model_dim : int
                The dimensionality of the model.
            """
            def __init__(self, model_dim: int):
                super().__init__()
                self.first_linear_layer = nn.Linear(model_dim, model_dim * 4)
                self.relu = nn.ReLU()
                self.second_linear_layer = nn.Linear(model_dim * 4, model_dim)
                self.dropout = nn.Dropout(0.2)

            def forward(self, x):
                """
                Forward pass for the feedforward network.

                Parameters
                ----------
                x : torch.Tensor
                    Input tensor.

                Returns
                -------
                torch.Tensor
                    Output tensor.
                """
                return self.dropout(self.second_linear_layer(self.relu(self.first_linear_layer(x))))

        def __init__(self, model_dim: int, num_heads: int):
            super().__init__()
            self.mhsa = self.MultiHeadedSelfAttention(model_dim, num_heads)
            self.vanilla_nn = self.VanillaNeuralNetwork(model_dim)
            self.layer_norm_one = nn.LayerNorm(model_dim)
            self.layer_norm_two = nn.LayerNorm(model_dim)

        def forward(self, embedded):
            """
            Forward pass for the transformer block.

            Parameters
            ----------
            embedded : torch.Tensor
                Input tensor.

            Returns
            -------
            torch.Tensor
                Processed tensor.
            """
            embedded = embedded + self.mhsa(self.layer_norm_one(embedded))  # Skip connection
            embedded = embedded + self.vanilla_nn(self.layer_norm_two(embedded))  # Another skip connection
            return embedded

    def __init__(self, vocab_size: int, context_length: int, model_dim: int, num_blocks: int, num_heads: int):
        super().__init__()
        self.token_embedding = nn.Embedding(vocab_size, model_dim)
        self.pos_embedding = nn.Embedding(context_length, model_dim)
        self.transformer_blocks = nn.Sequential(*[self.TransformerBlock(model_dim, num_heads) for _ in range(num_blocks)])
        self.layer_norm_three = nn.LayerNorm(model_dim)
        self.vocab_projection = nn.Linear(model_dim, vocab_size)

    def forward(self, context):
        """
        Forward pass for the GPT model.

        Parameters
        ----------
        context : torch.Tensor
            Input tensor of token indices.

        Returns
        -------
        torch.Tensor
            The logits for the next token prediction.
        """
        embedded = self.token_embedding(context)
        context_length = context.shape[1]
        positions = torch.arange(context_length).to(device)
        embedded = embedded + self.pos_embedding(positions)

        raw_output = self.vocab_projection(self.layer_norm_three(self.transformer_blocks(embedded)))
        return raw_output

## Green Day Lyrics generator

Load and process the lyrics data

In [None]:
with open('/content/GreenDayLyrics.txt', 'r', encoding='utf-8') as f:
    lyrics = f.read()

Create character-level vocabulary.

In [None]:
unique_chars = sorted(set(lyrics))
char_to_int = {ch: i for i, ch in enumerate(unique_chars)}
int_to_char = {i: ch for ch, i in char_to_int.items()}

Encode the lyrics data.

In [None]:
encoded_lyrics = [char_to_int[ch] for ch in lyrics]

Prepare input-target sequences.

In [None]:
def create_sequences(data, seq_length):
    """Generates input-target sequences from a dataset for training.

    Parameters
    ----------
    data : list or numpy.ndarray
        Input data from which sequences are generated.
    seq_length : int
        Length of each input sequence.

    Returns
    -------
    tuple of torch.Tensor
        A tuple (inputs, targets), where:
        - inputs: Tensor of shape (num_samples, seq_length) representing input sequences.
        - targets: Tensor of shape (num_samples, seq_length) representing target sequences.

    Notes
    -----
    - Each input sequence consists of `seq_length` consecutive elements from the input data.
    - Each target sequence is the corresponding next `seq_length` elements, offset by one.
    - Useful for training sequence models like RNNs or Transformers.
    """
    inputs, targets = [], []  # Initialize lists to store input and target sequences.

    # Iterate through the data to extract sequences.
    for i in range(len(data) - seq_length):
        inputs.append(data[i:i + seq_length])  # Input sequence of length `seq_length`.
        targets.append(data[i + 1:i + seq_length + 1])  # Corresponding target sequence.

    # Convert lists to tensors for use with PyTorch models.
    return torch.tensor(inputs), torch.tensor(targets)

seq_length = 128
X_train, y_train = create_sequences(encoded_lyrics, seq_length)

In [None]:
print(f' Number of sequences in training data: {len(X_train)}')

 Number of sequences in training data: 29744


Train the model.

In [None]:
subset_size = 10_000
X_train, y_train = X_train[:subset_size], y_train[:subset_size]

# Batch size and training epochs
batch_size = 128
epochs = 100

# Initialize model and optimizer
model = GPT(vocab_size=len(unique_chars), context_length=128, model_dim=252, num_blocks=6, num_heads=6).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=3e-4)
criterion = nn.CrossEntropyLoss()

# AMP API
scaler = torch.amp.GradScaler()

# Training loop with batching and mixed precision
for epoch in range(epochs):
    model.train()
    total_loss = 0

    for i in range(0, len(X_train), batch_size):
        context = X_train[i:i + batch_size].to(device)
        target = y_train[i:i + batch_size].to(device)

        optimizer.zero_grad()

        with torch.amp.autocast("cuda"):
            output = model(context)
            loss = criterion(output.view(-1, len(unique_chars)), target.view(-1))

        scaler.scale(loss).backward()
        scaler.step(optimizer)
        scaler.update()

        total_loss += loss.item()

    print(f"Epoch {epoch + 1}, Loss: {total_loss / (len(X_train) // batch_size)}")

# Save the improved model
torch.save(model.state_dict(), 'green_day_fine_tuned_weights.pt')

Epoch 1, Loss: 2.9024766439046616
Epoch 2, Loss: 2.6327119515492368
Epoch 3, Loss: 2.5627492788510446
Epoch 4, Loss: 2.5151719863598165
Epoch 5, Loss: 2.4764270048875074
Epoch 6, Loss: 2.4380006912427072
Epoch 7, Loss: 2.3940151868722377
Epoch 8, Loss: 2.346079939450973
Epoch 9, Loss: 2.2829126241879587
Epoch 10, Loss: 2.20493516096702
Epoch 11, Loss: 2.1052387906954837
Epoch 12, Loss: 1.9831893016130497
Epoch 13, Loss: 1.8581082668059912
Epoch 14, Loss: 1.7194860638716283
Epoch 15, Loss: 1.5789378713338802
Epoch 16, Loss: 1.4317210461848822
Epoch 17, Loss: 1.2891977261274288
Epoch 18, Loss: 1.164544566319539
Epoch 19, Loss: 1.0577828387419383
Epoch 20, Loss: 0.9674465885529151
Epoch 21, Loss: 0.8631298137016785
Epoch 22, Loss: 0.7554733126591413
Epoch 23, Loss: 0.6698395644242947
Epoch 24, Loss: 0.5768747604810275
Epoch 25, Loss: 0.4988027130946135
Epoch 26, Loss: 0.43181255555305725
Epoch 27, Loss: 0.37963918004280484
Epoch 28, Loss: 0.3327513058216144
Epoch 29, Loss: 0.2981765356201

Create a function to generate lyrics.

In [None]:
def generate_lyrics(model, new_chars, context, context_length, int_to_char, temperature=1.0):
    """Generates lyrics using a trained character-level language model.

    Parameters
    ----------
    model : torch.nn.Module
        The trained language model for character generation.
    new_chars : int
        Number of new characters to generate.
    context : torch.Tensor
        Input tensor representing the initial context (shape: [1, sequence_length]).
    context_length : int
        Maximum length of context to retain during generation.
    int_to_char : dict
        Mapping from integer indices to characters.
    temperature : float, optional
        Sampling temperature controlling randomness. Lower values make predictions more deterministic,
        while higher values increase diversity. Default is 1.0.

    Returns
    -------
    str
        Generated lyrics as a string.

    Notes
    -----
    - The function performs autoregressive generation by sampling one character at a time.
    - Uses softmax with a temperature parameter to control the randomness of predictions.
    - Context is truncated to `context_length` to prevent memory overflow.
    """
    model.eval()  # Set the model to evaluation mode (disables dropout, etc.).
    res = []  # Store generated characters.

    with torch.no_grad():  # Disable gradient computation for faster inference.
        for _ in range(new_chars):
            # Keep only the last `context_length` characters.
            if context.shape[1] > context_length:
                context = context[:, -context_length:]

            # Forward pass: generate model output (logits).
            output = model(context)

            # Extract logits for the last time step and apply temperature scaling.
            logits = output[:, -1, :] / temperature

            # Convert logits to probabilities using softmax.
            probs = torch.nn.functional.softmax(logits, dim=-1)

            # Sample the next character index from the probability distribution.
            next_char = torch.multinomial(probs, 1)

            # Append the new character to the context.
            context = torch.cat((context, next_char), dim=-1)

            # Map the character index to its corresponding character and store it.
            res.append(int_to_char[next_char.item()])

    return ''.join(res)  # Return the generated lyrics as a string.

Generate lyrics for new Green Day songs.

In [None]:
seed_text = "I walk alone"
start_context = torch.tensor([[char_to_int[c] for c in seed_text]], dtype=torch.int64).to(device)

new_lyrics = generate_lyrics(model, new_chars=2000, context=start_context, context_length=128, int_to_char=int_to_char, temperature=0.9)
print(new_lyrics)



I walk a

I walk this empty street
On the Boulevard of Broken Dreams
Where the city sleeps
And I'm the only one and I walk a

My shadow's the only one that walks beside me
My shallow heart's the only thing that's beating
Sometimes I wish someone out there will find me
'Til then I walk alone

Ah-ah, ah-ah, aaah-ah
Ah-ah, ah-ah

I walk a

I walk a

I walk this empty street
On the Boulevard of Broken Dreams
Where the city sleeps
And I'm the only one and I walk a

My shadow's the only one that walks beside me
My shallow heart's the only thing that's beating
Sometimes I wish someone out there will find me
'Til then I walk alone

Ah-ah, ah-ah, ah-ah, aaah-ah
Ah-ah, ah-ah

I walk a

I walk this empty street
On the Boulevard of Broken Dreams
Where the city sleeps
And I'm the only one and I walk alone

I walk alone
I walk a

My shadow's the only one that walks beside me
My shallow heart's the only thing that's beating
Sometimes I wish someone out there will find me
'Til then I walk alone 

Do

In [None]:
seed_text = "A throbbing tumor and a radiation high"
start_context = torch.tensor([[char_to_int[c] for c in seed_text]], dtype=torch.int64).to(device)

new_lyrics = generate_lyrics(model, new_chars=4000, context=start_context, context_length=128, int_to_char=int_to_char, temperature=1.2)
print(new_lyrics)

t
One my own... here we go

My eyes feel like they're gonna bleed
Dried up and bulging out my skull
My mouth is dry
My face is numb
Fucked up and spun out in my room
On my own... here we go 

I text a postcard sent to you
Did it go through?
Sending all my love to you
You are the moonlight of my life
Every night
Giving all my love to you

My beating heart belongs to you
I walked for miles 'til I found you
I'm here to honor you
If I lose everything in the fire
I'm sending something new

I'm having trouble trying to sleep
I'm counting sheep but running out
As time ticks by
Still I try
No rest for cross-tops in my mind
On my own... here we go

My eyes feel like they're gonna bleed
Dried up and bulging out my skull
My mouth is dry
My face is numb
Fucked up and spun out in my room
On my own... here we go 

I text a postcard sent to you
Did it go through?
Sending all my love to you
You are the moonlight of my life
Every night
Giving all my love to you

My beating heart belongs to you
I walked

## Taylor Swift Lyrics Generator

Load and process the lyrics data.

In [None]:
with open('/content/TaylorLyrics.txt', 'r', encoding='utf-8') as f:
    lyrics = f.read()

Create character-level vocabulary.

In [None]:
unique_chars = sorted(set(lyrics))
char_to_int = {ch: i for i, ch in enumerate(unique_chars)}
int_to_char = {i: ch for ch, i in char_to_int.items()}

Encode the lyrics data.

In [None]:
encoded_lyrics = [char_to_int[ch] for ch in lyrics]

Prepare input-target sequences.

In [None]:
seq_length = 128
X_train, y_train = create_sequences(encoded_lyrics, seq_length)

In [None]:
print(f' Number of sequences in training data: {len(X_train)}')

 Number of sequences in training data: 278642


Train the model.

In [None]:
subset_size = 10_000
X_train, y_train = X_train[:subset_size], y_train[:subset_size]

# Batch size and training epochs
batch_size = 128
epochs = 100

# Initialize model and optimizer
model = GPT(vocab_size=len(unique_chars), context_length=128, model_dim=252, num_blocks=6, num_heads=6).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=3e-4)
criterion = nn.CrossEntropyLoss()

# AMP API
scaler = torch.amp.GradScaler()

# Training loop with batching and mixed precision
for epoch in range(epochs):
    model.train()
    total_loss = 0

    for i in range(0, len(X_train), batch_size):
        context = X_train[i:i + batch_size].to(device)
        target = y_train[i:i + batch_size].to(device)

        optimizer.zero_grad()

        with torch.amp.autocast("cuda"):
            output = model(context)
            loss = criterion(output.view(-1, len(unique_chars)), target.view(-1))

        scaler.scale(loss).backward()
        scaler.step(optimizer)
        scaler.update()

        total_loss += loss.item()

    print(f"Epoch {epoch + 1}, Loss: {total_loss / (len(X_train) // batch_size)}")

# Save the improved model
torch.save(model.state_dict(), 'taylor_swift_tuned_weights.pt')

Epoch 1, Loss: 2.8535317243673863
Epoch 2, Loss: 2.522774983675052
Epoch 3, Loss: 2.461463885429578
Epoch 4, Loss: 2.4210038368518534
Epoch 5, Loss: 2.388933756412604
Epoch 6, Loss: 2.3499836371495175
Epoch 7, Loss: 2.29807387254177
Epoch 8, Loss: 2.2212006052335105
Epoch 9, Loss: 2.116103111169277
Epoch 10, Loss: 1.986960226144546
Epoch 11, Loss: 1.8597185061528132
Epoch 12, Loss: 1.7331275618993318
Epoch 13, Loss: 1.6040427852899601
Epoch 14, Loss: 1.4705819158982008
Epoch 15, Loss: 1.3372104045672295
Epoch 16, Loss: 1.2111956568864675
Epoch 17, Loss: 1.0967308642008367
Epoch 18, Loss: 0.9920510183542203
Epoch 19, Loss: 0.8839907928919181
Epoch 20, Loss: 0.7812798684224104
Epoch 21, Loss: 0.6896649396572357
Epoch 22, Loss: 0.6089869145399485
Epoch 23, Loss: 0.5374672458722041
Epoch 24, Loss: 0.47477339857663864
Epoch 25, Loss: 0.4162102102851256
Epoch 26, Loss: 0.3610234289215161
Epoch 27, Loss: 0.3221076520589682
Epoch 28, Loss: 0.2878520399905168
Epoch 29, Loss: 0.2557751651948843


Generate lyrics for new Taylor Swift songs.

In [None]:
seed_text = "Out of the Woods"
start_context = torch.tensor([[char_to_int[c] for c in seed_text]], dtype=torch.int64).to(device)

new_lyrics = generate_lyrics(model, new_chars=2000, context=start_context, context_length=128, int_to_char=int_to_char, temperature=1.0)
print(new_lyrics)

 find I wishing star
He's the song in the car I keep singing. Don't know why I do

So, I drive home alone
As I turn out the light
I'll put his picture down
And maybe get some sleep tonight

'Cause he's the reason for the teardrops on my guitar
The only one who's got enough of me to break my heart
He's the song in the car I keep singing. Don't know why I do

He's the time taken up, but there's never enough
And he's all that I need to fall into

Drew looks at me
I fake a smile so he won't see

I don't know what I want, so don't ask me
Cause I'm still trying to figure ind a place in this world

Got the radio on, my old blue jeans
And I'm wearing my heart on my sleeve
Feeling lucky today, got the sunshine
Could you tell me what more do I need
And tomorrow's just a mystery, oh yeah
But that's ok

[Chorus:]

Maybe I'm just a girl on a mission
But I'm ready to fly

I'm alone, on my own, and that's all I know
I'll be strong, I'll be wrong, oh but life goes on
Oh I'm alone, on my own, and that'

In [None]:
seed_text = "In the middle of town"
start_context = torch.tensor([[char_to_int[c] for c in seed_text]], dtype=torch.int64).to(device)

new_lyrics = generate_lyrics(model, new_chars=4000, context=start_context, context_length=128, int_to_char=int_to_char, temperature=0.8)
print(new_lyrics)


I'm standin' on your street
And there's a letter left on your doorstep
And the first few times
I right?

So how can I ever try to be better?
Nobody ever lets me in
I can still see you.
This ain't the best view
On the outside looking in
And I've been a lot of lonely places
I've never been on the outside

Seems the only one who doesn't see your beauty
Is the face in the mirror looking back at you
You walk around here thinking you're not pretty
But that's not true 'cause I know you

Hold on, baby, you're losing it
The water's high, you're jumping into it
And letting go
And no one knows
That you cry, but you don't tell anyone
That you might not be the golden one
And you're tied together with a smile
But you're coming undone

You're tied together with a smile
But you're coming undone
Goodbye, baby
With a smile, baby, baby

Cory's eyes are like a jungle
He smiles, it's like the radio
He whispers songs into my window
In words that nobody knows
There's pretty girls on every corner
That watch 