<a href="https://colab.research.google.com/github/gamesMum/love_song_generator_rnn/blob/master/Love_Song_Generator.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Importing the data from google drive

In [3]:
from google.colab import drive 
drive.mount("/content/drive")


Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


In [0]:
#importing some importent libraries
import numpy as np
import torch
from torch import nn
import torch.nn.functional as F

In [4]:
#Let's open the text file and read it
with open('/content/drive/My Drive/data/love_songs.txt', 'r') as f:
  text = f.read()

text[:100]

"Doesn't take much to make me happy\nAnd make me smile with glee \nNever never will I feel discouraged "

## **Tokenization**

In [0]:
#Encode the text and map each character to an integer and vice versa
chars = tuple(set(text)) #search this
int2char = dict(enumerate(chars)) #keys are integer, values are chars
char2int = {ch: ii for ii, ch in int2char.items()} #keys are characters, values are values

#encode the text
encoded = np.array([char2int[ch] for ch in text])

In [6]:
encoded[:100]
print()
encoded.shape




(902550,)

# **Pre-processing the data**

In [0]:
def one_hot_encode(arr, n_labels):
  #initialize the encoded array with zeros
  one_hot = np.zeros((arr.size, n_labels), dtype=np.float32)

  #fill the approperiate elemnts with ones
  one_hot[np.arange(one_hot.shape[0]), arr.flatten()] = 1 #flatten makes 1D array

  #reshape it to the oriinal array
  one_hot = one_hot.reshape((*arr.shape, n_labels)) #why th *

  return one_hot

In [24]:
#Test
# check that the function works as expected
test_seq = np.array([[3, 5, 1]])
one_hot = one_hot_encode(test_seq, 8)

print(one_hot)

[[[0. 0. 0. 1. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 1. 0. 0.]
  [0. 1. 0. 0. 0. 0. 0. 0.]]]


## **Making the min-batching**

In [0]:
def get_batches(arr, batch_size, seq_length):
  #number of characters in a complete min batch
  batch_size_total = batch_size * seq_length

  #Get the number of batches
  n_batches = len(arr) // batch_size_total

  #keep only enough characters to make full batches
  arr = arr[: n_batches * batch_size_total]
  #reshape into batch_size row
  arr = arr.reshape(batch_size, -1)
  #Iterate through the batches using window of size seq_length
  for n in range(0, arr.shape[1], seq_length):
        # The features (rows are the batches, columns are the seq_length window)
        x = arr[:, n:n+seq_length]
        # The targets, shifted by one
        y = np.zeros_like(x)
        try:
          y[:, :-1], y[:, -1] = x[: , 1:], arr[:, n+seq_length] #y = x hifted by one
        except IndexError:
          y[:, :-1], y[:, -1] = x[: , 1:], arr[:, 0]
        yield x, y


## **Define the network**

In [26]:
 #check if gpu is available
 train_on_gpu = torch.cuda.is_available()
 if(train_on_gpu):
   print('Training on GPU')
 else:
   print('No GPU is available. Trainig on CPU')

Training on GPU


In [0]:
#Define our Networ
class CharRNN(nn.Module):
  def __init__(self, tokens, n_hidden=256, n_layers=2,
               drop_prob=0.25, lr=0.01):
    super().__init__()
    self.drop_prob = drop_prob
    self.n_layers = n_layers
    self.n_hidden = n_hidden
    self.lr = lr

    #create charachters dictionary
    self.chars =  tokens
    self.int2char = dict(enumerate(self.chars))
    self.char2int = {ch: ii for ii, ch in self.int2char.items()}

    #Define the layers of the model
    self.lstm = nn.LSTM(len(chars), n_hidden, n_layers,
                        dropout=drop_prob, batch_first=True)
    self.dropout = nn.Dropout(drop_prob)

    self.fc = nn.Linear(n_hidden, len(self.chars))

  def forward(self, x, hidden):
    # x (batch_size, seq_length, input_size)
    # hidden (n_layers, batch_size, hidden_dim)
    # r_out (batch_size, time_step, hidden_size)   
    ## TODO: Get the outputs and the new hidden state from the lstm

    #get LSTM outputs
    r_out, hidden = self.lstm(x, hidden)
    #pass it through the dropout layer
    out = self.dropout(r_out)

    #stack up LSTM outputs using view 
    #use contiguous to reashape the output
    out = out.contiguous().view(-1, self.n_hidden)
    #now pass it through the fully connected layer
    out = self.fc(out)

    #return the final output and the hidden state
    return out, hidden
  def init_hidden(self, batch_size):
    # Create two new tensors with sizes n_layers x batch_size x n_hidden,
    # initialized to zero, for hidden state and cell state of LSTM
    weight = next(self.parameters()).data

    if (train_on_gpu):
       hidden = (weight.new(self.n_layers, batch_size, self.n_hidden).zero_().cuda(),
                  weight.new(self.n_layers, batch_size, self.n_hidden).zero_().cuda())
    else:
      hidden = (weight.new(self.n_layers, batch_size, self.n_hidden).zero_(),
                      weight.new(self.n_layers, batch_size, self.n_hidden).zero_())
        
    return hidden
        

## **Train our model**

In [0]:
def train(net, data, epochs=10, batch_size=10, seq_length=50, lr=0.001, clip=5, val_frac=0.1, print_every=10):
    ''' Training a network 
    
        Arguments
        ---------
        
        net: CharRNN network
        data: text data to train the network
        epochs: Number of epochs to train
        batch_size: Number of mini-sequences per mini-batch, aka batch size
        seq_length: Number of character steps per mini-batch
        lr: learning rate
        clip: gradient clipping
        val_frac: Fraction of data to hold out for validation
        print_every: Number of steps for printing training and validation loss
    
    '''
    net.train()
    
    opt = torch.optim.Adam(net.parameters(), lr=lr)
    criterion = nn.CrossEntropyLoss()
    
    # create training and validation data
    val_idx = int(len(data)*(1-val_frac))
    data, val_data = data[:val_idx], data[val_idx:]
    
    if(train_on_gpu):
        net.cuda()
    
    counter = 0
    n_chars = len(net.chars)
    for e in range(epochs):
        # initialize hidden state
        h = net.init_hidden(batch_size)
        
        for x, y in get_batches(data, batch_size, seq_length):
            counter += 1
            
            # One-hot encode our data and make them Torch tensors
            x = one_hot_encode(x, n_chars)
            inputs, targets = torch.from_numpy(x), torch.from_numpy(y)
            
            if(train_on_gpu):
                inputs, targets = inputs.cuda(), targets.cuda()

            # Creating new variables for the hidden state, otherwise
            # we'd backprop through the entire training history
            h = tuple([each.data for each in h])

            # zero accumulated gradients
            net.zero_grad()
            
            # get the output from the model
            output, h = net(inputs, h) #h: contains all information from all the previous steps
            
            # calculate the loss and perform backprop
            loss = criterion(output, targets.view(batch_size*seq_length).long())
            loss.backward()
            # `clip_grad_norm` helps prevent the exploding gradient problem in RNNs / LSTMs.
            nn.utils.clip_grad_norm_(net.parameters(), clip)
            opt.step()
            
            # loss stats
            if counter % print_every == 0:
                # Get validation loss
                val_h = net.init_hidden(batch_size)
                val_losses = []
                net.eval()
                for x, y in get_batches(val_data, batch_size, seq_length):
                    # One-hot encode our data and make them Torch tensors
                    x = one_hot_encode(x, n_chars)
                    x, y = torch.from_numpy(x), torch.from_numpy(y)
                    
                    # Creating new variables for the hidden state, otherwise
                    # we'd backprop through the entire training history
                    val_h = tuple([each.data for each in val_h])
                    
                    inputs, targets = x, y
                    if(train_on_gpu):
                        inputs, targets = inputs.cuda(), targets.cuda()

                    output, val_h = net(inputs, val_h)
                    val_loss = criterion(output, targets.view(batch_size*seq_length).long())
                
                    val_losses.append(val_loss.item())
                
                net.train() # reset to train mode after iterationg through validation data
                
                print("Epoch: {}/{}...".format(e+1, epochs),
                      "Step: {}...".format(counter),
                      "Loss: {:.4f}...".format(loss.item()),
                      "Val Loss: {:.4f}".format(np.mean(val_losses)))

In [119]:
# define and print the net
n_hidden= 128
n_layers=2

net = CharRNN(chars, n_hidden, n_layers, drop_prob=0.25)
print(net)

CharRNN(
  (lstm): LSTM(99, 128, num_layers=2, batch_first=True, dropout=0.25)
  (dropout): Dropout(p=0.25, inplace=False)
  (fc): Linear(in_features=128, out_features=99, bias=True)
)


In [120]:
batch_size = 100 
seq_length = 64
n_epochs =  50 # start small if you are just testing initial behavior

# train the model
train(net, encoded, epochs=n_epochs, batch_size=batch_size, seq_length=seq_length, lr=0.002, print_every=10)

Epoch: 1/50... Step: 10... Loss: 3.2930... Val Loss: 3.2831
Epoch: 1/50... Step: 20... Loss: 3.1826... Val Loss: 3.2226
Epoch: 1/50... Step: 30... Loss: 3.1601... Val Loss: 3.2035
Epoch: 1/50... Step: 40... Loss: 3.1499... Val Loss: 3.1960
Epoch: 1/50... Step: 50... Loss: 3.1475... Val Loss: 3.1958
Epoch: 1/50... Step: 60... Loss: 3.1381... Val Loss: 3.1930
Epoch: 1/50... Step: 70... Loss: 3.1485... Val Loss: 3.1934
Epoch: 1/50... Step: 80... Loss: 3.1378... Val Loss: 3.1924
Epoch: 1/50... Step: 90... Loss: 3.1709... Val Loss: 3.1922
Epoch: 1/50... Step: 100... Loss: 3.1276... Val Loss: 3.1898
Epoch: 1/50... Step: 110... Loss: 3.1533... Val Loss: 3.1868
Epoch: 1/50... Step: 120... Loss: 3.1318... Val Loss: 3.1787
Epoch: 2/50... Step: 130... Loss: 3.1249... Val Loss: 3.1647
Epoch: 2/50... Step: 140... Loss: 3.0818... Val Loss: 3.1340
Epoch: 2/50... Step: 150... Loss: 3.0097... Val Loss: 3.0830
Epoch: 2/50... Step: 160... Loss: 2.9631... Val Loss: 3.0176
Epoch: 2/50... Step: 170... Loss:

In [0]:
# change the name, for saving multiple files
model_name = 'rnn_x_epoch.net'

checkpoint = {'n_hidden': net.n_hidden,
              'n_layers': net.n_layers,
              'state_dict': net.state_dict(),
              'tokens': net.chars}

with open(model_name, 'wb') as f:
    torch.save(checkpoint, f)

## **Making Prediction**

In [0]:
def predict(net, char, h=None, top_k=None):
        ''' Given a character, predict the next character.
            Returns the predicted character and the hidden state.
        '''
        
        # tensor inputs
        x = np.array([[net.char2int[char]]])
        x = one_hot_encode(x, len(net.chars))
        inputs = torch.from_numpy(x)
        
        if(train_on_gpu):
            inputs = inputs.cuda()
        
        # detach hidden state from history
        h = tuple([each.data for each in h])
        # get the output of the model
        out, h = net(inputs, h)

        # get the character probabilities
        p = F.softmax(out, dim=1).data
        if(train_on_gpu):
            p = p.cpu() # move to cpu
        
        # get top characters
        if top_k is None:
            top_ch = np.arange(len(net.chars))
        else:
            p, top_ch = p.topk(top_k)
            top_ch = top_ch.numpy().squeeze()
        
        # select the likely next character with some element of randomness
        p = p.numpy().squeeze()
        char = np.random.choice(top_ch, p=p/p.sum())
        
        # return the encoded value of the predicted char and the hidden state
        return net.int2char[char], h

### Priming and generating text

In [0]:
def sample(net, size, prime='Line', top_k=None):
        
    if(train_on_gpu):
        net.cuda()
    else:
        net.cpu()
    
    net.eval() # eval mode
    
    # First off, run through the prime characters
    chars = [ch for ch in prime]
    h = net.init_hidden(1)
    for ch in prime:
        char, h = predict(net, ch, h, top_k=top_k)

    chars.append(char)
    
    # Now pass in the previous character and get a new one
    for ii in range(size):
        char, h = predict(net, chars[-1], h, top_k=top_k)
        chars.append(char)

    return ''.join(chars)

In [128]:
print(sample(net, 1000, prime='Line', top_k=5))

Line at all the door
I'd say it is that you can bral your love 
You know you know that I've looked to live tight
The same, way I can see you with you 
When I long to spend your love is the reatoors
And I can stop a turning far

When you know, that you do

Well in my breath to be

I keep trying it all I can stand to take a man tight
There's no man thit is I need

I wanna stop the world to you

Take this way I'm gonna do.
I'll never let you stop

There's a leather a little thing

You're my life, before the right
It came always tonight, they can leave to stand to be
Your love is the world)
An that's my looking for my best arally
I'd get another love so long 
I've got your love to my life to me 
I know is wait on you
And will the much a life

Well all the loving though 
And you're love away

I wanna get it in love, I know
I wanna leave me

All I love you

There's no minigh of you shaking around through

Your soul, then I'm a moneite is to me
I lost to be the mirds where I will be what wond

In [129]:
# Here we have loaded in a model that trained over 20 epochs `rnn_20_epoch.net`
with open('rnn_x_epoch.net', 'rb') as f:
    checkpoint = torch.load(f)
    
loaded = CharRNN(checkpoint['tokens'], n_hidden=checkpoint['n_hidden'], n_layers=checkpoint['n_layers'])
loaded.load_state_dict(checkpoint['state_dict'])

<All keys matched successfully>

In [133]:
# Sample using a loaded model
print(sample(loaded, 2000, top_k=5, prime="The sky above"))

The sky above 
Into my late at your heart
And I was, when, I don't care you won't give me

When you call me
It's not no makes love
I was livin' is together.

You know you know you wanna know

I wanna sex you the semmly soul
We say if you wanna be all of my love

Though you want it too
I wanna know I do to mine
And it's the ong and my love that you say 
And I can stay takes you will not see
The way we was started it all
I'll go back it all of me
I'm gonna show you to show me the same
When we want the late in me on, I wanna stop the moment I leave you fall in love words
When what I dripping and I can see the right way

If I was world will stop in your hands

Tell me all my love, my life
This life time well you, love me all that me

I would stay that who I can break it mine

And I was love with everything wanna lay me 
And the way I'll share the only thing with you

You're the one I can't dravees you to me 

I can't lose it)
I'll stand by you 
When I see you and you're the life and when I