<a href="https://colab.research.google.com/github/urmilapol/urmilapolprojects/blob/master/recurrent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Description:
An implementation of the [char-rnn model of Karpathy](http://karpathy.github.io/2015/05/21/rnn-effectiveness/). This is a recurrent neural network that is trained probabilistically on sequences of characters, and that can then be used to sample new sequences that are like the original. For more reading see [here](http://colah.github.io/posts/2015-08-Understanding-LSTMs/)



## Results
* Wiring up a basic sequence-to-sequence computation graph
* Implementing a GRU cell.


An example of my final samples are shown below after 150 passes through the Lord of the Rings text dataset.

<code>
eide and the cece the eviled understade and Shire. 
Them. And the rider his allove. 
It he hape
 eer was need to of more blown to still new rithed to have collong to not the our to the 
mucker abou
</code>


---

## Data loading and high level training

---






In [1]:
! wget -O ./text_files.tar.gz 'https://piazza.com/redirect/s3?bucket=uploads&prefix=attach%2Fjlifkda6h0x5bk%2Fhzosotq4zil49m%2Fjn13x09arfeb%2Ftext_files.tar.gz' 
! tar -xzf text_files.tar.gz
! pip install unidecode
! pip install torch

import unidecode
import string
import random
import re
from tqdm import tqdm
import pdb
 
all_characters = string.printable
n_characters = len(all_characters)
file = unidecode.unidecode(open('./text_files/lotr.txt').read())
file_len = len(file)
print('file_len =', file_len)

--2022-01-07 10:41:19--  https://piazza.com/redirect/s3?bucket=uploads&prefix=attach%2Fjlifkda6h0x5bk%2Fhzosotq4zil49m%2Fjn13x09arfeb%2Ftext_files.tar.gz
Resolving piazza.com (piazza.com)... 3.221.126.233, 18.214.211.171, 52.7.218.200, ...
Connecting to piazza.com (piazza.com)|3.221.126.233|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-uploads.piazza.com/attach/jlifkda6h0x5bk/hzosotq4zil49m/jn13x09arfeb/text_files.tar.gz [following]
--2022-01-07 10:41:19--  https://cdn-uploads.piazza.com/attach/jlifkda6h0x5bk/hzosotq4zil49m/jn13x09arfeb/text_files.tar.gz
Resolving cdn-uploads.piazza.com (cdn-uploads.piazza.com)... 52.222.138.21, 52.222.138.65, 52.222.138.20, ...
Connecting to cdn-uploads.piazza.com (cdn-uploads.piazza.com)|52.222.138.21|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1533290 (1.5M) [application/x-gzip]
Saving to: ‘./text_files.tar.gz’


2022-01-07 10:41:20 (2.68 MB/s) - ‘./text_files.tar.gz’ saved [

In [2]:
chunk_len = 200
 
def random_chunk():
  start_index = random.randint(0, file_len - chunk_len)
  end_index = start_index + chunk_len + 1
  return file[start_index:end_index]
  
print(random_chunk())

 falling to the ground, rising, and falling again. And all 




the while he hissed but spoke no words. 

The fires below awoke in anger, the red light blazed, and all the 
cavern was filled with a gre


In [3]:
import torch
from torch.autograd import Variable
# Turn string into list of longs
def char_tensor(string):
  tensor = torch.zeros(len(string)).long()
  for c in range(len(string)):
      tensor[c] = all_characters.index(string[c])
  return Variable(tensor)

print(char_tensor('abcDEF'))

tensor([10, 11, 12, 39, 40, 41])


---

## Creating a GRU cell 

---

The cell I used previously was a pre-defined Pytorch layer. I will now write a  GRU class using the same parameters as the built-in Pytorch class.


In [4]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
from torch.nn.parameter import Parameter

class GRU(nn.Module):
  def __init__(self, input_size, hidden_size, num_layers):
    super(GRU, self).__init__()
    self.sigmoid = nn.Sigmoid()
    self.tanh = nn.Tanh()
    
    self.W_ir = nn.Linear(input_size, hidden_size)
    self.W_hr = nn.Linear(hidden_size, hidden_size)
    
    self.W_iz = nn.Linear(input_size, hidden_size)
    self.W_hz = nn.Linear(hidden_size, hidden_size)
    
    self.W_in = nn.Linear(input_size, hidden_size)
    self.W_hn = nn.Linear(hidden_size, hidden_size)
      
  def forward(self, inputs, hidden):
    # hidden : (n_layers, batch, hidden_size)
    
    # Each layer does the following:
    # r_t = sigmoid(W_ir*x_t + b_ir + W_hr*h_(t-1) + b_hr)
    # z_t = sigmoid(W_iz*x_t + b_iz + W_hz*h_(t-1) + b_hz)
    # n_t = tanh(W_in*x_t + b_in + r_t**(W_hn*h_(t-1) + b_hn))
    # h_(t) = (1 - z_t)**n_t + z_t**h_(t-1)
    # Where ** is hadamard product (not matrix multiplication, but elementwise multiplication)
    r_t = self.sigmoid(self.W_ir(inputs) + self.W_hr(hidden))
    z_t = self.sigmoid(self.W_iz(inputs) + self.W_hz(hidden))
    n_t = self.tanh(self.W_in(inputs) + r_t * (self.W_hn(hidden)))
    hiddens = (1 - z_t) * n_t + z_t * hidden
    
    return n_t, hiddens

---

##  Building a sequence to sequence model

---


In [5]:
#linear layer takes hidden size and shrinks it to vocab size
class RNN(nn.Module):
  def __init__(self, input_size, hidden_size, output_size, n_layers=1):
    super(RNN, self).__init__()
    self.input_size = input_size # num characters
    self.hidden_size = hidden_size # 200
    self.output_size = output_size # num characters
    self.n_layers = n_layers # 3
  
    self.embedding = nn.Embedding(input_size, hidden_size)
    self.gru = GRU(hidden_size, hidden_size, n_layers)
    self.out = nn.Linear(hidden_size, output_size)

  def forward(self, input_char, hidden):
    output = self.embedding(input_char).view(1, 1, -1)
    
    output = F.relu(output)
    
    output, hidden = self.gru(output, hidden)
    
    output = self.out(output[0])
    
    return output, hidden

  def init_hidden(self):
    return Variable(torch.zeros(self.n_layers, 1, self.hidden_size))

In [6]:
def random_training_set():    
  chunk = random_chunk()
  inp = char_tensor(chunk[:-1])
  target = char_tensor(chunk[1:])
  return inp, target

In [7]:
import itertools

def train(decoder, decoder_optimizer, criterion, inp, target):
  ## initialize hidden layers, set up gradient and loss 
  decoder_optimizer.zero_grad()
  hidden = decoder.init_hidden()
  loss = 0
  
  for x, y in zip(inp, target):
    y_hat, hidden = decoder(x, hidden)
    
    loss += criterion(y_hat, y.unsqueeze(0))
   
  loss.backward()
  decoder_optimizer.step()
  return loss.item() / target.shape[0]

---

## Sampling text and Training information

---

This method takes as input a decoder and creates a string of the given length.


In [8]:
def evaluate(decoder, prime_str='A', predict_len=100, temperature=0.8):
  ## initialize hidden variable, initialize other useful variables 
    # your code here
  ## /
  prime_str = char_tensor(prime_str)
  hidden = decoder.init_hidden()
  output_str = ""
  
  with torch.no_grad():
    while len(output_str) < predict_len:
      for char in prime_str:
        prediction, hidden = decoder(char, hidden)
        
        prediction = torch.exp(prediction / temperature)
        
        sample_index = torch.multinomial(prediction, 1)
        
        output_str += all_characters[sample_index]
      
      prime_str = sample_index
  
  return output_str


---

## Running it and generating some text!

---
Now time to run the model. This will train the model outputting sample strings along the way.

In [9]:
import time
import gc
from IPython.core.ultratb import AutoFormattedTB
__ITB__ = AutoFormattedTB(mode = 'Verbose',color_scheme='LightBg', tb_offset = 1)


def print_strings(strings):
  for i in range(len(strings)):
    print("\n\t--------- output string ", i+1, " -----------\n", strings[i])


def run():
  try:
    gc.collect()
    
    n_epochs = 2000
    print_every = 130
    plot_every = 200
    hidden_size = 200
    n_layers = 1
    lr = 0.001

    decoder = RNN(n_characters, hidden_size, n_characters, n_layers)
    decoder_optimizer = torch.optim.Adam(decoder.parameters(), lr=lr)
    criterion = nn.CrossEntropyLoss()

    start = time.time()
    all_losses = []
    output_strings = []
    loss_avg = 0

    loop = tqdm(total=n_epochs, position=0, leave=False)
    for epoch in range(1, n_epochs + 1):
      loss_ = train(decoder, decoder_optimizer, criterion, *random_training_set())       
      loss_avg += loss_

      if epoch % print_every == 0:
          output_strings.append(evaluate(decoder, 'Wh', 100))

      if epoch % plot_every == 0:
          all_losses.append(loss_avg / plot_every)
          loss_avg = 0

      loop.set_description('loss:{:.4f}'.format(loss_))
      loop.update(1)

    return output_strings, all_losses, decoder

  except:
      __ITB__()

output_strings, all_losses, decoder = run()



In [10]:
print_strings(output_strings)


	--------- output string  1  -----------
 eei wane and athe fats Nheerelld to win! ghes 'of dong 
'oft bbe hit fod the lo+ Dast no Af of theg 

	--------- output string  2  -----------
  er the lind pars the he borte the and pote sor we done and upland tith aing the the the soner 
the 

	--------- output string  3  -----------
 ee no, his. Them deen it ons 
for of ros sill outs wito se the sood of tine and swith not enico shin

	--------- output string  4  -----------
 hild tur there with wast, of the Sould trome hor hig the horigh byer uilod alit be ill their wit the

	--------- output string  5  -----------
 eing, and his were manking in the going wereh. 

'We 
then cralk whe was the Enfing andy grounn free

	--------- output string  6  -----------
 eat ba-bethaally out on 
the meght they 

Frodo light the Harst and shoutaled cag's on the singed th

	--------- output string  7  -----------
 hed and greatt!' 

That the now tho6 the swiated to lowerss, thated. Hach some of the Nort? T

In [11]:
for i in range(10):
  start_strings = [" Th", " wh", " he", " I ", " ca", " G", " lo", " ra"]
  start = random.randint(0,len(start_strings)-1)
  print(start_strings[start])
#   all_characters.index(string[c])
  print(evaluate(decoder, start_strings[start], 200), '\n')

 lo
aiught to the could man like the Land you, the sone a read 
whide the slow, munt under fer the plans, I do to jomer it thought his had should way back and 
this life had 
could sile leaped the 
black  

 G
sandalf 
place was seaken on the morn's and long littless and 
the tolmil had such to shid, on his is a fell fell. He was into Midor? For he Sam hall reaty stwards the passess and Shorget up to that r 

 I 
s cowly, and he and falitil wind the blace unour hobe, was to the 
water stallly into hard 
fars my reet were the 
right, and the hobbits hosened to 
the inder 
the mace 
still, and such a recom the E 

 ra
tesh abod is of and 
said, at in the tearn the slow it arm the homper had least the sen upon, you soud, shadol he many. But the Hound of the surmer and the clowed, the sheld by a blawn, the may and be 

 he
ci peary upon the leves, and shot ex, Touthered they and all may the clumped than the long with went, and was and palled to the Sarown indee, steary 
pind. 

'I scold wi

---

## Generating output on a different dataset

---
I will now generate output from a Shakespeare dataset.

In [12]:
try:
  # %tensorflow_version only exists in Colab.
  %tensorflow_version 2.x
except Exception:
  pass
import tensorflow as tf

path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')
file = unidecode.unidecode(open(path_to_file).read())
file_len = len(file)
all_characters = sorted(set(file))
n_characters = len(all_characters)

output_strings2, all_losses2, _ = run()

Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt




In [13]:
print_strings(output_strings2)


	--------- output string  1  -----------
 hord oigenst wea chel con

ITDAiW boMuracre ay thesthell.

I:
WrRDGNNOERIO:
Mind?
Mow hate the thitu

	--------- output string  2  -----------
 hat thay he wentiss ther aneressbe,
Dere derd beithed,
Cat and apreentere ow surd tham the Wo cablef

	--------- output string  3  -----------
  or denker.
is me it 'leek ar cpon, weaciner, of you?
I has on the this in ofes

COUCIUCE RINIENT:
A

	--------- output string  4  -----------
 eer, hairs, door lighall,
Whee leens to thy Whath ste diving.

AOUS:
Peer are it siess the with dice

	--------- output string  5  -----------
  ere welpight with will so plotst cand
Benot no, well them bivatice caman struth that to me simme?



	--------- output string  6  -----------
 e the with lay!

KIOND MINUS:
Now chrevery neseld the oforest the monss
Comread, may amelam the to h

	--------- output string  7  -----------
 he, and, it sight,
Condenterfed so that lards will for chall the pracem,
And her shou my not 

**Performance evaluation**

I'd say my model did pretty well. It was able to change styles of writing depending on the dataset and actually outputted english words.