<a href="https://colab.research.google.com/github/pjohnst5/Char-nn/blob/master/Char_nn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Description:
An implementation of the [char-rnn model of Karpathy](http://karpathy.github.io/2015/05/21/rnn-effectiveness/). This is a recurrent neural network that is trained probabilistically on sequences of characters, and that can then be used to sample new sequences that are like the original. For more reading see [here](http://colah.github.io/posts/2015-08-Understanding-LSTMs/)



## Results
* Wiring up a basic sequence-to-sequence computation graph
* Implementing a GRU cell.


An example of my final samples are shown below after 150 passes through the Lord of the Rings text dataset.

<code>
eide and the cece the eviled understade and Shire. 
Them. And the rider his allove. 
It he hape
 eer was need to of more blown to still new rithed to have collong to not the our to the 
mucker abou
</code>


---

## Data loading and high level training

---






In [0]:
! wget -O ./text_files.tar.gz 'https://piazza.com/redirect/s3?bucket=uploads&prefix=attach%2Fjlifkda6h0x5bk%2Fhzosotq4zil49m%2Fjn13x09arfeb%2Ftext_files.tar.gz' 
! tar -xzf text_files.tar.gz
! pip install unidecode
! pip install torch

import unidecode
import string
import random
import re
from tqdm import tqdm
import pdb
 
all_characters = string.printable
n_characters = len(all_characters)
file = unidecode.unidecode(open('./text_files/lotr.txt').read())
file_len = len(file)
print('file_len =', file_len)

--2019-10-18 18:12:16--  https://piazza.com/redirect/s3?bucket=uploads&prefix=attach%2Fjlifkda6h0x5bk%2Fhzosotq4zil49m%2Fjn13x09arfeb%2Ftext_files.tar.gz
Resolving piazza.com (piazza.com)... 52.45.119.166, 3.214.17.10, 52.2.48.133, ...
Connecting to piazza.com (piazza.com)|52.45.119.166|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://d1b10bmlvqabco.cloudfront.net/attach/jlifkda6h0x5bk/hzosotq4zil49m/jn13x09arfeb/text_files.tar.gz [following]
--2019-10-18 18:12:17--  https://d1b10bmlvqabco.cloudfront.net/attach/jlifkda6h0x5bk/hzosotq4zil49m/jn13x09arfeb/text_files.tar.gz
Resolving d1b10bmlvqabco.cloudfront.net (d1b10bmlvqabco.cloudfront.net)... 13.249.94.147, 13.249.94.143, 13.249.94.174, ...
Connecting to d1b10bmlvqabco.cloudfront.net (d1b10bmlvqabco.cloudfront.net)|13.249.94.147|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1533290 (1.5M) [application/x-gzip]
Saving to: ‘./text_files.tar.gz’


2019-10-18 18:12:17 (25.2

In [0]:
chunk_len = 200
 
def random_chunk():
  start_index = random.randint(0, file_len - chunk_len)
  end_index = start_index + chunk_len + 1
  return file[start_index:end_index]
  
print(random_chunk())

swer, but he took the other's eye and held it, 
and for a moment they strove thus; but soon, though Aragorn did not stir nor 
move hand to weapon, the other quailed and gave back as if menaced with a 



In [0]:
import torch
from torch.autograd import Variable
# Turn string into list of longs
def char_tensor(string):
  tensor = torch.zeros(len(string)).long()
  for c in range(len(string)):
      tensor[c] = all_characters.index(string[c])
  return Variable(tensor)

print(char_tensor('abcDEF'))

tensor([10, 11, 12, 39, 40, 41])


---

## Creating a GRU cell 

---

The cell I used previously was a pre-defined Pytorch layer. I will now write a  GRU class using the same parameters as the built-in Pytorch class.


In [0]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
from torch.nn.parameter import Parameter

class GRU(nn.Module):
  def __init__(self, input_size, hidden_size, num_layers):
    super(GRU, self).__init__()
    self.sigmoid = nn.Sigmoid()
    self.tanh = nn.Tanh()
    
    self.W_ir = nn.Linear(input_size, hidden_size)
    self.W_hr = nn.Linear(hidden_size, hidden_size)
    
    self.W_iz = nn.Linear(input_size, hidden_size)
    self.W_hz = nn.Linear(hidden_size, hidden_size)
    
    self.W_in = nn.Linear(input_size, hidden_size)
    self.W_hn = nn.Linear(hidden_size, hidden_size)
      
  def forward(self, inputs, hidden):
    # hidden : (n_layers, batch, hidden_size)
    
    # Each layer does the following:
    # r_t = sigmoid(W_ir*x_t + b_ir + W_hr*h_(t-1) + b_hr)
    # z_t = sigmoid(W_iz*x_t + b_iz + W_hz*h_(t-1) + b_hz)
    # n_t = tanh(W_in*x_t + b_in + r_t**(W_hn*h_(t-1) + b_hn))
    # h_(t) = (1 - z_t)**n_t + z_t**h_(t-1)
    # Where ** is hadamard product (not matrix multiplication, but elementwise multiplication)
    r_t = self.sigmoid(self.W_ir(inputs) + self.W_hr(hidden))
    z_t = self.sigmoid(self.W_iz(inputs) + self.W_hz(hidden))
    n_t = self.tanh(self.W_in(inputs) + r_t * (self.W_hn(hidden)))
    hiddens = (1 - z_t) * n_t + z_t * hidden
    
    return n_t, hiddens

---

##  Building a sequence to sequence model

---


In [0]:
#linear layer takes hidden size and shrinks it to vocab size
class RNN(nn.Module):
  def __init__(self, input_size, hidden_size, output_size, n_layers=1):
    super(RNN, self).__init__()
    self.input_size = input_size # num characters
    self.hidden_size = hidden_size # 200
    self.output_size = output_size # num characters
    self.n_layers = n_layers # 3
  
    self.embedding = nn.Embedding(input_size, hidden_size)
    self.gru = GRU(hidden_size, hidden_size, n_layers)
    self.out = nn.Linear(hidden_size, output_size)

  def forward(self, input_char, hidden):
    output = self.embedding(input_char).view(1, 1, -1)
    
    output = F.relu(output)
    
    output, hidden = self.gru(output, hidden)
    
    output = self.out(output[0])
    
    return output, hidden

  def init_hidden(self):
    return Variable(torch.zeros(self.n_layers, 1, self.hidden_size))

In [0]:
def random_training_set():    
  chunk = random_chunk()
  inp = char_tensor(chunk[:-1])
  target = char_tensor(chunk[1:])
  return inp, target

In [0]:
import itertools

def train(decoder, decoder_optimizer, criterion, inp, target):
  ## initialize hidden layers, set up gradient and loss 
  decoder_optimizer.zero_grad()
  hidden = decoder.init_hidden()
  loss = 0
  
  for x, y in zip(inp, target):
    y_hat, hidden = decoder(x, hidden)
    
    loss += criterion(y_hat, y.unsqueeze(0))
   
  loss.backward()
  decoder_optimizer.step()
  return loss.item() / target.shape[0]

---

## Sampling text and Training information

---

This method takes as input a decoder and creates a string of the given length.


In [0]:
def evaluate(decoder, prime_str='A', predict_len=100, temperature=0.8):
  ## initialize hidden variable, initialize other useful variables 
    # your code here
  ## /
  prime_str = char_tensor(prime_str)
  hidden = decoder.init_hidden()
  output_str = ""
  
  with torch.no_grad():
    while len(output_str) < predict_len:
      for char in prime_str:
        prediction, hidden = decoder(char, hidden)
        
        prediction = torch.exp(prediction / temperature)
        
        sample_index = torch.multinomial(prediction, 1)
        
        output_str += all_characters[sample_index]
      
      prime_str = sample_index
  
  return output_str


---

## Running it and generating some text!

---
Now time to run the model. This will train the model outputting sample strings along the way.

In [0]:
import time
import gc
from IPython.core.ultratb import AutoFormattedTB
__ITB__ = AutoFormattedTB(mode = 'Verbose',color_scheme='LightBg', tb_offset = 1)


def print_strings(strings):
  for i in range(len(strings)):
    print("\n\t--------- output string ", i+1, " -----------\n", strings[i])


def run():
  try:
    gc.collect()
    
    n_epochs = 2000
    print_every = 130
    plot_every = 200
    hidden_size = 200
    n_layers = 1
    lr = 0.001

    decoder = RNN(n_characters, hidden_size, n_characters, n_layers)
    decoder_optimizer = torch.optim.Adam(decoder.parameters(), lr=lr)
    criterion = nn.CrossEntropyLoss()

    start = time.time()
    all_losses = []
    output_strings = []
    loss_avg = 0

    loop = tqdm(total=n_epochs, position=0, leave=False)
    for epoch in range(1, n_epochs + 1):
      loss_ = train(decoder, decoder_optimizer, criterion, *random_training_set())       
      loss_avg += loss_

      if epoch % print_every == 0:
          output_strings.append(evaluate(decoder, 'Wh', 100))

      if epoch % plot_every == 0:
          all_losses.append(loss_avg / plot_every)
          loss_avg = 0

      loop.set_description('loss:{:.4f}'.format(loss_))
      loop.update(1)

    return output_strings, all_losses, decoder

  except:
      __ITB__()

output_strings, all_losses, decoder = run()

loss:1.7562: 100%|██████████| 2000/2000 [09:02<00:00,  3.66it/s]

In [0]:
print_strings(output_strings)


	--------- output string  1  -----------
 eed and kees Ipat 
re sthist deay gepaprher thaind rooly nore ad 
aid heeg heed thit chid uro had he

	--------- output string  2  -----------
  alg stound ar of the wisch wite ting was wared siesteri thes the sewith cocli!, stor theven ard 
'n

	--------- output string  3  -----------
  our speat pleds it up frentherst that in to now It surep na 
old the sseade lodsthed deard and the 

	--------- output string  4  -----------
 eed liges they are mame had ouck with that they fille gow Polver 
pilk ta had 'med Gavary sund mill 

	--------- output string  5  -----------
  en the shalls freat and fured to to houg. I wat rompoon the done it we ball, of that were stree fir

	--------- output string  6  -----------
  en Jo dere, and land ley for hears for 





'Wat hought love silly high. But to 
lowe his were the

	--------- output string  7  -----------
 ee limany his come sider, and caney; by the Sing to we 
ince with 
with the free!' said 
he a

In [0]:
for i in range(10):
  start_strings = [" Th", " wh", " he", " I ", " ca", " G", " lo", " ra"]
  start = random.randint(0,len(start_strings)-1)
  print(start_strings[start])
#   all_characters.index(string[c])
  print(evaluate(decoder, start_strings[start], 200), '\n')

 Th
loen reat be and do 
diswering bread heart nawel, 
he were, and the toads now, and he heave he sleat he say hobbits of the tiner the 
groms head all wisthing, you said and they holder and their pather 

 wh
teat all coupting of mengn and dalk 
the ridden of Ores for he sungen to to ace one for shadows felt and were in the what the was not when arry that I walls seen goting of the choom anaver may 

chang 

 ca
aon ring trurd. Then is a was inter the nor the Roantine rooming 
of enen and an the slowed not them thought they find them,' said Gandalf of the farbbight in the Earths begone of ambing his feery of  

 wh
aaile here the 




Fromm tuer fating of the Itried and rest I lat me 
cound upon a land him side now and that reas and 
ane and 










and been that they hom's be seep and you frim and think and  

 Th
shings and be not hell in the that less uppeary water of an 
manes lants that id no beed amponike then we dee than of 
comporouned we find weRtentel before fall, and sh

---

## Generating output on a different dataset

---
I will now generate output from a Shakespeare dataset.

In [0]:
try:
  # %tensorflow_version only exists in Colab.
  %tensorflow_version 2.x
except Exception:
  pass
import tensorflow as tf

path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')
file = unidecode.unidecode(open(path_to_file).read())
file_len = len(file)
all_characters = sorted(set(file))
n_characters = len(all_characters)

output_strings2, all_losses2, _ = run()

TensorFlow 2.x selected.


loss:1.5133: 100%|██████████| 2000/2000 [09:17<00:00,  3.63it/s]

In [0]:
print_strings(output_strings2)


	--------- output string  1  -----------
 ee thes mand ay dhelat keo ou ound the hive, er, o cithoon whteabe,
Ak I hour bed shit wegse tonde r

	--------- output string  2  -----------
 Iith ave tr'ull mistiand un mede
Hard of and ind irs ack ond tnouce thar fersty wonds,
Whe the ince 

	--------- output string  3  -----------
 ie, wrist him, mares rovesis?

PERINGRONTIE:
No to cyelisktinall pore! grese soust Ifows fare, herd,

	--------- output string  4  -----------
 hor the siir aid of and for,
And sherm nat hit come; indon ence id cood houch my consime of hightet 

	--------- output string  5  -----------
 eich will steald ruseds onsame, of them,
Incousefrerss with that your theminr of insush a moness upp

	--------- output string  6  -----------
 iat she dost's live's hims.

POMKENTES:
Araid A farines own to the slime,
A so home, then his peath 

	--------- output string  7  -----------
 eibe sourth as mestard,
Whace not so light, to heaked a puse.

PRING:
I me be menter sue thou

**Performance evaluation**

I'd say my model did pretty well. It was able to change styles of writing depending on the dataset and actually outputted english words.