<a href="https://colab.research.google.com/github/purvasingh96/Deep-learning-with-neural-networks/blob/master/Deep-learning-with-pytorch/3.%20Recurrent%20Neural%20Networks/Char_RNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Text generation with an RNN


This notebook demonstrates how to generate text using a **charecter-level LSTM with PyTorch** using dataset from the book **Anna Karenina**. Given a sequence of charecter from this book, the model will generate longer sequences of data by calling the model repeatedly.

While some of the sentences are grammatical, most do not make sense. The model has not learned the meaning of words, but consider:

* The model is character-based. When training started, the model did not know how to spell an English word, or that words were even a unit of text.

* The structure of the output resembles a play—blocks of text generally begin with a speaker name, in all capital letters similar to the dataset.

* As demonstrated below, the model is trained on small batches of text (100 characters each), and is still able to generate a longer sequence of text with coherent structure. Below is the **general architecture of the character-wise RNN.**<br>
<img src="https://github.com/purvasingh96/Deep-learning-with-neural-networks/blob/master/Deep-learning-with-pytorch/3.%20Recurrent%20Neural%20Networks/images/lstm_rnn_architecture.png?raw=1"></img> 




# Set Up
### Import PyTorch and other libraries


In [0]:
import torch
from torch import nn
import torch.nn.functional as F
import numpy as np


# Download the Anna Karenina data
 

### Read the data

In [0]:
with open('sample_data/anna.txt', 'r') as f:
  text = f.read()

### First look at the text

In [0]:
print(text[:100])

Chapter 1


Happy families are all alike; every unhappy family is unhappy in its own
way.

Everythin


### GPU Usage
Enable GPU acceleration to execute this notebook faster. In Colab: *Runtime > Change runtime type > Hardware acclerator > GPU*. If running locally make sure TensorFlow version >= 1.11.

# Process the text
### Vectorize the text (Tokenization)
Before training, we need to map strings to a numerical representation. Create two lookup tables: one mapping characters to numbers, and another for numbers to characters.

In [0]:
chars = tuple(set(text))
int2char = dict(enumerate(chars))
char2int = {ch:ii for ii, ch in int2char.items()}

# encode text
encoded = np.array([char2int[ch] for ch in text])
encoded

array([51, 65, 10, ..., 13,  1, 65])

# Pre-processing the data
Our LSTM expects an input that is **one-hot encoded** meaning that each character is converted into an integer (via our created dictionary) and then converted into a column vector where only it's corresponding integer index will have the value of 1 and the rest of the vector will be filled with 0's.

In [0]:
def one_hot_encode(arr, n_labels):
  # initialize encoded array
  # arr.shape = (3,8)
  # np.arange(3) = [0, 1, 2]
  # arr.flatten() = ([[1, 2, 3]]) => ([1, 2, 3])
  one_hot = np.zeros((arr.size, n_labels), dtype=np.float32)
  one_hot[np.arange(one_hot.shape[0]), arr.flatten()] = 1
  one_hot = one_hot.reshape((*arr.shape, n_labels))
  return one_hot

In [0]:
test_seq = np.array([[3, 4, 5]])
one_hot_encode(test_seq, 8)

array([[[0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0.]]], dtype=float32)

# Making training mini-batches

To train on this data, we create mini batches for training. We want our batches to be multiple sequences of some desired number of sequence steps as below-<br>
<img src="https://github.com/purvasingh96/Deep-learning-with-neural-networks/blob/master/Deep-learning-with-pytorch/3.%20Recurrent%20Neural%20Networks/images/mini_batch_1.png?raw=1"></img><br><br>
In this example, we'll take the encoded characters (passed in as the arr parameter) and split them into multiple sequences, given by batch_size. Each of our sequences will be seq_length long.

# Creating Batches

### 1. Discard text to accomodate completely full mini-batches

* batch_size = `N (2)`
* seq_length = `M (3)`
* no. of charecters in one batch =` N * M (2 * 3 = 6 )`
* Total batches `(K)` that can be made out of the given array :

`len(arr)/ (no. of charecters per batch) = 12/6 = 2`

* Total charecters in array to be kept in-order to accomodate completely full mini-batch - 

`arr[:N * M * K] = uptil arr[10]` (discarding arr[11]=12)




### 2. Split the array into N batches
You can do this by using :<br>`arr.reshape((batch_size, -1))`.<br>
After this the size of array should be -<br>
`N * (M * K)`




### 3. Iterate through mini-batches
The idea is, each batch is of size `(N * M) window` on `N * (M * K) array`. This window slides over by `seq_length`. We also want both input and target arrays.
<br>
Target arrays are basically input arrays shifted over by one charecter.