Alan Turing's famous Turning test helps evaulate whether a machine's intelligence matches a human's intelligence. This test was called the imitation game. Where a machine has to try fool the human into thinking it is a human. 

A common approach to language tasks are Recurrent Neural Networks (RNNs), but there are many other types that have other use cases:

- Character RNN used to predict the next character in an sentence, using a Stateless RNN and then a Stateful RNN.
- Sentiment Analysis by extracting a feeling within a sentence
- Neural Machine Translation (NMT) capable of tranlating languages. 

We will also look at how we can boost the RNN performance by using Attention Mechanisms and Encoder-Decoder architecture, which allows the network to focus on a select part of the inputs at each time step. 

Finally, we will then look at a Transformer, a very succesful NLP architecture, before discussing GPT-2 and BERT. 

In [1]:
import sys 
sys.version_info > (3, 5)

import numpy as np
import tensorflow as tf
assert tf.__version__ > "2.0"
from tensorflow import keras
import matplotlib.pyplot as plt

# Shakespeare Dataset

Below is an example of how we would work with text data by converting it using a tokenizer, how to split text data because we cannot shuffle the data as we do with tabular data, 

In [2]:
shakespeare_url = "https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt"
filepath = keras.utils.get_file("shakespeare.txt", shakespeare_url)
with open(filepath) as f:
    shakespeare_text = f.read()

In [3]:
print(shakespeare_text[60:250])



All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.



In [4]:
"".join(sorted(set(shakespeare_text.lower()))) # list of characters within dataset

"\n !$&',-.3:;?abcdefghijklmnopqrstuvwxyz"

In [5]:
# convert all characters into a unique character ID
tokenizer = keras.preprocessing.text.Tokenizer(char_level=True)
tokenizer.fit_on_texts(shakespeare_text)

In [6]:
tokenizer.texts_to_sequences('Romeo')

[[9], [4], [15], [2], [4]]

In [7]:
"".join(tokenizer.sequences_to_texts([[9], [4], [15], [2], [4]]))

'romeo'

In [8]:
max_id = len(tokenizer.word_index) # number of distinct characters
dataset_size = tokenizer.document_count

Note, the word encoder sets the IDs from 1 to 39 so when we convert the entire text to ID we need to -1 so we can get IDs from 0 to 38.

In [9]:
[encoded] = np.array(tokenizer.texts_to_matrix([shakespeare_text])) - 1

--- 
#TO DO

## Split data

# Char RNN

We can train a model on all of Shakespeare's work and then use to predict a character in a sentence. This can be used to produce novel text and is pretty fun to read about. 

Read this blog by Andrej Karapthy: https://karpathy.github.io/2015/05/21/rnn-effectiveness/

# Stateful RNN

# Sentiment Analysis


# Bidirectional RNNs

# Attention Mechanisms