# Text l : Working with Text and Sequences, and TensorBoard Visualization.

We will be covering recurrent neural network(RNNs) and long short-term memory(LSTM) networks and handle sequences of variable length.

### Introduction to Recurrent Neural Networks

The basic idea behind RNN models is that each new element in the
sequence contributes some new information, which updates the current state of the current state of the model.

A fundamental mathematical construct in statistics and probably, which is often 
used as building block for modelling sequential pattern via 
machine learning is the Markov chain model. We tend to view our
data sequences as "chains", with each node in the chain dependent in some way on the
previous node, so that "history" is not erased but carried on.


RNN models are the based on this notion of chain structure. As the name
implie, recurrent neural nets apply some form of "loop." At some point in time t,
the network observes an input x(t)(a word in a sentence) and update its "state vector" to h(t) from the 
previous vector h(t-1). When we process new input (the next word), it will be done in some manner that is dependent on h(t) and thus on
the history of the sequence (the previous words we've seen affect our understanding of the current word).
                             Recurrent structure can simply be viewed as one long unrolled chain, with each node in the chain performing the same 
                             kind of processing "step" based on the "message" it obtains from the output of the previous node.

# Vanilla RNN Implementation 

We introduce some powerful, fairly low-level tools that Tensorflow provides for working
with sequence data, which you can use to implement your own systems.
We begin with our basic model mathematically. This mainly consists of defining the
recurrence structure - the RNN update step.
The update step for our simple vanilla RNN is
h(t) = tanh(W(x)x(t) + W(h)h(t-1) + b)
where W(h),W(x) and b are weight and bias variables. tanh(.) is the hyperbolic tangent function
that has its range in [-1,1] and


### MNIST image as sequences

From the previous chapter the architecture of convolutional neural networks makes
use of the spatial structure of images, it is revealing to look at the structure of 
images from different angles by trying to capture in some sense the "generative process" that
created each image. Intuitively, this all comes down to the notion that nearby areas in 
images are somehow related, and trying to model this structure.
In our MNIST data, this just means that each 28 * 28 pixel image can be viewed as sequence of lengh 28,
each element in the sequence a vector of 28 pixels. Then the temporal dependencies in the RNN can be imaged as a scanner 
head, scanning the image from top to buttom(rows) or left to right (columns).

We start by loading data, defining some parameters, and creating placeholders for
our data:
    

In [None]:
import tensorflow as tf

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data", one_hot=True)

#Define some parameters
element_size = 28
time_steps = 28
num_classes = 10
batch_size = 128
hidden_layer_size = 128

# Where to save TensorBoard model summaries
LOG_DIR = "logs/RNN_with_summaries"

# Create placeholders for inputs, labels
_inputs = tf.placeholder(tf.float32, shape=[None, time_steps, element_size], name="inputs")

y = tf.placeholder(tf.float32, shape=[None, num_classes], name="labels")
