# Table of Contents
 <p><div class="lev1"><a href="#Necessary-installs"><span class="toc-item-num">1&nbsp;&nbsp;</span>Necessary installs</a></div><div class="lev1"><a href="#Context"><span class="toc-item-num">2&nbsp;&nbsp;</span>Context</a></div><div class="lev1"><a href="#Let's-code-!"><span class="toc-item-num">3&nbsp;&nbsp;</span>Let's code !</a></div>

# Necessary installs

In [1]:
import os, sys
print(sys.version)

3.5.2 |Continuum Analytics, Inc.| (default, Jul  5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]


In [3]:
try:
    import tensorflow
    print("tensorflow available in python 3.5 environment !")
except Exception as e:
    print("You need python 3.5 and to install the tensorflow library to continue !")

tensorflow available in python 3.5 environment !


Using Conda, you can do as follow if you got the error :

- Create a python 3.5 env and jupyter kernel with conda :
```
conda create -n py35 python=3.5
source activate py35
conda install notebook ipykernel
ipython kernel install --user --name=python3.5
```

- Install tensorflow with pip :
```
pip install tensorflow
```

# Context

[TensorFlow tutorial](https://www.tensorflow.org/versions/master/tutorials/mnist/beginners/index.html#mnist-for-ml-beginners) using MNIST dataset to recognize handwritten digits automatically.

The MNIST data is hosted on [Yann LeCun's website](http://yann.lecun.com/exdb/mnist/).

We can download it easily (or read it if already downloaded) as follow :

In [6]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


The MNIST data is split into three parts.

- A training set, so that our net architecture can learn from this data,
- A validation set to configure its hyperparameter,
- A test set to try and predict the digit labels from its image on unseen data.

**mnist.train.images** is a tensor (an n-dimensional array) with a shape of [55000, 784]. The first dimension is an index into the list of images and the second dimension is the index for each pixel in each image. Each entry in the tensor is a pixel intensity between 0 and 1, for a particular pixel in a particular image.

# Let's code !

In [17]:
import tensorflow as tf

TensorFlow lets us describe interacting operations by manipulating symbolic variables, like this :

In [18]:
x = tf.placeholder(tf.float32, [None, 784])

**x** isn't a specific value. It's a **placeholder**, a value that we'll input when we ask TensorFlow to run a computation. We want to be able to input any number of MNIST images, each flattened into a 784-dimensional vector. We represent this as a 2-D tensor of floating-point numbers, with a shape [None, 784]. (Here **None** means that a dimension can be of any length.)

We also need the **weights** and **biases** for our model. We could imagine treating these like additional inputs, but TensorFlow has an even better way to handle it: **Variable**. A Variable is a modifiable tensor that lives in TensorFlow's graph of interacting operations. It can be used and even modified by the computation. For machine learning applications, one generally has the model parameters be Variables.

In [19]:
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

W has a shape of [784, 10] because we want to multiply the 784-dimensional image vectors by it to produce 10-dimensional vectors of evidence for the difference classes. b has a shape of [10] so we can add it to the output.

Our model can now be implemented in only one line.

In [20]:
y = tf.nn.softmax(tf.matmul(x, W) + b)

Now, this model needs to be trained.

To train our model, we need to define a way to labelize it as good or bad. One way to do it is through the **cross-entropy** function. It describes how inefficient our predictions are for describing the truth.

We first need a placeholder to compute it.

In [21]:
y_ = tf.placeholder(tf.float32, [None, 10])

Then we can implement the cross-entropy function, $-\sum\limits y'log(y)$.

In [22]:
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

First, tf.log computes the logarithm of each element of y. Next, we multiply each element of y_ with the corresponding element of tf.log(y). Then tf.reduce_sum adds the elements in the second dimension of y, due to the reduction_indices=[1] parameter. Finally, tf.reduce_mean computes the mean over all the examples in the batch.

Note that in the source code, we don't use this formulation, because it is numerically unstable. Instead, we apply tf.nn.softmax_cross_entropy_with_logits on the unnormalized logits (e.g., we call softmax_cross_entropy_with_logits on tf.matmul(x, W) + b), because this more numerically stable function internally computes the softmax activation. In your code, consider using tf.nn.softmax_cross_entropy_with_logits instead.