# Deep MNIST for Experts

Tensorflow is a powerful library for doing large-scale numerical computation. One of the tasks at which it excels is implementating and training deep neural networks.

## Set up

Before we create our model, we will first load the MNIST dataset, and start a TensorFlow sesssion.

### Load MNIST data

In [7]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


### start TensorFlow InteractiveSession

TensorFlow relies on a highly efficient C++ backend to do its computation. The connection to this backend is called a `session`. The common usage for TensorFlow program is to first create a graph and then launch it in a session.

Here we instead use the convenient `InteractiveSession` class, which makes TensorFlow more flexible about how you structure your code. It allows you to interleave operation which build a `computation graph` with ones that run the graph. This is particularly convenient when working in interactive contexts like Ipython. **If you are not using an `InteractiveSession`, then you should build the entire computation graph before starting a sessin and launching the graph.**

In [13]:
import tensorflow as tf
sess = tf.InteractiveSession()

### Computation Graph

To do efficient numerical computing in Python, we typically use libraries like `NumPy` that do expensive operations such as matrix multiplication outside Python, using highly efficient code implemented in another language. Unfortunately, there can still be a lot of overhead from switching back to Python every operation. This overhead is especially bad if you want to run computations on GPUs or in a distributed manner, where there can be a high cost to transferring data.

TensorFlow also does its heavy lifting outside Python, but it takes things a step further to avoid this overhead. Instead of running a single expensive operation independently from Python, TensorFlow lets us describe a graph of interacting operations that run entirely outside Python. This approach is similar to that used in `Theano` or `Torch`.

The role of the Python code is therefore to build this external computation graph, and to dictate which parts of the computation graph should be run. 


## Build a Softmax Regression Model

In this section we will build a softmax regression model with a single linear layer. In the next section, we will extend this to the case of softmax regression with a multilayer convolutional network.

### Placeholders

We start building the computation graph by creating nodes for the input images and target output classes.

