# Tensorflow

In [33]:
import matplotlib.pyplot as plt
%matplotlib inline

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


The MNIST data is split into three parts: **55,000 data points of training data (mnist.train), 10,000 points of test data (mnist.test), and 5,000 points of validation data (mnist.validation)**. This split is very important: it's essential in machine learning that we have separate data which we don't learn from so that we can make sure that what we've learned actually generalizes!

**Training Set:** this data set is used to adjust the weights on the neural network.

**Validation Set:** this data set is used to minimize overfitting. You're not adjusting the weights of the network with this data set, you're just verifying that any increase in accuracy over the training data set actually yields an increase in accuracy over a data set that has not been shown to the network before, or at least the network hasn't trained on it (i.e. validation data set). If the accuracy over the training data set increases, but the accuracy over then validation data set stays the same or decreases, then you're overfitting your neural network and you should stop training.

**Testing Set:** this data set is used only for testing the final solution in order to confirm the actual predictive power of the network.

Each image is 28 pixels by 28 pixels. We can interpret this as a big array of numbers:

![28](https://www.tensorflow.org/images/MNIST-Matrix.png)

We can flatten this array into a vector of 28x28 = 784 numbers. It doesn't matter how we flatten the array, as long as we're consistent between images. From this perspective, the MNIST images are just a bunch of points in a 784-dimensional vector space, with a [very rich structure](https://colah.github.io/posts/2014-10-Visualizing-MNIST/) (warning: computationally intensive visualizations).

In [34]:
x = tf.placeholder(tf.float32, [None, 784])

In [35]:
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

Weights
![weights](https://www.tensorflow.org/images/softmax-weights.png)

y = tf.nn.softmax(tf.matmul(x, W) + b)

![Graph](https://www.tensorflow.org/images/softmax-regression-scalargraph.png)

# y = softmax(Wx + b)

In [5]:
y_ = tf.placeholder(tf.float32, [None, 10])

In [6]:
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

In [7]:
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

In [8]:
sess = tf.InteractiveSession()

In [9]:
tf.global_variables_initializer().run()

In [10]:
for _ in range(1000):
  batch_xs, batch_ys = mnist.train.next_batch(100)
  sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

In [11]:
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

In [12]:
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

In [14]:
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

0.9151
