# Solving MNIST with different python packages

There are many different machine learning packages out there. We want to have a look at the most popular ones and see how we can use them to solve a well known problem. MNIST has become the "hello world" of machine learning. It is a dataset containing images of handwritten digits and labels and the goal is to train a model that can classify the images correctly. In this notebook we use scikit-learn and Tensor Flow and we are especially interested in the necessary steps to get from 'I have data' to 'I have a trained model'.

## Scikit-learn

Scikit-learn is a python packge built on NumPy and SciPy. It contains easy to use implementations of many machine learning algorithms and is one of the most used machine learning packages in python.
To solve our problem we use a support vector machine implementation from scikit-learn and train it on MNIST data.
The following MNIST example is an abbreviated version taken from Scikit-learns documentation.
For the full example look here: http://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html      

In [6]:
from sklearn import datasets, svm, metrics, model_selection

# Prepare data
data, target = datasets.load_digits(return_X_y=True)
train_data, test_data, train_target, expected = model_selection.train_test_split(data, target, random_state=1)

# create and train classifier
classifier = svm.SVC(gamma=0.01)
classifier.fit(train_data, train_target)

predicted = classifier.predict(test_data)
print("Accuracy score", metrics.accuracy_score(expected, predicted))

Accuracy score 0.735555555556


## Auto-sklearn

When using scikit-learn (and machine learning in general) we often have to make decisions that influence how good our models can be. We have to decide what algorithm to use, what model to use or what model parameters to use. These problems have lead to researchers looking into how to automate machine learning itself. Think of it as 'We want to learn how to learn". There is an interesting package called auto-sklearn which is build on top of scikit-learn. In our first example we decided to use a support vector machine with an input parameter. When we run the same example again, but with auto-sklearn the auto classifier will choose an algorithm and input parameters for us. The example is from https://automl.github.io/auto-sklearn/stable/.

In [14]:
import autosklearn.classification

automl = autosklearn.classification.AutoSklearnClassifier()
automl.fit(train_data, train_target)
predicted = automl.predict(test_data)
print("Accuracy score", metrics.accuracy_score(expected, predicted))

Accuracy score 0.988888888889


## Tensor Flow

Last but not least, we want to have a look at Tensor Flow. Tensor Flow is an open source software that was originally developed by Google. Its popularity has many reasons one of which is, that it is suited for both research as well as production environments. We wont go into detail since it is a very large and powerfull software library. Instead we define a simple model for MNIST training and look at the necessary steps and to train it. 
In contrast to scikit-learn, where we simply wrote "model.fit" we now have to declare our model and train it explicitly using gradient descent in a for loop. 
The code we use is again an abbreviated version from a tutorial we found on their documentation. For the full tutorial look here: http://www.tensorflow.org/get_started/mnist/beginners

In [3]:
from tensorflow.examples.tutorials.mnist import input_data

data = input_data.read_data_sets("MNIST_data/", one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


In [8]:
import tensorflow as tf

# init variables
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# implement model
y = tf.nn.softmax(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))

# Train model
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()

for i in range(1000):
    batch_xs, batch_ys = data.train.next_batch(100)
    sess.run(tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy), feed_dict={x: batch_xs, y_: batch_ys})
    
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print("Accuracy score", sess.run(accuracy, feed_dict={x: data.test.images, y_: data.test.labels}))

Accuracy score 0.9065
