# Identifying Hand Written Digits
### Use Labelbox and Tensorflow to train an AI to identify hand written digits

**Optional**: [Label 120 MNIST images using Labelbox](https://medium.com/labelbox/identifying-hand-written-digits-beta-51d71f5c883f)

### Build Training and Testing Input Data Sets for Tensorflow

**Import required libraries**

In [1]:
import json
from PIL import Image
import requests
import numpy as np
import tensorflow as tf

### Setup Training and Testing data sets from the Labelbox export

**Read Labelbox JSON Export of Labeled Data**

In [2]:
# set this to the path of your Labelbox JSON Export
labeled_data = 'data/lb-mnist-120.json'

# read labelbox JSON output
with open (labeled_data, 'r') as f:
    lines = f.readlines()
label_data = json.loads(lines[0])

**Convert Labeled Data to Tensors**

In [3]:
# read labelbox JSON output
with open ('data/lb-mnist-120.json', 'r') as f:
    lines = f.readlines()
label_data = json.loads(lines[0])


# convert the 0-9 integer label to represent a 1 in a length 10 array (one hot format)
cls = np.zeros(shape=len(label_data), dtype=float)
labels = np.zeros(shape=(len(label_data), 10), dtype=float)
images = np.zeros(shape=(len(label_data), 784), dtype=float)
for i, data in enumerate(label_data):
    # load labels and classes into array
    labels[i, data['Label']] = 1
    cls[i] = data['Label']
    
    # load images into array format
    response = requests.get(data['Labeled Data'], stream=True)
    response.raw.decode_content = True
    img = Image.open(response.raw).convert('L')
    images[i] = np.array(img.getdata(), dtype=np.uint8)
    
# separate images and labels into training and testing sets
test_images = images[:20]
train_images = images[20:]
test_labels = labels[:20]
train_labels = labels[20:]
test_cls = cls[:20]
train_cls = cls[20:]

### Setup TensorFlow

In [4]:
x = tf.placeholder(tf.float32, [None, 784])

W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

y = tf.nn.softmax(tf.matmul(x, W) + b)

### Setup Model Training

In [5]:
y_ = tf.placeholder(tf.float32, [None, 10])

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

### Initialize Tensorflow

In [6]:
init = tf.global_variables_initializer()

### Train the Model

In [7]:
sess = tf.Session()
sess.run(init)

sess.run(train_step, feed_dict={x: train_images, y_: train_labels})

### Evaluate the Model against the official MNIST test data

In [8]:
# Download and read official MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("data/", one_hot=True)

# Calculate model accuracy
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

print('\n\nModel Accuracy: {}'.format(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})))

Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting data/t10k-labels-idx1-ubyte.gz


Model Accuracy: 0.4742000102996826
