# Chicken, Dog or Muffin?
In this notebook, we use CNN on Tensorflow to classify the chicken-dog-muffin dataset.

## Part 1: Step by step

### Step 1: Load the Module

In [13]:
import os
import random
import math
import csv
import tensorflow as tf
import numpy as np
import glob
# we use The Python Imaging Library (PIL) to preprocess image data
# you can install it by: pip install Pillow
from PIL import Image

### Step 2: Choose the Parameters
We will use a three-layer convolution neural network to classify the images, and in this part we choose the parameters for image size, batch size and number of features on each layer. The default parameters chosen here are from Tensorflow official [tutorial](https://www.tensorflow.org/get_started/mnist/pros).

In [14]:
img_size = 64
batch_size = 50
# number of features on each layer
fea_conv1 = 32
fea_conv2 = 64
fea_fc1 = 1024

### Step 3: Read and Transform the Data

Our dataset contains two parts: training set and test set. In each set, there are 3000 images along with their labels, 1000 of each class. In order to run the following codes, you need to download the [dataset](https://www.dropbox.com/s/9elzjdtdsp3emy8/data.zip?dl=0), unzip it, and put in the same folder as this notebook.

This part may take up to 2 minutes. Is there a way to load all 6000 images in parallel?

After preprocessing, we will get two numpy array of dimension [3000, img_size, img_size,3] for the images: `train_img` and `test_img`, and two numpy array of dimension [3000, 3] for the labels: `train_label` and `test_label`.

In [15]:
# read images and transform them into numpy array
train_path = glob.glob("./data/training_data/images/*.jpg")
test_path = glob.glob("./data/test_data/images/*.jpg")

train_img = []
test_img = []

for file_name in train_path:
    pil_im = Image.open(file_name).convert('RGB')
    pil_resize = pil_im.resize((img_size,img_size))
    pil_array = np.array(pil_resize.getdata()).reshape(img_size, img_size, 3)
    train_img.append(pil_array)

for file_name in test_path:
    pil_im = Image.open(file_name).convert('RGB')
    pil_resize = pil_im.resize((img_size,img_size))
    pil_array = np.array(pil_resize.getdata()).reshape(img_size, img_size, 3)
    test_img.append(pil_array)

train_img = np.array(train_img)
test_img = np.array(test_img)
print('Images load is done!')

# read labels and transform them into 3000x3 numpy array
train_label = np.zeros((3000, 3))
test_label = np.zeros((3000, 3))
with open('./data/training_data/label_train.csv') as csvfile:
    label_train = csv.reader(csvfile, delimiter=',')
    ## delete row names
    next(label_train)
    train_list = list(label_train)

with open('./data/test_data/label_test.csv') as csvfile:
    label_test = csv.reader(csvfile, delimiter=',')
    next(label_test)
    test_list = list(label_test)
    
for i in range(3000):
    j_1 = int(train_list[i][1])
    j_2 = int(test_list[i][1])
    train_label[i, j_1] = 1
    test_label[i, j_2] = 1
print('Label loading and transform is done!')

Images load is done!
Label loading and transform is done!


### Step 4: Weight Initialization

To create this model, we're going to need to create a lot of weights and biases. One should generally initialize weights with a small amount of noise for symmetry breaking, and to prevent 0 gradients. Since we're using `ReLU` neurons, it is also good practice to initialize them with a slightly positive initial bias to avoid "dead neurons". Instead of doing this repeatedly while we build the model, let's create two handy functions to do it for us.

In [16]:
## values whose magnitude larger than 2 standard 
## deviation would be dropped and repicked
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

## initialize them with a slightly positive initial bias 
## to avoid "dead neurons"
def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)


### Step 5: Convolution and Pooling

TensorFlow also gives us a lot of flexibility in convolution and pooling operations. Our convolutions uses a stride of one and are zero padded so that the output is the same size as the input. Our pooling is plain old max pooling over 2x2 blocks. To keep our code cleaner, let's also abstract those operations into functions.

In [17]:
# convolution and pooling
def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1],
                       padding='SAME')
def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], 
                          strides=[1, 2, 2, 1], padding='SAME')


### Step 6: First Convolutional Layer

We can now implement our first layer. It will consist of convolution, followed by max pooling. The convolution will compute `fea_conv1` features for each 5x5 patch. Its weight tensor will have a shape of [5, 5, 1, fea_conv1]. The first two dimensions are the patch size, the next is the number of input channels, and the last is the number of output channels. We will also have a bias vector with a component for each output channel.

In [18]:
# first convolutional layer
W_conv1 = weight_variable([5, 5, 3, fea_conv1])
b_conv1 = bias_variable([fea_conv1])



To apply the layer, we first reshape x to a 4d tensor, with the second and third dimensions corresponding to image width and height, and the final dimension corresponding to the number of color channels.

In [19]:
x = tf.placeholder(tf.float32, shape=[None, img_size, img_size, 3])
y_ = tf.placeholder(tf.float32, shape=[None, 3])

x_image = tf.reshape(x, [-1, img_size, img_size, 3])

We then convolve `x_image` with the weight tensor, add the bias, apply the `ReLU` function, and finally max pool. The `max_pool_2x2` method will reduce the image size to img_size/2 * img_size/2.

In [20]:
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

### Step 7: Second Convolutional Layer

In order to build a deep network, we stack several layers of this type. The second layer will have `fea_conv2` features for each 5x5 patch. Moreover, you can add more convolutional layers afterwards.

In [21]:
# second convolutional layer
W_conv2 = weight_variable([5, 5, fea_conv1, fea_conv2])
b_conv2 = bias_variable([fea_conv2])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

### Step 8: Densely Connected Layer

Now that the image size has been reduced to img_size/4 * img_size/4, we add a fully-connected layer with `fea_fc1` neurons to allow processing on the entire image. We reshape the tensor from the pooling layer into a batch of vectors, multiply by a weight matrix, add a bias, and apply a ReLU.

In [22]:
# densily connected layer
size_conv2 = int(img_size/4)
W_fc1 = weight_variable([size_conv2 * size_conv2 * fea_conv2, fea_fc1])
b_fc1 = bias_variable([fea_fc1])

h_pool2_flat = tf.reshape(h_pool2, [-1, size_conv2 * size_conv2 * fea_conv2])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)


### Step 9: Dropout

To reduce overfitting, we will apply dropout before the readout layer. We create a placeholder for the probability that a neuron's output is kept during dropout. This allows us to turn dropout on during training, and turn it off during testing. TensorFlow's tf.nn.dropout op automatically handles scaling neuron outputs in addition to masking them, so dropout just works without any additional scaling.

In [23]:
# dropout
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

### Step 10: Readout Layer

Finally, we add a layer, just like for the one layer softmax regression above.

In [24]:
# readout layer
W_fc2 = weight_variable([fea_fc1, 3])
b_fc2 = bias_variable([3])

y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2

### Step 11: Train and Evaluate the Model

In this part, we will use ADAM optimizer, and will include the additional parameter `keep_prob` in feed_dict to control the dropout rate.

In large-scale machine learning, we usually adopt SGD to reduce memory burden and save running time per iteration. Here, in each step, we only feed `batch_size` images to the optimizer, in an ordered manner. Also, after each epoch (run over all the training samples), we reshuffle the data set to continue the optimization.

In [25]:
# train and evaluate the model
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))

train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# train and evaluate the model
id_raw = np.arange(3000)
it_per_epo = math.floor(3000/batch_size)
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))

train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(1000):
        num_in_epo = i % it_per_epo
        if num_in_epo == 0:
            # if already run an epoch
            # reshuffle the training data
            id_shuf = random.sample(list(id_raw), 3000)
            print('Data reshuffled!')
            batch_img = train_img[id_shuf[:batch_size]]
            batch_label = train_label[id_shuf[:batch_size]]
            train_step.run(feed_dict={x: batch_img, y_: batch_label, keep_prob: 0.5})
            train_accuracy = accuracy.eval(feed_dict={
                x: batch_img, y_: batch_label, keep_prob: 1.0})
        else:
            batch_img = train_img[id_shuf[(num_in_epo*batch_size):(num_in_epo*batch_size + batch_size)]]
            batch_label = train_label[id_shuf[(num_in_epo*batch_size):(num_in_epo*batch_size + batch_size)]]
            train_step.run(feed_dict={x: batch_img, y_: batch_label, keep_prob: 0.5})
            train_accuracy = accuracy.eval(feed_dict={
                x: batch_img, y_: batch_label, keep_prob: 1.0})
        if i%10 ==0:
            print('step %d, training accuracy %g' % (i, train_accuracy))
            print('test accuracy %g' % accuracy.eval(feed_dict={
                x: test_img, 
                y_: test_label, keep_prob: 1.0}))

Data reshuffled!
step 0, training accuracy 0.58
test accuracy 0.360333
step 10, training accuracy 0.48
test accuracy 0.519333
step 20, training accuracy 0.56


KeyboardInterrupt: 

## Part 2: Complete Codes

In [26]:
import os
import random
import math
import csv
import tensorflow as tf
import numpy as np
import glob
# we use The Python Imaging Library (PIL) to preprocess image data
# you can install it by: pip install Pillow
from PIL import Image

img_size = 64
batch_size = 50
# number of features on each layer
fea_conv1 = 32
fea_conv2 = 64
fea_fc1 = 1024

# read images and transform them into numpy array
train_path = glob.glob("./data/training_data/images/*.jpg")
test_path = glob.glob("./data/test_data/images/*.jpg")

train_img = []
test_img = []

for file_name in train_path:
    pil_im = Image.open(file_name).convert('RGB')
    pil_resize = pil_im.resize((img_size,img_size))
    pil_array = np.array(pil_resize.getdata()).reshape(img_size, img_size, 3)
    train_img.append(pil_array)

for file_name in test_path:
    pil_im = Image.open(file_name).convert('RGB')
    pil_resize = pil_im.resize((img_size,img_size))
    pil_array = np.array(pil_resize.getdata()).reshape(img_size, img_size, 3)
    test_img.append(pil_array)

train_img = np.array(train_img)
test_img = np.array(test_img)
print('Images load is done!')

# read labels and transform them into 3000x3 numpy array
train_label = np.zeros((3000, 3))
test_label = np.zeros((3000, 3))
with open('./data/training_data/label_train.csv') as csvfile:
    label_train = csv.reader(csvfile, delimiter=',')
    ## delete row names
    next(label_train)
    train_list = list(label_train)

with open('./data/test_data/label_test.csv') as csvfile:
    label_test = csv.reader(csvfile, delimiter=',')
    next(label_test)
    test_list = list(label_test)
    
for i in range(3000):
    j_1 = int(train_list[i][1])
    j_2 = int(test_list[i][1])
    train_label[i, j_1] = 1
    test_label[i, j_2] = 1
print('Label loading and transform is done!')

# weight initialization

## values whose magnitude larger than 2 standard 
## deviation would be dropped and repicked
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

## initialize them with a slightly positive initial bias 
## to avoid "dead neurons"
def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

# convolution and pooling
def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1],
                       padding='SAME')
def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], 
                          strides=[1, 2, 2, 1], padding='SAME')


# first convolutional layer
W_conv1 = weight_variable([5, 5, 3, fea_conv1])
b_conv1 = bias_variable([fea_conv1])


x = tf.placeholder(tf.float32, shape=[None, img_size, img_size, 3])
y_ = tf.placeholder(tf.float32, shape=[None, 3])

x_image = x

h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

# second convolutional layer
W_conv2 = weight_variable([5, 5, fea_conv1, fea_conv2])
b_conv2 = bias_variable([fea_conv2])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

# densily connected layer
size_conv2 = int(img_size/4)
W_fc1 = weight_variable([size_conv2 * size_conv2 * fea_conv2, fea_fc1])
b_fc1 = bias_variable([fea_fc1])

h_pool2_flat = tf.reshape(h_pool2, [-1, size_conv2 * size_conv2 * fea_conv2])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

# dropout
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

# readout layer
W_fc2 = weight_variable([fea_fc1, 3])
b_fc2 = bias_variable([3])

y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2

# train and evaluate the model
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))

train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# train and evaluate the model
id_raw = np.arange(3000)
it_per_epo = math.floor(3000/batch_size)
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))

train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(1000):
        num_in_epo = i % it_per_epo
        if num_in_epo == 0:
            # if already run an epoch
            # reshuffle the training data
            id_shuf = random.sample(list(id_raw), 3000)
            print('Data reshuffled!')
            batch_img = train_img[id_shuf[:batch_size]]
            batch_label = train_label[id_shuf[:batch_size]]
            train_step.run(feed_dict={x: batch_img, y_: batch_label, keep_prob: 0.5})
            train_accuracy = accuracy.eval(feed_dict={
                x: batch_img, y_: batch_label, keep_prob: 1.0})
        else:
            batch_img = train_img[id_shuf[(num_in_epo*batch_size):(num_in_epo*batch_size + batch_size)]]
            batch_label = train_label[id_shuf[(num_in_epo*batch_size):(num_in_epo*batch_size + batch_size)]]
            train_step.run(feed_dict={x: batch_img, y_: batch_label, keep_prob: 0.5})
            train_accuracy = accuracy.eval(feed_dict={
                x: batch_img, y_: batch_label, keep_prob: 1.0})
        if i%10 ==0:
            print('step %d, training accuracy %g' % (i, train_accuracy))
            print('test accuracy %g' % accuracy.eval(feed_dict={
                x: test_img, 
                y_: test_label, keep_prob: 1.0}))




Images load is done!
Label loading and transform is done!
Data reshuffled!
step 0, training accuracy 0.32
test accuracy 0.372667
step 10, training accuracy 0.48
test accuracy 0.473333
step 20, training accuracy 0.5
test accuracy 0.487333
step 30, training accuracy 0.58
test accuracy 0.592333
step 40, training accuracy 0.5
test accuracy 0.630667
step 50, training accuracy 0.76
test accuracy 0.637667
Data reshuffled!
step 60, training accuracy 0.72
test accuracy 0.632333
step 70, training accuracy 0.74
test accuracy 0.662667
step 80, training accuracy 0.76
test accuracy 0.660667
step 90, training accuracy 0.76
test accuracy 0.661333
step 100, training accuracy 0.52
test accuracy 0.646
step 110, training accuracy 0.78
test accuracy 0.654
Data reshuffled!
step 120, training accuracy 0.72
test accuracy 0.685333
step 130, training accuracy 0.78
test accuracy 0.674667
step 140, training accuracy 0.7
test accuracy 0.657
step 150, training accuracy 0.7
test accuracy 0.672
step 160, training acc

KeyboardInterrupt: 