# ECBM E4040 - Assignment 2- Task 5: Kaggle Open-ended Competition

Kaggle is a platform for predictive modelling and analytics competitions in which companies and researchers post data and statisticians and data miners compete to produce the best models for predicting and describing the data.

If you don't have a Kaggle account, feel free to join at [www.kaggle.com](https://www.kaggle.com). To let the TAs do the grading more conveniently, please use Lionmail to join Kaggle and use UNI as your username.

Visit the website for this competition to join: 
[https://www.kaggle.com/t/8dd419892b1c49a3afb0cea385a7e677](https://www.kaggle.com/t/8dd419892b1c49a3afb0cea385a7e677)

Details about this in-class competition is shown on the website above. Please read carefully.

<span style="color:red">__TODO__:</span>
1. Train a custom model for the bottle dataset classification problem. You are free to use any methods taught in the class or found by yourself on the Internet (ALWAYS provide reference to the source). General training methods include:
    * Dropout
    * Batch normalization
    * Early stopping
    * l1-norm & l2-norm penalization
2. You'll be given the test set to generate your predictions (70% public + 30% private, but you don't know which ones are public/private). Achieve 70% accuracy on the public test set. The accuracy will be shown on the public leaderboard once you submit your prediction .csv file. 
3. (A) Report your results on the Kaggle, for comparison with other students' optimization results (you should do this several times). (C) Save your best model, using BitBucket, at the same time when you (B) submit the homework files into Courseworks. See instructions below. 

__Hint__: You can start from what you implemented in task 4. Another classic classification model named 'VGG16' can also be easily implemented.

## HW Submission Details:
There are three components to reporting the results of this task: 

**(A) Submission (possible several) of the .csv prediction file throught the Kaggle platform;**. You should start doing this VARY early, so that students can compare their work as they are making progress with model optimization.

**(B) Editing and submitting the content of this Jupyter notebook, through Courseworks; **
(i) The code for your CNN model and for the training function. The code should be stored in __./ecbm4040/neuralnets/kaggle.py__;
(ii) Print out your training process and accuracy __within this notebook__;

**(C) Submitting your best CNN model through instructor-owned private BitBucket repo.**

**Description of (C):** 
For this task, you will be utilizing bitbucket to save your model for submission. Bitbucket provides Git code managment. For those who are not familiar with git operations, please check [Learn Git with Bitbucket Cloud](https://www.atlassian.com/git/tutorials/learn-git-with-bitbucket-cloud) as reference.
**TAs will create a private Bitbucket repository for each student, with the write access. This repo will be owned by the instructors. Make sure to properly submit your model to that exact repository (submissions to your own private repository will not count)** Students need to populate the following file to provide instructors with bitbucket account information: https://docs.google.com/spreadsheets/d/1_7cZjyr34I2y-AD_0N5UaJ3ZnqdhYcvrdoTsYvOSd-g/edit#gid=0.

<span style="color:red">__Submission content:__ :</span>
(i) Upload your best model with all the data output (for example, __MODEL.data-00000-of-00001, MODEL.meta, MODEL.index__) into the  BitBucket. Store your model in the folder named "__KaggleModel__" within the BitBucket repository. 
Remember to delete any intermediate results, **we only want your best model. Do not upload any data files**. The instructors will rerun the uploaded best model and verify against the score which you reported on the Kaggle.



In [1]:

%matplotlib inline
%load_ext autoreload
%autoreload 2

# Import modules
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image


In [17]:
import glob

paths = [ "kaggle/train_128/0/*.png" , "kaggle/train_128/1/*.png" ,\
         "kaggle/train_128/2/*.png" ,  "kaggle/train_128/3/*.png" , \
         "kaggle/train_128/4/*.png"]   


X_train = np.empty((15000,64,64,3))
y_train = np.empty(15000)

for i, path in enumerate(paths):
    filenames = glob.glob(path)
    temp = np.empty((len(filenames), 64, 64,3))
    print(str(len(filenames)) + ' images in batch'  )
    for j, fname in enumerate(filenames):
        img = Image.open(fname)
        img = img.resize((64,64),Image.NEAREST)
        arr = np.array(img)
        X_train[(3000*i) + j] = arr / 255
        y_train[(3000*i) + j] = i
    print(path[-7])
    



3000 images in batch
0
3000 images in batch
1
3000 images in batch
2
3000 images in batch
3
3000 images in batch
4


In [18]:
num_train = 14200 
num_val = 800
num_dev = 100

reshape = np.random.permutation(X_train.shape[0])

X_train = X_train[reshape]
y_train = y_train[reshape]

# The development set is used for augmentation practices.
mask = np.random.choice(num_train, num_dev, replace=False)
X_dev = X_train[mask]
y_dev = y_train[mask]

# Seperate Training set into a training set and a validation set
X_val = X_train[num_train:]
y_val = y_train[num_train:]
X_train = X_train[:num_train]
y_train = y_train[:num_train]



# Preprocessing: subtract the mean value across every dimension for training data, and reshape it to be RGB size
mean_image = np.mean(X_train, axis=0)
X_train = X_train.astype(np.float32) - mean_image.astype(np.float32)
X_val = X_val.astype(np.float32) - mean_image




print(X_train.shape, X_val.shape, X_dev.shape)

(14200, 64, 64, 3) (800, 64, 64, 3) (100, 64, 64, 3)


## Train your model here

In [None]:
from ecbm4040.image_generator import ImageGenerator

dev_gen = ImageGenerator(X_dev , y_dev)
dev_gen.show()

In [None]:
del dev_gen


Train_X_Gen = ImageGenerator(X_train , y_train)

In [49]:
from ecbm4040.neuralnets.cnn_kaggle import my_training
import tensorflow as tf




tf.reset_default_graph()
result , cache = my_training(X_train, y_train, X_val, y_val, 
             conv_featmap=[10],
             fc_units=[84, 84],
             conv_kernel_size=[5],
             pooling_size=[2],
             l2_norm= .0001,
             seed=235,
             use_adam = True,
             learning_rate= .001,
             epoch=20,
             batch_size=245,
             verbose=False,
             pre_trained_model= None)


Building my LeNet. Parameters: 
conv_featmap=[10]
fc_units=[84, 84]
conv_kernel_size=[5]
pooling_size=[2]
l2_norm=0.0001
seed=235
learning_rate=0.001
number of batches for training: 57
epoch 1 
epoch 2 
Best validation accuracy! iteration:100 accuracy: 69.75%
epoch 3 
epoch 4 
Best validation accuracy! iteration:200 accuracy: 76.375%
epoch 5 
epoch 6 
Best validation accuracy! iteration:300 accuracy: 79.875%
epoch 7 
epoch 8 
Best validation accuracy! iteration:400 accuracy: 83.625%
epoch 9 
epoch 10 
epoch 11 
epoch 12 
epoch 13 
epoch 14 
epoch 15 
epoch 16 
epoch 17 
epoch 18 
epoch 19 
epoch 20 
Best validation accuracy! iteration:1100 accuracy: 84.125%
Traning ends. The best valid accuracy is 84.125. Model named lenet_1509848889.


In [20]:
from ecbm4040.neuralnets.cnn_kaggle import my_training
import tensorflow as tf

kernel_sizes = [2, 5 ,10]
l2_penalties = [.00001, .0001 , .001, .01]
learning_rates = [1e-4, 1e-2, 1e-1]
batch_sizes    = [75, 150, 250, 400]
pooling_size   = [1, 2, 5, 7, 10]
conv_features  = [ 3, 6, 10 ]

results = list()
parameters = list()

for i in range(20):
    
    tf.reset_default_graph()
    k = np.random.choice(kernel_sizes)
    l = np.random.choice(l2_penalties)
    r = np.random.choice(learning_rates)
    b = np.random.choice(batch_sizes)
    p = np.random.choice(pooling_size)
    f = np.random.choice(conv_features)

    print('iteration ' + str(i) + ': \n')
    print('kernel size: ' + str(k)+ '\n')
    print('penalty: ' + str(l)+ '\n')
    print('learning rate: ' + str(r)+ '\n')
    print('batch size: ' + str(b)+ '\n')


    result , cache = my_training(X_train, y_train, X_val, y_val, 
             conv_featmap=[f],
             fc_units=[128, 84],
             conv_kernel_size=[k],
             pooling_size=[p],
             l2_norm= r,
             seed=235,
             use_adam = True,
             learning_rate= l,
             epoch=20,
             batch_size=b,
             verbose=False,
             pre_trained_model=None)
    
    results.append(result)
    parameters.append(cache)

iteration 0: 

kernel size: 5

penalty: 1e-05

learning rate: 0.1

batch size: 75

Building my LeNet. Parameters: 
conv_featmap=[10]
fc_units=[128, 84]
conv_kernel_size=[5]
pooling_size=[2]
l2_norm=0.1
seed=235
learning_rate=1e-05
number of batches for training: 189
epoch 1 
Best validation accuracy! iteration:100 accuracy: 35.625%
epoch 2 
Best validation accuracy! iteration:200 accuracy: 44.625%
Best validation accuracy! iteration:300 accuracy: 48.875%
epoch 3 
Best validation accuracy! iteration:400 accuracy: 53.75%
Best validation accuracy! iteration:500 accuracy: 56.125%
epoch 4 
Best validation accuracy! iteration:600 accuracy: 56.875%
Best validation accuracy! iteration:700 accuracy: 57.625%
epoch 5 
Best validation accuracy! iteration:800 accuracy: 59.25%
Best validation accuracy! iteration:900 accuracy: 59.625%
epoch 6 
Best validation accuracy! iteration:1000 accuracy: 60.875%
epoch 7 
Best validation accuracy! iteration:1200 accuracy: 61.375%
epoch 8 
Best validation accurac

epoch 1 
Best validation accuracy! iteration:100 accuracy: 61.375%
epoch 2 
Best validation accuracy! iteration:200 accuracy: 70.25%
epoch 3 
epoch 4 
Best validation accuracy! iteration:600 accuracy: 71.625%
Best validation accuracy! iteration:700 accuracy: 73.75%
epoch 5 
epoch 6 
epoch 7 
epoch 8 
Best validation accuracy! iteration:1500 accuracy: 76.375%
epoch 9 
epoch 10 
epoch 11 
epoch 12 
epoch 13 
epoch 14 
epoch 15 
epoch 16 
epoch 17 
epoch 18 
epoch 19 
epoch 20 
Traning ends. The best valid accuracy is 76.375. Model named lenet_1509843932.
iteration 8: 

kernel size: 2

penalty: 0.001

learning rate: 0.0001

batch size: 400

Building my LeNet. Parameters: 
conv_featmap=[10]
fc_units=[128, 84]
conv_kernel_size=[2]
pooling_size=[10]
l2_norm=0.0001
seed=235
learning_rate=0.001
number of batches for training: 35
epoch 1 
epoch 2 
epoch 3 
Best validation accuracy! iteration:100 accuracy: 59.125%
epoch 4 
epoch 5 
epoch 6 
Best validation accuracy! iteration:200 accuracy: 65.62

epoch 1 
epoch 2 
Best validation accuracy! iteration:100 accuracy: 13.375%
epoch 3 
Best validation accuracy! iteration:200 accuracy: 16.125%
epoch 4 
Best validation accuracy! iteration:300 accuracy: 17.25%
epoch 5 
Best validation accuracy! iteration:400 accuracy: 17.625%
epoch 6 
Best validation accuracy! iteration:500 accuracy: 18.25%
epoch 7 
Best validation accuracy! iteration:600 accuracy: 19.0%
epoch 8 
Best validation accuracy! iteration:700 accuracy: 19.75%
epoch 9 
epoch 10 
Best validation accuracy! iteration:900 accuracy: 20.25%
epoch 11 
Best validation accuracy! iteration:1000 accuracy: 20.875%
epoch 12 
Best validation accuracy! iteration:1100 accuracy: 22.25%
epoch 13 
Best validation accuracy! iteration:1200 accuracy: 23.625%
epoch 14 
Best validation accuracy! iteration:1300 accuracy: 25.0%
epoch 15 
Best validation accuracy! iteration:1400 accuracy: 25.875%
epoch 16 
Best validation accuracy! iteration:1500 accuracy: 27.75%
epoch 17 
epoch 18 
Best validation accur

In [24]:
results

[68.75,
 18.375,
 84.0,
 29.75,
 68.375,
 50.25,
 84.25,
 76.375,
 72.5,
 51.625,
 69.625,
 71.25,
 78.875,
 54.0,
 79.5,
 77.375,
 30.875,
 68.375,
 66.75,
 26.625]

## Save your best model

In [61]:
tf.reset_default_graph()
from ecbm4040.neuralnets.cnn_kaggle import evaluate


with tf.Session() as sess:
    saver = tf.train.import_meta_graph('model/lenet_1509844345.meta')
    saver.restore(sess, tf.train.latest_checkpoint('model/'))
    ys = tf.placeholder(shape=[None, ], dtype=tf.int64)

    
    eve = evaluate(saver, ys)

    sess.run(tf.global_variables_initializer())
    
    valid_eve = sess.run([eve], feed_dict={xs: X_val, ys: y_val})
    valid_acc = 100 - valid_eve * 100 / y_val.shape[0]


INFO:tensorflow:Restoring parameters from model/lenet_1509848889


InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [128] rhs shape= [84]
	 [[Node: save/Assign_8 = Assign[T=DT_FLOAT, _class=["loc:@fc_layer_0/fc_kernel/fc_bias_0"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/gpu:0"](fc_layer_0/fc_kernel/fc_bias_0/Adam_1, save/RestoreV2_8/_9)]]

Caused by op 'save/Assign_8', defined at:
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 477, in start
    ioloop.IOLoop.instance().start()
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/zmq/eventloop/ioloop.py", line 177, in start
    super(ZMQIOLoop, self).start()
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/tornado/ioloop.py", line 888, in start
    handler_func(fd_obj, events)
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
    self._handle_recv()
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
    self._run_callback(callback, msg)
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
    callback(*args, **kwargs)
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell
    handler(stream, idents, msg)
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/ipykernel/ipkernel.py", line 196, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/ipykernel/zmqshell.py", line 533, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2698, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2802, in run_ast_nodes
    if self.run_code(code, result):
  File "/home/ecbm4040/miniconda2/envs/dlenv/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2862, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-61-10747f79722a>", line 6, in <module>
    saver = tf.train.import_meta_graph('model/lenet_1509844345.meta')
  File "/home/ecbm4040/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1698, in import_meta_graph
    **kwargs)
  File "/home/ecbm4040/.local/lib/python3.5/site-packages/tensorflow/python/framework/meta_graph.py", line 656, in import_scoped_meta_graph
    producer_op_list=producer_op_list)
  File "/home/ecbm4040/.local/lib/python3.5/site-packages/tensorflow/python/framework/importer.py", line 313, in import_graph_def
    op_def=op_def)
  File "/home/ecbm4040/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/ecbm4040/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [128] rhs shape= [84]
	 [[Node: save/Assign_8 = Assign[T=DT_FLOAT, _class=["loc:@fc_layer_0/fc_kernel/fc_bias_0"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/gpu:0"](fc_layer_0/fc_kernel/fc_bias_0/Adam_1, save/RestoreV2_8/_9)]]


In [60]:
with tf.Session() as session:
    saver = tf.train.import_meta_graph('model/lenet_1509844345.meta')
    saver.restore(session, tf.train.latest_checkpoint())

    feed_dict = {tf_train_dataset : batch_data}
    predictions = session.run([test_prediction], feed_dict)

TypeError: latest_checkpoint() missing 1 required positional argument: 'checkpoint_dir'

In [None]:
# YOUR CODE HERE


## Generate .csv file for Kaggle

In [None]:
# The following code snippet can be used to generate your prediction .csv file.

# import csv
# with open('predicted.csv','w') as csvfile:
#     fieldnames = ['Id','label']
#     writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
#     writer.writeheader()    
#     for index,l in enumerate(predicted_values_generated_by_your_model):
#         filename = str(index)+'.png'
#         label = str(l)
#         writer.writerow({'Id': filename, 'label': label})