<a href="https://colab.research.google.com/github/yngvib/DeepLearningCourse/blob/master/Deep_Learning_Lab_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Deep Learning LAB 2**

The objectives of this lab is to further familiarize students with the basics of Keras and Deep Learning.   In particular, we will learn how to manipulate images, create learning models and parameterize them, see the effects of different activation functions and regularization terms, as well as how to visulize learning progress.

This lab is slightly adapted from an online tutorial by Adrian Rosebrock.

Read the code in each of the steps carefully with the aim of fully understand what is going on (the instructor will help as needed),. Then run each step.

Once, you have finished going through all the steps, try to improve the test accuracy of the ANN, for example, by:

*   using different activation function in the hidden layers
*   increase number of layers
*   add regularizers (see Keras documentation)

Were you able to improve the test accuracy of the network? By how much? Which enhancements worked the best? Show your result to the lab instructor.






# Step 1:  "Upload" the images in Colaboratory


In [14]:
# Three thousand images of cats, dogs, and pandas (1000 each)
!wget https://www.ru.is/~yngvi/ML/lab2.tgz
!tar -xvzf lab2.tgz  

--2018-11-30 00:04:25--  https://www.ru.is/~yngvi/ML/lab2.tgz
Resolving www.ru.is (www.ru.is)... 54.229.5.10, 34.241.248.147, 2a05:d018:6f6:8704:3791:b3c0:a966:bbaa, ...
Connecting to www.ru.is (www.ru.is)|54.229.5.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 196221281 (187M) [application/x-compressed]
Saving to: ‘lab2.tgz.1’


2018-11-30 00:04:42 (11.3 MB/s) - ‘lab2.tgz.1’ saved [196221281/196221281]

lab2/
lab2/._images
lab2/images/
lab2/output/
lab2/._animals
lab2/animals/
lab2/animals/._dogs
lab2/animals/dogs/
lab2/animals/._cats
lab2/animals/cats/
lab2/animals/._panda
lab2/animals/panda/
lab2/animals/panda/._panda_00927.jpg
lab2/animals/panda/panda_00927.jpg
lab2/animals/panda/._panda_00099.jpg
lab2/animals/panda/panda_00099.jpg
lab2/animals/panda/._panda_00933.jpg
lab2/animals/panda/panda_00933.jpg
lab2/animals/panda/._panda_00700.jpg
lab2/animals/panda/panda_00700.jpg
lab2/animals/panda/._panda_00066.jpg
lab2/animals/panda/panda_00066.jpg
lab2/ani

# Step 2: Import necessary Python packages

Apart from the necessary Keras packages, we will be using several other support libraries to make our life easier, for example, OpenCV for reading in (and manipulating) images, SciKit for transformations, etc.


In [0]:
import os  # misc operating system specific operations, e.g., reading directries. 
import random

import cv2
import numpy as np

from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split

from keras.models import Sequential
from keras.layers.core import Dense
from keras.optimizers import SGD
from sklearn.metrics import classification_report

import matplotlib.pyplot as plt

#from keras import regularizers

random_seed = 42   # include for reproducability

# Step 3: Read in the filesystem paths of the images

In [15]:
print("[INFO] loading images...")

# Specify locations of input and output files.
tutorial_dir = "./lab2/"
args = {}
args["dataset"] = tutorial_dir + "animals/"
args["model"]   = tutorial_dir + "output/simple_nn.model"
args["plot"]    = tutorial_dir + "output/simple_nn_plot.png"

# Read in the file paths of the images to use for the training.
image_paths = list()
for (dirpath, dirnames, filenames) in os.walk(args["dataset"]):
  for file in filenames:
    if '.jpg' in file and not file.startswith('.'):
      image_paths.append(os.path.join(dirpath, file))
random.seed(random_seed)
random.shuffle(image_paths)
print(image_paths)

[INFO] loading images...
['./lab2/animals/cats/cats_00673.jpg', './lab2/animals/dogs/dogs_00898.jpg', './lab2/animals/cats/cats_00210.jpg', './lab2/animals/panda/panda_00191.jpg', './lab2/animals/panda/panda_00378.jpg', './lab2/animals/panda/panda_00729.jpg', './lab2/animals/dogs/dogs_00422.jpg', './lab2/animals/panda/panda_00587.jpg', './lab2/animals/cats/cats_00346.jpg', './lab2/animals/dogs/dogs_00520.jpg', './lab2/animals/dogs/dogs_00405.jpg', './lab2/animals/cats/cats_00423.jpg', './lab2/animals/cats/cats_00600.jpg', './lab2/animals/dogs/dogs_00003.jpg', './lab2/animals/panda/panda_00060.jpg', './lab2/animals/cats/cats_00016.jpg', './lab2/animals/panda/panda_00748.jpg', './lab2/animals/cats/cats_00931.jpg', './lab2/animals/panda/panda_00218.jpg', './lab2/animals/panda/panda_00181.jpg', './lab2/animals/panda/panda_00727.jpg', './lab2/animals/cats/cats_00588.jpg', './lab2/animals/cats/cats_00996.jpg', './lab2/animals/cats/cats_00076.jpg', './lab2/animals/panda/panda_00965.jpg', './l

# Step 4: Read in and preprocess the images

In [0]:
input_data   = []
input_labels = []
for image_path in image_paths:
  # Load an image
  image = cv2.imread(image_path)
  # Resize it be 32x32 pixels (ignoring aspect ratio), and
  # flatten it into a one-dimentional 32x32x3=3072 pixel image. 
  image = cv2.resize(image, (32, 32)).flatten()
  # Store image in a list
  input_data.append(image)  
  # Extract the class label from the image path
  label = image_path.split(os.path.sep)[-2]
  # Store image label in a list
  input_labels.append(label)

# Normalize the pixel values to be in the range [0,1], and store as NumPy arrays.
input_data   = np.array(input_data, dtype="float") / 255.0
input_labels = np.array(input_labels)
print(input_data)
print(input_labels)

# Step 5: Split the data into test and training set, and reformat target values

In [0]:
# Split the data into training and testing sets
(trainX, testX, trainY, testY) = train_test_split(
            input_data, input_labels, test_size=0.25, random_state=random_seed)

# Convert the target categorial labels into binary vectors 
# (for 2-class, binary classification you should use Keras' 
#  to_categorical function instead as the scikit-learn's LabelBinarizer)
lb = LabelBinarizer()    # ... from scikit
trainY = lb.fit_transform(trainY)  # ... from scikit
testY  = lb.transform(testY)       # ... from scikit
print(trainY)
print(testY)


# Step 5: Create the ANN model, train it, and then evaluate it

In [0]:
# Define a 3072-1024-512-3 architecture using Keras
model = Sequential()
model.add(Dense(1024, input_shape=(3072,), activation="sigmoid"))
model.add(Dense(512, activation="sigmoid"))
model.add(Dense(len(lb.classes_), activation="softmax"))

# Set initial learning rate and number of epochs to train for
INIT_LR = 0.01
EPOCHS = 5

# Compile the model using Stocastic-GD as our optimizer and categorical
# cross-entropy loss function
# (in case of 2-class classification, you would instead use binary_crossentropy)
print("[INFO] training network...")
opt = SGD(lr=INIT_LR)
model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])

# Now train the ANN ...
H = model.fit(trainX, trainY, validation_data=(testX, testY), epochs=EPOCHS, batch_size=32)

# ... and then evaluate it.
print("[INFO] evaluating network...")
predictions = model.predict(testX, batch_size=32)
print(classification_report(testY.argmax(axis=1),
	       predictions.argmax(axis=1), target_names=lb.classes_))

# Store the model on disk.
print("[INFO] serializing and storing the model ...")
model.save(args["model"])


# Output a graph with information about learning progress.


In [0]:
# Plot the training loss and accuracy
N = np.arange(0, EPOCHS)
plt.style.use("ggplot")
plt.figure()
plt.plot(N, H.history["loss"], label="train_loss")
plt.plot(N, H.history["val_loss"], label="val_loss")
plt.plot(N, H.history["acc"], label="train_acc")
plt.plot(N, H.history["val_acc"], label="val_acc")
plt.title("Training Loss and Accuracy (Simple NN)")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend()
plt.savefig(args["plot"])