This notebook trains an image classification model and makes gradual improvements, debugging the model, to improve performance.  **GPUs are encouraged.** In colab, one can add a GPU by clicking the `Runtime` menu and selecting `Change runtime type`. Selecting `GPU` as the hardware accelerator will allow for the usage of a GPU.

Below we download and `unzip` the tiny ImageNet dataset.

Tiny ImageNet is a dataset based on ImageNet with 100,000 images. The dataset consists of 200 categories instead of Imagenet’s full 1,000 categories. Each image is a 64x64 pixel color image, which is about one-twelfth the size of those in Imagenet. Imagenet’s images are 224x224 pixels. There are 500 images in each category instead of Imagenet's roughly 1000 images per category.

In [1]:
!wget http://cs231n.stanford.edu/tiny-imagenet-200.zip
!unzip -q tiny-imagenet-200.zip

--2023-09-25 19:20:57--  http://cs231n.stanford.edu/tiny-imagenet-200.zip
Resolving cs231n.stanford.edu (cs231n.stanford.edu)... 171.64.68.10
Connecting to cs231n.stanford.edu (cs231n.stanford.edu)|171.64.68.10|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 248100043 (237M) [application/zip]
Saving to: ‘tiny-imagenet-200.zip’


2023-09-25 19:21:00 (92.1 MB/s) - ‘tiny-imagenet-200.zip’ saved [248100043/248100043]



We’ll import some code for image processing. We’ll use `keras` for building our algorithm and `numpy` for working with vectors. We’ll also use a function from `sklearn`, the scikit learn library, for splitting up training and testing vectors.

From the 200 categories in the set, we select a list of 36 categories that correspond to animals, `cats_0`. The list includes lions, boa constrictors, and king penguins.

We also select a list of 14 categories from tiny imagenet that correspond to bugs, `cats_1`. The list includes roaches, grasshoppers, scorpions, tarantulas, and dragonflies.

In [2]:
!pip install keras=='2.13.1'
from PIL import Image
from keras.preprocessing import image
import numpy as np
from sklearn.model_selection import train_test_split
import random
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
cats_0 = ['n01443537','n01629819','n01641577','n01644900','n01698640','n01742172',
          'n01855672','n01882714','n02002724','n02056570','n02058221','n02074367',
          'n02085620','n02094433','n02099601','n02099712','n02106662','n02113799',
          'n02123045','n02123394','n02124075','n02125311','n02129165','n02132136',
          'n02364673','n02395406','n02403003','n02410509','n02415577','n02423022',
          'n02437312','n02480495','n02481823','n02486410','n02504458','n02509815']
cats_1 = ['n01770393','n01774384','n01774750','n01784675','n02165456','n02190166',
          'n02206856','n02226429','n02231487','n02233338','n02236044','n02268443',
          'n02279972','n02281406']



The `read_cats` subroutine reads the images for the categories `cats` it is given and it associates their vectors with the label `lab`. Initially, the lists of vectors, `vecs`, and labels, `labs`, are empty. We also pass in the desired training and testing set sizes, `train_size` and `test_size`.



We loop through each category, `c`, and image, `i`. For each category and image pair, we construct a filename into the “tiny” dataset. Specifically, the directory that stores the images is called “tiny-imagenet-200/tiny”. Then, there’s a subdirectory for each of the tiny imagenet categories, which contains a subdirectory called “images”. Within that directory, there’s 500 JPEG files, each named with the tiny imagenet category and a number.

We retrieve the image and store it in “img”.
From the image object, we extract an array and flatten it out into a vector. Then, we reshape it to 64x64 x 3 colors.
We string together all the collected images and labels into lists, one list called “vecs”, the other called “labs”.
Once the lists are constructed, we turn the list of vectors into a numpy array.
We split up this array and the labels with the desired train/test sizes and return the result.

In [3]:
# Redefine read_cats to preprocess inputs
def read_cats(cats, lab, train_size, test_size):
  vecs = []
  labs = []
  for c in cats:
    for i in range(500):
      img = image.load_img("tiny-imagenet-200/train/"+c+"/images/"+c+"_"+str(i)+".JPEG")
      img_arr = image.img_to_array(img)
      img_arr = preprocess_input(img_arr)
      img_arr = img_arr.flatten()
      img_arr = img_arr / 255. - 1
      img_arr = img_arr.reshape(64,64,3)
      vecs += [img_arr]
      labs += [lab]
  vecs = np.asarray(vecs)
  return(train_test_split(vecs,labs, train_size=train_size,test_size=test_size))

For both `cats_1` and `cats_0`, we use 10 percent of the available data for training and 20 percent for testing. Combining the data from each category will give us our training data, `X_train` and `y_train`, and our test data, `X_test` and `y_test`.

In [4]:
X0_train, X0_test, y0_train, y0_test = read_cats(cats_0, 0, .1, .2)
X1_train, X1_test, y1_train, y1_test = read_cats(cats_1, 1, .1, .2)
X_train = np.concatenate((X0_train, X1_train))
X_test = np.concatenate((X0_test, X1_test))
y_train = np.concatenate((y0_train, y1_train))
y_test = np.concatenate((y0_test, y1_test))

`build_network` builds a VGG-16 model. It consists of a series of 14 convolutional layers, `Conv2D`, interspersed with max pooling layers, `MaxPooling2D`. After these layers, it includes 3 fully connected, `Dense`, layers ending with the output layer.

In the code to build and return the untrained neural network, the main change is in the very first layer and the very last layer. VGG-16 is designed to recognize imagenet images of size 224x224. The “tiny” images in this set are only 64x64. So `input_shape` is different.

The output shape is also different. In the Imagenet challenge, the learners have to pick out an image from 1000 categories. Here, we only have two --- animal and bug. So, we use a single output unit with `sigmoid` activation. A sigmoid unit is one that outputs a number near 1 if the input sums to a positive number and near zero if the input sums to a negative number, with a smooth transition between them around zero. Thus, the outputs look like probabilities. It is trained to produce a number close to 1 for bugs and close to 0 for animals.

In [5]:
def build_network():
  import keras
  from keras.models import Sequential
  from keras.layers import Dense, Activation, Dropout, Flatten
  from keras.layers import Conv2D
  from keras.layers import MaxPooling2D
  input_shape = (64, 64, 3)
  #Instantiate an empty model
  model = Sequential()
  model.add(Conv2D(64, (3, 3), input_shape=input_shape, padding='same', activation='relu'))
  model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
  model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))
  model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
  model.add(Conv2D(256, (3, 3), padding='same', activation='relu'))
  model.add(Conv2D(256, (3, 3), padding='same', activation='relu'))
  model.add(Conv2D(256, (3, 3), padding='same', activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
  model.add(Conv2D(512, (3, 3), padding='same', activation='relu'))
  model.add(Conv2D(512, (3, 3), padding='same', activation='relu'))
  model.add(Conv2D(512, (3, 3), padding='same', activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
  model.add(Conv2D(512, (3, 3), padding='same', activation='relu'))
  model.add(Conv2D(512, (3, 3), padding='same', activation='relu'))
  model.add(Conv2D(512, (3, 3), padding='same', activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
  model.add(Flatten())
  model.add(Dense(4096, activation='relu'))
  model.add(Dense(4096, activation='relu'))
  model.add(Dense(1, activation='sigmoid'))
  return(model)

Now we dramatically increase the size of the training set to 10,000 examples, 5,000 of each class.

In [6]:
# running again with maximum training data
X0_train, X0_test, y0_train, y0_test = read_cats(cats_0, 0, 5000, 500)
X1_train, X1_test, y1_train, y1_test = read_cats(cats_1, 1, 5000, 500)
X_train = np.concatenate((X0_train, X1_train))
X_test = np.concatenate((X0_test, X1_test))
y_train = np.concatenate((y0_train, y1_train))
y_test = np.concatenate((y0_test, y1_test))

We build and compile our model again using `SGD`.

In [8]:
from keras import optimizers

for alpha in (0.001, 0.005, 0.01, 0.05, 0.01, 0.5):
  solver = optimizers.SGD(learning_rate=alpha)
  model = build_network()
  model.compile(loss='mean_squared_error', optimizer=solver, metrics=['accuracy'])
  model.fit(X_train,y_train,epochs=100)
  print(model.evaluate(X_test,y_test))

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78