<a href="https://colab.research.google.com/github/srodney/diffimageml/blob/main/diffimageml/examples/vgg16_keras_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Demo of the keras vgg16 implementation

ML warmup: shows a demo classification of cats and dogs images from imagenet db.  See, for example: 

https://www.kaggle.com/shaochuanwang/keras-warm-up-cats-vs-dogs-cnn-with-vgg16

https://machinelearningmastery.com/how-to-develop-a-convolutional-neural-network-to-classify-photos-of-dogs-and-cats/


## Get the data from Kaggle to Colab

To set up the training data on Google Colab, do the following:
1. log in to Kaggle (register for an account if needed)
2. Visit the Dogs vs Cats competition: https://www.kaggle.com/c/dogs-vs-cats/data
3. accept the competition rules (you must do this before you can download the data using the API)
4. Follow the steps described here: https://www.kaggle.com/general/74235

In [113]:
import os
import glob
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

import tensorflow as tf
from tensorflow import keras

In [114]:
from keras import backend as K
from keras.models import Sequential
from keras.layers import Dense, Conv2D, MaxPool2D , Flatten
from keras.preprocessing.image import ImageDataGenerator

from keras.optimizers import Adam
from keras.callbacks import ModelCheckpoint, EarlyStopping


In [115]:
%matplotlib inline

## Set up the data

In [120]:
# NOTE: data must be downloaded and unzipped into these directories.
# training and validation data are labeled.  Test data are not.
# Assumes classes are defined by subdirs within.
traindatadir = "dogs_vs_cats_data/train/"
validatedatadir = "dogs_vs_cats_data/validate/"
testdatadir = "dogs_vs_cats_data/test1/"
assert(os.path.isdir(traindatadir))
assert(os.path.isdir(validatedatadir))
assert(os.path.isdir(testdatadir))

Ntrain = len(glob.glob(os.path.join(traindatadir,"*/*jpg")))
Nvalidate = len(glob.glob(os.path.join(validatedatadir,"*/*jpg")))
Ntest = len(glob.glob(os.path.join(testdatadir,"*jpg")))
print(f"found {Ntrain} images for training, {Nvalidate} for validation, {Ntest} for blind testing")

found 20000 images for training, 5000 for validation, 12500 for blind testing


In [126]:
datagen = ImageDataGenerator()
traindata = datagen.flow_from_directory(directory=traindatadir,target_size=(224,224))

Found 20000 images belonging to 2 classes.


In [127]:
validatedata = datagen.flow_from_directory(directory=validatedatadir,target_size=(224,224))

Found 5000 images belonging to 2 classes.


In [128]:
model = Sequential()
model.add(Conv2D(input_shape=(224,224,3),filters=64,kernel_size=(3,3),padding="same", activation="relu"))
model.add(Conv2D(filters=64,kernel_size=(3,3),padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))
model.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))

model.add(Flatten())
model.add(Dense(units=4096,activation="relu"))
model.add(Dense(units=4096,activation="relu"))
model.add(Dense(units=2, activation="softmax"))


In [129]:
opt = Adam(lr=0.001)
model.compile(optimizer=opt, loss=keras.losses.categorical_crossentropy, metrics=['accuracy'])

model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_26 (Conv2D)           (None, 224, 224, 64)      1792      
_________________________________________________________________
conv2d_27 (Conv2D)           (None, 224, 224, 64)      36928     
_________________________________________________________________
max_pooling2d_10 (MaxPooling (None, 112, 112, 64)      0         
_________________________________________________________________
conv2d_28 (Conv2D)           (None, 112, 112, 128)     73856     
_________________________________________________________________
conv2d_29 (Conv2D)           (None, 112, 112, 128)     147584    
_________________________________________________________________
max_pooling2d_11 (MaxPooling (None, 56, 56, 128)       0         
_________________________________________________________________
conv2d_30 (Conv2D)           (None, 56, 56, 256)      

In [None]:
checkpoint = ModelCheckpoint("vgg16_1.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)
early = EarlyStopping(monitor='val_acc', min_delta=0, patience=20, verbose=1, mode='auto')
hist = model.fit_generator(steps_per_epoch=100,generator=traindata, validation_data= validatedata, validation_steps=10,epochs=100,callbacks=[checkpoint,early])





Epoch 1/100
  4/100 [>.............................] - ETA: 1:37:21 - loss: 3165.9537 - accuracy: 0.4694

In [None]:
plt.plot(hist.history["acc"])
plt.plot(hist.history['val_acc'])
plt.plot(hist.history['loss'])
plt.plot(hist.history['val_loss'])
plt.title("model accuracy")
plt.ylabel("Accuracy")
plt.xlabel("Epoch")
plt.legend(["Accuracy","Validation Accuracy","loss","Validation Loss"])
#plt.savefig("trainvalid.png")