In [1]:
!git clone --single-branch --branch production https://github.com/bippity/intuit-project.git
%cd intuit-project/Current\ CNN

Cloning into 'intuit-project'...
remote: Enumerating objects: 3936, done.[K
remote: Total 3936 (delta 0), reused 0 (delta 0), pack-reused 3936[K
Receiving objects: 100% (3936/3936), 1.68 GiB | 37.62 MiB/s, done.
Resolving deltas: 100% (44/44), done.
Checking out files: 100% (3953/3953), done.
/content/intuit-project/Current CNN


In [0]:
import matplotlib.pyplot as plt 
import os 
import cv2
import numpy as np
import random 
import pickle
import tensorflow as tf
from tensorflow.keras.models import Sequential 
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D

Create training data, resizing all images to 200x200 and using color(R,G,B)

In [0]:
DATADIR = "Kaggle"
CATEGORIES = ["Good", "Bad"]
training_data = []
IMG_SIZE = 200         # This is the size we are using 

def create_training_data():
    for category in CATEGORIES:                 # loop threw each folder with in W2 folder  
        path = os.path.join(DATADIR,category)   # Path to folder 
        class_num = CATEGORIES.index(category)  # labeling the data based on folder 
        for img in os.listdir(path):
            try:
                img_array = cv2.imread(os.path.join(path,img), cv2.IMREAD_COLOR) # converts the image to an array 
                new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))   # Resize to normalize data size 
                training_data.append([new_array, class_num]) # adds it to are traning data with the label 
            except Exception as e:
                pass
            
create_training_data()

Shuffling data before feeding it to the CNN

In [0]:
random.shuffle(training_data)


Separating the features from the labels and converting them to a np array

In [0]:
X = []
y = []

for features, label in training_data:
    X.append(features)
    y.append(label)
    
X = np.array(X).reshape(-1,IMG_SIZE, IMG_SIZE, 3)
y = np.array(y)

We start by normalizing the data by scaling it, min is 0 and max is 255 for pixel data So we will divide it by 255, Keras also has a built in function to do this

In [0]:
X = X/255 #255 pixels max for pixel data

Instantiate a VGG16 model that is preloaded with weights,

We tell it the image size and that the images will be in color(3)

include_top = False will not include the classification layer, we will add one ourselves

In [9]:
IMG_SIZE = 200
IMG_SHAPE = (IMG_SIZE, IMG_SIZE, 3)
VGG16_MODEL=tf.keras.applications.VGG16(input_shape = IMG_SHAPE,
                                               include_top = False,
                                               weights = 'imagenet')

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5


Summary of the layers VGG16 includes

In [10]:
VGG16_MODEL.summary()

Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 200, 200, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 200, 200, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 200, 200, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 100, 100, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 100, 100, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 100, 100, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 50, 50, 128)       0     

We are freezing the VGG16 model so that way the weights in the given model will not update. Also including 2 more layers, one being our output layer

In [0]:
VGG16_MODEL.trainable = False
global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
prediction_layer = tf.keras.layers.Dense(2,activation='softmax')

Will convert to a sequential model and combine the last two layer we made

In [0]:
model = tf.keras.Sequential([
  VGG16_MODEL,
  global_average_layer,
  prediction_layer
])

Compile our model using an 'adam' optimizer and 'sparese categorical crossentropy' for the loss

In [0]:
model.compile(optimizer='adam', 
              loss=tf.keras.losses.sparse_categorical_crossentropy,
              metrics=["accuracy"])

Time to fit the model with 5 epochs

In [14]:
model.fit(X, y, batch_size = 12, epochs = 5, validation_split = .1)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7fb9c00aa0f0>

In [15]:
model.save("VGG16_v1")

Instructions for updating:
If using Keras pass *_constraint arguments to layers.
INFO:tensorflow:Assets written to: VGG16_v1/assets


In [16]:
#Zip model to download
!zip -r VGG16_v1.zip VGG16_v1

  adding: VGG16_v1/ (stored 0%)
  adding: VGG16_v1/variables/ (stored 0%)
  adding: VGG16_v1/variables/variables.data-00001-of-00002 (deflated 7%)
  adding: VGG16_v1/variables/variables.index (deflated 64%)
  adding: VGG16_v1/variables/variables.data-00000-of-00002 (deflated 80%)
  adding: VGG16_v1/assets/ (stored 0%)
  adding: VGG16_v1/saved_model.pb (deflated 92%)
