# Quickmodel
A simple command project I'm putting together. Goal: User runs the file with a specific search term, and the program returns a pickled ML model that is trained to recognize images of the search term. 

In [1]:
import tensorflow as tf
from tensorflow import keras 
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
import numpy as np
import matplotlib.pyplot as plt
import os
from tensorflow.keras.optimizers import SGD
img_dir = os.listdir('../data/')
img_count = len(img_dir)
type(img_dir)

list

First, let's generate our dataset from our existing files. Luckily, ``keras`` has a nice way of implementing this from their Image library.

In [2]:
image_generator = tf.keras.preprocessing.image.ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True,
    rescale=1./255, 
    validation_split=0.3
)

IMG_HEIGHT = 256
IMG_WIDTH = 256
BATCH_SIZE=20
STEPS_PER_EPOCH = np.ceil(img_count/BATCH_SIZE)

train_generator = image_generator.flow_from_directory(directory='../data',
                                                     target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                     classes = ['Eggs', 'NOT-Eggs'], #should come from script parameter
                                                     subset='training',
                                                     class_mode='binary') 
validation_generator = image_generator.flow_from_directory(directory='../data',
                                                          target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                          classes= ['Eggs', 'NOT-Eggs'],
                                                          subset='validation',
                                                          class_mode='binary')

Found 126 images belonging to 2 classes.
Found 54 images belonging to 2 classes.


Now that we have our dataset, we need to construct a model and train it.

In [3]:
model = keras.models.Sequential([
    Conv2D(32, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(64, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(64, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(32, 2, padding='same', activation='relu'),
    MaxPooling2D(),
    Flatten(),
    Dense(256, activation='relu'),
    Dense(1, activation='sigmoid')
])

opt = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=opt,
              loss='binary_crossentropy',
              metrics=['accuracy'])


Now, let's train and test our model. Keep in mind, the dataset is quite small so accuracy might be low. We will probably need to generate more augmented data. 

In [None]:
from tensorflow.keras.callbacks import EarlyStopping

model.fit(
    train_generator,
#     steps_per_epoch = train_generator.samples // BATCH_SIZE,
    validation_data = validation_generator, 
#     validation_steps = validation_generator.samples // BATCH_SIZE,
    epochs = 30,
#     callbacks=[EarlyStopping(monitor='val_loss', patience=4)],
)

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30

In [7]:
import numpy as np
print(model.history.history['val_accuracy'][-1])

0.88


Our model is performing at about 88% cross-validated accuracy. Definitely not amazing, and there are architectural improvements that I have in mind to work on in the future, but this is far better than the 60% baseline I started out with. 