<a href="https://colab.research.google.com/github/GantMan/MachineLearningTraining/blob/master/Truck_Identifier_From_Scratch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Let's make a model to train with Keras that identifies trucks.



In [0]:
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers.normalization import BatchNormalization
from keras import regularizers
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D
import numpy as np
import matplotlib.pyplot as plt

We'll be using the [CIFAR-10 dataset](https://www.cs.toronto.edu/~kriz/cifar.html).  The CIFAR-10 dataset consists of 60000 32x32 color images in 10 classes, with 6000 images per class, all labeled.  

This fortunately comes with an existing training/testing split ready to go right in Keras!  Thanks [pre-existing Keras datasets](https://keras.io/datasets/)!

In [0]:
# Load data set
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

**The dataset is labeled with the following 10 numerical outcomes.**

0.   Airplane
1.   Automobile
2.   Bird
3.   Cat
4.   deer
5.   dog
6.   frog
7.   horse
8.   ship
9.   truck


In [0]:
def visualize_classes():
  graphic = []
  for i in range(0, 10):
      img_batch = x_train[y_train.flatten() == i][0:10] # 10 examples of each class
      combo = np.concatenate(img_batch, axis = 0) # combine into an image
      graphic.append(combo)
  plt.figure(figsize=(12,24))
  plt.axis('off')
  plt.imshow(np.concatenate(graphic, axis = 1))



In [0]:
visualize_classes()


We use a bit of boolean broadcasting to convert the numeric items to booleans.  We'll set truch to `true`, and all others to `false`.


In [0]:
# adjust it so instead it sets all trucks to true and everything else to false
# broadcast boolean logic througout dataset
y_train = y_train == 9
y_test = y_test == 9

In [0]:
# prepare x to be normalized (between 0 and 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

In [0]:
model = Sequential()

model.add(
    Conv2D(
        32,
        (3, 3),
        padding='same',
        input_shape=(32, 32, 3),
        activation="relu"
    )
)
model.add(Conv2D(32, (3, 3), activation="relu"))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(0.25))

![pool after conv](https://pbs.twimg.com/media/DwCNkbIW0AENQy6.jpg =400x)

In [0]:
# second grouping
model.add(
    Conv2D(64, (3, 3), padding='same', activation="relu")
)
model.add(Conv2D(64, (3, 3), activation="relu"))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(0.25))

## Adding L2 Regularization

Regularization helps values that go too small, or too big for your computer from ruining everything.  Exploding and vanishing gradients, even if they don't crash your machine, can slow things downt o a crawl.   Regularization helps!   L2 is the most popular, unless you're looking to significantly remove useless inputs.  In those cases the more agressive L1 regularization would work.

In [0]:
# Final dense layer
model.add(Flatten())
model.add(Dense(512, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(1, activation="sigmoid", kernel_regularizer=regularizers.l2(0.01)))


In [0]:
model.compile(
    loss="binary_crossentropy",
    optimizer="adam",
    metrics=['accuracy']
)

## Training with that GPU goodness

![GPU yum](https://pbs.twimg.com/media/DrdQ4MIXgAIgIAG.jpg)

To train we call `fit` on the model and we wait.  An epoch on a fast CPU is around 5 minutes.

In [0]:
model.fit(
    x_train,
    y_train,
    batch_size=32,
    epochs=3,
    validation_data=(x_test, y_test),
    shuffle=True
)



In [0]:
model.save("scratch_truck_model.h5")

At any time we can load this saved model.  We would import `load_model` from `keras.models` and call that with our model file 'scratch_truck_model.h5'

In [0]:
from keras.preprocessing import image
from pathlib import Path
import numpy as np

**We'll try showing our model the following 2 images:**

This 32x32x3 truck -  ![truck](https://www.cs.toronto.edu/~kriz/cifar-10-sample/truck2.png) as `truck.png`

And this 32x32x3 bird -  ![bird](https://www.cs.toronto.edu/~kriz/cifar-10-sample/bird8.png) as `not_truck.png`

In [0]:
import urllib.request

urllib.request.urlretrieve ("https://www.cs.toronto.edu/~kriz/cifar-10-sample/truck2.png", "truck.png")
urllib.request.urlretrieve ("https://www.cs.toronto.edu/~kriz/cifar-10-sample/bird8.png", "not_truck.png")


![wut cifar image](https://pbs.twimg.com/media/DwCQutKW0AAUKqk.jpg =400x)

In [0]:
for f in sorted(Path(".").glob("*.png")):
    # Load an image file to test
    image_to_test = image.load_img(
        str(f),
        target_size=(32, 32)
    )

    # Convert the image data to a numpy array
    # suitable for Keras
    image_to_test = image.img_to_array(image_to_test)
    # normalize to a 0 to 1 value
    image_to_test /= 255

    # Add a fourth dimension to the image since
    # Keras expects a list of images
    list_of_images = np.expand_dims(
        image_to_test,
        axis=0
    )
    # Make a prediction using the truck model
    results = model.predict(list_of_images)

    # Since we only passed in one test image,
    # we can just check the first result directly.
    image_likelihood = results[0][0]

    # The result will be a number from 0.0 to 1.0
    # representing the likelihood that this
    # image is a truck.
    if image_likelihood > 0.5:
        print(f"{f} is a truck! ({image_likelihood:.2f})")
    else:
        print(f"{f} is NOT a truck! ({image_likelihood:.2f})")

![confusion matrix](https://www.dataschool.io/content/images/2015/01/confusion_matrix2.png)

In [0]:
from sklearn.metrics import classification_report

predictions = model.predict(x_test, batch_size=32, verbose=1)
# If the model is more than 50% sure the object is a truck, call it a truck.
# Otherwise, call it "not a truck" via boolean
predictions = predictions > 0.5

# Calculate Precision and Recall for each class
report = classification_report(y_test, predictions)
print(report)

F-score (F1 score) is the harmonic mean of precision and recall:

>$F1=2*\frac{precision*recall}{precision+recall}$

Basically a way for us to give some kind of combo-score to their trade-offs.

In [0]:
%%bash
pip install tensorflowjs
tensorflowjs_converter --input_format keras \
                       ./scratch_truck_model.h5 \
                       ./tfjs_target_dir

In [0]:
!ls ./tfjs_target_dir

![me](https://pbs.twimg.com/media/DvFA7pNV4AAiW49.jpg =400x)