# AniRec: Animal Recognition/Classifier Model

## Group Members (in alphabetical order):
### Madison Adams

### Zinah Aljanabi

### Grant Goodman

### Sergio Gonzalez

## ML Code Written By: Madison Adams
### Sources:
* Dataset: https://www.kaggle.com/alessiocorrado99/animals10
* https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/image_classification/cats_vs_dogs.py
* https://github.com/tensorflow/examples/tree/master/lite/examples/image_classification/android
* https://www.kaggle.com/ayushimishra2809/animal-classifier
* https://www.kaggle.com/subratasarkar32/cnn-for-animals-cat-dog-human
* Used documentation and API's for both Keras and Tensorflow

# Code for the AniRec ML Model:

## Library and Module Imports

In [1]:
import tensorflow as tf
from tensorflow.keras.layers import Dense, Flatten, Dropout, Conv2D, MaxPooling2D, BatchNormalization
from tensorflow.keras.models import Sequential
import numpy as np
import os 
import random
import cv2
from sklearn.model_selection import train_test_split
from tqdm import tqdm
import matplotlib.pyplot as plt

## create_data(): Function to load the animal images in order to create the data for the model:

In [2]:
data = []
image_size = 100
#in alphabetical order
animals = {"butterfly":0, "cat":1, "chicken":2, "cow":3, "dog":4, "elephant":5, "horse":6, "sheep":7, "spider":8, "squirrel":9}
def create_data(directory):
    for animal in os.listdir(directory):
        counter = 0
        path = directory + animal
        for image in tqdm(os.listdir(path)):
            image_array = cv2.imread(os.path.join(path, image))
            new_image_array = cv2.resize(image_array,(image_size, image_size))
            label = animals[str(animal)]
            data.append([new_image_array, label])
            counter += 1
            if counter == 1400:
                break
            
    random.shuffle(data) #randomize order of data
    x = []
    y = []
    for animal_image, labels in data:
        x.append(animal_image)
        y.append(labels)   
    x = np.asarray(x)
    y = np.asarray(y)
    y = y.reshape(-1,1)
    #use a validation amount/percent of 20% (test with 20% of the data)
    x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.2, random_state = 42) 
    
    return x_train, x_test, y_train, y_test

## display_plots(): Function to create the plots and display the results of the model (training & validation accuracy and loss) as plots

In [3]:
def display_plots(history,epoch_value):
    epochs = epoch_value
    
    acc = history.history["accuracy"]
    val_acc = history.history["val_accuracy"]

    loss = history.history["loss"]
    val_loss = history.history["val_loss"]

    epochs_range = range(epochs)

    plt.figure(figsize = (8, 8))
    plt.subplot(1, 2, 1)
    plt.plot(epochs_range, acc, label = "Training Accuracy")
    plt.plot(epochs_range, val_acc, label = "Validation Accuracy")
    plt.legend(loc = "lower right")
    plt.title("Training and Validation Accuracy")

    plt.subplot(1, 2, 2)
    plt.plot(epochs_range, loss, label = "Training Loss")
    plt.plot(epochs_range, val_loss, label = "Validation Loss")
    plt.legend(loc = "upper right")
    plt.title("Training and Validation Loss")
    
    plt.show()
    plt.close()
    
    model.summary()

## create_model(): Function to create the CNN model and add all of the necessary layers

In [4]:
def create_model(num_classes):
    model = Sequential([
        Conv2D(32, kernel_size = 3, activation = "relu", input_shape = (100,100,3)),
        
        BatchNormalization(),
        Conv2D(32, kernel_size = 3, activation = "relu"),
        BatchNormalization(),
        Conv2D(32, kernel_size = 5, strides = 2, padding = "same", activation = "relu"),
        BatchNormalization(),
        Dropout(0.4),
        Conv2D(64, kernel_size = 5, strides = 2, padding = "same", activation = "relu"),
        BatchNormalization(),
        Dropout(0.4),
        Conv2D(256, kernel_size = 4, activation = "relu"),
        BatchNormalization(),
        Flatten(),
        Dropout(0.4),
        Dense(num_classes, activation = "softmax")
    ])
    
    model.compile(optimizer = "adam", loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])

    return model

# prediction_vs_actual(): Function to display the actual animal against what the model predicts what the animal is. This is here to see a few examples with random pictures.

In [5]:
def prediction_vs_actual(x_test, y_test, model):
    im_list = [100,250,989,505,600,312,832,1400]
    for i in im_list:
        img = x_test[i]
        plt.imshow(img)
        plt.show()
        
        prediction = model.predict_classes(img.reshape(-1,image_size,image_size,3))
        actual =  y_test[i]
        print("actual: ", actual[0])
        print("prediction: ", prediction[0])

## Call the necessary functions in order to create the model:

In [6]:
print("To ensure balanced training and validation, only about 1400 images of each animal will be used")

num_classes = 10
epochs = 60
batch_size = 140

#If on Jupyter Lab, use this (uncomment both lines): 
#directory = os.getcwd()
#directory = os.path.join(directory+"/animals10/raw-img/")

#If on Kaggle, use this (uncomment the first line but first choose which "directory =" statement you will use:
#directory = "../input/translated-animals10/animals10/raw-img/" or 
directory = "../input/animals10/raw-img/"
#If one of the two "directory = " statements in the line above don't work, then you will have to configure the path for the dataset
#Uncomment the line below when trying to configure the path on Kaggle
#print(directory)

x_train, x_test, y_train, y_test = create_data(directory)
data = np.asarray(data)
model = create_model(num_classes)

history = model.fit(x_train, y_train, epochs = epochs, batch_size = batch_size, validation_data = (x_test, y_test))
display_plots(history,epochs)
prediction_vs_actual(x_test, y_test, model)

answer = input("Do you want to save the model? (type y for yes or n for no: ")
if(answer == "y"):
    #pb model
    model.save("animal_model") #will be saved as "animal_model.pb"
    
    #H5/HDF5 model
    model.save("final_animal_model_1.h5") #will be saved as "animal_model.h5"
    
    print(".pb and .h5 files were created")

100%|██████████| 1668/1668 [00:38<00:00, 42.81it/s]
100%|██████████| 2112/2112 [00:24<00:00, 85.67it/s] 
100%|██████████| 4863/4863 [00:45<00:00, 106.96it/s]
100%|██████████| 1820/1820 [00:21<00:00, 84.36it/s] 
100%|██████████| 4821/4821 [00:46<00:00, 102.94it/s]
100%|██████████| 3098/3098 [00:33<00:00, 93.60it/s] 
100%|██████████| 2623/2623 [00:26<00:00, 99.14it/s] 
100%|██████████| 1862/1862 [00:16<00:00, 114.79it/s]
100%|██████████| 1866/1866 [00:23<00:00, 80.30it/s] 
100%|██████████| 1446/1446 [00:20<00:00, 72.26it/s]


here
Train on 20943 samples, validate on 5236 samples

KeyboardInterrupt: 