## Plant Classification MVP

This project involves using neural networks and transfer learning to classify leaves by plant type. There are 11 classes total. 

This notebook will build the final model on all the training and validation data and display the final model metrics. Additionally, I will test out some of my own leaf images to test the model on individual images! 

In [9]:
# display and plotting imports
%pylab inline 
import seaborn as sns
sns.set()
from IPython.display import SVG

import pandas as pd

# sklearn imports
from sklearn.metrics import accuracy_score
from keras.models import Model, Sequential
from keras.layers import Dense, Activation, Dropout
import os
import numpy as np
from keras.preprocessing.image import ImageDataGenerator

#transfer learning model
from tensorflow import keras
from keras.applications.vgg16 import VGG16

conv_base = VGG16(weights='imagenet',
                  include_top=False)
import os
import numpy as np

Populating the interactive namespace from numpy and matplotlib


## Pulling in and processing image data (train and test)

In [10]:
base_dir = '/Users/mehikapatel/Plant_NN_Project/data'

train_dir = os.path.join(base_dir, 'train')
test_dir = os.path.join(base_dir, 'test')

datagen = ImageDataGenerator(rescale=1./255)
batch_size = 20

def extract_features(directory, sample_count):
    features = np.zeros(shape=(sample_count, 4, 4, 512))
    labels = np.zeros(shape=(sample_count,11))
    generator = datagen.flow_from_directory(
        directory,
        target_size=(150, 150),
        batch_size=batch_size,
        class_mode='categorical')
    i = 0
    for inputs_batch, labels_batch in generator:
        features_batch = conv_base.predict(inputs_batch)
        features[i * batch_size : (i + 1) * batch_size] = features_batch
        labels[i * batch_size : (i + 1) * batch_size] = labels_batch
        i += 1
        if i * batch_size >= sample_count:
            # Note that since generators yield data indefinitely in a loop,
            # we must `break` after every image has been seen once.
            break
    return features, labels

In [11]:
train_features, train_labels = extract_features(train_dir, 1929)
test_features, test_labels = extract_features(test_dir, 344) 

Found 1929 images belonging to 11 classes.
Found 344 images belonging to 11 classes.


In [54]:
#flatten images:
train_features = np.reshape(train_featLures, (1929, 4 * 4 * 512))
test_features = np.reshape(test_features, (344, 4 * 4 * 512))

## Compile and fit the model!

In [55]:
train_features.shape[1:]

(8192,)

In [16]:
final = keras.Sequential([
    keras.layers.InputLayer(input_shape=train_features.shape[1:]),
    keras.layers.Dense(units=150, activation="relu"),
    keras.layers.Dense(units=125, activation="relu"),
    keras.layers.Dense(units=100, activation="relu"),
    keras.layers.Dense(units=75, activation="relu"),
    keras.layers.Dense(units=50, activation="relu"),
    keras.layers.Dense(units=25, activation="relu"),
    keras.layers.Dense(units=11, activation= "softmax"),
])

final.compile("nadam", loss="categorical_crossentropy", metrics=["acc"])

final.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_7 (Dense)              (None, 150)               1228950   
_________________________________________________________________
dense_8 (Dense)              (None, 125)               18875     
_________________________________________________________________
dense_9 (Dense)              (None, 100)               12600     
_________________________________________________________________
dense_10 (Dense)             (None, 75)                7575      
_________________________________________________________________
dense_11 (Dense)             (None, 50)                3800      
_________________________________________________________________
dense_12 (Dense)             (None, 25)                1275      
_________________________________________________________________
dense_13 (Dense)             (None, 11)               

In [17]:
final.fit(train_features, train_labels, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<tensorflow.python.keras.callbacks.History at 0x7faac0fc8d30>

### Final Test Accuracy Score

In [19]:
test_loss, test_acc = final.evaluate(test_features, test_labels)
print(f'\n\nTest Accuracy: {test_acc}')
print(f'\n\nTest Loss: {test_loss}')



Test Accuracy: 0.9331395626068115


Test Loss: 0.24856244027614594


In [56]:
# Save our model to use in streamlit!
# final.save_weights("model.h5")
# serialize model to JSON
model_json = final.to_json()
with open("model.json", "w") as json_file:
    json_file.write(model_json)
# serialize weights to HDF5
final.save_weights("model.h5")
print("Saved model to disk")

Saved model to disk


## Test some hold out images!

In [31]:
base_dir = '/Users/mehikapatel/Plant_NN_Project/data'

holdout_dir = os.path.join(base_dir, 'holdout')


datagen = ImageDataGenerator(rescale=1./255)
batch_size = 20

def extract_features(directory, sample_count):
    features = np.zeros(shape=(sample_count, 4, 4, 512))
    labels = np.zeros(shape=(sample_count,4))
    generator = datagen.flow_from_directory(
        directory,
        target_size=(150, 150),
        batch_size=batch_size,
        class_mode='categorical')
    i = 0
    for inputs_batch, labels_batch in generator:
        features_batch = conv_base.predict(inputs_batch)
        features[i * batch_size : (i + 1) * batch_size] = features_batch
        labels[i * batch_size : (i + 1) * batch_size] = labels_batch
        i += 1
        if i * batch_size >= sample_count:
            # Note that since generators yield data indefinitely in a loop,
            # we must `break` after every image has been seen once.
            break
    return features, labels

In [32]:
plantx, planty = extract_features(holdout_dir, 4)

Found 4 images belonging to 4 classes.


In [34]:
#flatten images:
plantx = np.reshape(plantx, (4, 4 * 4 * 512))

In [45]:
predictions = final.predict_classes(plantx)

In [1]:
predictions[1]

NameError: name 'predictions' is not defined

We inputted: Pomegranate, Mango, Lemon, Basil. 

The model classified these as Pomegranate, Lemon, Lemon, & Guava. 