Code Tutorial followed as the basis for this project:

https://www.geeksforgeeks.org/python-image-classification-using-keras/


Modifications made:
1. Changed from Binary to Categorial Model
1. Using Flowers instead of cars and planes as images to predict

Flower Dataset found at:

https://www.kaggle.com/alxmamaev/flowers-recognition/data#


The inspiration for this project was a feature in a video game that I really enjoy playing called Legend of Zelda: Breath of the Wild. It's an open world game where you fight monsters and collect weapons, food, and plants. There is a particular camera feature within the game, that lets you take a picture of any objects you can see. The game will place a label above anything you can pick up within the viewport of the virtual camera. Now the game obviously knows exactly what the object you're looking is because it built the in-game world, but in the real world we do not have this luxury. So the big idea is to make a system that will be able to tell you what plant your phone camera is currently viewing. This ML algorithm that attempts to identify plants would be the central component of a plant recognition app to help plant biologists and environmental researchers catalog local plant populations and look out for invasive species more quickly, efficiently, and accurately. The future end game if further developed past the extent required in this course would be to create a native android/iOS app for people to download and use to view the real world through your phone's camera with the application labeling the plants and flowers most prominent in the viewport of the camera. I would hope that this would give the users of such an application a much larger appreciation for the nature that constantly surrounds them. Instead of walking past "trees", they could point out that they've walked past some Red Pines and White Pines and Great Oaks. Given that immature flowers are added ot the database, the application could even tell what flowers will be budding from plants that have yet to display flowers.

The technology stack for this ML learning project is the keras library in python, which is built on top of the ever popular tensorflow. Keras provides a more abstract view of the tensorflow processes. The network architecture is a Convolutional Neural Network that takes an input shape of:

(NumberOfPlants, StandardImageWidth, StandardImageHeight,)

where NumberOfPlants is the number of plants which have available images in the dataset and StandardImageWidth and StandardImageHeight are the dimensions that all images are reshaped when entered into the CNN. The Convolutional Neural Network was chosen as its the best type of Machine Learning to apply to image analysis type problems such as the handwriting digits recognition problem. The problem of recognition of plants should be similar except for the issue that some photos may not be close up and may be all different angles or flipped.

The implementation of this network has 3 layer sets of 2 Dimensional Convolution with RELU activation followed by a 2 Dimensional Max Pooling layer. The output layer is 5 nodes whose activation signal how likely the viewed picture is that corresponding plant or flower. In the current implementation, the first node is for the daisy, followed by dandelion, rose, sunflower, and lastly tulip. A example output for a new image would then be:

[.24, .15, .6, , .09, .5]

This would signal that the most likely flower shown in the new image is a rose, but it's a close call between the most likely a rose and second closest a tulip. Some limitations of this network is the fact that as the number of different categories (different plants/flowers) grows the accuracy degrades as the number of choices overwhelms the network. One item of research that could be done into this limitation area is the effect that increasing the number of training images has on the accuracy of the network and see how well it can keep up with the demands of the number of categories. One issue with that though is the increase in training time that would result from adding any significant number of training images to the dataset. The training time is already more than 30 minutes in the current implementation using 10 epochs to reach an accuracy of .821, which seems like an acceptable accuracy for the time taken to train. Another argument could be made well that the algorithm doesn't necessary need to be razor sharp accurate as the correct plant or flower should be within the top 3 if the network is trained correctly. It just must be clear to the user that the algorithm is not infallible and if the activation numbers are close then the top choices can be differentiated more closely to determine the correct identification. This does however vastly cutdown on the sheer number of possibilities and great decreases the amount time a lay person would take to identify a flower or plant.

If this algorithm were to be set inside of a native application, then ideally the app would list the "top 3 best guesses" the program has.

Now to move on to discuss the implications of this project. The most direct and desired implication is that people who would use this algorithm or the possible app built around the algorithm could quickly and easily learn what flower they are looking. With more research and more images, this algorithm could possibly get up the level of accuracy as plant biologists, but that would require a much larger dataset as plant biologists are generally very quick. However, this method could possibly save them the time to walk around and bend down to take a closer look at flowers or plants. Another implication is that perhaps plant biologists would not appreciate an application like this because they enjoy cataloging plants without technology and just love doing the identification themselves. I got this view from one of my fellow graduating seniors, Jonathan Walt. As discussed in the vision portion, a good implication could be that users of the application or algorithm would gain a greater apprecation for plants and flowers and their great number of varieties. Now on to some possible negative implications of this project. 



Now to move on towards the actual algorithm and system. The code is below.

The program takes a zip file as input. In order to use the algorithm properly, the zip file must have a certain structure to it. It must have at the root level two folders labeled "train" and "test". Each of these folders must have an equal number of subfolders corresponding to the desired number of possible classifications. The number of these classifications determines the input size as well as the output size.

In [1]:
# Importing all necessary libraries 
from keras.preprocessing.image import ImageDataGenerator 
from keras.models import Sequential 
from keras.layers import Conv2D, MaxPooling2D 
from keras.layers import Activation, Dropout, Flatten, Dense 
from keras import backend as K 


Using TensorFlow backend.


In [0]:
import zipfile
# with zipfile.ZipFile("v_data.zip", 'r') as zip_ref:
#     zip_ref.extractall("/")
resultForDandelion = [.25, .60, .05, .15, .24]
with zipfile.ZipFile("flowers.zip", 'r') as zip_ref:
    zip_ref.extractall("dataset")

In [0]:

img_width, img_height = 224, 224


train_data_dir = 'dataset/train'
validation_data_dir = 'dataset/test'


nb_train_samples = 2160
nb_validation_samples = 2163
epochs = 5
batch_size = 16


if K.image_data_format() == 'channels_first': 
	input_shape = (3, img_width, img_height) 
else: 
	input_shape = (img_width, img_height, 3) 

model = Sequential() 
model.add(Conv2D(32, (2, 2), input_shape=input_shape)) 
model.add(Activation('relu')) 
model.add(MaxPooling2D(pool_size=(2, 2))) 

model.add(Conv2D(32, (2, 2))) 
model.add(Activation('relu')) 
model.add(MaxPooling2D(pool_size=(2, 2))) 

model.add(Conv2D(64, (2, 2))) 
model.add(Activation('relu')) 
model.add(MaxPooling2D(pool_size=(2, 2))) 

model.add(Flatten()) 
model.add(Dense(64)) 
model.add(Activation('relu')) 
model.add(Dropout(0.5)) 
model.add(Dense(5)) 
model.add(Activation('sigmoid')) 

model.compile(loss='categorical_crossentropy', 
			optimizer='rmsprop', 
			metrics=['accuracy']) 


train_datagen = ImageDataGenerator( 
	rescale=1. / 255, 
	shear_range=0.2, 
	zoom_range=0.2, 
	horizontal_flip=True) 

test_datagen = ImageDataGenerator(rescale=1. / 255) 

train_generator = train_datagen.flow_from_directory( 
	train_data_dir, 
	target_size=(img_width, img_height), 
	batch_size=batch_size, 
	class_mode='categorical')

validation_generator = test_datagen.flow_from_directory( 
	validation_data_dir, 
	target_size=(img_width, img_height), 
	batch_size=batch_size, 
	class_mode='categorical') 

model.fit_generator( 
	train_generator, 
	steps_per_epoch=nb_train_samples // batch_size, 
	epochs=epochs, 
	validation_data=validation_generator, 
	validation_steps=nb_validation_samples //batch_size) 

filenames = validation_generator.filenames
nb_samples = len(filenames)

predict = model.predict_generator(validation_generator,steps = nb_samples)
print(predict) 

Found 2163 images belonging to 5 classes.
Found 2160 images belonging to 5 classes.
Epoch 1/5
Epoch 2/5

In [0]:
filenames = validation_generator.filenames
nb_samples = len(filenames)

predict = model.predict_generator(validation_generator,steps = nb_samples)
print(predict)