# Using image classification to label plant species
Firstly, importing all necessary packages needed and setting some plot style preferences.

In [1]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import cv2

In [2]:
plt.rcParams['figure.figsize'] = (21.0, 13.0)
plt.rcParams['font.size'] = 18.0
sns.set_style('darkgrid')
sns.set_palette('pastel')

As I have access to a GPU on my PC I will quickly check if TensorFlow is able to find it.

In [3]:
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

Found GPU at: /device:GPU:0


That all checks out. Now we can pre-process the multi-labeled image data downloaded from the [Kaggle competition page](https://www.kaggle.com/c/plant-seedlings-classification/overview). Firstly, we need to check how many different plant labels we have to begin with, as this will define the set of all classification labels that can possibly be assigned by our model. The image data available under $\texttt{train.zip}$ has the labels of the plant species as the name of the subdirectory containing the images of that plant species, so we can list all subdirectories of the data to see all classification labels and save as a Python list.

In [4]:
species_list = os.listdir('./train')
species_list

['Black-grass',
 'Charlock',
 'Cleavers',
 'Common Chickweed',
 'Common wheat',
 'Fat Hen',
 'Loose Silky-bent',
 'Maize',
 'Scentless Mayweed',
 'Shepherds Purse',
 'Small-flowered Cranesbill',
 'Sugar beet']

Then we can consider constructing a coherent pre-pixel data set which contains the plant species label as the first column, the numerical plant species label as the second column and the image file location as the third column. This is done for ease of further use of the data in that we won't need to deal with constantly going into each of the above subdirectories of the data. After some research into how this can be done, I found the following to be the most straight forward method to iteratively go through each file in each subdirectory while keeping track of the subdirectory label. Note that we also setup the resized image file name as this will be created in the next couple of steps.

In [5]:
full_data = list()
for numerical_species, species in enumerate(species_list):
    for file_name in os.listdir(os.path.join('./train', species)):
        full_data.append([species, numerical_species, './train/{}/{}'.format(species, file_name), './pre_processed_train/{}/{}'.format(species, file_name)])
full_data = pd.DataFrame(full_data, columns=['species', 'numerical_species', 'file_name', 'pre_processed_file_name'])
full_data.head()

Unnamed: 0,species,numerical_species,file_name,pre_processed_file_name
0,Black-grass,0,./train/Black-grass/0050f38b3.png,./pre_processed_train/Black-grass/0050f38b3.png
1,Black-grass,0,./train/Black-grass/0183fdf68.png,./pre_processed_train/Black-grass/0183fdf68.png
2,Black-grass,0,./train/Black-grass/0260cffa8.png,./pre_processed_train/Black-grass/0260cffa8.png
3,Black-grass,0,./train/Black-grass/05eedce4d.png,./pre_processed_train/Black-grass/05eedce4d.png
4,Black-grass,0,./train/Black-grass/075d004bc.png,./pre_processed_train/Black-grass/075d004bc.png


We can observe the total image count.

In [6]:
full_data.shape

(4750, 4)

As well as the total number of images in each of the given species.

In [7]:
full_data['species'].value_counts()

Loose Silky-bent             654
Common Chickweed             611
Scentless Mayweed            516
Small-flowered Cranesbill    496
Fat Hen                      475
Charlock                     390
Sugar beet                   385
Cleavers                     287
Black-grass                  263
Shepherds Purse              231
Maize                        221
Common wheat                 221
Name: species, dtype: int64

It is important to note that we have only 221 images of two of the plant species when it comes to defining a training/testing data split. As we will need to ensure that a good amount of each plant species is present in the training data for a cohesive prediction model. We can ensure this is done by setting a random seed that has a good data split of the above levels.

Next we can consider the quality of the provided images. After a small look through some of the photos I have noticed that the images vary in size quite a bit. To rectify this we can consider setting a base image size and transform all images to this agreed size. Firstly, we can consider graphing a histogram of all of the image heights and image widths and find the smallest height and width. To do so we can add two more columns to our pre-pixel data which will be the image height and image width.

In [8]:
shape_list = [ cv2.imread(file).shape for file in full_data['file_name'] ]
heights = [ x[0] for x in shape_list ]
widths = [ x[1] for x in shape_list ]
full_data['image_height'] = heights
full_data['image_width'] = widths

In [9]:
full_data.head()

Unnamed: 0,species,numerical_species,file_name,pre_processed_file_name,image_height,image_width
0,Black-grass,0,./train/Black-grass/0050f38b3.png,./pre_processed_train/Black-grass/0050f38b3.png,196,196
1,Black-grass,0,./train/Black-grass/0183fdf68.png,./pre_processed_train/Black-grass/0183fdf68.png,388,388
2,Black-grass,0,./train/Black-grass/0260cffa8.png,./pre_processed_train/Black-grass/0260cffa8.png,886,886
3,Black-grass,0,./train/Black-grass/05eedce4d.png,./pre_processed_train/Black-grass/05eedce4d.png,117,117
4,Black-grass,0,./train/Black-grass/075d004bc.png,./pre_processed_train/Black-grass/075d004bc.png,471,471


In [10]:
full_data['image_height'].min()

49

In [11]:
full_data['image_width'].min()

49

Now as shrinking an image down to a smaller size is a much better task than zooming an image in and interpolating new pixel data, we can select a base image size for our data to be 48-by-48 pixels. As this ensures that no image will be upscaled.

Next, before we resize the image we can do some neat computer vision tricks. After some observation of the data it can be seen that all of these plants are distinctly green and more importantly all of these plants are photographed from the top down to their rocky garden bed. This means that in all of the given images we have a green plant photographed against a non-green background. So we can consider removing these backgrounds from all of the images and just leaving the green plant. One way we can do so is by throwing away all non-green pixels from each image by applying a mask that filters all non-green. This is done prior to resizing so that each resized image has a well defined set of green pixels and is not just plain black.

Another very important issue solved by applying this green mask is that it increases the independence of each image. As we can assume that all of plant species are located in a close proximity to each other. i.e. each photographed plant species is located in the same garden bed. So by removing the background we remove the capability of our (to be constructed) neural network from learning the background of the garden bed of each plant species. This means our model will be much better in identifying the actual plant and not just the plant and environment around the plant. We also add some Gaussian blur to de-noise the image.

In [12]:
os.mkdir('pre_processed_train')
for species in species_list:
    os.mkdir('./pre_processed_train/{}'.format(species))
save = 0
for file in full_data['file_name']:
    image = cv2.imread(file)
    greenmask = cv2.inRange(cv2.cvtColor(image, cv2.COLOR_BGR2HSV), (28, 25, 25), (92, 255, 255))
    green_image = np.zeros_like(image)
    green_image[greenmask > 0] = image[greenmask > 0]
    #Saving an example image for the report
    if(save == 0):
        cv2.imwrite('example_image.png', image)
        cv2.imwrite('green_image.png', green_image)
        save = 1
    pre_processed_image = cv2.resize(green_image, (48, 48), interpolation = cv2.INTER_AREA)
    pre_processed_image = cv2.GaussianBlur(pre_processed_image, (5,5), 0)
    pre_processed_file = './pre_processed_train' + file[7:]
    cv2.imwrite(pre_processed_file, pre_processed_image)

Next we sequentially setup our convolutional neural network that be our predictive model for this multi-level classification problem.

In [13]:
from keras.models import Sequential
from keras.layers import Dense, Dropout, Conv2D, MaxPooling2D, BatchNormalization, Flatten
tf.logging.set_verbosity(tf.logging.ERROR)
convolutional_neural_network = Sequential()
# First layer
convolutional_neural_network.add(Conv2D(32, kernel_size = (9, 9), padding = 'same', activation = 'relu', input_shape = (48, 48, 3)))
convolutional_neural_network.add(Conv2D(64, kernel_size = (9, 9), padding = 'same', activation = 'relu'))
convolutional_neural_network.add(MaxPooling2D(pool_size = (3, 3), strides = (2, 2)))
convolutional_neural_network.add(BatchNormalization())
convolutional_neural_network.add(Dropout(0.2))

#Second layer
convolutional_neural_network.add(Conv2D(128, kernel_size = (3, 3), padding = 'same', activation = 'relu'))
convolutional_neural_network.add(Conv2D(256, kernel_size = (3, 3), padding = 'same', activation = 'relu'))
convolutional_neural_network.add(MaxPooling2D(pool_size = (3, 3), strides = (2, 2)))
convolutional_neural_network.add(BatchNormalization())
convolutional_neural_network.add(Dropout(0.2))

# Third layer
convolutional_neural_network.add(Conv2D(512, kernel_size = (3, 3), padding = 'same', activation = 'relu'))
convolutional_neural_network.add(Conv2D(1024, kernel_size = (3, 3), padding = 'same', activation = 'relu'))
convolutional_neural_network.add(MaxPooling2D(pool_size = (3, 3), strides = (2, 2)))
convolutional_neural_network.add(BatchNormalization())
convolutional_neural_network.add(Dropout(0.2))

# Fourth layer
convolutional_neural_network.add(Conv2D(512, kernel_size = (3, 3), padding = 'same', activation = 'relu'))
convolutional_neural_network.add(Conv2D(1024, kernel_size = (3, 3), padding = 'same', activation = 'relu'))
convolutional_neural_network.add(MaxPooling2D(pool_size = (3, 3), strides = (2, 2)))
convolutional_neural_network.add(BatchNormalization())
convolutional_neural_network.add(Dropout(0.2))

# Final layer
convolutional_neural_network.add(Flatten())
convolutional_neural_network.add(Dense(512, activation = 'relu'))
convolutional_neural_network.add(Dense(12, activation = 'softmax'))
convolutional_neural_network.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy']);

Using TensorFlow backend.


Then we can consider splitting the given data into an 80% training set and 20% testing set.

In [14]:
from sklearn.model_selection import train_test_split
training_data, testing_data = train_test_split(full_data, test_size = 0.2, random_state = 6)
training_data['species'].value_counts()

Loose Silky-bent             513
Common Chickweed             496
Scentless Mayweed            423
Small-flowered Cranesbill    395
Fat Hen                      378
Charlock                     305
Sugar beet                   299
Cleavers                     233
Black-grass                  211
Shepherds Purse              190
Common wheat                 181
Maize                        176
Name: species, dtype: int64

Then we can transform the features $X$ and classifications $y$ into the format required by the neural network. So in the case of the features we just read each image from the pre-processed file locations and convert into a numpy array. As for the classifications we just need to encode them as binary dummy variables representing the different levels.

In [15]:
from keras.utils.np_utils import to_categorical
X_train = np.array([ cv2.imread(pre_processed_file) for pre_processed_file in training_data['pre_processed_file_name'] ])
y_train = to_categorical(training_data['numerical_species'])
X_test = np.array([ cv2.imread(pre_processed_file) for pre_processed_file in testing_data['pre_processed_file_name'] ])
y_test = to_categorical(testing_data['numerical_species'])

Next we can add a very useful addition to the data that we feed into the neural network. The following image data generator will alter each of the images used in each of the epochs such that the same data is not used on every single loop through the training process. Some of the alterations include: rotating the image, shifting the image to the left or right and even zooming in the image. This makes our trained neural network more robust in that it will not just be trained on the same set of data each time. So it will be able to just learn the distinguishing features between the species. This also helps a great deal in not overfitting the training data.

In [16]:
from keras.preprocessing.image import ImageDataGenerator
data_generator = ImageDataGenerator(rotation_range = 360, width_shift_range = 0.5, height_shift_range = 0.5,
                                    shear_range = 0.1, zoom_range = [0.75, 1.25], horizontal_flip = True,
                                    vertical_flip = True, fill_mode = 'constant')

Then finally we can perform the training process and on each epoch we can also display the accuracy on the testing data. As I have access to a GPU I went a little bit overboard with the number of training loops as well as the number of images process at a time. My PC was able to handle a batch size of 32 and for each of the below main epochs I was able to complete 20 actual epochs. So in total the output from the following represents 600 epochs or training loops through the image data generated training data. 

In [17]:
convolutional_neural_network.fit_generator(data_generator.flow(X_train, y_train, batch_size = 32), validation_data = (X_test, y_test),
                                          steps_per_epoch = 20*len(X_train)/32, epochs = 30);

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30
