# Plant Seedlings Classification

### A simple notebook explaning step by step how to perform image classification using convolutional neural networks and keras

### The data for this notebook can be found at https://www.kaggle.com/c/plant-seedlings-classification

#### First things first, we are going to be working with train.zip and test.zip datasets. We will need to transform the datasets a bit in order to use keras, specifically we will initially split the train examples into 3 separate sets

* Train set (80% of the data)
* Validation set (10% of the data)
* Fake-test set (10% of the data)

Why the fake-test set? well, it will give us an opportunity to know for sure whether or not we have overfitted train and validation sets. 


So remember, we are going to be working with 12 classes, which are 

* Black-grass
* Charlock
* Cleavers
* Common Chickweed
* Common wheat
* Fat Hen
* Loose Silky-bent
* Maize
* Scentless Mayweed
* Shepherds Purse
* Small-flowered Cranesbill
* Sugar beet

The current data structure (after preprocessing) will be

```
train
    ├── Black-grass
    │   ├── 129c51855.png
    │   ├── a08892355.png
    │   └── f84089a55.png
    ├── Charlock
    │   ├── 0d5f555a3.png
    │   ├── 20b955bc3.png
    │   └── d9de67550.png
    ├── Cleavers
    ├── ... the rest of the classes 
    │
validation
    │
    ├── Black-grass
    │   ├── 1.png
    │   ├── 2.png
    │   └── 3.png
    ├── Charlock
    │   ├── 4.png
    │   ├── 5.png
    │   └── 6.png
    ├── Cleavers
    ├── ... the rest of the classes  
    |
fake-test
    |
    │
    ├── Black-grass
    │   ├── a.png
    │   ├── b.png
    │   └── c.png
    ├── Charlock
    │   ├── a.png
    │   ├── b.png
    │   └── c.png
    ├── Cleavers
    ├── ... the rest of the classes     
```

The reason we need such structure is because later we will make use of Kera's ImageGenerator object, which requires different dirs for train and validation sets.

So, first things, first, we are going to prepare our data

In [11]:
from os import listdir
from os import mkdir
from os import makedirs
import os
import shutil
from IPython.display import Image, display
from keras.preprocessing.image import ImageDataGenerator
from keras.applications import vgg16, vgg19, xception
from keras.layers import Dense, GlobalAveragePooling2D, Dropout, Flatten
from keras.models import Model
from keras.models import load_model
from keras import optimizers
import pandas as pd
from skimage import io
import numpy as np
import cv2

from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau

%matplotlib inline


In [2]:
# This code will require GPU usage... so sometimes we will need to run it in floydhub
FLOYDHUB=True

# As per the image size we will use, I am going with 224... no particular reason really
IMAGE_WIDTH = 224
IMAGE_HEIGHT = 224

if FLOYDHUB:
    OUTPUT_DIR = "/output/"
    TRAIN_DIR = "/input/train/"
    VALIDATION_DIR = "/input/validation/"
    FAKE_TEST_DIR = "/input/fake-test"
    TEST_DIR = "/input/test"
else:
    OUTPUT_DIR = "/tmp/"
    TRAIN_DIR = "train/"
    VALIDATION_DIR = "validation/"
    FAKE_TEST_DIR = "fake-test/"
    TEST_DIR = "test/"




In [3]:
CLASS_NAMES = [
    "Black-grass",
    "Charlock",
    "Cleavers",
    "Common Chickweed",
    "Common wheat",
    "Fat Hen",
    "Loose Silky-bent",
    "Maize",
    "Scentless Mayweed",
    "Shepherds Purse",
    "Small-flowered Cranesbill",
    "Sugar beet",
]

In [24]:
batch_size = 16

# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
        rescale=1./255,
        rotation_range=180,
        vertical_flip=True,
        horizontal_flip=True)

# this is the augmentation configuration we will use for validation:
# only rescaling
validation_datagen = ImageDataGenerator(rescale=1./255)

# and the same for the test set
fake_test_datagen = ImageDataGenerator(rescale=1./255)

# this is a generator that will read pictures found in
# subfolers of 'data/train', and indefinitely generate
# batches of augmented image data
train_generator = train_datagen.flow_from_directory(
        TRAIN_DIR,  # this is the target directory
        target_size=(IMAGE_WIDTH, IMAGE_HEIGHT),  # all images will be resized to 150x150
        batch_size=batch_size,
        class_mode='categorical')  # since we use categorical_crossentropy loss, we will need one-hot-encoded...

# this is a similar generator, for validation data
validation_generator = validation_datagen.flow_from_directory(
        VALIDATION_DIR,
        target_size=(IMAGE_WIDTH, IMAGE_HEIGHT),
        batch_size=batch_size,
        class_mode='categorical')

# And the generator for test data
fake_test_generator = fake_test_datagen.flow_from_directory(
        FAKE_TEST_DIR,
        target_size=(IMAGE_WIDTH, IMAGE_HEIGHT),
        batch_size=batch_size,
        class_mode='categorical')

Found 3806 images belonging to 12 classes.
Found 474 images belonging to 12 classes.
Found 470 images belonging to 12 classes.


In [32]:
training_samples = (3805 // batch_size ) * batch_size
validation_samples = (474 // batch_size ) * batch_size
fake_test_samples = (470 // batch_size ) * batch_size

In [33]:
xception_model = xception.Xception(weights='imagenet', include_top=False)

In [34]:
xception_model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
input_4 (InputLayer)             (None, None, None, 3) 0                                            
____________________________________________________________________________________________________
block1_conv1 (Conv2D)            (None, None, None, 32 864         input_4[0][0]                    
____________________________________________________________________________________________________
block1_conv1_bn (BatchNormalizat (None, None, None, 32 128         block1_conv1[0][0]               
____________________________________________________________________________________________________
block1_conv1_act (Activation)    (None, None, None, 32 0           block1_conv1_bn[0][0]            
___________________________________________________________________________________________

In [35]:
for layer in xception_model.layers:
    print(layer.name, "\t",  "trainable" if layer.trainable else "NOT trainable")

input_4 	 NOT trainable
block1_conv1 	 trainable
block1_conv1_bn 	 trainable
block1_conv1_act 	 trainable
block1_conv2 	 trainable
block1_conv2_bn 	 trainable
block1_conv2_act 	 trainable
block2_sepconv1 	 trainable
block2_sepconv1_bn 	 trainable
block2_sepconv2_act 	 trainable
block2_sepconv2 	 trainable
block2_sepconv2_bn 	 trainable
conv2d_13 	 trainable
block2_pool 	 trainable
batch_normalization_13 	 trainable
add_37 	 trainable
block3_sepconv1_act 	 trainable
block3_sepconv1 	 trainable
block3_sepconv1_bn 	 trainable
block3_sepconv2_act 	 trainable
block3_sepconv2 	 trainable
block3_sepconv2_bn 	 trainable
conv2d_14 	 trainable
block3_pool 	 trainable
batch_normalization_14 	 trainable
add_38 	 trainable
block4_sepconv1_act 	 trainable
block4_sepconv1 	 trainable
block4_sepconv1_bn 	 trainable
block4_sepconv2_act 	 trainable
block4_sepconv2 	 trainable
block4_sepconv2_bn 	 trainable
conv2d_15 	 trainable
block4_pool 	 trainable
batch_normalization_15 	 trainable
add_39 	 trainabl

In [36]:
for layer in xception_model.layers[:-6]: # Remember, we will train the last 4 layers
    layer.trainable = False
    
    
x = xception_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation="relu")(x)
x = Dropout(0.5)(x)
predictions = Dense(12, activation="softmax")(x)

model_trainable = Model(input = xception_model.input, output = predictions)
    
    
#for layer in xception_model.layers:
#    print(layer.name, "\t",  "trainable" if layer.trainable else "NOT trainable")

  # This is added back by InteractiveShellApp.init_path()


In [41]:
adam = optimizers.Adam(lr=0.00001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)

model_trainable.compile(loss='categorical_crossentropy',
                  optimizer=adam,
                  metrics=['accuracy'])

reduce_lr_callback = ReduceLROnPlateau(monitor='val_loss', 
                                       factor=0.5, 
                                       patience=3, 
                                       min_lr=0.00001,
                                       verbose=1)
checkpoint_callback = ModelCheckpoint(OUTPUT_DIR+"bestXception.model", 
                                      monitor='val_loss', 
                                      verbose=1, 
                                      save_best_only=True, 
                                      save_weights_only=False, 
                                      mode='auto', 
                                      period=1)

history_trainable_augmented = model_trainable.fit_generator(train_generator,
                                                  steps_per_epoch=training_samples // batch_size,
                                                  epochs=15,
                                                  callbacks = [reduce_lr_callback,  checkpoint_callback],
                                                  validation_data=validation_generator,
                                                  validation_steps=validation_samples // batch_size)

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


In [42]:
model_trainable.evaluate_generator(fake_test_generator, steps= fake_test_samples // batch_size)

[0.20487483205466434, 0.92887931034482762]

## Prediction area

In [43]:
real_test_images = []
final_predictions = pd.DataFrame(columns=CLASS_NAMES)

image_files = listdir(TEST_DIR)
i = 0
for image_file in image_files:     
    raw_image = io.imread(TEST_DIR+"/"+image_file)
    scaled_img = cv2.resize(raw_image, (IMAGE_WIDTH, IMAGE_HEIGHT), interpolation=cv2.INTER_CUBIC)
    real_test_images.append(scaled_img)
    i+=1    
    if i % 100 == 0:
        print("Loaded", i, "images so far...")
X = np.array(real_test_images)
X = X / 255
print("Done!") 

Loaded 100 images so far...
Loaded 200 images so far...
Loaded 300 images so far...
Loaded 400 images so far...
Loaded 500 images so far...
Loaded 600 images so far...
Loaded 700 images so far...
Done!


In [44]:
def predict_and_dump(model_to_use, X_to_use, image_files_to_use, file_name):
    results = model_to_use.predict(X_to_use, verbose=1)
    final_predictions = pd.DataFrame(columns=CLASS_NAMES, data=results)
    predictions = final_predictions.head().idxmax(axis=1)
    kaggle_data = pd.DataFrame(columns=["file"])
    kaggle_data["file"] = image_files_to_use
    kaggle_data["species"] = final_predictions.idxmax(axis=1)
    kaggle_data.to_csv(file_name, index=False)
    return kaggle_data, final_predictions

In [46]:
output, predictions_raw = predict_and_dump(model_trainable, X, image_files, "xception60epochs.csv")

