# Introduction

This notebook is copied from <a href="https://www.kaggle.com/code/larsmadsen/tf-keras-inception-resnet-v2-97-acc" style="text-decoration:none">LMADSEN</a>. And I made a few modifications according to the prompts in the referenced notebook and my coding style. I also rearranged the code layout for my personal coding taste. 

Thanks **LMADSEN** for sharing his/her excellent work so that I had this opportunity to know the Inception-ResNet model and to learn how to implement it using tf.keras.

# Import Libraries

In [1]:
import tensorflow as tf
import pandas as pd

In [2]:
# Check the version of tensorflow and make sure that the GPU is available
print(tf.__version__)
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

# tf.config.experimental.list_physical_devices('GPU')

2.4.1
Num GPUs Available:  1


<br>

# Define Some Constants.

These are number of epochs to train, images per batch, image width and height in pixels and the names of each class/species for the submission file

In [3]:
nb_epoch     = 40
batch_size   = 16
width        = 299
height       = 299
species_list = ["Black-grass", "Charlock", "Cleavers", "Common Chickweed", 
                "Common wheat", "Fat Hen", "Loose Silky-bent", "Maize", 
                "Scentless Mayweed", "Shepherds Purse", 
                "Small-flowered Cranesbill", "Sugar beet"]

<br>

# Define the Model

Define the neural network. We use the build-in module for Inception-ResNet V2 in tf.keras.

We don't import the output layer, however as we don't have 1000 different categories. Instead we add a few layers for our 12 catagories.

In [4]:
def define_model(width, height):
    model_input = tf.keras.layers.Input(shape=(width, height, 3), name='image_input')
    model_main = tf.keras.applications.inception_resnet_v2\
                                      .InceptionResNetV2(include_top=False,
                                                         weights='imagenet')(model_input)
    model_dense1 = tf.keras.layers.Flatten()(model_main)
    model_dense2 = tf.keras.layers.Dense(128, activation='relu')(model_dense1)
    model_out = tf.keras.layers.Dense(12, activation="softmax")(model_dense2)

    model = tf.keras.models.Model(model_input,  model_out)
    
    optimizer = tf.keras.optimizers.Adam(lr=0.00004, beta_1=0.9, beta_2=0.999)
    model.compile(loss="categorical_crossentropy", 
                  optimizer=optimizer,
                  metrics=["accuracy"])
    return model

<br>

# Define Data Generator

Next, the code for the data generators that take care of traversing through the directories with images and augmenting the images as needed for training.

In [5]:
data_dir = "/kaggle/input/plant-seedlings-classification/"

In [6]:
def define_generators():
    train_datagen = tf.keras.preprocessing\
                      .image.ImageDataGenerator(
                                rotation_range=360,
                                width_shift_range=0.3,
                                height_shift_range=0.3,
                                shear_range=0.3,
                                zoom_range=0.5,
                                vertical_flip=True,
                                horizontal_flip=True,
                                
                                validation_split=0.2,
                                # validation_split=0.0,
                                # change to use validation instead of training on entire training set
                                )

    train_generator = train_datagen.flow_from_directory(
                                        # directory='/kaggle/input/plant-seedlings-classification/train',
                                        directory=data_dir + "train",
                                        target_size=(width, height),
                                        batch_size=batch_size,
                                        color_mode='rgb',
                                        class_mode="categorical",
                                        subset='training',
                                        )

    
    validation_generator = train_datagen.flow_from_directory(
                                           # directory='/kaggle/input/plant-seedlings-classification/train',
                                           directory=data_dir + "train",
                                           target_size=(width, height),
                                           batch_size=batch_size,
                                           color_mode='rgb',
                                           class_mode="categorical",
                                           subset='validation',
                                           )

    test_datagen = tf.keras.preprocessing.image.ImageDataGenerator()

    test_generator = test_datagen.flow_from_directory(
                                         # directory='/kaggle/input/plant-seedlings-classification/',
                                         directory=data_dir,
                                         classes=['test'],
                                         target_size=(width, height),
                                         batch_size=1,
                                         color_mode='rgb',
                                         shuffle=False,
                                         class_mode='categorical')

    return train_generator, validation_generator, test_generator

<br>

# Define the Checkpoint

Define the checkpoint save callback on validation accuracy.

It is not currently used, but you can if you want to work on the model with the highest accuracy instead of the last training epoch.

In [7]:
def define_callbacks():
    save_callback = tf.keras.callbacks.ModelCheckpoint(filepath='model.h5',
                                                       monitor='val_acc',
                                                       save_best_only=True,
                                                       verbose=1
                                                      )

    return save_callback

<br>

# Train the Model

Now, define the model and fit it with the training data.

In [8]:
model = define_model(width, height)
model.summary()

train_generator, validation_generator, test_generator = define_generators()


save_callback = define_callbacks()

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/inception_resnet_v2/inception_resnet_v2_weights_tf_dim_ordering_tf_kernels_notop.h5
Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
image_input (InputLayer)     [(None, 299, 299, 3)]     0         
_________________________________________________________________
inception_resnet_v2 (Functio (None, None, None, 1536)  54336736  
_________________________________________________________________
flatten (Flatten)            (None, 98304)             0         
_________________________________________________________________
dense (Dense)                (None, 128)               12583040  
_________________________________________________________________
dense_1 (Dense)              (None, 12)                1548      
Total params: 66,921,324
Trainable params: 66,860,780
Non-trainable params: 60,544
_________

In [9]:
model.fit(train_generator,
          epochs=nb_epoch,
          steps_per_epoch=train_generator.samples // batch_size,
          validation_data= validation_generator,
          validation_steps=validation_generator.samples // batch_size,
          callbacks=[save_callback]  # UNCOMMENT THIS LINE TO SAVE BEST VAL_ACC MODEL
)

Epoch 1/40
Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40
Epoch 22/40
Epoch 23/40
Epoch 24/40
Epoch 25/40
Epoch 26/40
Epoch 27/40
Epoch 28/40
Epoch 29/40
Epoch 30/40
Epoch 31/40
Epoch 32/40
Epoch 33/40
Epoch 34/40
Epoch 35/40
Epoch 36/40
Epoch 37/40
Epoch 38/40
Epoch 39/40
Epoch 40/40


<tensorflow.python.keras.callbacks.History at 0x73aa96f382d0>

<br>

# Predict and Submission

Cool, now we (hopefully) have a model that can predict the species!

Call it to get the predictions, and create a pandas dataframe with the species names of the highest probabilities. finally save the dataframe as the submission file.

In [11]:
predictions = model.predict(test_generator, steps=test_generator.samples)

class_list = []

for i in range(0, predictions.shape[0]):
    y_class = predictions[i, :].argmax(axis=-1)
    class_list += [species_list[y_class]]

submission = pd.DataFrame()
submission['file'] = test_generator.filenames
submission['file'] = submission['file'].str.replace(r'test\\', '')
submission['species'] = class_list

submission.to_csv('./Inception-ResNet.csv', index=False)

print('Submission file generated. All done.')

Submission file generated. All done.
