#[SIF Artificial Intelligence Foundational] Week 4 - Rice Grain Detection.ipynb

🌟 In this Rice Classification Notebook, we will be using the classified rice grains that we have sorted in the previous activity. We will be using supervised learning for this.

🌟 This notebook will take you through the full journey of the AI Project Cycle. This starts from defining the problem. 

🌟 Next we will upload and visualise our data before designing a machine learning model to process the data.

#1. Problem Scope

The **identification** **of** **different** **varieties** **of** **rice** was discussed at a huge level during the processing of food elements. The traditional and conventional techniques were used for the identification of different rice varieties, which were collected from different real field sources. 

This technique was more expensive, a slow process, and time-consuming. Furthermore, there are many varieties of rice and it is very difficult to identify the actual verity of rice with the help of rice grain. 

So, we need an automated system, which helps us in the identification of rice varieties with better accuracy. The main purpose of this is describing the identification of rice varieties using different rice grains.

#2a. Importing the Libraries

In [None]:
# Importing necessary libraries

# Building deep learning models
import tensorflow as tf 
from tensorflow import keras 
# For accessing pre-trained models
import tensorflow_hub as hub 
# For separating train and test sets
from sklearn.model_selection import train_test_split

# For visualizations
import matplotlib.pyplot as plt
import matplotlib.image as img
import PIL.Image as Image
import cv2

import os
import numpy as np
import pathlib
import zipfile

In [2]:
#Mount the Google Drive


#2b. Uploading the Data Set

Now, we will upload the Google Drive and the images from the rice.

In [2]:
data_dir = "/content/drive/MyDrive/rice_image_dataset/" # Datasets path
data_dir = pathlib.Path(data_dir)
data_dir

#3. Data Visualization

We now can visualize the data using this line of code. Take a look at each of the rice grain from different categories!

In [2]:
tf.keras.utils.load_img("/content/drive/MyDrive/rice_image_dataset/Jasmine/Jasmine (1).jpg",target_size=(250,250))

In [1]:
#Let's try another one!
tf.keras.utils.load_img("/content/drive/MyDrive/rice_image_dataset/Arborio/Arborio (1).jpg",target_size=(250,250))

In [None]:
arborio = list(data_dir.glob('Arborio/*'))[:600]
basmati = list(data_dir.glob('Basmati/*'))[:600]
ipsala = list(data_dir.glob('Ipsala/*'))[:600]
jasmine = list(data_dir.glob('Jasmine/*'))[:600]
karacadag = list(data_dir.glob('Karacadag/*'))[:600]

In [None]:
#Let's take a look at the rice grains next to each other

fig, ax = plt.subplots(ncols=5, figsize=(20,5))
fig.suptitle('Rice Category')


#4. Data Labeling

In [None]:
# Contains the images path
df_images = {


# Contains numerical labels for the categories
df_labels = {

In [None]:
# Converting the images into numerical arrays


#Reshape the images into 250 by 250 by 3

In [None]:
X, y = [], [] # X = images, y = labels
for label, images in df_images.items():
    for image in images:
        img = cv2.imread(str(image))
        resized_img = cv2.resize(img, (224, 224)) # Resizing the images to be able to pass on MobileNetv2 model
        X.append(resized_img) 
        y.append(df_labels[label])

In [None]:
# Standarizing
X = np.array(X)
X = X/255
y = np.array(y)

# Separating data into training, test and validation sets
X_train, X_test_val, y_train, y_test_val = train_test_split(X, y) #default 0.25 split
X_test, X_val, y_test, y_val = train_test_split(X_test_val, y_test_val) #Validation set

# 5a. Modeling


In [None]:
mobile_net = 'https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/4' # MobileNetv4 link
mobile_net = hub.KerasLayer(
        mobile_net, input_shape=(224,224, 3), trainable=False) # Removing the last layer

In [None]:
num_label = 5 # number of labels

model = keras.Sequential([
    mobile_net,
    keras.layers.Dense(num_label)
])

model.summary()

#5b. Training the Model

In [None]:
model.compile(
  optimizer="adam",
  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
  metrics=['acc'])

history = model.fit(X_train, y_train, epochs=10, validation_data=(X_val, y_val))


In [None]:
from sklearn.metrics import classification_report

y_pred = model.predict(X_test, batch_size=64, verbose=1)
y_pred_bool = np.argmax(y_pred,axis=1)

print(classification_report(y_test, y_pred_bool))

#6. Visualizing the Model


In [None]:
plt.plot(history.history['acc'],marker='x')
plt.plot(history.history['val_acc'],marker='o')
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='lower right')
plt.show()

In [None]:
plt.plot(history.history['loss'],marker='o')
plt.plot(history.history['val_loss'], marker='o')
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper right')
plt.show()

#Reflections 🤔

Congratulations on completing the notebook! Now, we will be reflecting on the experience and also documenting some of the interesting points that you have encountered!

What did you learn from this Python coding activity?

What are some important metrics to determine how good a machine learning model is?

What did you enjoy most in this session?