
**Dataset collection:**

1. Mount your drive
2. Upload your kaggle.json file (Download your Kaggle API token to get kaggle.json file)
3. Download the dataset (Here, you will get the dataset as imagenet.zip)
4. Unzip the downloaded dataset

In [None]:
from google.colab import drive
drive.mount('/content/drive') # Select you account, and give continue. You will be able to see the drive mounted

Mounted at /content/drive


In [None]:
%cd /content/drive/MyDrive/Colab Notebooks/ # Setting up the working directory. (Optional)

[Errno 2] No such file or directory: '/content/drive/MyDrive/Colab Notebooks/'
/content


In [None]:
# Upload the download kaggle.json from your device
from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

# Then move kaggle.json into the folder where the API expects to find it.
!mkdir -p ~/.kaggle/ && mv kaggle.json ~/.kaggle/ && chmod 600 ~/.kaggle/kaggle.json

In [None]:
!kaggle datasets download -d ambityga/imagenet100 #downloading the dataset from kaggle as imagenet100.zip

In [None]:
!unzip imagenet100.zip # Unzipping the imagenet100.zip file

**Installing the dependencies or setting up the environment**

1. Create a venv, for e.g. python -m venv imagenet_vnv
2. Install the below specified libraries/packages.


---



In [None]:
# Installing the dependencies

!pip install tensorflow
!pip install tqdm
!pip install numpy
!pip install scikit-learn
!pip install keras
!pip install pandas
!pip install opencv-python

In [None]:
#Importing the libraries

import tensorflow as tf
from zipfile import ZipFile
import os,glob
import cv2
from tqdm._tqdm_notebook import tqdm_notebook as tqdm
import numpy as np
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Convolution2D, Dropout, Dense,MaxPooling2D
from keras.layers import BatchNormalization
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from zipfile import ZipFile

Please use `tqdm.notebook.*` instead of `tqdm._tqdm_notebook.*`
  from tqdm._tqdm_notebook import tqdm_notebook as tqdm


**Data preparation**

1. We see that our train dataset with a total of 100 classes is given in 4 different folders (train.X1, train.X2, train.x3, train.X4) with each containing 25 classes.
2. We need to merge these folders together.
3. Therefore, all of these datas are pushed to a new created directory, '/train' and Val.X (the validation data with 100 folders, representing 100 classes) is also pushed to a newly created directory, '/val'

In [None]:
import pandas as pd
import os
import shutil

# Moving folders in train.X1 to train/{name of the folder}

for label in os.listdir('/content/train.X1'):
    os.makedirs(f'/content/train/{label}')
    print('dir created')
    for filename in os.listdir(f'/content/train.X1/{label}'):
        shutil.copy(f'/content/train.X1/{label}/{filename}', f'/content/train/{label}/{filename}')

In [None]:
# Moving folders in train.X2 to train/{name of the folder}

for label in os.listdir('/content/train.X2'):
    os.makedirs(f'/content/train/{label}/{filename}')
    print('dir created')
    for filename in os.listdir(f'/content/train.X2/{label}'):
        shutil.copy(f'/content/train.X2/{label}/{filename}', f'/content/train/{label}/{filename}')

In [None]:
# Moving folders in train.X3 to train/{name of the folder}

for label in os.listdir('/content/train.X3'):
    os.makedirs(f'/content/train/{label}')
    print('dir created')
    for filename in os.listdir(f'/content/train.X3/{label}'):
        shutil.copy(f'/content/train.X3/{label}/{filename}', f'/content/train/{label}/{filename}')

In [None]:
# Moving folders in train.X4 to train/{name of the folder}

for label in os.listdir('/content/train.X4'):
    os.makedirs(f'/content/train/{label}')
    print('dir created')
    for filename in os.listdir(f'/content/train.X4/{label}'):
        shutil.copy(f'/content/train.X4/{label}/{filename}', f'/content/train/{label}/{filename}')

In [None]:
# Moving folders in val.X to val/{name of the folder}

for label in os.listdir('/content/val.X'):
    os.makedirs(f'/content/val/{label}')
    print('dir created')
    for filename in os.listdir(f'/content/val.X/{label}'):
        shutil.copy(f'/content/val.X/{label}/{filename}', f'/content/val/{label}/{filename}')


**Training the data**


Here, the ImageDataGenerator() function is generally used for data augmentation. As the next step, we are generating train data and test data from the train.X1 and Val.X1 directory respectively.

Parameters specified:

Batch size = 32 (No.of batches/samples propagating through the network during every epoch/iteration)

Class mode = Categorical (We have about 100 classes in total)

target_size(224,224) (It reduces storage. It makes machine learning algorithms computationally efficient.)

In [None]:
from keras.preprocessing.image import ImageDataGenerator
trdata = ImageDataGenerator()
traindata = trdata.flow_from_directory(directory="/content/train",batch_size=32,class_mode= "categorical",target_size=(224,224))
tsdata = ImageDataGenerator()
testdata = tsdata.flow_from_directory(directory="/content/val", batch_size=32, class_mode= "categorical",target_size=(224,224))

Here, the dataset is trained using 3 different models, such as:
1. VGG16
2. Inceptionv3
3. ResNet50
4. MobileNet

VGG16:
The VGG-16 is one of the most popular pre-trained models for image classification. Introduced in the famous ILSVRC 2014 Conference, it was and remains THE model to beat even today. Developed at the Visual Graphics Group at the University of Oxford, VGG-16 beat the then standard of AlexNet and was quickly adopted by researchers and the industry for their image Classification Tasks.

Inception:
Inception v3 is an image recognition model that has been shown to attain greater than 78.1% accuracy on the ImageNet dataset. The model is the culmination of many ideas developed by multiple researchers over the years.Inception v3 is a convolutional neural network for assisting in image analysis and object detection, and got its start as a module for GoogLeNet. It is the third edition of Google's Inception Convolutional Neural Network, originally introduced during the ImageNet Recognition Challenge

ResNet50:
ResNet-50 is a 50-layer convolutional neural network (48 convolutional layers, one MaxPool layer, and one average pool layer). Residual neural networks are a type of artificial neural network (ANN) that forms networks by stacking residual blocks. ResNet50 is a powerful image classification model that can be trained on large datasets and achieve state-of-the-art results. One of its key innovations is the use of residual connections , which allow the network to learn a set of residual functions that map the input to the desired output.

EfficientNet:
is a convolutional neural network architecture and scaling method that uniformly scales all dimensions of depth/width/resolution using a compound coefficient



**Defining the model**

We will be using only the basic models, with changes made only to the final layer. This is because this is just a 100 calsses classification problem while these models are built to handle up to 1000 classes.


The 'choose_model' variable can be changed to utilize any one of the above mentioned architecture to train the dataset.

In [None]:
choose_model = ' '

if choose_model == 'VGG16':
  from tf.keras.applications.vgg16 import VGG16

  base_model = VGG16(input_shape = (224, 224, 3), # Shape of our images
  include_top = False, # Leave out the last fully connected layer
  weights = 'imagenet')

elif choose_model == 'Inceptionv3':
  from tf.keras.applications.inception_v3 import InceptionV3

  base_model = InceptionV3(input_shape = (224, 224, 3),
                          include_top = False,
                          weights = 'imagenet')

elif choose_model == 'ResNet50':
  from tf.keras.applications import ResNet50

  base_model = InceptionV3(input_shape = (224, 224, 3),
                          include_top = False,
                          weights = 'imagenet')

elif choose_model == 'efficientNet':
  import efficientNet.keras as efn
  base_model = efn.EfficientNetB0( input_shape = (224, 224, 3),
                                 include_top = False,
                                 weights = 'imagenet')


Since we don’t have to train all the layers, we make them non_trainable

In [None]:
for layer in base_model.layers:
    layer.trainable = False


We will then build the last fully-connected layer. I have just used the basic settings, but feel free to experiment with different values of dropout, and different Optimisers, activation functions and learning rate.

The flatten layer serves the purpose of reshaping the output of the preceding layer into a one-dimensional vector, which can then be fed into subsequent fully connected layers

A dense layer is a layer where each neuron is connected to every neuron in the previous layer. In other words, the output of each neuron in a dense layer is computed as a weighted sum of the inputs from all the neurons in the previous layer. Here, we have 1024 neurons/units present in the dense layer.

The Dropout Layer. Another typical characteristic of CNNs is a Dropout layer. The Dropout layer is a mask that nullifies the contribution of some neurons towards the next layer and leaves unmodified all others. Here we dropout 0.5 units.

Finally the output layer with 100 neurons specifying 100 classes.

In [None]:
x = Flatten()(base_model.output)

# Add a fully connected layer with 1024 hidden units and ReLU activation
x = Dense(1024, activation='relu')(x)

# Add a dropout rate of 0.5
x = Dropout(0.5)(x)

# Add a final sigmoid layer with 100 node for classification output
x = Dense(100, activation='softmax')(x)

model = tf.keras.models.Model(base_model.input, x)

model.compile(optimizer = tf.keras.optimizers.  Adam(lr=0.001), loss = 'categorical_crossentropy',metrics = ['acc'])

**Summarizing the model**

In [None]:
model.summary()

We will now build the final model based on the training and validation sets we created earlier. Please note to use the original directories itself.

In [None]:
from keras.callbacks import ModelCheckpoint, EarlyStopping
checkpoint = ModelCheckpoint("Model.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)
early = EarlyStopping(monitor='val_acc', min_delta=0, patience=20, verbose=1, mode='auto')
hist = model.fit(steps_per_epoch=len(traindata),generator=traindata, validation_data= testdata, validation_steps=len(testdata),epochs=5,callbacks=[checkpoint,early])

**Evaluation Metrics**
1. Training loss
2. Training Accuracy
3. Validation Loass
4. Validation accuracy

These values are obtained at each and every epoch. These are later on plotted for better visualization


In [None]:
#Graph to check loss and accuracy

import matplotlib.pyplot as plt
plt.plot(hist.history["acc"])
plt.plot(hist.history['val_acc'])
plt.plot(hist.history['loss'])
plt.plot(hist.history['val_loss'])
plt.title("model accuracy")
plt.ylabel("Accuracy")
plt.xlabel("Epoch")
plt.legend(["Accuracy","Validation Accuracy","loss","Validation Loss"])
plt.show()

Precision:

Precision is one indicator of a machine learning model's performance – the quality of a positive prediction made by the model. Precision refers to the number of true positives divided by the total number of positive predictions (i.e., the number of true positives plus the number of false positives)


Recall:

Recall is a metric that measures how often a machine learning model correctly identifies positive instances (true positives) from all the actual positive samples in the dataset. You can calculate recall by dividing the number of true positives by the number of positive instances


F1 Score:

The F1 score or F-measure is described as the harmonic mean of the precision and recall of a classification model. The two metrics contribute equally to the score, ensuring that the F1 metric correctly indicates the reliability of a model

The classification report is generated for the same.

In [None]:
# Precision, recall, f1 score
Y_pred = model.predict(testdata, testdata.samples / 32)
val_preds = np.argmax(Y_pred, axis=1)
import sklearn.metrics as metrics
val_trues =testdata.classes
from sklearn.metrics import classification_report
print(classification_report(val_trues, val_preds))

A confusion matrix is a performance evaluation tool in machine learning, representing the accuracy of a classification model. It displays the number of true positives, true negatives, false positives, and false negatives

In [None]:
# Confusion matrix

Y_pred = model.predict(testdata, testdata.samples / 32)
val_preds = np.argmax(Y_pred, axis=1)
val_trues =testdata.classes
cm = metrics.confusion_matrix(val_trues, val_preds)
cm

**Prediction using new image**

In [None]:
#create new file test.py and run this file
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import preprocess_input, decode_predictions
import numpy as np
#load saved model
model = load_model("Model.h5") # or model = tf.keras.applications.VGG16(weights='imagenet', input_shape=(128, 128, 3))
img_path = '/path/to/input_image.jpg' # Input the image path
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds=model.predict(x)
# create a list containing the class labels
class_labels=[]
f = open('Labels.json')
data = json.load(f)
for dir in glob.glob('/content/Val/*'):
  label = dir.split('/')[-1]
  class_labels.append(data[label])
# find the index of the class with maximum score
pred = np.argmax(preds, axis=-1)
# print the label of the class with maximum score
print(class_labels[pred[0]])

**SHAP Values generation**

SHAP (SHapley Additive exPlanations) values are a way to explain the output of any machine learning model. It uses a game theoretic approach that measures each player's contribution to the final outcome.  Shapley values are a widely used approach from cooperative game theory that come with desirable properties

In [None]:
import shap
import numpy as np
import matplotlib.pyplot as plt
from keras.preprocessing.image import load_img, img_to_array
from keras.applications.vgg16 import VGG16, preprocess_input
from tensorflow.keras.models import load_model


class_labels=[]
f = open('Labels.json')
data = json.load(f)
for dir in glob.glob('/content/val/*'):
  label = dir.split('/')[-1]
  class_labels.append(data[label])

# Load the pre-trained model
model = load_model('Model.h5')

# Define a function to preprocess the images
def preprocess_image(image_path):
    img = load_img(image_path, target_size=(128, 128))
    x = img_to_array(img)
    x = np.expand_dims(x, axis=0)
    x = preprocess_input(x)
    return x

# Single image implementation
image_path = '/content/val/n01440764/ILSVRC2012_val_00000293.JPEG'
image = preprocess_image(image_path)

predictions = model.predict(image)
predicted_class_index = np.argmax(predictions)
confidence = predictions[0, predicted_class_index]
predicted_class_name = class_labels[predicted_class_index]
print(f"Prediction for {image_path}: {predicted_class_name} ({confidence * 100:.2f}%)")


# Create a masker object for the PartitionExplainer
masker = shap.maskers.Image("inpaint_telea", image[0].shape)

# Compute SHAP values using PartitionExplainer
explainer = shap.Explainer(model, masker,output_names=class_labels)
shap_values = explainer(
    image, max_evals=500, batch_size=50, outputs=shap.Explanation.argsort.flip[:8]
)
shap.image_plot(shap_values)

# Complete folder implementation

# folder_path = 'path/to/folder'
# for i in os.listdir(folder_path):
#   image_path = f'path/to/folder/{i}'
#   image_path = '/content/val/n01440764/ILSVRC2012_val_00000293.JPEG'
#   image = preprocess_image(image_path)

#   predictions = model.predict(image)
#   predicted_class_index = np.argmax(predictions)
#   confidence = predictions[0, predicted_class_index]
#   predicted_class_name = class_labels[predicted_class_index]
#   print(f"Prediction for {image_path}: {predicted_class_name} ({confidence * 100:.2f}%)")


#   # Create a masker object for the PartitionExplainer
#   masker = shap.maskers.Image("inpaint_telea", image[0].shape)

#   # Compute SHAP values using PartitionExplainer
#   explainer = shap.Explainer(model, masker,output_names=class_labels)
#   shap_values = explainer(
#       image, max_evals=500, batch_size=50, outputs=shap.Explanation.argsort.flip[:4]
#   )
#   shap.image_plot(shap_values)





**Post Model Quantization**

Model quantization is vital for deploying large AI models on resource-constrained devices. Quantization levels, like 8-bit or 16-bit, reduce model size and improve efficiency.

In [None]:
import tensorflow as tf

# Load the Keras model
model = tf.keras.models.load_model('vgg16.h5')

# Initialize the TFLiteConverter with the Keras model
converter = tf.lite.TFLiteConverter.from_keras_model(model)

# Convert the model to TensorFlow Lite format
tflite_model = converter.convert()

# Save the TensorFlow Lite model to a file
with open('model.tflite', 'wb') as f:
    f.write(tflite_model) # model.tflite will be saved in your working directory

print('completed')