# Lego Minifigure Classification - Transfer Learning using MobileNetV2

![Lego](https://i.pinimg.com/originals/35/b7/0e/35b70e342fc3ad6e4d78f46febc05ebb.jpg)



## We'll be going through the following steps from loading and preparing our data to training our model and finally validating our results.

1. Importing Libraries

2. Loading Dataset - Exploratory Data Analysis

3. Data Preprocessing

4. Loading Base Model for Transfer Learning 

5. Compiling Model

6. Training Model

7. Visualizing Model Accuracy and Loss

8. Validation

## 1. Importing Libraries 

In [None]:
import os
import cv2
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dropout, Dense
from tensorflow.keras.applications import MobileNetV2

## 2. Loading Dataset

In [None]:
#Displaying directory files
os.listdir('/kaggle/input/lego-minifigures-classification/')

In [None]:
#Initializing variable with path to dataset
dataPath = '../input/lego-minifigures-classification/'

### Visualizing images using OpenCV and Matplotlib

In [None]:
#Reading image -> Resizing to (512x512) -> Converting to RGB -> Normalizing pixel values

image = cv2.imread('../input/lego-minifigures-classification/harry-potter/0002/009.jpg')
image = cv2.resize(image, (512,512))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) / 255.0

plt.imshow(image)

In [None]:
#Reading index.csv dataframe
index_df = pd.read_csv(dataPath + 'index.csv')
index_df

In [None]:
#Reading metadata.csv dataframe
meta_df = pd.read_csv(dataPath + 'metadata.csv')
meta_df

In [None]:
#Merging index and metadata on class_id feature
data_df = pd.merge(index_df, meta_df[['class_id', 'minifigure_name']], on='class_id')
data_df

In [None]:
#Displaying overall information
data_df.info()

In [None]:
#Looking for missing data in dataframe
print("Missing Data:",data_df.isnull().any().any())
data_df.isnull().sum()

In [None]:
#Keeping count of number of each minifigure
labels = data_df['minifigure_name'].unique()
count = data_df['minifigure_name'].value_counts()

count

### Exploratory Data Analysis

In [None]:
#Visualizing quantity of each minifugure in dataset
import seaborn as sns

plt.figure(figsize=(12,10))
sns.barplot(x=labels, y=count,palette="rocket")

plt.xticks(rotation= 90)
plt.xlabel('Labels')
plt.ylabel('Count')
plt.title('Dataset Analysis')
plt.show()

In [None]:
#Splitting and creating a training and validation dataframe 
training = data_df[data_df["train-valid"] == 'train']
validation = data_df[data_df["train-valid"] == 'valid']

In [None]:
#Viewing our prepared dataframes for training and validation
training, validation

In [None]:
#Evaluating total number of classes
CLASSES = len(data_df['class_id'].unique())
CLASSES

## 3. Data Preprocessing

**We'll be preparing our image data for input to the Neural Network model.**

* Creating numpy arrays for Train and Valid Data - Labels
* Reading each image from training and validation dataframes using the 'path' feature
* Converting image format to RGB from BGR
* Resizing image to (512 x 512) 
* Normalizing pixel values between [0, 1] for each image
* Appending data and labels to train and valid numpy arrays

In [None]:
#Training Data Preprocessing

trainData = np.zeros((training.shape[0], 512, 512, 3))

for i in range(training.shape[0]):
    
    image = cv2.imread('../input/lego-minifigures-classification/' + training["path"].values[i])
    
    #Converting BGR to RGB 
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    
    #Resizing image to (512 x 512)
    image = cv2.resize(image, (512,512))
    
    #Normalizing pixel values to [0,1]
    trainData[i] = image / 255.0

trainLabel = np.array(training["class_id"])-1

In [None]:
#Validation Data Preprocessing

validData = np.zeros((validation.shape[0], 512, 512, 3))

for i in range(validation.shape[0]):
    
    image = cv2.imread('../input/lego-minifigures-classification/' + validation["path"].values[i])
    
    #Converting BGR to RGB 
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    
    #Resizing image to (512 x 512)
    image = cv2.resize(image, (512,512))
    
    #Normalizing pixel values to [0,1]
    validData[i] = image / 255.0

validLabel = np.array(validation["class_id"])-1

In [None]:
#Viewing our prepared numpy arrays of data and labels
trainData, trainLabel

## 4. Loading Base Model for Transfer Learning

* We will be taking advantage of Transfer Learning provided under the [tensorflow.keras.applications library](https://keras.io/guides/transfer_learning)
* Using pretrained model weights of [MobileNetV2](https://keras.io/api/applications/mobilenet)
* Loading MobileNetV2 as 'base_model'
* Defining last two layers of MobileNetV2 as our trainable layers
* Adding Droupout Layer of (0.5) - 50% neurons are dropped to prevent overfitting on data
* Adding Dense Layer with number of CLASSES with a 'softmax' function
* Creating Model with base model as inputs

In [None]:
#Loading Base Model
base_model = MobileNetV2()

#Adding Dropout layer
x = Dropout(0.5)(base_model.layers[-2].output)

#Adding Dense layer
outputs = Dense(CLASSES, activation='softmax')(x)

#Creating model
model = Model(base_model.inputs, outputs)

In [None]:
#Displaying model summary
model.summary()

## 5. Compiling Model using Adam Optimizer

* Learning Rate - 0.0001
* Loss Function - Sparse Categorical Crossentropy
****

In [None]:
model.compile(
    optimizer=Adam(lr=0.0001),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

## 6. Training Model - 30 Epochs

In [None]:
#Training model - 30 epochs
hist = model.fit(
    trainData, trainLabel,
    epochs=40,
    validation_data=(validData, validLabel),
    shuffle=True,
    batch_size=4
)

## 7. Visualizing Model Performance

* Plot 1 - Visualizing based on Model Loss 
* Plot 2 - Visualizing based on Model Accuracy

In [None]:
plt.figure(figsize=(16, 6))
plt.subplot(1, 2, 1)
plt.plot(hist.history['loss'], label='train loss')
plt.plot(hist.history['val_loss'], label='valid loss')
plt.grid()
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(hist.history['accuracy'], label='train acc')
plt.plot(hist.history['val_accuracy'], label='valid acc')
plt.grid()
plt.legend()

## 8. Validation

* Loading test image from dataset
* Preprocessing data - Resizing, Converting Format and Normalizing for input to trained model
* Displaying test image 

In [None]:
testImage = cv2.imread('../input/lego-minifigures-classification/marvel/0001/001.jpg')
testImage = cv2.resize(testImage, (512,512))
testImage = cv2.cvtColor(testImage, cv2.COLOR_BGR2RGB) / 255.0

plt.imshow(testImage)

* Reshaping and converting image data to numpy array
* Prediciting on data 
* Displaying class name and minifigure name based on prediciton

In [None]:
testImage = np.reshape(testImage, (1, 512, 512, 3))

predictedClass = model.predict(testImage).argmax()
predictedClass = predictedClass + 1

figureName = meta_df['minifigure_name'][meta_df['class_id'] == predictedClass].iloc[0]

print(figureName)