Task 3 : Model Building


Below discussed are the popular and effective models for car brand identification:

1.Convolutional Neural Networks (CNNs):

CNNs have been the go-to choice for image classification tasks, including car brand identification. You can use architectures like VGG, ResNet, or MobileNet as a base and fine-tune them for your specific dataset.

2.Transfer Learning with Inception or ResNet:

Leveraging pre-trained Inception or ResNet models and fine-tuning them on your car brand dataset can often yield excellent results. Transfer learning helps save time and resources.

3.Support Vector Machines (SVM):

SVMs are powerful classifiers that can work well with appropriate feature extraction methods like HOG or SIFT. They are known for their ability to handle complex decision boundaries.

4.Random Forest:

Random Forest is an ensemble learning technique that can handle high-dimensional data well. It's robust and can be effective for car brand identification, especially when combined with appropriate feature engineering.

5.Bag-of-Words (BoW) with K-Nearest Neighbors (KNN):

This combination of BoW for feature extraction and KNN for classification can work surprisingly well for car brand identification. It's simple and interpretable, making it a good choice for some scenarios.

In this model building, i have used transfer learning models. Below is the detailed explanation of the transfer learning models:

Transfer learning is a machine learning technique where a model trained on one task is adapted for use on a second, related task. In the context of car brand identification or any image classification task, transfer learning has become a popular and effective approach. Here's an overview of transfer learning models for image classification:

**1. Pre-trained Convolutional Neural Networks (CNNs):**
   - CNNs have revolutionized image classification tasks. Pre-trained CNN models, such as VGG, ResNet, Inception, and MobileNet, have been trained on massive image datasets like ImageNet, which contain a vast variety of objects, including cars. These models have learned to extract hierarchical and generic features from images, making them excellent feature extractors.

   - Transfer learning using pre-trained CNNs involves taking the layers of a pre-trained network and fine-tuning them for a specific task, such as car brand identification. The early layers capture low-level features like edges and textures, while deeper layers capture more abstract and high-level features. You can retrain the last few layers or some intermediate layers to adapt the model to your dataset.

   - Benefits of using pre-trained CNNs include reduced training time and the ability to leverage the rich features learned from a large and diverse dataset like ImageNet.

**2. Feature Extraction with CNNs:**
   - Another transfer learning approach is to use a pre-trained CNN as a fixed feature extractor. You can remove the top classification layers and use the output of the last convolutional layer as feature vectors. These feature vectors can then be fed into a separate classifier (e.g., SVM, Random Forest) for car brand identification.

   - This approach is particularly useful when you have limited labeled data for your specific task. It allows you to benefit from the generalization capabilities of pre-trained CNNs while training a simple classifier on top of the extracted features.

**3. Fine-tuning Pre-trained Models:**
   - Fine-tuning involves training the entire pre-trained model on your target dataset. You start with the pre-trained weights and then update them during training on your dataset. This can be beneficial when your target task is similar to the original task the model was trained on.

   - For car brand identification, you can fine-tune a pre-trained model by replacing the output layer with a new set of output units corresponding to the car brands in your dataset. The rest of the model's layers can be updated during training.

**4. Domain Adaptation:**
   - If your target domain (e.g., images of cars in real-world conditions) is substantially different from the source domain (e.g., ImageNet), domain adaptation techniques can be applied. These techniques aim to reduce the domain gap between the source and target data to improve model performance.

Transfer learning with pre-trained models has proven to be highly effective in various image classification tasks, including car brand identification. It allows you to leverage the knowledge encoded in large, publicly available datasets and adapt it to your specific problem, often resulting in better performance with less data and computation compared to training from scratch. However, selecting the right pre-trained model and fine-tuning strategy depends on the nature of your dataset and the specific requirements of your task.

The transfer learning model i chose is VGG.Below is the detailed explanation of the VGG model:

The VGG (Visual Geometry Group) model is a deep convolutional neural network architecture designed for image classification. It was developed by the Visual Geometry Group at the University of Oxford and was one of the finalists in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2014. The VGG architecture is known for its simplicity and effectiveness, making it a popular choice for image classification tasks.

Here is a detailed explanation of the VGG model:

**1. Architecture:**
   - The VGG model consists of multiple layers, primarily using 3x3 convolutional filters, with max-pooling layers interspersed. The key idea behind VGG is to use very small 3x3 convolutional filters repeatedly, which allows the network to learn complex features while keeping the number of parameters manageable.

**2. Layer Configurations:**
   - There are several variations of the VGG architecture, denoted as VGG11, VGG13, VGG16, and VGG19, depending on the depth of the network. The numbers in the names represent the total number of weight layers (including both convolutional and fully connected layers).

   - For example, VGG16 consists of 16 weight layers: 13 convolutional layers followed by 3 fully connected layers.

**3. Convolutional Blocks:**
   - Each convolutional block in VGG typically consists of two or more 3x3 convolutional layers followed by a max-pooling layer. The convolutional layers are usually followed by Rectified Linear Unit (ReLU) activation functions to introduce non-linearity.

   - After a series of convolutional blocks, the spatial dimensions of the feature maps decrease, while the depth increases.

**4. Max-Pooling:**
   - Max-pooling layers are used to reduce the spatial dimensions of the feature maps while retaining the most important information. In VGG, max-pooling is typically performed using 2x2 windows with a stride of 2.

**5. Fully Connected Layers:**
   - After the convolutional layers, VGG includes fully connected layers for final classification. These fully connected layers are followed by softmax activation to produce class probabilities.

**6. Number of Parameters:**
   - One of the characteristics of VGG is its simplicity in terms of architectural design. However, this simplicity comes at the cost of a large number of parameters, especially in deeper versions like VGG16 and VGG19. This can make training and deploying these models computationally expensive.

**7. Pre-trained Models:**
   - Due to its effectiveness, VGG models pre-trained on large image datasets like ImageNet are often used as the starting point for various computer vision tasks. Researchers and practitioners fine-tune these pre-trained models on specific datasets for tasks like object detection, image segmentation, and car brand identification.

**8. Limitations:**
   - While VGG was groundbreaking when it was introduced, newer architectures like ResNet and Inception have since surpassed it in terms of accuracy and efficiency. VGG's depth and parameter count can make it less suitable for real-time applications or resource-constrained environments.

In summary, the VGG model is a deep convolutional neural network architecture known for its simplicity and effectiveness in image classification tasks. It uses small 3x3 convolutional filters repeatedly to learn features from images and has different variations depending on the depth of the network. Pre-trained VGG models have been widely used as a starting point for various computer vision tasks, making them valuable tools in the field of deep learning.


Dataset Link:https://drive.google.com/drive/folders/18zhtVKfTju_rz7XnGe_QRpASijOdBqBh?usp=sharing


In [None]:
from google.colab import drive

In [None]:
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import os

drive_root = '/content/drive/My Drive'
os.listdir(drive_root)

['file1.pdf',
 'WhatsApp Image 2023-01-16 at 12.55.13 (1).jpeg',
 'WhatsApp Image 2023-01-16 at 12.55.13 (2).jpeg',
 'cir_new.pdf',
 'WhatsApp Image 2023-01-16 at 12.55.13.jpeg',
 'WhatsApp Image 2023-01-16 at 12.55.13 (3).jpeg',
 'DAESI_interview.pdf',
 'resume_siva_compressed.pdf',
 'Document from Kumar Singh Sharma (1).pdf',
 'Document from Kumar Singh Sharma (1)',
 'Photo from Kumar Singh Sharma (1)',
 'Photo from Kumar Singh Sharma',
 'Document from Kumar Singh Sharma.pdf',
 'Document from Kumar Singh Sharma',
 'kimaqr bolld (1).docx',
 'WhatsApp Image 2023-07-16 at 22.10.38.jpeg',
 'CERTIFICATES (1)-1.pdf',
 'WhatsApp Image 2023-07-16 at 22.10.38 (1).jpeg',
 'sraco.jpeg',
 'preprocessed_dataset.zip',
 'my_dataset',
 'siva.gslides',
 'sivakumar_resume.pdf',
 'sivakumar_resume.gdoc',
 'Colab Notebooks',
 'datasets']

In [None]:
from keras.applications import VGG16

In [None]:
img_rows, img_cols = 224, 224

In [None]:
model = VGG16(weights = 'imagenet',
                 include_top = False,
                 input_shape = (img_rows, img_cols, 3))

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5


In [None]:
# Layers are set to trainable as True by default
for layer in model.layers:
    layer.trainable = False

# Let's print our layers
for (i,layer) in enumerate(model.layers):
    print(str(i) + " "+ layer.__class__.__name__, layer.trainable)


0 InputLayer False
1 Conv2D False
2 Conv2D False
3 MaxPooling2D False
4 Conv2D False
5 Conv2D False
6 MaxPooling2D False
7 Conv2D False
8 Conv2D False
9 Conv2D False
10 MaxPooling2D False
11 Conv2D False
12 Conv2D False
13 Conv2D False
14 MaxPooling2D False
15 Conv2D False
16 Conv2D False
17 Conv2D False
18 MaxPooling2D False


In [None]:

def add_layer(bottom_model, num_classes):
    """creates the top or head of the model that will be
    placed ontop of the bottom layers"""

    top_model = bottom_model.output
    top_model = GlobalAveragePooling2D()(top_model)
    top_model = Dense(1024,activation='relu')(top_model)
    top_model = Dense(512,activation='relu')(top_model)
    top_model = Dense(num_classes,activation='softmax')(top_model)
    return top_model


In [None]:
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten, GlobalAveragePooling2D
from keras.layers import Conv2D, MaxPooling2D, ZeroPadding2D
from keras.models import Model


In [None]:
num_classes = 9

In [None]:
FC_Head = add_layer(model, num_classes)

modelnew = Model(inputs = model.input, outputs = FC_Head)

print(modelnew.summary())

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 block1_conv1 (Conv2D)       (None, 224, 224, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 224, 224, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)      0         
                                                                 
 block2_conv1 (Conv2D)       (None, 112, 112, 128)     73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 112, 112, 128)     147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 56, 56, 128)       0     

In [None]:
from keras.preprocessing.image import ImageDataGenerator

In [None]:
train_data_dir = '/content/drive/MyDrive/datasets/train'
validation_data_dir = '/content/drive/MyDrive/datasets/validation'

In [None]:
# Let's use some data augmentaiton
train_datagen = ImageDataGenerator(
      rescale=1./255,
      rotation_range=45,
      width_shift_range=0.3,
      height_shift_range=0.3,
      horizontal_flip=True,
      fill_mode='nearest')

In [None]:
validation_datagen = ImageDataGenerator(rescale=1./255)

In [None]:
batch_size = 32

In [None]:
train_generator = train_datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_rows, img_cols),
        batch_size=batch_size,
        class_mode='categorical')

Found 3040 images belonging to 9 classes.


In [None]:
validation_generator = validation_datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_rows, img_cols),
        batch_size=batch_size,
        class_mode='categorical')

Found 15151 images belonging to 9 classes.


In [None]:
from keras.optimizers import RMSprop
from keras.callbacks import ModelCheckpoint, EarlyStopping

In [None]:

checkpoint = ModelCheckpoint("/root/face_vgg16.h5",
                             monitor="val_loss",
                             mode="min",
                             save_best_only = True,
                             verbose=1)

In [None]:
earlystop = EarlyStopping(monitor = 'val_loss',
                          min_delta = 0,
                          patience = 3,
                          verbose = 1,
                          restore_best_weights = True)

In [None]:
callbacks = [earlystop, checkpoint]

In [None]:
modelnew.compile(loss = 'categorical_crossentropy',
              optimizer = RMSprop(learning_rate = 0.001),
              metrics = ['accuracy'])

In [None]:
nb_train_samples = 300
nb_validation_samples = 50

In [None]:
epochs = 5
batch_size = 32

In [None]:
history = modelnew.fit(
    train_generator,
    steps_per_epoch = nb_train_samples // batch_size,
    epochs = epochs,
    callbacks = callbacks,
    validation_data = validation_generator,
    validation_steps = nb_validation_samples // batch_size )


Epoch 1/5




Epoch 1: val_loss improved from inf to 4.02405, saving model to /root/face_vgg16.h5


  saving_api.save_model(


Epoch 2/5
Epoch 2: val_loss improved from 4.02405 to 3.35139, saving model to /root/face_vgg16.h5
Epoch 3/5
Epoch 3: val_loss did not improve from 3.35139
Epoch 4/5
Epoch 4: val_loss did not improve from 3.35139
Epoch 5/5

Epoch 5: val_loss did not improve from 3.35139
Epoch 5: early stopping


In [None]:
modelnew.save("/root/face_vgg16.h5")

In [None]:
final_accuracy=history.history["val_accuracy"][-1]
print(final_accuracy)

0.0625


In [None]:
from keras.preprocessing.image import ImageDataGenerator

# Define the directory where your dataset is located
data_dir = '/content/drive/MyDrive/datasets/train'  # Replace with the path to your dataset

# Create an ImageDataGenerator for the dataset
datagen = ImageDataGenerator(rescale=1./255)

# Create a generator for the dataset
generator = datagen.flow_from_directory(
    data_dir,
    target_size=(224, 224),  # Set the target image size
    batch_size=32,           # Set the batch size
    class_mode='categorical' # Set the class mode
)

# Get the class names from the generator
class_names = list(generator.class_indices.keys())

# Print the class names
print("Class Names:", class_names)


Found 3040 images belonging to 9 classes.
Class Names: ['Ford', 'Honday', 'Hyundai', 'Nissan', 'Renault', 'Suzuki', 'Tata', 'Toyota', 'Volkswagen']


In [None]:
from keras.preprocessing import image
import numpy as np

# Load the trained model
from keras.models import load_model
modelnew = load_model("/root/face_vgg16.h5")

# Define a function to predict the class of an image
def predict_class(model, image_path):
    img = image.load_img(image_path, target_size=(224, 224))
    img = image.img_to_array(img)
    img = np.expand_dims(img, axis=0)
    img = img / 255.0  # Normalize the image data if needed

    prediction = model.predict(img)
    class_index = np.argmax(prediction, axis=1)
    return class_index[0]

image_path='/content/drive/MyDrive/datasets/test/1.png' #Replace with your test image filepath
predicted_class = predict_class(modelnew, image_path)

predicted_class_name = class_names[predicted_class]

print("Predicted Class Name:", predicted_class_name)


Predicted Class Name: Suzuki
