# Computer vision: Classifying car make, model and year

Computer vision could potentially be used to automate traffic censuses and other tasks that require identification of vehicles.
The <a href="https://www.tensorflow.org/datasets/catalog/cars196">cars196</a> dataset contains 16,185 images of 196 different types of cars, which
can be used to train a supervised learning system to determine the make and model of a vehicle in a photograph.

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
import tensorflow_datasets as tfds
from matplotlib import pyplot as plt

In [None]:
####NOTE THIS IS CODE THAT MUST BE RUN ON KAGGL.
#If NON KAGGLE environment then CHANGE DATA_DIR to the location of DATASET and download = true
#best working in Kaggle

DATA_DIR = '/kaggle/input/cars196'

[train_ds, test_ds], ds_info = tfds.load(
    "cars196",
    # Reserve 10% for validation and 50% for test
    #loads dataset into tf.load.dataset (Luckily TFDS)
    split=["train", "test"],
    as_supervised=True,  # Include labels
    with_info=True,
    download=False,
    data_dir=DATA_DIR,  #Kaggle input
)

Now, let's use the built-in visualization function to show some example images:

In [None]:
tfds.visualization.show_examples(train_ds, ds_info)

In [1]:
train_ds

NameError: name 'train_ds' is not defined

## Standardizing the data
Our raw images have a variety of sizes. In addition, each pixel consists of 3 integer values between 0 and 255 (RGB level values). This isn't a great fit for feeding a neural network. We need to do 2 things:

* Standardize to a fixed image size. We pick 150x150.
* Normalize pixel values between -1 and 1. We'll do this using a Normalization layer as part of the model itself.

In general, it's a good practice to develop models that take raw data as input, as opposed to models that take already-preprocessed data. The reason being that, if your model expects preprocessed data, any time you export your model to use it elsewhere (in a web browser, in a mobile app), you'll need to reimplement the exact same preprocessing pipeline. This gets very tricky very quickly. So we should do the least possible amount of preprocessing before hitting the model.

Here, we'll do image resizing in the data pipeline (because a deep neural network can only process contiguous batches of data), and we'll do the input value scaling as part of the model, when we create it.



In [None]:
train_ds

In [None]:
#resize images to 150x150:
height, width = 150, 150
#size = (150, 150)

train_ds = train_ds.map(lambda x, y: (tf.image.resize(x, (height, width)), y))
#validation_ds = validation_ds.map(lambda x, y: (tf.image.resize(x, size), y))
test_ds = test_ds.map(lambda x, y: (tf.image.resize(x, (height, width)), y))
train_ds

## Preprocessing: Resizing and random data augmentation

When you don't have a large image dataset, it's a good practice to artificially introduce sample diversity by applying random yet realistic transformations to the training images, such as random horizontal flipping or small random rotations. This helps expose the model to different aspects of the training data while slowing down overfitting.

Additionally, let's the data and use caching and prefetching to optimize load speed:

In [None]:
batch_size = 32 # sdantard batch size for images

def augment_func(image,label):
    image = tf.image.resize_with_crop_or_pad(image,height+6,width+6)
    #image = tf.clip_by_value(image,0,255) #make sure you have no color value higher than 225 or lower than 0.
    image = tf.image.random_crop(image,size=[height,width,3])
    image = tf.image.random_flip_left_right(image) #different aspect of vehicles
    image = tf.image.random_hue(image,0.2) #random color, change a red cat into a blue car
    image = tf.image.random_contrast(image,0.5,2)# random contrast
    image = tf.image.random_saturation(image,0,2)# random sturations
    return image, label


train_ds = train_ds.cache().map(augment_func).shuffle(100).batch(batch_size).prefetch(buffer_size=10) # cache makes the images ready before running
test_ds = test_ds.cache().map(augment_func).batch(batch_size).prefetch(buffer_size=10)

Let's visualize what the first 18 images of the first batch looks like after various random transformations.

Note that because the augmentations in the previous cell are applied randomly, these images will look different everytime they are run through the model during training.

In [None]:
plt.figure(figsize=(10, 20))
for i, (image_batch, label) in enumerate(train_ds.take(18)): # we did batch(batch_size) before, if we didn't, "take" will take individual image.
        ax = plt.subplot(6, 3, i + 1)
        plt.imshow(image_batch[3].numpy().astype("int32")) #tensor flow treats things as floating numbers, but images need integer.
        plt.title(ds_info.features["label"].names[int(label[3])])
        plt.axis("off")


In [None]:
train_ds

## Build a model

Now let's built a model.

1. We add a Normalization layer to scale input values (initially in the [0, 255] range) to the [-1, 1] range, because this is the format that is expected by the pre-trained model that comes next.
1. We start with a pre-trained model that's trained on the [ImageNet](http://image-net.org/about-overview) dataset, which includes a large number of images with a large number of different labels, but doesn't not include as much specificity regarding vehicle types as the cars196 dataset does. Training these models from scratch is tricky; it is much easier to start with a pre-trained model and fine tune it for use for a different task.
3. We add our own classification layer at the end of the model, with 96 outputs representing our 96 vehicle classes, and "softmax" activation which forces the output values to all be between 0 and 1, and to all sum to 1.
4. We add a Dropout layer before the above classification layer, for regularization.


We need the number of outputs in the final layer to equal the number of variables or classes we want to predict: in this case, 196 vehicle types. 
We use a softmax activation on the final layer for classification problems, but if we want to use this model for regression we would only have to change the number of desired outputs and set `activation=None`.

In [None]:
base_model = tf.keras.applications.Xception(
    weights="imagenet",  # Load weights pre-trained on ImageNet.
    input_shape=(height, width, 3),
    include_top=False, # Do not include the final ImageNet classifier layer at the top.
)  

base_model.trainable = False # We freeze the base model

# Create new model on surrounding our pretrained base model.
inputs = tf.keras.Input(shape=(height, width, 3))

# Pre-trained Xception weights requires that input be normalized
# from (0, 255) to a range (-1., +1.), the normalization layer
# does the following, outputs = (inputs - mean) / sqrt(var)
norm_layer = keras.layers.experimental.preprocessing.Normalization()
mean = np.array([127.5] * 3)
var = mean ** 2
# Scale inputs to [-1, +1]
x = norm_layer(inputs)
norm_layer.set_weights([mean, var])

# The base model contains batchnorm layers. We want to keep them in inference mode
# when we unfreeze the base model for fine-tuning, so we make sure that the
# base_model is running in inference mode here.
# during inference, the batch normalization acts as a simple linear transformation of what comes out of the previous layer, often a convolution.
#  normalize its inputs during inference after having been trained on data that has similar statistics as the inference data.
#The layer will transform inputs so that they are standardized, meaning that they 
# will have a mean of zero and a standard deviation of one. During training, 
# the layer will keep track of statistics for each input variable and use them to standardize the data.
# This has the effect of stabilizing the learning process and dramatically reducing the number of training epochs required to train deep networks.
x = base_model(x, training=False)
x = keras.layers.GlobalAveragePooling2D()(x) # this is a neural network operation to help adapt the features learned by the pretrained model to our specific task. 
                                             #he 2D Global average pooling block takes a tensor of size (input width) x (input height) x (input channels) 
                                            #and computes the average value of all values across the entire (input width) x (input height) matrix for each of the (input channels).
                                            #designed to replace fully connected layers in classical CNNs. The idea is to generate one feature map for each corresponding category of the classification task in the last mlpconv layer.
x = keras.layers.Dropout(0.5)(x)  # Regularize with dropout
num_outputs = ds_info.features['label'].num_classes # This is the number of output variables we want, 196 in this case.
outputs = keras.layers.Dense(num_outputs, activation="softmax")(x) # Use activation=softmax for classification, and activation=None for regression.
model = keras.Model(inputs, outputs)

model.summary()

In [None]:
model.compile(optimizer='adam',
                loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                metrics=['sparse_categorical_accuracy'])

epochs = 100
model.fit(train_ds, epochs=epochs,validation_data = test_ds)

## Fine-tune the model

We use a relatively low learning rate to prevent the model from unlearning what it learned when being trained on the larger imagenet dataset.

In [None]:
# Unfreeze the base_model. Note that it keeps running in inference mode
# since we passed `training=False` when calling it. This means that
# the batchnorm layers will not update their batch statistics.
# This prevents the batchnorm layers from undoing all the training
# we've done so far.
base_model.trainable = True
model.summary()

model.compile(optimizer=keras.optimizers.Adam(1e-5),
                loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                metrics=['sparse_categorical_accuracy'])

epochs = 100
model.fit(train_ds, epochs = epochs, validation_data = test_ds)

Finally, we can save the model for later use. If you are doing this on Kaggle, there is an option to download the saved file in the panel on the right side of the screen.

In [None]:
model.save("model.h5", save_format="h5")