# Building A Convolutional Image Classifier With Keras and Tensorflow

This project revolves around useing modern deep-learning networks to build an image classifier with Keras. We will design our own custom convnet with reusable blocks and perform visual feature extraction. We will also use transfer learning to boost our model and utilzie data augmentation to extend our dataset. 

Before we begin, let's break down the theory behind our project so we get a better understanding. 

The goal of our project is to design a neural network which can "understand" a natural image well-enough to solve the same kinds of problems the human visual system can solve. 

There are many neural networks (eg. RNNs, GNNs, CNNs), all utilized for different purposes and applications in machine learning. For example, Recurrent Neural Networks (RNNs) are great for text based classification tasks. This can include objectives like sentiment analysis. The neural networks that are best for image classification are called convolutional neural networks (CNN or convnet). 

A CNN consists of two parts: a convolutional base and a dense head. 

The base is used to extract the features from an image. WHat does this mean? Each convolutional layer applies filters (small matrices) that detect specific patterns in different part of the image. These filters help break down the image into different levels of abstraction. 
- First layers detect basic features (edges, corners, textures). 
- Middle layers detect more complex structures (shapes, object). 
- Deeper layers recognize high-level features (eg. faces, cats, cars).

Each layer transforms the image into multiple feature maps, which are "filtered versions" of the image, highlighting different aspects. By gradually learning from low-level details to high-level concepts, the CNN builds an abstract understanding of the image inputted allowing the classification to become easier. 

The head now recieves meaninfgul structured information from the base instead of raw data/pixels. This allows it to make an educated guess. 

Now, during training, we want our network to learn two things. 
1. which features to extract from an image 
2. which class goes with what features

CNNs are rarely trained from scratch and a more common approach is to reuse the base of a pretrained model. To the pretrained base, we can then attach an untrained head. In other wrods, we reuse thepart of a network that has already learned to extract features adn attach it to some fresh layers to learn. 

Enough talking, let's get coding! 

## Step 1 - Loading the Data

In [None]:
# Import required libraries
import os
import warnings
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing import image_dataset_from_directory
import kagglehub

# Reproducibility
def set_seed(seed=31415):
    np.random.seed(seed)
    tf.random.set_seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
    os.environ['TF_DETERMINISTIC_OPS'] = '1'
set_seed()

# Set Matplotlib defaults
plt.rc('figure', autolayout=True)
plt.rc('axes', labelweight='bold', labelsize='large', titleweight='bold', titlesize=18, titlepad=10)
plt.rc('image', cmap='magma')
warnings.filterwarnings("ignore")  # to clean up output cells

# Download the dataset using KaggleHub API
path = kagglehub.dataset_download("ryanholbrook/car-or-truck")
print("Path to dataset files:", path)

# Check if the dataset directory exists
if not os.path.exists(path):
    raise FileNotFoundError(f"Dataset path does not exist: {path}")

# Check the contents of the dataset directory
print("Contents of dataset directory:", os.listdir(path))

# Load training and validation sets
ds_train_ = image_dataset_from_directory(
    os.path.join(path, 'train'),
    labels='inferred',
    label_mode='binary',
    image_size=[128, 128],
    interpolation='nearest',
    batch_size=64,
    shuffle=True,
)

ds_valid_ = image_dataset_from_directory(
    os.path.join(path, 'valid'),
    labels='inferred',
    label_mode='binary',
    image_size=[128, 128],
    interpolation='nearest',
    batch_size=64,
    shuffle=False,
)

# Data Pipeline
def convert_to_float(image, label):
    image = tf.image.convert_image_dtype(image, dtype=tf.float32)
    return image, label

AUTOTUNE = tf.data.experimental.AUTOTUNE
ds_train = (
    ds_train_
    .map(convert_to_float)
    .cache()
    .prefetch(buffer_size=AUTOTUNE)
)
ds_valid = (
    ds_valid_
    .map(convert_to_float)
    .cache()
    .prefetch(buffer_size=AUTOTUNE)
)


: 