# Stanford Dogs - A Classfication problem

Classification is a fundamental task in machine learning, and the Stanford Dogs Dataset provides a valuable resource for training and evaluating classification models. The dataset consists of images of various dog breeds, each labeled with the corresponding breed.

By leveraging this dataset, we can develop a classification model that can accurately identify the breed of a given dog image. This can have practical applications in areas such as pet identification, animal welfare, and breed-specific research.

To build a classification model using the Stanford Dogs Dataset, we can employ various machine learning techniques, such as convolutional neural networks (CNNs). CNNs are particularly effective for image classification tasks, as they can automatically learn relevant features from the input images.

By training a CNN on the Stanford Dogs Dataset, we can teach the model to recognize distinctive patterns and characteristics of different dog breeds. Once trained, the model can be used to classify new dog images, providing predictions about the breed with a certain level of confidence.

Evaluation of the classification model can be done using metrics such as accuracy, precision, recall, and F1 score. These metrics help assess the model's performance and determine its effectiveness in correctly classifying dog breeds.

Overall, the Stanford Dogs Dataset offers a valuable opportunity to explore and develop classification models for dog breed identification. By leveraging this dataset and employing appropriate machine learning techniques, we can contribute to the field of computer vision and enhance our understanding of dog breeds.

## 00 - Preprocessing ⚙️

The dataset is split into two parts - Images and Annotations. 

The **Images** are pictures of the 120 different dog breeds present in the dataset. 
The **Annotations** are `.xml`-files, which contains information about where the dog is located in the different pictures and what breed it is.

So first of all we need to load all of these informations into Python, so they can be used to train our model.

In [1]:
# Just imports
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator, img_to_array, load_img
import xml.etree.ElementTree as ET
import os
import pandas as pd

# Constants
dataset_dir = '.\\data'
images_dir = os.path.join(dataset_dir, 'images')
annotation_dir = os.path.join(dataset_dir, 'annotations')

In [2]:
# FUNCTION - Parse the annotations from their XML files to extract the image filename and label
def parse_annotations(annotation_dir):
    annotations = []
    for root_dir, _, files in os.walk(annotation_dir):
        for xml_file in files:
            if xml_file.endswith('.xml'):
                xml_path = os.path.join(root_dir, xml_file)
                tree = ET.parse(xml_path)
                root = tree.getroot()
                
                # Extract the folder and filename
                folder = root.find('folder').text.strip()

                #? If the folder is not present, then use the folder of the XML file
                if not folder or folder == '%s':
                    folder = os.path.basename(root_dir)

                #? Check if the folder name contains an 'n' at the start, if not, add it
                if not folder.startswith('n'):
                    folder = 'n' + folder

                filename = root.find('filename').text.strip() + '.jpg'  # Add .jpg extension

                #? If the filename is not present, then use the filename from the XML file
                if not filename or filename == '%s.jpg':
                    filename = xml_file.replace('.xml', '.jpg')

                
                # Construct the full image filename
                image_filename = os.path.join(folder, filename)
                
                # Extract the label
                label = root.find('object').find('name').text.strip()
                
                # Append the annotation to the list
                annotations.append((image_filename, label))
    return annotations

annotations = parse_annotations(annotation_dir)

# * Convert annotations to DataFrame
annotations_df = pd.DataFrame(annotations, columns=['filename', 'label'])

# * Show the first 10 rows of the Annotation DataFrame
print("\nShow paths to the first 5 images")
for i in range(5):
    print(annotations_df['filename'].iloc[i])


# Find the row with the specified filename
filename_to_find = 'n%s\%s.jpg'  # Replace with the actual filename you're looking for
matching_row = annotations_df[annotations_df['filename'] == filename_to_find]

# Display the matching row
if not matching_row.empty:
    display(matching_row)
else:
    print(f"No match found for filename: {filename_to_find}")



  filename_to_find = 'n%s\%s.jpg'  # Replace with the actual filename you're looking for


In [None]:
# * Load an image using the Keras load_img function
def load_image(image_path):
    img = load_img(image_path, target_size=(224, 224))
    return img_to_array(img) / 255.0

# Create a dictionary to map filenames to labels
filename_to_label = dict(zip(annotations_df['filename'], annotations_df['label']))

# Show the first 5 items in the dictionary
print("\nShow the first 5 items in the dictionary")
for i, (filename, label) in enumerate(filename_to_label.items()):
    print(f"{filename}: {label}")
    if i == 4:
        break

# List all image in the image directory


image_paths = [os.path.join(images_dir, f) for f in filename_to_label.keys()]

# ? Check if the image paths are correct
print("\nShow paths to the first 5 images")
for i in range(5):
    print(image_paths[i])

# Check if any image paths contain 'n%\%s.jpg'
for path in image_paths:
    if '.\\data\\images\\n%s\\%s.jpg' in path:
        print(f"Found in path: {path}")



# Create lists of images and labels
images = [load_image(img_path) for img_path in image_paths]
print(f"Loaded {len(images)} images")
labels = [filename_to_label[os.path.relpath(img_path, images_dir)] for img_path in image_paths]
print(f"Loaded {len(labels)} labels")



Show the first 5 items in the dictionary
n02085620\n02085620_10074.jpg: Chihuahua
n02085620\n02085620_10131.jpg: Chihuahua
n02085620\n02085620_10621.jpg: Chihuahua
n02085620\n02085620_1073.jpg: Chihuahua
n02085620\n02085620_10976.jpg: Chihuahua

Show paths to the first 5 images
.\data\images\n02085620\n02085620_10074.jpg
.\data\images\n02085620\n02085620_10131.jpg
.\data\images\n02085620\n02085620_10621.jpg
.\data\images\n02085620\n02085620_1073.jpg
.\data\images\n02085620\n02085620_10976.jpg
Loaded 20580 images
Loaded 20580 labels


In [None]:
# Convert labels to categorical format
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.utils import to_categorical
import numpy as np

label_encoder = LabelEncoder()
labels_encoded = label_encoder.fit_transform(labels)
labels_categorical = to_categorical(labels_encoded)

# Split into training and validation sets
from sklearn.model_selection import train_test_split

X_train, X_val, y_train, y_val = train_test_split(images, labels_categorical, test_size=0.2, random_state=42)

In [None]:
import numpy as np

# Convert lists to numpy arrays
X_train = np.array(X_train)
X_val = np.array(X_val)
y_train = np.array(y_train)
y_val = np.array(y_val)

# Define the model (using ResNet50 as an example)
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D

base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(len(label_encoder.classes_), activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)

# Freeze the base model layers
for layer in base_model.layers:
    layer.trainable = False

# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history = model.fit(
    X_train, y_train,
    validation_data=(X_val, y_val),
    epochs=10,
    batch_size=32
)

# Fine-tune the model by unfreezing some layers
for layer in base_model.layers[-50:]:
    layer.trainable = True

model.compile(optimizer=tf.keras.optimizers.Adam(1e-5), loss='categorical_crossentropy', metrics=['accuracy'])

history_fine = model.fit(
    X_train, y_train,
    validation_data=(X_val, y_val),
    epochs=10,
    batch_size=32
)

# Evaluate the model
val_loss, val_accuracy = model.evaluate(X_val, y_val)
print(f'Validation accuracy: {val_accuracy * 100:.2f}%')

Epoch 1/10
[1m515/515[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m617s[0m 1s/step - accuracy: 0.0104 - loss: 4.8539 - val_accuracy: 0.0134 - val_loss: 4.7618
Epoch 2/10
[1m515/515[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m594s[0m 1s/step - accuracy: 0.0156 - loss: 4.7366 - val_accuracy: 0.0160 - val_loss: 4.7181
Epoch 3/10
[1m515/515[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m587s[0m 1s/step - accuracy: 0.0185 - loss: 4.6993 - val_accuracy: 0.0177 - val_loss: 4.7110
Epoch 4/10
[1m515/515[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m633s[0m 1s/step - accuracy: 0.0197 - loss: 4.6802 - val_accuracy: 0.0187 - val_loss: 4.7044
Epoch 5/10
[1m515/515[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m622s[0m 1s/step - accuracy: 0.0215 - loss: 4.6733 - val_accuracy: 0.0207 - val_loss: 4.6870
Epoch 6/10
[1m515/515[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m597s[0m 1s/step - accuracy: 0.0228 - loss: 4.6658 - val_accuracy: 0.0199 - val_loss: 4.6764
Epoch 7/10
[1m515/515

## 01 - Compiling the model 🔧

The next step in the process is to compile the model itself. But before that we have define what **Loss function**, **Optimizer** and **Metrics** we are going to be using on this model.

For the **Loss function** We have a few different options:

(*Name a few different loss functions that would make sense to use for this project.*)

For the **Optizimers** we also have a few different options:
- *Adam*, *SGD*, *RMSProp* etc.

For the **Metrcis** we also have a few different options:
- *Accuarcy*, *PRecision*, *Recall*, *F1 score* etc.


## 02 - Train the model 🧠

The next step in the process is to train the now compiled model on our data. Here we also have a little exploratory work in figuring out:
- What *batch size* should we use?
- What *number of epochs* should we use?
- Is the model *overfitting* or *underfitting*?



## Futher plan!

1. **Choose the model architecture suitable for our problem** 🤔
    - Convolutional Neural Network (CNN - Good with Image data)
    - Recurrent Neural Network (RNN - Good with sequence data)
    - Another type??

2. **Compile our model** 🔧
    - What *Loss function* should we use? - Cross-entropy is used for classification?
    - What *Optimizer* should we use? Adam, SGD, RMSProp etc.
    - What *Metrics* should we use? Accuracy, precision, recall, f1 score etc.

3. **Train the model** ⚙️
    - What *batch size* should we use?
    - What *number of epochs* should we use?
    - Is the model *overfitting* or *underfitting*?

4. **Evalute the model** 📊
    - Is the model performing as we would like? Based upon our selected metrics to be unbiased 😉

5. **Tune Hyperparameter (Optional) - To improve performance** 📈
    - Use grid search or another thing similar to find the best hyperparameters
    - Adjust model layers, units, learning rate etc.

6. **Save the Model (Optional) - But would be smart** 🧠
    - This can be done, so we don't have to run all the code later to get the model up and running!

7. **Use the Model!**