### Clarification about the data

#### I'd like to explain that I collected some more images of the four breeds that we want to predict. And I used the images provided by Pento to check, at the end, if the model predicts them correctly. I also used a picture of a Dobermann, to see if the predicted class was 'other'.

## Imports

In [None]:
import os
import tarfile
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.preprocessing.image import img_to_array, load_img, ImageDataGenerator
from tensorflow.keras.applications.inception_v3 import InceptionV3, preprocess_input
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, Dropout
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.metrics import precision_score
from tensorflow.keras.models import load_model
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import pickle

## We expand the data

In [None]:
# Path of the file
tar_file_path = './stanford_dogs/images.tar'

# Folder where the files will be extracted
extract_path = './stanford_dogs/'

# Extract the file
with tarfile.open(tar_file_path, 'r') as tar:
    tar.extractall(path=extract_path)
    print(f"Files extracted in: {extract_path}")

## Load the data obtained

In [None]:
# Define the route to the main folder that contains the classes
base_dir = r'.\stanford_dogs\images\filtered_images'

# Create an empty list to save the images and the labels
img_dataset = []
labels = []

# Classes that we need
class_mapping = {
    'n02099601-golden_retriever': 'Golden Retriever',
    'n02106662-German_shepherd': 'German Shepherd',
    'n02108915-French_bulldog': 'French Bulldog',
    'poodle': 'Poodle'
}

# Run through every class folder
for class_name in os.listdir(base_dir):
    class_dir = os.path.join(base_dir, class_name)
    
    # Make sure that is a directory
    if os.path.isdir(class_dir):
        print(f"Processing class: {class_name}")
        
        # Obtaine the grouped label
        label = class_mapping.get(class_name)
        
        # Run every image file in the subfolder
        for img_file in os.listdir(class_dir):
            img_path = os.path.join(class_dir, img_file)
            
            try:
                # Load and conver the image in a numpy array
                img = load_img(img_path, target_size=(299, 299))  # Re-dimension to 299x299
                img_array = img_to_array(img)
                
                # Add the image to the dataset
                img_dataset.append(img_array)
                labels.append(label)  # Save the corresponding label
                
            except Exception as e:
                print(f"Error when processing image {img_file}: {e}")

# Conver the list of images to a numpy array
if img_dataset:
    img_dataset = np.array(img_dataset)
    labels = np.array(labels)

    print(f"Dataset of images: {img_dataset.shape}")
    print(f"Labels: {labels.shape}")
else:
    print("Images were not found in the specified folder.")


## Load data, preprocess, training and validation

In [None]:
# Define the route to the main folder that contains all the classes
base_dir = r'./stanford_dogs/images/filtered_images'

# Create a datagenerator with preprocess_input for InceptionV3
datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,  
    validation_split=0.15,  
    horizontal_flip=True,
    rotation_range=30,
    width_shift_range=0.2,
    height_shift_range=0.2,
    zoom_range=0.2,
    shear_range=0.2,
    brightness_range=[0.7, 1.3]
)

# Training data generator
train_generator = datagen.flow_from_directory(
    base_dir,
    target_size=(299, 299),  # Tamaño requerido por InceptionV3
    batch_size=32,
    class_mode='categorical',
    subset='training'
)

# Validation data generator
validation_generator = datagen.flow_from_directory(
    base_dir,
    target_size=(299, 299),
    batch_size=32,
    class_mode='categorical',
    subset='validation'
)

## Load a pre-trained model, add new personalized layers, create final model and compile

In [None]:
# Load the inceptionv3 model withouth the top layers (include_top = False)
base_model = InceptionV3(weights='imagenet', include_top=False)

# Add new personalized layers
x = base_model.output
x = GlobalAveragePooling2D()(x)  
x = Dropout(0.5)(x) 
x = Dense(1024, activation='relu')(x)  
x = Dropout(0.5)(x) 
predictions = Dense(4, activation='softmax')(x)  

# Create the final model
model = Model(inputs=base_model.input, outputs=predictions)

# Freeze the InceptionV3 layers so we don't train them
for layer in base_model.layers:
    layer.trainable = False

# Compile the model with Adam, and the metrics: accuracy and precision
model.compile(optimizer=Adam(learning_rate=0.0001),
              loss='categorical_crossentropy',
              metrics=['accuracy', tf.keras.metrics.Precision()])

model.summary()

## Train the model

In [None]:
# Add EarlyStopping
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

# Train the model
history = model.fit(
    train_generator,
    epochs=20,
    validation_data=validation_generator,
    callbacks=[early_stopping]
)



# Save the history
with open('training_history.pkl', 'wb') as f:
    pickle.dump(history.history, f)


# Evaluate the model
val_loss, val_acc, val_prec = model.evaluate(validation_generator)
print(f"Accuracy in validation set: {val_acc * 100:.2f}%")
print(f"Precision in validation set: {val_prec * 100:.2f}%")

## Save the created model

In [None]:
# Save the model in the specified folder

model.save(r'./pento-ssr-challenge/my_model.keras')

## Load the saved model from the specified folder

In [None]:
# Load the model from the specified folder
model = load_model(r'./pento-ssr-challenge/my_model.keras')

## Plot the accuracy / precision vs epoch and Loss Vs Epoch

In [None]:
def plot_metrics(history):
    # Obtener métricas
    loss = history.history.get('loss', [])
    val_loss = history.history.get('val_loss', [])
    accuracy = history.history.get('accuracy', [])
    val_accuracy = history.history.get('val_accuracy', [])
    precision = history.history.get('precision_2', []) 
    val_precision = history.history.get('val_precision_2', [])  
    
    # Create a figure with sub-graphics
    plt.figure(figsize=(14, 7))

    # Loss chart
    plt.subplot(1, 2, 1)
    plt.plot(loss, label='Loss')
    plt.plot(val_loss, label='Val Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.title('Loss vs Epoch')
    plt.legend()

    # Precission chart
    plt.subplot(1, 2, 2)
    plt.plot(accuracy, label='Accuracy')
    plt.plot(val_accuracy, label='Val Accuracy')
    plt.plot(precision, label='Precision')
    plt.plot(val_precision, label='Val Precission')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy / Precission')
    plt.title('Accuracy / Precission vs Epoch')
    plt.legend()

    plt.show()

# Call the function with the history of training
plot_metrics(history)

In [None]:
# Path to the image file
image_path = r'./pento-ssr-challenge/metrics.png'

# Load the image
img = mpimg.imread(image_path)

# Display the image
plt.imshow(img)
plt.axis('off')  # Optional: Hide the axes
plt.show()


### Here, we define a classify_images function. It sets a probability threshold to classificate if the image of the dog belongs to one of our classes or if it should belong to 'other'.

In [None]:
def classify_images(image_dir):
    # Load the model
    model = tf.keras.models.load_model('./pento-ssr-challenge/my_model.keras')

    # Define class labels
    class_labels = {0: 'golden_retriever', 1: 'german_shepherd', 2: 'french_bulldog', 3: 'poodle', 4: 'toy_poodle', 5: 'standard_poodle'}
    
    # Define the probability threshold
    threshold = 0.75

    # Recursively iterate over each file in the directory and subdirectories
    for root, _, files in os.walk(image_dir):
        for img_file in files:
            img_path = os.path.join(root, img_file)

            # Ensure it's a file
            if os.path.isfile(img_path):
                try:
                    # Load and preprocess the image
                    img = load_img(img_path, target_size=(299, 299))
                    img_array = img_to_array(img)
                    img_array = np.expand_dims(img_array, axis=0)
                    img_array = preprocess_input(img_array)

                    # Make the prediction
                    predictions = model.predict(img_array)
                    max_prob = np.max(predictions)
                    predicted_class = np.argmax(predictions, axis=1)[0]

                    # Get the corresponding label
                    if max_prob >= threshold:
                        class_label = class_labels.get(predicted_class, 'other')
                    else:
                        class_label = 'other'

                    print(f"Image: {img_file}, Classification: {class_label} (Probability: {max_prob:.2f})")

                except Exception as e:
                    print(f"Error processing image {img_file}: {e}")

# Path to the directory with images
new_images_dir = 'C:/Users/Julian Amuedo/Desktop/pento-ssr-challenge/dogs'

# Call the function
classify_images(new_images_dir)


### Code breakdown for the image data generator and it's correspondant generators

In [None]:
# Define the route to the main folder that contains all the classes
base_dir = r'./stanford_dogs/images/filtered_images'

#### This is the path to the directory where the images are organized into subdirectories, each corresponding to a different class. Each subdirectory name should match a class label.

In [None]:
# Create a datagenerator with preprocess_input for InceptionV3
datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input,  
    validation_split=0.15,  
    horizontal_flip=True,
    rotation_range=30,
    width_shift_range=0.2,
    height_shift_range=0.2,
    zoom_range=0.2,
    shear_range=0.2,
    brightness_range=[0.7, 1.3]
)


### ImageDataGenerator: A class from TensorFlow/Keras used to generate batches of tensor image data with real-time data augmentation.

### preprocessing_function=preprocess_input: Applies preprocessing specific to the InceptionV3 model. This step is essential to scale and normalize the images as required by the pre-trained model.

### validation_split=0.15: Specifies that 15% of the images will be used for validation. This means the remaining 85% will be used for training. The ImageDataGenerator will use this split to automatically create validation data from the training data.

### horizontal_flip=True: Randomly flips the images horizontally. This augmentation technique helps the model generalize better by seeing the same image in different orientations.

### rotation_range=30: Randomly rotates the images within a range of 30 degrees. This helps the model become invariant to slight rotations.

### width_shift_range=0.2: Randomly shifts the images horizontally by a fraction of the width (20% of the width). This makes the model more robust to slight changes in the image's horizontal position.

### height_shift_range=0.2: Randomly shifts the images vertically by a fraction of the height (20% of the height). Similar to width_shift_range, this helps with vertical positional changes.

### zoom_range=0.2: Randomly zooms into the images by 20%. This helps the model handle variations in the scale of objects.

### shear_range=0.2: Applies random shearing transformations to the images. Shearing is a form of distortion where the image is stretched in one direction. This can improve the model’s robustness to such distortions.

### brightness_range=[0.7, 1.3]: Randomly changes the brightness of the images within the specified range. This helps the model handle variations in lighting conditions.

In [None]:
# Training data generator
train_generator = datagen.flow_from_directory(
    base_dir,
    target_size=(299, 299),  # Tamaño requerido por InceptionV3
    batch_size=32,
    class_mode='categorical',
    subset='training'
)

# Validation data generator
validation_generator = datagen.flow_from_directory(
    base_dir,
    target_size=(299, 299),
    batch_size=32,
    class_mode='categorical',
    subset='validation'
)


### train_generator: This is the generator that will provide batches of images for training.

### base_dir: The directory containing the images organized in subdirectories by class.

### target_size=(299, 299): Resizes all images to 299x299 pixels, which is the input size expected by the InceptionV3 model.

### batch_size=32: Number of images to return in each batch.

### class_mode='categorical': Specifies that the labels are one-hot encoded vectors (for multi-class classification).

### subset='training': Indicates that this generator should use the portion of the data designated for training, as specified by the validation_split parameter.

### validation_generator: This generator provides batches of images for validation.

### subset='validation': Indicates that this generator should use the portion of the data designated for validation.    
    
    

# Model explanation

### Model Explanation

1. **Base Model: InceptionV3 (without top layers)**

   We load the pre-trained InceptionV3 model without the top layers (i.e., without the fully connected layers) to use it as a feature extractor.

   $$ f(\mathbf{X}) = \text{InceptionV3}(\mathbf{X}, \text{weights} = \text{imagenet}, \text{include\_top} = \text{False}) $$

   where $$ f(\mathbf{X}) $$ is the input image.

2. **New Personalized Layers**

   - **Global Average Pooling Layer:**

     The global average pooling layer computes the average of each feature map across its spatial dimensions:

     $$ \mathbf{z} = \text{GlobalAveragePooling2D}(f(\mathbf{X})) $$

   - **Dropout Layer (0.5):**

     Dropout randomly sets a fraction \( p = 0.5 \) of the input units to zero during training:

     $$ \mathbf{z}' = \text{Dropout}(0.5) \mathbf{z} $$

   - **Dense Layer (1024 units, ReLU activation):**

     A fully connected dense layer with 1024 units and ReLU activation:

     $$ \mathbf{h} = \text{ReLU}(W \mathbf{z}' + b) $$

     where \( W \) is the weight matrix, \( b \) is the bias vector, and $$ \mathbf{h} $$ is the output.

   - **Dropout Layer (0.5):**

     Another dropout layer is applied:

     $$ \mathbf{h}' = \text{Dropout}(0.5)(\mathbf{h}) $$

   - **Output Dense Layer (4 units, Softmax activation):**

     The final dense layer with 4 units (for the 4 classes) and softmax activation to get class probabilities:

     $$ \mathbf{y} = \text{Softmax}(W' \mathbf{h}' + b') $$

3. **Model Compilation**

   - **Adam Optimizer:**

     The Adam optimizer is defined as:

     $$ \text{AdamOptimizer}(\text{learning\_rate} = 0.0001) $$

   - **Categorical Cross-Entropy Loss:**

     The loss function is categorical cross-entropy:

     $$ \text{Loss} = - \sum_{i=1}^{C} y_i \log(p_i) $$

     where \( C \) is the number of classes, \( y_i \) is the true label, and \( p_i \) is the predicted probability for class \( i \).

   - **Accuracy Metric:**

     Accuracy measures the proportion of correctly classified samples:

     $$ \text{Accuracy} = \frac{\text{Number of Correct Predictions}}{\text{Total Number of Predictions}} $$

   - **Precision Metric:**

     Precision calculates the ratio of true positive predictions to the total number of positive predictions:

     $$ \text{Precision} = \frac{TP}{TP + FP} $$

     where \( TP \) is the number of true positives and \( FP \) is the number of false positives.


## Extra comments i'd like to do in person

Why I chose InceptionV3: I could explain why I selected InceptionV3 as my base model, emphasizing its success in image classification tasks and how its pre-trained weights on ImageNet speed up training and improve performance.

Freezing layers: I'd discuss the importance of freezing the layers of the pre-trained model, explaining that it prevents updating the weights during training, which preserves the learned features and allows the new layers to focus on learning high-level representations for my specific task.

Custom layers: I would explain the reasoning behind adding a GlobalAveragePooling2D layer, a common practice to reduce overfitting while retaining spatial information, and how the additional dense and dropout layers help in reducing overfitting by adding regularization.

Softmax activation: I'd clarify why I used a softmax layer with 4 outputs, corresponding to the 4 dog breeds I’m trying to classify, ensuring that the model outputs class probabilities that sum to 1.

Why precision as a metric: I’d go into why I chose to track precision, especially in cases where misclassifying one breed for another (e.g., a similar-looking breed) might be more harmful and precision helps capture this.

Regularization with dropout: I could explain how dropout reduces overfitting by randomly ignoring a fraction of the neurons during training, which forces the network to generalize better.

How we could benefit from more data (images) of other breeds: I would like to explain that, providing more breeds could lead to  a better prediction of the class (label) 'others'.