# Introduction to Computer Vision – Homework
## Fait par : Hatim SALMI , Taha ELMARZOUKI 

 To accomplish this task, we will need to follow these steps:

1.Data Preparation: We load the CIFAR-10 dataset and label the classes appropriately.

2.Model Architecture: We design a Convolutional Neural Network (CNN) suitable for binary classification.

3.Loss Function: We use an appropriate loss function for binary classification.

4.Model Training: We train the model on the training set.

5.Evaluation: We evaluate the model using precision, recall, F1 score, accuracy, and a confusion matrix.

# Step 1: Data Preparation
## We load the CIFAR-10 dataset and create labels for the two categories.

In this step, we transform the multiclass labels from the CIFAR-10 dataset into binary labels. Our goal is to group the CIFAR-10 classes into two categories: "can fly" and "cannot fly". Here’s a detailed explanation of the process, including the code.


-Before running any TensorFlow code, we ensure that TensorFlow is installed.

-We import all the necessary libraries, including TensorFlow and other utilities for handling data, building the model, and evaluating performance.

-We load the CIFAR-10 dataset, which consists of 60,000 32x32 color images in 10 classes, with 6,000 images per class.

-We define the classes that can fly (airplane and bird) and those that cannot fly (the rest):
    The CIFAR-10 dataset consists of 10 classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck.
    We need to classify these into two groups:
        Can Fly: Includes 'airplane' and 'bird' (class indices 0 and 2 in CIFAR-10).
        Cannot Fly: Includes all other classes (indices 1, 3, 4, 5, 6, 7, 8, 9 in CIFAR-10).

-We then create binary labels for these classes:
    The original labels (y_train and y_test) are integers from 0 to 9, representing the 10 classes.
    We convert these into binary labels where:
        1 indicates the class belongs to the "can fly" group.
        0 indicates the class belongs to the "cannot fly" group.
    This is done by checking if each label is in the list of classes that can fly (fly_classes).

-We normalize the image data to the range [0, 1] for better training performance.



In [None]:
# Install TensorFlow
!pip install tensorflow

# Import necessary libraries
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from sklearn.metrics import confusion_matrix, precision_score, recall_score, f1_score, accuracy_score
import seaborn as sns
import matplotlib.pyplot as plt

# Load the CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# Define the classes that can and cannot fly
fly_classes = [0, 2]  # airplane, bird
not_fly_classes = [1, 3, 4, 5, 6, 7, 8, 9]  # the rest

# Create binary labels
def create_binary_labels(y, fly_classes):
    y_binary = np.isin(y, fly_classes).astype(int)
    return y_binary

y_train_binary = create_binary_labels(y_train, fly_classes)
y_test_binary = create_binary_labels(y_test, fly_classes)

# Normalize the data
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255



# Step 2: Model Architecture
## We design a simple CNN for binary classification.

1.We build a Convolutional Neural Network (CNN) with several layers, including convolutional, max-pooling, dense, and dropout layers. The output layer uses a sigmoid activation function for binary classification.

2.We compile the model using the Adam optimizer, binary cross-entropy loss function (suitable for binary classification), and accuracy as the evaluation metric.

Explanation: 

CNN Layers:

    Convolutional Layers: These layers apply a set of filters to the input image, capturing various features such as edges, textures, and patterns.
    Max Pooling Layers: These layers reduce the spatial dimensions of the feature maps, making the model computationally efficient and reducing overfitting.
    Flatten Layer: This layer converts the 2D feature maps into a 1D vector, which can be fed into fully connected layers.
    Dense Layers: These layers perform the final classification based on the features extracted by the convolutional layers.
    Dropout Layer: This layer randomly sets a fraction of input units to 0 during training to prevent overfitting.
    Output Layer: For binary classification, this layer uses a sigmoid activation function to produce a probability score between 0 and 1.

Activation Functions:

    ReLU (Rectified Linear Unit): This activation function introduces non-linearity to the model, allowing it to learn complex patterns.

Optimizer and Loss Function:

    The use of binary cross-entropy loss in this CNN training process ensures that the model learns to distinguish between the two classes based on the input images. This loss function, combined with the Adam optimizer and accuracy metric, provides a robust framework for training and evaluating the model on the binary classification task.


In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

model = Sequential([
    # First Convolutional Layer
    Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)), # 32 filters, each with a size of 3x3 and we apply the ReLU activation function.
    # First Max Pooling Layer
    MaxPooling2D((2, 2)), # Reduces the spatial dimensions by a factor of 2
    # Second Convolutional Layer
    Conv2D(64, (3, 3), activation='relu'), # 64 filters, each with a size of 3x3 and we apply the ReLU activation function.
    # Second Max Pooling Layer
    MaxPooling2D((2, 2)), # Reduces the spatial dimensions by a factor of 2
    # Third Convolutional Layer
    Conv2D(128, (3, 3), activation='relu'),
    # Third Max Pooling Layer
    MaxPooling2D((2, 2)),
    # Flatten Layer
    Flatten(), # Converts the 3D feature maps into 1D feature vectors.
    # First Dense Layer
    Dense(128, activation='relu'),
    # Dropout Layer
    Dropout(0.5), # Randomly sets 50% of the input units to 0 during each update to prevent overfitting.
    # Output Layer
    Dense(1, activation='sigmoid') # Applies the sigmoid activation function to produce a probability score for binary classification.
])
#  Uses binary cross-entropy as the loss function which is suitable for binary classification tasks.

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) 


# Step 3: Model Training
## We train the model on the training data.

We train the model on the training data, using 20% of it for validation. The model is trained for 20 epochs with a batch size of 64.

In [None]:
history = model.fit(x_train, y_train_binary, epochs=20, batch_size=64, validation_split=0.2)


Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


## Analysis & Interpretation:

    1.Epochs 1-6:

        The model shows a steady decrease in training loss and validation loss, with both metrics improving significantly.
        Training accuracy and validation accuracy both increase, indicating that the model is learning effectively.
        By Epoch 6, the training loss is 0.2517, and the validation loss is 0.2808, with training accuracy at 90.01% and validation accuracy at 88.66%.

    2.Epochs 7-10:

        The training loss continues to decrease, and training accuracy continues to increase, reaching 93.54% by Epoch 10.
        However, the validation loss begins to fluctuate and increase slightly, while validation accuracy shows minor improvements but then starts to plateau.
        By Epoch 10, the training loss is 0.1637, and the validation loss is 0.3111, with training accuracy at 93.54% and validation accuracy at 89.62%.

    3.Epochs 11-20:

        The model continues to improve its performance on the training set, with training loss dropping to 0.0499 and training accuracy reaching 98.09% by Epoch 20.
        The validation loss, however, starts to increase more noticeably, and validation accuracy shows signs of slight decline or plateauing.
        By Epoch 20, the training loss is 0.0499, and the validation loss is 0.5711, with training accuracy at 98.09% and validation accuracy at 88.30%.

    
     Interpretation of Results

    Initial Learning (Epochs 1-6):

        The model shows effective learning, with both training and validation losses decreasing and accuracies increasing. This indicates that the model is able to generalize well to the validation set initially.

    Overfitting (Epochs 7-20):

        Starting around Epoch 7, the validation loss begins to increase while the training loss continues to decrease. This suggests that the model is starting to overfit the training data.
        Overfitting occurs when the model learns to perform very well on the training data but fails to generalize to new, unseen data. This is evident as the validation loss increases despite the high training accuracy.

    Validation Accuracy Plateau:

        The validation accuracy plateaus around 88-89%, indicating that the model has reached its generalization capacity given the current architecture and training setup.

    


# Step 4: Evaluation
## We evaluate the model using various metrics.

The model is compiled with the binary cross-entropy loss function and trained on the training set.

We evaluate the trained model on the test data, calculating predictions and then computing evaluation metrics such as precision, recall, F1 score, and accuracy. We also generate a confusion matrix.

In the code below , for compiling with the appropriate loss function and optimizer we use model.compile() , for evaluating the model , we use model.evaluate() to calculate the test loss and test accuracy and then we print the test loss and accuracy.Also for making predictions , we use model.predict() to get the predicted probabilities for the test set.

And finally we calculate the metrics .



In [None]:
import numpy as np
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from sklearn.model_selection import train_test_split

# Load CIFAR-10 dataset
(x_train_full, y_train_full), (x_test, y_test) = cifar10.load_data()

# Normalize pixel values to range [0, 1]
x_train_full = x_train_full.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Create binary labels for full training set
y_train_full_binary = np.where((y_train_full == 0) | (y_train_full == 2), 1, 0)

# Split full training set into training and validation sets (80-20 split for training and testing)
x_train, x_val, y_train, y_val = train_test_split(x_train_full, y_train_full_binary, test_size=0.2, random_state=42)

# Define CNN architecture
model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(1, activation='sigmoid')  # Output layer with sigmoid activation for binary classification
])

# Compile the model with appropriate loss function and optimizer
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the model
history = model.fit(x_train, y_train, epochs=20, batch_size=64, validation_data=(x_val, y_val))

# Evaluate the model on the testing set
test_loss, test_accuracy = model.evaluate(x_test, y_test_binary)
print(f'Test loss: {test_loss}')
print(f'Test accuracy: {test_accuracy}')

# Make predictions on the testing set
y_test_pred = model.predict(x_test)
y_test_pred_binary = (y_test_pred > 0.5).astype(int)  #if the predicted probability is greater than 0.5, it will evaluate to True; otherwise,false.

# Calculate evaluation metrics
precision = precision_score(y_test_binary, y_test_pred_binary)
recall = recall_score(y_test_binary, y_test_pred_binary)
f1 = f1_score(y_test_binary, y_test_pred_binary)
conf_matrix = confusion_matrix(y_test_binary, y_test_pred_binary)

print(f'Precision: {precision}')
print(f'Recall: {recall}')
print(f'F1 Score: {f1}')
print(f'Confusion Matrix:\n{conf_matrix}')


Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Test loss: 0.4624193012714386
Test accuracy: 0.8784000277519226
Precision: 0.697979797979798
Recall: 0.691
F1 Score: 0.6944723618090453
Confusion Matrix:
[[7402  598]
 [ 618 1382]]


## Interpretation
Training Performance:
    Loss and Accuracy:
        The training loss gradually decreases from approximately 0.43 to 0.07 over the 20 epochs, indicating that the model's predictions are improving.
        The training accuracy increases from around 0.82 to 0.97, showing that the model is learning to classify the training data more accurately with each epoch.

Validation Performance:
    Loss and Accuracy:
        The validation loss decreases from about 0.39 to 0.47, but it slightly increases after around the 10th epoch. This might indicate overfitting, where the model starts to memorize the training data instead of generalizing well to unseen data.
        The validation accuracy reaches a peak of around 0.89 after approximately the 9th epoch but then starts to decrease slightly, similar to the validation loss trend.

Testing Performance:
    Loss and Accuracy:
        The test loss is approximately 0.46, and the test accuracy is about 0.88. This indicates that the model performs similarly well on unseen test data as it did on the validation data.

Precision, Recall, and F1 Score:
    Precision: Precision is around 0.70, indicating that when the model predicts a sample to belong to the "can fly" category, it is correct about 70% of the time.
    Recall: Recall is approximately 0.69, indicating that the model correctly identifies about 69% of the samples belonging to the "can fly" category.
    F1 Score: The F1 score is about 0.69, which is the harmonic mean of precision and recall. It provides a balance between precision and recall.

Confusion Matrix:
    The confusion matrix shows the following:
        True Negative (TN): 7402 samples were correctly classified as "cannot fly".
        False Positive (FP): 598 samples were incorrectly classified as "can fly" when they actually cannot.
        False Negative (FN): 618 samples were incorrectly classified as "cannot fly" when they actually can.
        True Positive (TP): 1382 samples were correctly classified as "can fly".

## Suggestions to possibly improve performance

Model Architecture:

    We may need to experiment with different CNN architectures, including variations in the number of layers, kernel sizes, and filter depths.

Data Augmentation:

    We may need to apply data augmentation techniques such as rotation, flipping, scaling, and shifting to artificially increase the size of the training dataset and improve model generalization.

Hyperparameter Tuning:

We may need to search for optimal hyperparameters such as learning rate and batch size.
