<a href="https://colab.research.google.com/github/Zendellll/Road2_AI_Course/blob/main/AICourse_task7.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
from google.colab import drive

# Mount Google Drive
drive.mount('/content/drive', force_remount=True)


Mounted at /content/drive


In [None]:
!mkdir /content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds
!mkdir /content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/images
!mkdir /content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/labels

!cp /content/drive/MyDrive/drone_data/*.jpg /content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/images
!cp /content/drive/MyDrive/drone_data/*.txt /content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/labels
!cp /content/drive/MyDrive/bird_data/*.jpg /content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/images
!cp /content/drive/MyDrive/bird_data/*.txt /content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/labels

In [None]:
import torch
import torchvision.transforms as transforms
import random
import os
import cv2
import numpy as np
from keras.preprocessing.image import load_img, img_to_array
from google.colab.patches import cv2_imshow

def augment_image(image):
    # Convert image to PyTorch tensor
    image_tensor = transforms.ToTensor()(image)

    # Randomly choose an augmentation
    augmentation_type = random.choice(['flip', 'color_jitter', 'rotation'])

    # Apply random flip
    if augmentation_type == 'flip':
        image_tensor = transforms.RandomHorizontalFlip()(image_tensor)

    # Apply random color jitter
    elif augmentation_type == 'color_jitter':
        image_tensor = transforms.ColorJitter(brightness=0.4, contrast=0.4, saturation=0.4, hue=0.2)(image_tensor)

    # Apply random rotation
    elif augmentation_type == 'rotation':
        angle = random.uniform(-15, 15)
        image_tensor = transforms.functional.rotate(image_tensor, angle)

    # Convert back to PIL Image
    augmented_image = transforms.ToPILImage()(image_tensor)

    return augmented_image

# Function to convert YOLO coordinates to bounding box coordinates
def yolo_to_bbox(image_shape, class_index, x_center, y_center, width, height):
    img_height, img_width, _ = image_shape

    x_min = int((x_center - width / 2) * img_width)
    y_min = int((y_center - height / 2) * img_height)
    x_max = int((x_center + width / 2) * img_width)
    y_max = int((y_center + height / 2) * img_height)

    return x_min, y_min, x_max, y_max

# Function to randomly crop a 32x32 section from the image
def random_crop(image, crop_size=(32, 32)):
    img_height, img_width, _ = image.shape

    # Randomly choose the top-left corner of the crop
    y_start = np.random.randint(0, img_height - crop_size[0] + 1)
    x_start = np.random.randint(0, img_width - crop_size[1] + 1)

    # Crop the image
    cropped_img = image[y_start:y_start + crop_size[0], x_start:x_start + crop_size[1]]

    return cropped_img

# Pre-process data
## Prompt:
Today we are doing some AI coding on Google Colab, using Python, Keras, TensorFlow.

I have a dataset containing images of either birds or drones. All images are found in path - '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/images', and all labels are found in - '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/labels'. In the images dir, there are images of both birds and drones. Each image has a corresponding label file, which has the same name, and ends with .txt instead of .jpg.
All labels are in the YOLO annotation format. the class IDs are 0 for drone and 1 for bird.

I need you to go over all images in the image path I provided. For each image, start by cropping a random 32*32 area. Run the cropped image through the augmentation function to get an augmented image. Then, save that image into the path 'content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/other/images', and save a corresponding label file at '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/other/labels'. For the random crop image, the label should be 2 (the .txt file should have only '2' in it).
Then, using the coordinates from the label file, extract the object location and crop around the object. Take the cropped object and resize it to 32*32. Run it through the augmentation function. Once done, I need you to save the image. The location is based on the label file.
If the first number in the label file is '0', it means the object is a drone, in which case we will save the image at '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/drone/images' and save a corresponding label at '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/drone/labels' - the label file containing only a '0'.
Else (meaning the first number in the label file is '1'), the object is a bird. Then we will save it in '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/bird/images', and save a corresponding label file at '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/bird/labels' - with the label file containing only a '1'.

In [None]:
# Path definitions
image_path = '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/images'
label_path = '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/labels'

# Output paths
other_image_path = '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/other/images'
other_label_path = '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/other/labels'

drone_image_path = '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/drone/images'
drone_label_path = '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/drone/labels'

bird_image_path = '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/bird/images'
bird_label_path = '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/bird/labels'

In [None]:
# Process each image
for filename in os.listdir(image_path):
    if filename.endswith(".jpg"):
        # Load the image
        img = cv2.imread(os.path.join(image_path, filename))

        # Load corresponding label file
        label_filename = os.path.splitext(filename)[0] + '.txt'
        label_file_path = os.path.join(label_path, label_filename)

        with open(label_file_path, 'r') as label_file:
            class_id, x_center, y_center, width, height = map(float, label_file.readline().split())

            # Random crop
            random_cropped_img = random_crop(img)

            # Augment the random cropped image
            augmented_random_cropped_img = augment_image(random_cropped_img)

            # Save the augmented random cropped image
            cv2.imwrite(os.path.join(other_image_path, filename), cv2.cvtColor(np.array(augmented_random_cropped_img), cv2.COLOR_RGB2BGR))

            # Save the label file for the other class
            with open(os.path.join(other_label_path, label_filename), 'w') as other_label_file:
                other_label_file.write('2')

            # Extract object location
            x_min, y_min, x_max, y_max = yolo_to_bbox(img.shape, class_id, x_center, y_center, width, height)

            # Crop around the object
            cropped_object_img = img[y_min:y_max, x_min:x_max]

            # Resize to 32x32
            resized_object_img = cv2.resize(cropped_object_img, (32, 32))

            # Augment the resized object image
            augmented_resized_object_img = augment_image(resized_object_img)

            # Save the augmented resized object image
            if class_id == 0:
                cv2.imwrite(os.path.join(drone_image_path, filename), cv2.cvtColor(np.array(augmented_resized_object_img), cv2.COLOR_RGB2BGR))
                with open(os.path.join(drone_label_path, label_filename), 'w') as drone_label_file:
                    drone_label_file.write('0')
            elif class_id == 1:
                cv2.imwrite(os.path.join(bird_image_path, filename), cv2.cvtColor(np.array(augmented_resized_object_img), cv2.COLOR_RGB2BGR))
                with open(os.path.join(bird_label_path, label_filename), 'w') as bird_label_file:
                    bird_label_file.write('1')

In [None]:
print(len(os.listdir('/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/drone/images')), len(os.listdir('/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/drone/labels')))
print(len(os.listdir('/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/bird/images')), len(os.listdir('/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/bird/labels')))
print(len(os.listdir('/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/other/images')), len(os.listdir('/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/other/labels')))

4070 4070
1644 1644
5714 5714


# Convert to TF Dataset
## Prompt:
Perfect. Now my images are in '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/other/images', '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/drone/images' and '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/bird/images'.
My labels are in '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/other/labels',
'/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/drone/labels' and '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/bird/labels'.


I need you to convert all the files in those paths into one large TensorFlow dataset. Save the TensorFlow dataset in '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/', call it task7_DS. Ensure that the shapes and types of images and labels match TensorFlow's requirements.
## Prompt 2:
In the script you provided for converting the dataset into a tf dataset, on line - dataset = dataset.map(lambda x, y: (preprocess_image(x), preprocess_label(y)))
I got the error - NotImplementedError: Cannot convert a symbolic tf.Tensor (StringToNumber:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported.

In [None]:
import tensorflow as tf
from tensorflow.keras.utils import to_categorical

# Function to read and preprocess an image
def preprocess_image(image_path):
    img = tf.io.read_file(image_path)
    img = tf.image.decode_jpeg(img, channels=3)
    img = tf.image.resize(img, (32, 32))  # Ensure the image size is 32x32
    img = img / 255.0  # Normalize to [0, 1]

    return img

# Function to read and preprocess a label
def preprocess_label(label_path):
    def _preprocess_label(label_path):
        label = tf.io.read_file(label_path)
        label = tf.strings.to_number(tf.strings.split(label, '\n')[0])
        return label.numpy()  # Convert to numpy array

    return tf.py_function(_preprocess_label, [label_path], tf.float32)

# Get list of image and label paths
other_image_paths = [os.path.join(other_image_path, img) for img in os.listdir(other_image_path) if img.endswith('.jpg')]
drone_image_paths = [os.path.join(drone_image_path, img) for img in os.listdir(drone_image_path) if img.endswith('.jpg')]
bird_image_paths = [os.path.join(bird_image_path, img) for img in os.listdir(bird_image_path) if img.endswith('.jpg')]

other_label_paths = [os.path.join(other_label_path, label) for label in os.listdir(other_label_path) if label.endswith('.txt')]
drone_label_paths = [os.path.join(drone_label_path, label) for label in os.listdir(drone_label_path) if label.endswith('.txt')]
bird_label_paths = [os.path.join(bird_label_path, label) for label in os.listdir(bird_label_path) if label.endswith('.txt')]

# Create TensorFlow dataset
image_paths = other_image_paths + drone_image_paths + bird_image_paths
label_paths = other_label_paths + drone_label_paths + bird_label_paths

dataset = tf.data.Dataset.from_tensor_slices((image_paths, label_paths))

# Map the preprocess functions to the dataset
dataset = dataset.map(lambda x, y: (preprocess_image(x), preprocess_label(y)))

# Save the TensorFlow dataset
tf.data.Dataset.save(dataset, '/content/drive/MyDrive/Learning/Road2_AI_Course/task7_ds/task7_DS')


# Data visualization and balance
## Distribution and balance
### Prompt:
Perfect.
I need you to visualize and display the dataset's class distribution (drone, bird and other). Also, I need you to print out if the dataset is balanced or not

In [None]:
import matplotlib.pyplot as plt

# Function to visualize class distribution
def visualize_class_distribution(dataset):
    class_counts = [0, 0, 0]  # Counts for 'other', 'drone', and 'bird'

    for _, label in dataset:
        class_index = tf.argmax(label).numpy()
        class_counts[class_index] += 1

    classes = ['Other', 'Drone', 'Bird']

    # Plotting
    plt.bar(classes, class_counts, color=['blue', 'green', 'red'])
    plt.title('Class Distribution')
    plt.xlabel('Class')
    plt.ylabel('Count')
    plt.show()

    return class_counts

# Visualize class distribution
class_counts = visualize_class_distribution(dataset)

# Check if the dataset is balanced
is_balanced = all(count == class_counts[0] for count in class_counts[1:])
print(f"Is the dataset balanced? {'Yes' if is_balanced else 'No'}")


## Showcasing 5 random images
### Prompt:
Great. now I need you to plot 5 random images from each class, just for showcase purpose

In [None]:
import random

# Function to display random images from each class
def display_random_images(dataset, num_images=5):
    classes = ['Other', 'Drone', 'Bird']
    class_indices = {0: [], 1: [], 2: []}

    # Collect indices of each class
    for i, (_, label) in enumerate(dataset):
        class_index = tf.argmax(label).numpy()
        class_indices[class_index].append(i)

    # Display random images from each class
    for class_index, class_name in enumerate(classes):
        print(f"Random images from {class_name} class:")
        for _ in range(num_images):
            random_index = random.choice(class_indices[class_index])
            image, _ = dataset.skip(random_index).take(1).as_numpy_iterator().next()
            plt.imshow(image)
            plt.title(class_name)
            plt.axis('off')
            plt.show()

# Display random images
display_random_images(dataset)


## Splitting DS into train and test
### Prompt:
Great. Now I need you to split our dataset into train and test sets

In [None]:
# Function to split dataset into train and test sets
def split_dataset(dataset, train_percentage=0.8, seed=None):
    # Get dataset size
    dataset_size = len(dataset)
    train_size = int(train_percentage * dataset_size)

    # Set seed for reproducibility
    tf.random.set_seed(seed)

    # Shuffle the dataset
    dataset = dataset.shuffle(dataset_size, seed=seed)

    # Split into train and test sets
    train_dataset = dataset.take(train_size)
    test_dataset = dataset.skip(train_size)

    return train_dataset, test_dataset

# Split the dataset
train_dataset, test_dataset = split_dataset(dataset, train_percentage=0.8, seed=42)

# Print the sizes of train and test sets
print(f"Train set size: {len(train_dataset)}")
print(f"Test set size: {len(test_dataset)}")


# Building and training the network
## Model set up
### Prompt:
Perfect. Now it's time to use Keras and build a Convolutional Neural Network. Make sure the input layer has the right shape for our data, and make sure the output layer has 3 perceptrons (one for each possible class).
### Prompt 2:
Change it so that the input layer is already a Conv2D layer and don't forget the MaxPool layer after the input layer as well

In [None]:
from tensorflow.keras import layers, models

# Define the CNN model
model = models.Sequential()

# Input layer with Conv2D and MaxPooling2D
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))

# Convolutional layers
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))

# Flatten layer
model.add(layers.Flatten())

# Dense layers
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(128, activation='relu'))

# Output layer with 3 perceptrons (for each class)
model.add(layers.Dense(3, activation='softmax'))

# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Display the model summary
model.summary()


## Train
### Prompt:
Great. Now let's train this network on the dataset we have previously created. Print out the test accuracy when training is over

In [None]:
# Set the number of epochs
epochs = 10

# Convert the datasets to numpy arrays for compatibility with Keras
train_images, train_labels = zip(*train_dataset.as_numpy_iterator())
train_images = np.array(train_images)
train_labels = np.array(train_labels)

test_images, test_labels = zip(*test_dataset.as_numpy_iterator())
test_images = np.array(test_images)
test_labels = np.array(test_labels)

# Train the model
history = model.fit(train_images, train_labels,
                    epochs=epochs,
                    validation_data=(test_images, test_labels))

# Evaluate the model on the test set
test_loss, test_accuracy = model.evaluate(test_images, test_labels)
print(f'Test Accuracy: {test_accuracy * 100:.2f}%')


# Results
## Plotting accuracy, loss and confusion matrix
### Prompt:
Perfect. Finally, let's plot the accuracy and loss graphs. Also, make sure to plot the confusion matrix

In [None]:
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix
import seaborn as sns

# Plot accuracy and loss graphs
def plot_accuracy_loss(history):
    # Accuracy
    plt.figure(figsize=(12, 4))
    plt.subplot(1, 2, 1)
    plt.plot(history.history['accuracy'], label='Training Accuracy')
    plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
    plt.title('Training and Validation Accuracy')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend()

    # Loss
    plt.subplot(1, 2, 2)
    plt.plot(history.history['loss'], label='Training Loss')
    plt.plot(history.history['val_loss'], label='Validation Loss')
    plt.title('Training and Validation Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()

    plt.tight_layout()
    plt.show()

# Plot confusion matrix
def plot_confusion_matrix(y_true, y_pred, class_names):
    cm = confusion_matrix(y_true, y_pred)
    cm = cm / cm.sum(axis=1)[:, np.newaxis]  # Normalize

    plt.figure(figsize=(8, 6))
    sns.heatmap(cm, annot=True, fmt=".2f", cmap="Blues", xticklabels=class_names, yticklabels=class_names)
    plt.title('Confusion Matrix')
    plt.xlabel('Predicted')
    plt.ylabel('True')
    plt.show()

# Convert one-hot encoded labels back to class indices
test_predictions = np.argmax(model.predict(test_images), axis=1)
test_true_labels = np.argmax(test_labels, axis=1)

# Plot accuracy and loss graphs
plot_accuracy_loss(history)

# Plot confusion matrix
class_names = ['Other', 'Drone', 'Bird']
plot_confusion_matrix(test_true_labels, test_predictions, class_names)
