# Notebook 3: Classification using CNN

**Only run it if you’re adopting or experimenting with this model, as it can take up to two hours to train.**

In this notebook, we will use a CNN model to classify the images, following the approach used in the following [paper](https://ui.adsabs.harvard.edu/abs/2023SPIE12729E..0KC/abstract).

---

### Reading the data

First, we’ll load the saved image and label data from the NumPy files.

In [1]:
import numpy as np  # Importing NumPy for numerical operations and array handling

# Load the images and labels back from the saved NumPy files
train_images = np.load('train_images.npy')  # Load image training data
val_images = np.load('val_images.npy')      # Load image validation data
train_labels = np.load('train_labels.npy')  # Load label training data
val_labels = np.load('val_labels.npy')      # Load label validation data

print("Data loaded successfully from NumPy files.")

Data loaded successfully from NumPy files.


---

### Train CubeCatNet CNN mdoel

We will define and train a Convolutional Neural Network (CNN) model that was defined in [link](https://ui.adsabs.harvard.edu/abs/2023SPIE12729E..0KC/abstract).

In [2]:
import tensorflow as tf
from tensorflow.keras.models import Sequential  # Importing Sequential to build the model
from tensorflow.keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D, Dense  # Importing necessary layers for the CNN

from keras.utils import to_categorical

# One-hot encode the labels (assuming you have 5 classes)
train_labels = to_categorical(train_labels, num_classes=5)
val_labels = to_categorical(val_labels, num_classes=5)


# Define the CNN model architecture
model = Sequential([
    Conv2D(16, (3, 3), activation='relu', input_shape=(512, 512, 3)),  # Convolutional layer + ReLU activation
    MaxPooling2D((2, 2)),  # Max pooling layer
    Conv2D(32, (3, 3), activation='relu'),  # Convolutional layer + ReLU activation
    MaxPooling2D((2, 2)),  # Max pooling layer
    Conv2D(64, (3, 3), activation='relu'),  # Convolutional layer + ReLU activation
    MaxPooling2D((2, 2)),  # Max pooling layer
    Conv2D(128, (3, 3), activation='relu'),  # Convolutional layer + ReLU activation
    MaxPooling2D((2, 2)),  # Max pooling layer
    GlobalAveragePooling2D(),  # Global average pooling layer
    Dense(5, activation='softmax')  # Output layer with 5 neurons (one for each class) + Softmax activation
])

# Compile the model with appropriate loss function, optimizer, and metrics
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

print("Model defined and compiled successfully.")

# Train the model on the training data
history = model.fit(
    train_images, train_labels,
    epochs=10,  # Number of epochs
    batch_size=64,  # Batch size
)

print("Model training complete.")

2024-09-28 17:17:34.420777: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-09-28 17:17:34.427193: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-09-28 17:17:34.446322: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-28 17:17:34.468651: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-28 17:17:34.476023: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-09-28 17:17:34.497044: I tensorflow/core/platform/cpu_feature_gu

Model defined and compiled successfully.
Epoch 1/10
[1m203/203[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m699s[0m 3s/step - accuracy: 0.7054 - loss: 0.9550
Epoch 2/10
[1m203/203[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m693s[0m 3s/step - accuracy: 0.9866 - loss: 0.0539
Epoch 3/10
[1m203/203[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m691s[0m 3s/step - accuracy: 0.9926 - loss: 0.0281
Epoch 4/10
[1m203/203[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m694s[0m 3s/step - accuracy: 0.9962 - loss: 0.0128
Epoch 5/10
[1m203/203[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m693s[0m 3s/step - accuracy: 0.9946 - loss: 0.0205
Epoch 6/10
[1m203/203[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m742s[0m 3s/step - accuracy: 0.9653 - loss: 0.1648
Epoch 7/10
[1m203/203[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m693s[0m 3s/step - accuracy: 0.9964 - loss: 0.0118
Epoch 8/10
[1m203/203[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m697s[0m 3s/step - accuracy: 0.9985 - loss

---

### Perfomrmance Evaluation

In [3]:
from source.pre import evaluate_pipeline

def preprocessing_fn(X):
    
    return X
    
# Evaluate the pipeline
metrics = evaluate_pipeline(model, val_images, val_labels, preprocessing_fn)

# Print the evaluation metrics
print("Evaluation Metrics:")
for key, value in metrics.items():
    if key == 'evaluation_time':
        print(f"{key}: {value:.2f} seconds")
    elif key == 'pipeline_size':
        print(f"{key}: {value:.2f} MB")
    elif key == 'peak_memory_usage':
        print(f"{key}: {value:.2f} MB")
    elif key == 'average_cpu_usage':
        print(f"{key}: {value:.2f}%")
    else:
        print(f"{key}: {value:.4f}")

[1m102/102[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m34s[0m 334ms/step
(3237, 5)
(3237, 5)
Evaluation Metrics:
evaluation_time: 37.37 seconds
peak_memory_usage: 19525.90 MB
average_cpu_usage: 1019.84%
accuracy: 0.9975
f1_score: 0.9980
pipeline_size: 1.16 MB
