# **National AI Competition - Technical Track**

This document provides guidance on how to properly submit your Kuih Classification project. Your submission will be tested using an automated script similar to the example provided, so it's essential that you follow these guidelines precisely.

## **Required Files for Submission**

1. Machine Learning Model


*   Your Model Weights (Ex: ``keras_model.h5``)
*   Class labels file (if applicable)

2. Testing Script

*   A Google Colab Notebook that can load and test your model.
*   The script must work with the predefined test dataset path

## **Example Test Dataset**

An example of folder of test images will be located in this Google Dirve:
https://drive.google.com/drive/folders/1NzCoYjsMnTTPf3lWCfnylG8IF_VQenkM?usp=sharing

Be sure to add a shortcut to your drive for the testing. Your script must be able to access and process images in this directory without modification.


## **Example Testing Script**

Below is an example testing script that utilizes model weights exported from Teachable Machine.




### 1. Setting Up the Environment

This section, we install the necessary libraries. For Teachable Machine Models, we specifically install TensorFlow 2.12.0; However, for your own models, installing the latest version of TensorFlow might be better.

In [1]:
!pip uninstall -y numpy pandas tensorflow
!pip install --no-cache-dir tensorflow==2.12.0 numpy==1.23.5 pandas==1.5.3

Found existing installation: numpy 2.0.2
Uninstalling numpy-2.0.2:
  Successfully uninstalled numpy-2.0.2
Found existing installation: pandas 2.2.2
Uninstalling pandas-2.2.2:
  Successfully uninstalled pandas-2.2.2
Found existing installation: tensorflow 2.18.0
Uninstalling tensorflow-2.18.0:
  Successfully uninstalled tensorflow-2.18.0
Collecting tensorflow==2.12.0
  Downloading tensorflow-2.12.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.4 kB)
Collecting numpy==1.23.5
  Downloading numpy-1.23.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.3 kB)
Collecting pandas==1.5.3
  Downloading pandas-1.5.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Collecting gast<=0.4.0,>=0.2.1 (from tensorflow==2.12.0)
  Downloading gast-0.4.0-py3-none-any.whl.metadata (1.1 kB)
Collecting keras<2.13,>=2.12.0 (from tensorflow==2.12.0)
  Downloading keras-2.12.0-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting protobuf!=4.21

### 2. Importing Libraries

Here, we will imports the necessary Python Libraries and Mount the Google Drive to access the model weights, label, and test images

In [2]:
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing import image
import os
import matplotlib.pyplot as plt
from google.colab import files
import io
import zipfile
from tqdm.notebook import tqdm

from google.colab import drive
drive.mount('/content/drive')

ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

### 3. Upload Model & Labels.txt

This is an example model file we attach. Do change this to your own model path.

*Note:* This model is very bad at classifying images so you would need to train your own!

In [None]:
## CChange this to where your keras_model.h5 is
model_filename = '/content/drive/MyDrive/Public-Access/keras_model.h5'
model = keras.models.load_model(model_filename)
print("Model loaded successfully!")


In [None]:
## Change this to where your labels.txt is
labels_filename = '/content/drive/MyDrive/Public-Access/labels.txt'
labels = {}
with open(labels_filename, 'r') as f:
    for line in f:
        if line.strip():
            idx, label = line.strip().split(' ', 1)
            labels[int(idx)] = label

print(f"Loaded {len(labels)} classes:")
for idx, label in labels.items():
    print(f"  {idx}: {label}")

### 4. Access Testing Directory

Remember to make a shortcut to your own drive for that to work!

In [None]:
# Get list of test images
test_dir = '/content/drive/MyDrive/Public-Access/Testing'
test_images = []
for root, _, files in os.walk(test_dir):
    for file in files:
        if file.lower().endswith(('.png', '.jpg', '.jpeg')):
            test_images.append(os.path.join(root, file))

test_images.sort()  # Sort to ensure consistent order
print(f"Found {len(test_images)} test images")

### 5. Running Predictions

In this section, we process each image for model input (resize, normalize) and make prediction using the model label.

In [None]:
predictions = []
input_shape = model.input_shape[1:3]

for img_path in tqdm(test_images):
    try:
        # Preprocess the image
        img = image.load_img(img_path, target_size=input_shape)
        img_array = image.img_to_array(img)
        img_array = np.expand_dims(img_array, axis=0)
        img_array = img_array / 255.0  # Normalize to [0,1]

        # Predict class probabilities
        pred_probs = model.predict(img_array, verbose=0)[0]  # shape: (n_classes,)
        predicted_class_idx = int(np.argmax(pred_probs))

        # Get the label for the predicted class
        predicted_label = labels.get(predicted_class_idx, f"Unknown ({predicted_class_idx})")

        # Store prediction result
        predictions.append({
            'image': os.path.basename(img_path),
            'predicted_class_index': predicted_class_idx,
            'predicted_label': predicted_label,
            'class_probabilities': pred_probs.tolist()  # convert to list for JSON-safe export
        })

    except Exception as e:
        print(f"Error processing {img_path}: {str(e)}")
        predictions.append({
            'image': os.path.basename(img_path),
            'predicted_class_index': -1,
            'predicted_label': 'Error',
            'class_probabilities': []
        })

### 6. Creating Output

In this section, you will convert the model's prediction results into a structured format and prepare it for submission.

**Instructions:**

1. Convert predictions to a DataFrame
Use pandas to store each image’s:

- filename
- predicted class index
- predicted class label (match it back from labels.txt)


2. Save the results to a CSV file
Save the DataFrame using df.to_csv("predictions.csv", index=False).

In [None]:
results_df = pd.DataFrame(predictions)
display(results_df)

### 7. Example Metrics Computation

In [None]:
from sklearn.metrics import (
    classification_report, accuracy_score,
    roc_auc_score, precision_recall_fscore_support
)
from sklearn.preprocessing import label_binarize
import numpy as np

# True and predicted labels
true_labels = [1, 2, 5]  # Adjust this list to match your full test set
results_df['true_class_index'] = true_labels
y_true = results_df['true_class_index'].astype(int).values
y_pred = results_df['predicted_class_index'].astype(int).values
y_probs = np.array(results_df['class_probabilities'].tolist())

## Quick fix for ROC Curve as I only have 3 classes here (DO NOT NEED THIS IF YOU HAVE 8 CLASSES IN YOUR TEST SET)
FULL_NUM_CLASSES = 8  # total number of possible classes

# Pad probability vectors to length 8
def pad_probs(probs, target_len=FULL_NUM_CLASSES):
    padded = np.zeros(target_len)
    padded[:len(probs)] = probs  # assumes probs are in order (class 0, 1, 2, ...)
    return padded

# Apply padding
y_probs_padded = np.array([pad_probs(p, FULL_NUM_CLASSES) for p in results_df['class_probabilities']])

# Update your DataFrame or use directly in metrics
y_probs = y_probs_padded


In [None]:
# Number of classes
n_classes = FULL_NUM_CLASSES
class_names = list(range(FULL_NUM_CLASSES))

# Accuracy
acc = accuracy_score(y_true, y_pred)
print(f"\n✅ Accuracy: {acc:.4f}")

# Precision, Recall, F1 per class & macro
prec, rec, f1, _ = precision_recall_fscore_support(y_true, y_pred, labels=class_names, average=None)
macro_prec, macro_rec, macro_f1, _ = precision_recall_fscore_support(y_true, y_pred, average='macro')

print("\n📊 Per-class metrics:")
for i, cls in enumerate(class_names):
    print(f"Class {cls}: Precision={prec[i]:.4f}, Recall={rec[i]:.4f}, F1={f1[i]:.4f}")

print(f"\n📦 Macro Precision: {macro_prec:.4f}, Macro Recall: {macro_rec:.4f}, Macro F1: {macro_f1:.4f}")

# ROC AUC (requires binarized labels)
y_true_bin = label_binarize(y_true, classes=class_names)

# ROC AUC per class and macro
try:
    auc_per_class = roc_auc_score(y_true_bin, y_probs, average=None, multi_class='ovr')
    auc_macro = roc_auc_score(y_true_bin, y_probs, average='macro', multi_class='ovr')

    print("\n🎯 ROC AUC per class:")
    for i, cls in enumerate(class_names):
        print(f"Class {cls}: AUC = {auc_per_class[i]:.4f}")

    print(f"\n🌐 Macro ROC AUC: {auc_macro:.4f}")

except Exception as e:
    print(f"⚠️ ROC AUC could not be computed: {e}")


## 🔍 What’s Happening?

1. **Per-Class Metrics**
   - Only class **5** was predicted correctly.
   - All other classes had **no true labels** and/or **no predicted labels**, hence precision, recall, and F1 are `0.0000`.
   - That’s why macro scores are low (`0.2500`) — averaging over all 8 classes.

2. **ROC AUC**
   - Only classes with **at least one positive and one negative** sample can have an AUC.
   - Classes like **0, 3, 4, 6, 7** were **never in `y_true`**, so AUC = `nan`.
   - `roc_auc_score` emits warnings because for those classes, it’s mathematically **undefined**.

3. **Macro ROC AUC**
   - If *any* class has AUC = `nan`, then the macro average becomes `nan` too.
   - This is why your `🌐 Macro ROC AUC: nan`.

