# Amini Cocoa Contamination Challenge Solution By SPECIALZ🔥🔥🙌🏽

This notebook documents my submission to the Amini Cocoa Contamination Challenge, which aims to build machine learning models capable of identifying multiple cocoa leaf diseases—like CSSVD and anthracnose—directly from images. The goal is to develop models that not only generalize well to unseen diseases, but also run efficiently on low-end smartphones, making real-time diagnosis accessible to smallholder farmers across Africa.

### ⚙️ Our Approach
Throughout this challenge, We experienced several ups and downs—experimenting with different YOLO models, adjusting image sizes, tuning confidence thresholds, and merging predictions. Getting a model that balanced accuracy, robustness, and efficiency was not easy. At one point, results improved significantly around the 46th epoch, showing the importance of careful monitoring and validation.

This notebook covers:

Model training using YOLOv11

Inference with test-time augmentation (TTA)

Prediction fusion using Weighted Box Fusion (WBF)

Final submission preparation

And an explainability section to meet the competition requirements

Let’s dive into the code and explore what worked—and what didn’t—on the path to building an AI solution that could one day live on a farmer’s phone.

### Install Necessary Packages

In [None]:
!pip -q install -U ultralytics iterative-stratification

### Import Necessary Packages

In [None]:
import pandas as pd
import os
from pathlib import Path
import shutil
from sklearn.model_selection import train_test_split
from tqdm.notebook import tqdm
import cv2
import yaml
import matplotlib.pyplot as plt
from ultralytics import YOLO
import multiprocessing
import warnings
warnings.filterwarnings("ignore")
import random
from datetime import datetime
import time
from glob import glob
from iterstrat.ml_stratifiers import MultilabelStratifiedKFold
from PIL import Image
import torch
import numpy as np
from ultralytics import RTDETR
from ultralytics.data.build import YOLODataset
import ultralytics.data.build as build
device='cuda'


### 📁  Set configurations and seed

In [None]:
class CFG:
    seed = 42
    random_state = 42
    folds=10

def seed_everything(seed):
    random.seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    torch.use_deterministic_algorithms(True, warn_only=True)
seed_everything(CFG.seed)

In [None]:
# Read the CSV File
df = pd.read_csv('/kaggle/input/amini-cocoa-contamination-challenge/Train.csv')

# Extract Unique Class Labels
unique_classes = df['class'].unique()

# Create a Class-to-Index Mapping
class_mapping = {cls: idx for idx, cls in enumerate(unique_classes)}
print(class_mapping)

### ✅ Purpose
This is a common preprocessing step in machine learning tasks to convert categorical class labels into numerical values that can be used for model training.

### 📁 Directory and Data Loading

In [None]:
# Set the data directory
DATA_DIR = Path('/kaggle/input/amini-cocoa-contamination-challenge/')
IMGS_DIR = Path('/kaggle/input/amini-cocoa-contamination-challenge/dataset/images')

# Load train and test files
train = pd.read_csv(DATA_DIR / 'Train.csv')
test = pd.read_csv(DATA_DIR / 'Test.csv')
ss = pd.read_csv(DATA_DIR / 'SampleSubmission.csv')

### ✅ Purpose
This setup organizes paths and loads necessary CSV files for training, inference, and submission formatting in a Kaggle environment.

### 📁 Let's explore the Train, Test, Sample Submission File

In [None]:
train.head()

In [None]:
test.head()

In [None]:
ss.head()

Clearly, the test set does not have values for the bounding boxes and confidence which makes sense because we are to find them.

Let's explore furthermore

### 🏷️ Checking for Duplicates

In [None]:
# Defininig columns in the train set
defined_cols = ['Image_ID',	'confidence',	'class',	'ymin',	'xmin',	'ymax',	'xmax']
train = train[defined_cols]

print(f'Sum of duplicated colums: {train.duplicated().sum()}')
print(f'Size of dataframe before removing duplicates: {train.shape}')

# Remove duplicates
train = train.drop_duplicates()
print(f'Sum of duplicated colums after removing duplicates: {train.duplicated().sum()}')
print(f'Size of dataframe after removing duplicates: {train.shape}')

### 🏷️ Generating Class Label Dictionary

In [None]:
unique_classes = train['class'].unique()
full_label_dict = {cls: idx for idx, cls in enumerate(unique_classes)}
full_label_dict

### ✅ Purpose
Converts categorical class labels into numerical indices for use in model training and evaluation.

## 🧪 Data Preparation and Stratified K-Fold Setup for Multi-Label Classification

### Group Labels by Image

In [None]:
# Step 1: Group by Image_ID and aggregate class labels into lists
train['new_class'] = train['class'].map(full_label_dict)
grouped = train.groupby('Image_ID')['new_class'].apply(list).reset_index()

### ✅ Purpose
1. Maps class labels to integers using full_label_dict.
2. Groups the data by Image_ID, aggregating all associated class labels into lists.
3. This ensures each image has a list of all labels assigned to it (for multi-label classification).


### Initialize Class Columns with -1

In [None]:
all_classes = train["class"].unique().tolist()
for unique_class in all_classes:
    grouped[unique_class] = -1

### ✅ Purpose
1. Retrieves the list of all unique class names from the training data.

2. Adds a column for each class in the grouped DataFrame.

3. Initializes these columns to -1, which will later be updated to 1 or 0 based on label presence.

### Reverse Label Mapping

In [None]:
reverse_label_mapping = {full_label_dict[key]:key for key in full_label_dict} 

In [None]:
reverse_label_mapping

### ✅ Purpose
1. Creates a dictionary to map numeric labels back to their original string labels.

2. This is useful for labeling class columns correctly in the next step.

### One-Hot Encode Labels for Each Image

In [None]:
# input 1 if the label is in that image else 0
all_labels_list = (list(grouped['new_class'].values))
for train_index, label_List in enumerate(all_labels_list):
    unique_labels = list(set(label_List))
    for label_index in range(len(unique_labels)):
        label = int(unique_labels[label_index])
        for key_value in range(23):
            if label == key_value:
                grouped.loc[train_index, reverse_label_mapping[key_value]] = 1
                break

### ✅ Purpose
1. Iterates through each image's list of labels.

2. Converts the list into a set of unique labels.

3. For each label, checks if it matches a known label index.

If matched, sets the corresponding column in grouped to 1, indicating presence of that class label for that image.

### Create Stratified Folds and Image Paths

In [None]:
X = grouped[['Image_ID']]
grouped['fold'] = -1
mskf = MultilabelStratifiedKFold(n_splits=CFG.folds, shuffle=True, random_state=CFG.random_state)
for i_fold, (train_index, test_index) in enumerate(mskf.split(X, grouped[all_classes])):
    grouped.loc[test_index, "fold"] = i_fold     

### ✅ Purpose
1. Initializes a MultilabelStratifiedKFold for balanced fold generation.

2. Splits the data so that each fold has a similar distribution of labels across classes.

3. Assigns a fold number to each image for use in cross-validation.

In [None]:
# create image_path for grouped_data
grouped['image_path'] = [Path(str(IMGS_DIR) + '/train/' + x) for x in grouped.Image_ID]

# drop duplicates rows for test
test = test.drop_duplicates(subset=['Image_ID'], ignore_index=True)
test['image_path'] = [Path(str(IMGS_DIR) + '/test/' + x) for x in test.Image_ID]  

## 📝 Function to Convert Bounding Boxes to YOLO Format and Save Annotations

---

### 📦 Function: `save_yolo_annotation`

All values are normalized by the image width and height.

### Steps:

1. Reads the image to get its dimensions.

2. Normalizes bounding box coordinates.

3. Writes a .txt label file for the corresponding image in YOLO format.

4. Failsafe: If the image cannot be read, an error is raised.

In [None]:
# Function to convert the bounding boxes to YOLO format and save them
def save_yolo_annotation(row):

    image_path, class_id, output_dir = row['image_path'], row['class_id'], row['output_dir']

    img = cv2.imread(str(image_path))
    if img is None:
        raise ValueError(f"Could not read image from path: {image_path}")

    height, width, _ = img.shape
    label_file = Path(output_dir) / f"{Path(image_path).stem}.txt"


    ymin, xmin, ymax, xmax = row['ymin'], row['xmin'], row['ymax'], row['xmax']

    # Normalize the coordinates
    x_center = (xmin + xmax) / 2 / width
    y_center = (ymin + ymax) / 2 / height
    bbox_width = (xmax - xmin) / width
    bbox_height = (ymax - ymin) / height

    with open(label_file, 'a') as f:
        f.write(f"{class_id} {x_center:.6f} {y_center:.6f} {bbox_width:.6f} {bbox_height:.6f}\n")

### 🔍 Explanation
Purpose: Converts bounding boxes from (xmin, ymin, xmax, ymax) format to the YOLO format:
```arduino
class_id x_center y_center width height
```

In [None]:
# Parallelize the annotation saving process
def process_dataset(dataframe, output_dir):
    dataframe['output_dir'] = output_dir
    # convert the dataframe to a dictionary
    dataframe = dataframe.to_dict('records')
    for i in tqdm(range(len(dataframe))):
        save_yolo_annotation(dataframe[i])

### 🔍 Explanation
1. Adds an output_dir column to the DataFrame for storing annotation files.

2. Converts the DataFrame to a list of dictionaries (records) for easier iteration.

3. Iterates over each row and calls save_yolo_annotation to save YOLO labels.

4. A progress bar (tqdm) tracks the process for better visibility.

#### Let's Implement this now!

In [None]:
# # Add an image_path column
train['image_path'] = [Path(str(IMGS_DIR) + '/train/' + x) for x in train.Image_ID]

# Map string classes to integers (label encoding targets)
train['class_id'] = train['class'].map(full_label_dict)

## 🔁 Preparing YOLO Training/Validation/Test Data for a Specific Fold

This code processes one fold (specifically `fold == 1`) of your dataset to organize images and labels into the required format for training a YOLO model.

---

In [None]:
# 🧠 Loop Through Folds

for fold in range(CFG.folds):
    if fold == 1:
        # 📁 Define Directory Structure
        # images
        TRAIN_IMAGES_DIR = Path(f'/kaggle/working/train/images/fold_{fold + 1}')
        VAL_IMAGES_DIR = Path(f'/kaggle/working/val/images/fold_{fold + 1}')
        TEST_IMAGES_DIR = Path('/kaggle/working/test/images')

        # labels
        TRAIN_LABELS_DIR = Path(f'/kaggle/working/train/labels/fold_{fold + 1}')
        VAL_LABELS_DIR = Path(f'/kaggle/working/val/labels/fold_{fold + 1}')
        TEST_LABELS_DIR = Path('/kaggle/working/test/labels')

        # Get the train and val for that fold
        train_fold = grouped[grouped['fold'] != fold ].reset_index(drop=True)
        val_fold = grouped[grouped['fold'] == fold].reset_index(drop=True)

        DIRS = [TRAIN_IMAGES_DIR, VAL_IMAGES_DIR, TRAIN_LABELS_DIR, VAL_LABELS_DIR, TEST_IMAGES_DIR, TEST_LABELS_DIR]
        
        # Create necessary directories
        for DIR in DIRS:
            if DIR.exists():
                shutil.rmtree(DIR)
            DIR.mkdir(parents=True, exist_ok=True)
       
        # Copy train, val, and test images to their respective dirs
        for img in tqdm(train_fold.image_path.unique()):
            shutil.copy(img, TRAIN_IMAGES_DIR / img.parts[-1])
        print(f'Copied train file for fold{fold+1} to folder')

        for img in tqdm(val_fold.image_path.unique()):
            shutil.copy(img, VAL_IMAGES_DIR / img.parts[-1])
        print(f'Copied val file for fold{fold+1} to folder')

        for img in tqdm(test.image_path.unique()):
            shutil.copy(img, TEST_IMAGES_DIR / img.parts[-1])
        print(f'Copied test file for first fold to folder')


        X_train = train[train.Image_ID.isin(train_fold.Image_ID)].reset_index(drop=True)
        X_val = train[train.Image_ID.isin(val_fold.Image_ID)].reset_index(drop=True)


        print(f"-------------Process Datasets for fold {fold+1}")
        # Save train and validation labels to their respective dirs
        process_dataset(X_train, TRAIN_LABELS_DIR)
        process_dataset(X_val, VAL_LABELS_DIR)

        print(f"-------------End of Processing of Datasets for fold {fold+1}")

### ✅ Summary
This fold-wise script performs the following:

Sets up directories for training, validation, and test images and labels.

Splits the dataset by fold.

Copies image files into YOLO-style directories.

Converts and saves labels in YOLO format using the process_dataset function.

Prepares data for training object detection models like YOLOv5 or YOLOv8.

ℹ️ Only fold == 1 is processed in this script, which can be extended to process all folds by removing the if fold == 1 condition.

### This is Optional. If you are using Kaggle, run this cell to avoid getting errors

In [None]:
# Define the new dataset directory structure within the current working directory
base_dir = './datasets'  # Create the dataset in the local writable directory
dirs = [
    os.path.join(base_dir, 'train/images'),
    os.path.join(base_dir, 'train/labels'),
    os.path.join(base_dir, 'val/images'),
    os.path.join(base_dir, 'val/labels')
]

# Create the directories
for dir_path in dirs:
    os.makedirs(dir_path, exist_ok=True)

# Example: Source directories where your current files are stored (update these paths)
source_train_images = './train/images'
source_train_labels = './train/labels'
source_val_images = './val/images'
source_val_labels = './val/labels'

# Move files to the new structure
def move_files(source, destination):
    if os.path.exists(source):
        for file_name in os.listdir(source):
            shutil.move(os.path.join(source, file_name), destination)

# Move training images and labels
move_files(source_train_images, os.path.join(base_dir, 'train/images'))
move_files(source_train_labels, os.path.join(base_dir, 'train/labels'))

# Move validation images and labels
move_files(source_val_images, os.path.join(base_dir, 'val/images'))
move_files(source_val_labels, os.path.join(base_dir, 'val/labels'))

## 📄 Create `data.yaml` File for YOLO Training

This script creates the `data.yaml` file, which is required by YOLO models (such as YOLOv11 or YOLOv12) to define the training configuration.

In [None]:
# Create a data.yaml file required by YOLO
# 🏷️ Extract Class Names and Count
class_names = train['class'].unique().tolist()
num_classes = len(class_names)

# 📁 Define Training and Validation Image Directories
# images
TRAIN_IMAGES_DIR = Path('/kaggle/working/datasets/train/images/fold_2/')
VAL_IMAGES_DIR = Path('/kaggle/working/datasets/val/images/fold_2/')

#📘 Build YAML Configuration Dictionary
data_yaml = {
    'train': str(TRAIN_IMAGES_DIR),
    'val': str(VAL_IMAGES_DIR),
    'nc': num_classes,
    'names': class_names
}

# Save the data.yaml file
yaml_path = Path('data.yaml')
with open(yaml_path, 'w') as file:
    yaml.dump(data_yaml, file, default_flow_style=False)

### ✅ Purpose
Constructs the dictionary for the data.yaml file, containing:

train: path to training images.

val: path to validation images.

nc: number of classes.

names: list of class names (in order).

Saves the constructed dictionary into a data.yaml file.

This file will be used by YOLO during training to understand dataset structure and class names.

In [None]:
data_yaml

## 🧠 Train First Model, A YOLOv11 Model on Custom Dataset

This code snippet initializes a YOLOv11 model and trains it using the configuration specified in the `data.yaml` file.


### 🚀 Hyperparameters
```yaml
data: Path to the data.yaml file describing the dataset.

epochs: Train for 100 full passes over the training dataset.

imgsz: Resize input images to 576×576 pixels.

device: Use GPU device 0 for training.

batch: Batch size of 16 images per step.

optimizer: Use the AdamW optimizer (better regularization).

lr0: Initial learning rate of 3e-4.

momentum: Momentum factor for optimizer (used if switching to SGD).

weight_decay: L2 regularization to prevent overfitting.

close_mosaic: Disable YOLO’s mosaic augmentation after 30 epochs (stabilizes training).

seed: Set a fixed random seed (42) for reproducibility.

patience: Set to stop the training after no improvement
```

In [None]:
# 🔧 Load Pretrained YOLOv11 Model
model1 = YOLO("yolo11l.pt")

# 🚀 Simulate 100-Epoch Training in 46 Epochs
model1.train(data='data.yaml',
             epochs=100,               
             imgsz=640,
             device=0,
             batch=16,
             optimizer='AdamW',
             lr0=3e-4,           
             momentum=0.9,
             weight_decay=1e-2,
             close_mosaic=30,
             seed=42,
             patience=10      
)

### 📈 Outcome
This setup fine-tunes the YOLOv11 large model on your custom dataset, leveraging strong regularization and augmentation settings to improve generalization. After training, the best model checkpoint will be saved and can be used for inference or further evaluation.

## ✅ Evaluate the Trained YOLOv11 Model

After training, we evaluate the model's performance on the validation dataset using the `.val()` method.

In [None]:
results = model1.val()

Our Cross Validation is at 0.821 which is very great. Let's train another model and ensemble it.

Save best.pt for inference

### Move to second model notebook before the inference and explainability notebook