# Weed Detection using Computer Vision

This notebook focuses on developing a weed detection system using computer vision techniques. We are using the "Weed Detection" dataset from Kaggle, which contains images of weed species in agricultural settings.

## Dataset Description
- The dataset consists of annotated images of weed species in various growth stages.
- Images are split into training and test sets
- Annotations are provided in COCO format for object detection

## Objective
The goal is to build and train a model that can accurately detect and localize weed species in images. This has practical applications in precision agriculture and automated weed control systems.

## Approach
We will be using:
- YOLO (You Only Look Once) for object detection
- Data augmentation techniques to improve model robustness
- TensorFlow/Keras for deep learning implementation


In [1]:
import torch
torch.cuda.empty_cache()

In [None]:
%pip install torchmetrics[detection]
%pip install albumentations
%pip install torchvision
%pip install ultralytics
%pip install kagglehub

In [1]:
# Importation des bibliothèques
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import albumentations as A
import json
import shutil
import yaml
import cv2
import os
from ultralytics import YOLO
import torchvision
import torch
from torchvision.models import resnet50
import torch.nn.functional as F
from torch.utils.data import DataLoader
from transformers import DetrImageProcessor, DetrForObjectDetection
from torch.utils.data import Dataset
from PIL import Image
import torch.nn as nn
from tqdm.auto import tqdm
from torchmetrics.detection.mean_ap import MeanAveragePrecision
from datetime import datetime


In [2]:
import kagglehub

# Download latest version
path = kagglehub.dataset_download("jaidalmotra/weed-detection")
print("Path to dataset files:", path)

Path to dataset files: /root/.cache/kagglehub/datasets/jaidalmotra/weed-detection/versions/1


# YOLO (You Only Look Once)
In this section, we will:
- Convert our COCO format annotations to YOLO format
- Set up a YOLO model for weed detection 
- Train the model on our dataset
- Evaluate model performance
- Make predictions on test images

YOLO is an efficient real-time object detection system that processes images in a single pass through a neural network, making it ideal for our weed detection task.

In [3]:
# Define paths for train and test data
train_folder = os.path.join(path, 'train')
test_folder = os.path.join(path, 'test')


# Converting COCO Format to YOLO Format

In this step, we'll convert our dataset annotations from COCO format to YOLO format, which is required for training YOLO models.

COCO format uses absolute pixel coordinates in the format:
- `[x, y, width, height]` where (x,y) is the top-left corner
- Separate JSON files containing image info and annotations

YOLO format uses normalized coordinates in a simple text format:
- `<class_id> <x_center> <y_center> <width> <height>`
- All values are normalized between 0 and 1
- One text file per image with the same name

The conversion process will:
1. Read the COCO JSON annotations
2. For each annotation:
   - Convert absolute coordinates to normalized values
   - Convert top-left format to center point format
   - Write in YOLO's text file format



In [4]:
def convert_to_yolo_format(coco_annotations, image_folder, output_folder):
    """
    Convert COCO annotations to YOLO format
    YOLO format: <class> <x_center> <y_center> <width> <height>
    Values are normalized between 0 and 1
    """
    # Create output folder if it doesn't exist
    os.makedirs(output_folder, exist_ok=True)
    
    # Create mapping of image_id to file_name and dimensions
    image_info = {}
    for img in coco_annotations['images']:
        image_info[img['id']] = {
            'file_name': img['file_name'],
            'width': img['width'],
            'height': img['height']
        }
    
    # Process each annotation
    current_image_id = None
    current_labels = []
    
    for ann in coco_annotations['annotations']:
        img_id = ann['image_id']
        img_data = image_info[img_id]
        
        # Get bbox coordinates
        x, y, w, h = ann['bbox']
        
        # Convert to YOLO format (normalized)
        x_center = (x + w/2) / img_data['width']
        y_center = (y + h/2) / img_data['height']
        width = w / img_data['width']
        height = h / img_data['height']
        
        # Class ID (assuming category_id starts from 1, YOLO expects 0-based)
        class_id = ann['category_id'] - 1
        
        # Create YOLO format line
        yolo_line = f"{class_id} {x_center} {y_center} {width} {height}"
        
        # Write to file
        label_file = os.path.splitext(img_data['file_name'])[0] + '.txt'
        label_path = os.path.join(output_folder, label_file)
        
        with open(label_path, 'a') as f:
            f.write(yolo_line + '\n')



# Function: visualize_yolo_dataset()

This function visualizes images and their corresponding YOLO format annotations by:
- Taking a directory path containing images and their annotation files
- Randomly selecting a specified number of images (default 3) 
- For each selected image:
  - Loading the image and its YOLO annotation file (.txt)
  - Converting YOLO's normalized coordinates to pixel coordinates
  - Drawing red bounding boxes around the annotated objects
  - Displaying the image with annotations in a subplot

Parameters:
- data_path: Directory containing the image and annotation files
- num_samples: Number of random images to visualize (default=3)

The YOLO annotation format used is:
class_id x_center y_center width height
where all coordinates are normalized between 0 and 1




In [5]:
def visualize_yolo_dataset(data_path, num_samples=3):
    """
    Visualise les images et leurs annotations au format YOLO
    """
    # Charger quelques images aléatoires
    image_files = [f for f in os.listdir(data_path) if f.endswith(('.jpg', '.jpeg', '.png'))]
    selected_files = np.random.choice(image_files, min(num_samples, len(image_files)), replace=False)
    
    plt.figure(figsize=(15, 5*num_samples))
    
    for idx, img_file in enumerate(selected_files):
        # Charger l'image
        img_path = os.path.join(data_path, img_file)
        img = cv2.imread(img_path)
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        
        # Charger les annotations YOLO correspondantes
        txt_file = os.path.join(data_path, os.path.splitext(img_file)[0] + '.txt')
        
        plt.subplot(num_samples, 1, idx + 1)
        plt.imshow(img)
        
        if os.path.exists(txt_file):
            with open(txt_file, 'r') as f:
                lines = f.readlines()
                
            img_height, img_width = img.shape[:2]
            
            for line in lines:
                # Format YOLO: class x_center y_center width height
                class_id, x_center, y_center, width, height = map(float, line.strip().split())
                
                # Convertir les coordonnées normalisées en pixels
                x = int((x_center - width/2) * img_width)
                y = int((y_center - height/2) * img_height)
                w = int(width * img_width)
                h = int(height * img_height)
                
                # Dessiner le rectangle
                rect = patches.Rectangle(
                    (x, y), w, h,
                    linewidth=2,
                    edgecolor='r',
                    facecolor='none'
                )
                plt.gca().add_patch(rect)
                
                # Ajouter le label
                plt.text(x, y-5, f'Weed {class_id}', color='r', fontsize=12)
        
        plt.title(f'Image: {img_file}')
        plt.axis('off')
    
    plt.tight_layout()
    plt.show()


# Function Documentation

## clean_yolo_annotations(data_path)

This function cleans YOLO annotation files by removing duplicate bounding box annotations.

### Parameters:
- `data_path`: Path to the directory containing YOLO annotation files (.txt)

### Process:
1. Finds all .txt annotation files in the specified directory
2. For each file:
   - Reads all annotation lines
   - Converts annotations to tuples for deduplication
   - Uses a set to identify and remove duplicates
   - Rewrites the file with only unique annotations

### Returns:
- Number of duplicate annotations removed
- Number of files that were cleaned


In [6]:
def clean_yolo_annotations(data_path):
    """
    Nettoie les annotations YOLO en supprimant les doublons
    """
    txt_files = [f for f in os.listdir(data_path) if f.endswith('.txt')]
    duplicates_found = 0
    files_cleaned = 0
    
    for txt_file in txt_files:
        file_path = os.path.join(data_path, txt_file)
        
        # Lire les annotations
        with open(file_path, 'r') as f:
            lines = f.readlines()
        
        # Convertir chaque ligne en tuple pour pouvoir utiliser set()
        unique_annotations = set()
        for line in lines:
            # Convertir la ligne en tuple de floats
            values = tuple(map(float, line.strip().split()))
            unique_annotations.add(values)
        
        # Si on a trouvé des doublons
        if len(unique_annotations) < len(lines):
            duplicates_found += len(lines) - len(unique_annotations)
            files_cleaned += 1
            
            # Réécrire le fichier sans les doublons
            with open(file_path, 'w') as f:
                for annotation in unique_annotations:
                    f.write(' '.join(map(str, annotation)) + '\n')
    
    print(f"Nettoyage terminé:")
    print(f"- {duplicates_found} annotations dupliquées supprimées")
    print(f"- {files_cleaned} fichiers nettoyés")
    
    return duplicates_found, files_cleaned


# Vérifier les annotations après nettoyage
def verify_annotations_after_cleaning(data_path):
    """
    Vérifie qu'il n'y a plus de doublons dans les annotations
    """
    txt_files = [f for f in os.listdir(data_path) if f.endswith('.txt')]
    
    print(f"\nVérification des annotations dans {data_path}:")
    print(f"Nombre total de fichiers d'annotation: {len(txt_files)}")
    
    # Vérifier quelques exemples
    sample_files = np.random.choice(txt_files, min(3, len(txt_files)), replace=False)
    print("\nExemples d'annotations nettoyées:")
    
    for txt_file in sample_files:
        with open(os.path.join(data_path, txt_file), 'r') as f:
            content = f.readlines()
        print(f"\n{txt_file} ({len(content)} annotations):")
        for line in content:
            print(line.strip())



## COCO Format Analysis
The dataset uses the COCO (Common Objects in Context) format, which is a standard for object detection tasks. We analyze:
- Total number of images in the dataset
- Number of object annotations
- Number of weed categories
- Distribution of annotations per image
- Basic image metadata (dimensions, file names)

This analysis helps us understand:
- The scale of our dataset
- The complexity of our detection task
- Potential class imbalance issues

In [7]:
# Load and explore the COCO annotations
import json

# Load the COCO annotation file
with open(path + '/train/_annotations.coco.json', 'r') as f:
    coco_data = json.load(f)

# Print basic dataset information
print("Dataset Information:")
print(f"Number of images: {len(coco_data['images'])}")
print(f"Number of annotations: {len(coco_data['annotations'])}")
print(f"Number of categories: {len(coco_data['categories'])}\n")

# Print category information
print("Categories:")
for category in coco_data['categories']:
    print(f"ID: {category['id']}, Name: {category['name']}")

# Get some statistics about annotations per image
annotations_per_image = {}
for ann in coco_data['annotations']:
    img_id = ann['image_id']
    if img_id not in annotations_per_image:
        annotations_per_image[img_id] = 0
    annotations_per_image[img_id] += 1

# Calculate and print annotation statistics
num_annotations = list(annotations_per_image.values())
print(f"\nAnnotation Statistics:")
print(f"Average annotations per image: {sum(num_annotations)/len(num_annotations):.2f}")
print(f"Max annotations in a single image: {max(num_annotations)}")
print(f"Min annotations in a single image: {min(num_annotations)}")

# Print sample image information
print("\nSample Image Information:")
for img in coco_data['images'][:3]:  # Show first 3 images
    print(f"ID: {img['id']}")
    print(f"File name: {img['file_name']}")
    print(f"Width: {img['width']}, Height: {img['height']}")
    print()


Dataset Information:
Number of images: 1661
Number of annotations: 4199
Number of categories: 2

Categories:
ID: 0, Name: grass-weeds
ID: 1, Name: 0 ridderzuring

Annotation Statistics:
Average annotations per image: 2.54
Max annotations in a single image: 19
Min annotations in a single image: 1

Sample Image Information:
ID: 0
File name: Rumex-obtusifolius-L_1703_jpg.rf.00ce9f9ea755f686d01d88767ff162ee.jpg
Width: 640, Height: 640

ID: 1
File name: ridderzuring_0981_jpg.rf.00938fc387bb7acbdd49a94a987fa58c.jpg
Width: 640, Height: 640

ID: 2
File name: Rumex-obtusifolius-L_0194_jpg.rf.00065c173713e91bdc5e600a761fa880.jpg
Width: 640, Height: 640



In [8]:
# Load the COCO annotation file
with open(path + '/train/_annotations.coco.json', 'r') as f:
    coco_data = json.load(f)

# Convert COCO annotations to YOLO format
convert_to_yolo_format(coco_data, path + '/train', 'yolo_dataset/train')

train_path = 'yolo_dataset/train'

In [9]:
# Convert COCO annotations to YOLO format for test set
with open(path + '/test/_annotations.coco.json', 'r') as f:
    test_coco_data = json.load(f)

# Convert test annotations to YOLO format 
convert_to_yolo_format(test_coco_data, path + '/test', 'yolo_dataset/test')

test_path = 'yolo_dataset/test'

In [10]:
# Créer les dossiers YOLO s'ils n'existent pas
for dataset in ['train', 'test']:
    os.makedirs(f'yolo_dataset/{dataset}', exist_ok=True)

# Copier les images et créer les annotations pour chaque dataset
for dataset in ['train', 'test']:
    source_folder = os.path.join(path, dataset)
    target_folder = f'yolo_dataset/{dataset}'
    
    # Copier toutes les images
    for img_file in os.listdir(source_folder):
        if img_file.endswith(('.jpg', '.jpeg', '.png')):
            src_path = os.path.join(source_folder, img_file)
            dst_path = os.path.join(target_folder, img_file)
            shutil.copy2(src_path, dst_path)

# Maintenant, on peut exécuter le code original pour data.yaml et la création des fichiers txt
data_yaml = {
    'train': train_path,
    'val': test_path,
    'nc': 1,
    'names': ['weed']
}

with open('yolo_dataset/data.yaml', 'w') as f:
    yaml.dump(data_yaml, f)

# Pour chaque dataset (train et test)
for dataset in ['train', 'test']:
    img_path = f'yolo_dataset/{dataset}'
    
    # Lister tous les fichiers d'annotations (.txt)
    txt_files = [f for f in os.listdir(img_path) if f.endswith('.txt')]
    
    # Créer les chemins correspondants pour les images
    img_paths = []
    label_paths = []
    
    for txt_file in txt_files:
        base_name = os.path.splitext(txt_file)[0]
        # Vérifier si l'image correspondante existe
        for ext in ['.jpg', '.jpeg', '.png']:
            img_file = base_name + ext
            if os.path.exists(os.path.join(img_path, img_file)):
                img_paths.append(os.path.join(img_path, img_file))
                label_paths.append(os.path.join(img_path, txt_file))
                break
    
    # Écrire les chemins dans les fichiers
    with open(f'yolo_dataset/{dataset}_images.txt', 'w') as f:
        f.write('\n'.join(img_paths))
    
    with open(f'yolo_dataset/{dataset}_labels.txt', 'w') as f:
        f.write('\n'.join(label_paths))
    
    print(f"\nDataset {dataset}:")
    print(f"- Nombre de fichiers d'annotations: {len(txt_files)}")
    print(f"- Nombre de paires image-annotation: {len(img_paths)}")


Dataset train:
- Nombre de fichiers d'annotations: 1655
- Nombre de paires image-annotation: 1655

Dataset test:
- Nombre de fichiers d'annotations: 244
- Nombre de paires image-annotation: 244


In [None]:
visualize_yolo_dataset('yolo_dataset/train')
visualize_yolo_dataset('yolo_dataset/test')

In [None]:
# Import required libraries
import numpy as np
import cv2
import os
import albumentations as A
import shutil

# Define augmentation pipeline
transform = A.Compose([
    A.RandomRotate90(p=0.5),
    A.HorizontalFlip(p=0.5),
    A.VerticalFlip(p=0.5),
    A.OneOf([
        A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5),
        A.RandomGamma(gamma_limit=(80, 120), p=0.5),
    ], p=0.5),
    A.OneOf([
        A.GaussNoise(var_limit=(10.0, 50.0), p=0.5),
        A.GaussianBlur(blur_limit=(3, 7), p=0.5),
    ], p=0.5),
    A.OneOf([
        A.RandomScale(scale_limit=0.2, p=0.5),
        A.Resize(height=640, width=640, p=0.5),
    ], p=0.5),
], bbox_params=A.BboxParams(format='yolo', label_fields=['class_labels']))

# Create augmented training directory
augmented_train_folder = os.path.join('yolo_dataset/train_with_aug')
os.makedirs(augmented_train_folder, exist_ok=True)

# First, copy all original files to the new directory
original_train_folder = os.path.join('yolo_dataset/train')
for file in os.listdir(original_train_folder):
    src = os.path.join(original_train_folder, file)
    dst = os.path.join(augmented_train_folder, file)
    shutil.copy2(src, dst)

# Get list of training images
image_files = [f for f in os.listdir(original_train_folder) if f.endswith(('.jpg', '.jpeg', '.png'))]
num_augmentations = 3  # Number of augmented versions to create per image

# Perform augmentation
for img_file in image_files:
    # Load image
    img_path = os.path.join(original_train_folder, img_file)
    image = cv2.imread(img_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    
    # Load corresponding label file
    label_file = os.path.splitext(img_file)[0] + '.txt'
    label_path = os.path.join(original_train_folder, label_file)
    
    if os.path.exists(label_path):
        # Read boxes and class labels
        boxes = []
        class_labels = []
        with open(label_path, 'r') as f:
            for line in f:
                class_id, x_center, y_center, width, height = map(float, line.strip().split())
                boxes.append([x_center, y_center, width, height])
                class_labels.append(class_id)
        
        # Create augmented versions
        for i in range(num_augmentations):
            # Apply augmentation
            transformed = transform(image=image, bboxes=boxes, class_labels=class_labels)
            aug_image = transformed['image']
            aug_boxes = transformed['bboxes']
            
            # Save augmented image
            aug_img_file = f"{os.path.splitext(img_file)[0]}_aug_{i}{os.path.splitext(img_file)[1]}"
            aug_img_path = os.path.join(augmented_train_folder, aug_img_file)
            cv2.imwrite(aug_img_path, cv2.cvtColor(aug_image, cv2.COLOR_RGB2BGR))
            
            # Save augmented labels
            aug_label_file = f"{os.path.splitext(img_file)[0]}_aug_{i}.txt"
            aug_label_path = os.path.join(augmented_train_folder, aug_label_file)
            
            with open(aug_label_path, 'w') as f:
                for box, class_id in zip(aug_boxes, class_labels):
                    f.write(f"{int(class_id)} {' '.join(map(str, box))}\n")

# Print statistics
original_images = len(image_files)
total_images = len([f for f in os.listdir(augmented_train_folder) if f.endswith(('.jpg', '.jpeg', '.png'))])
augmented_images = total_images - original_images

print(f"Original training images: {original_images}")
print(f"Augmented images created: {augmented_images}")
print(f"Total training images available: {total_images}")


In [None]:
# Obtenir le chemin absolu du projet
import os
current_dir = os.getcwd()

# Créer data.yaml avec les chemins absolus, incluant les données augmentées
data_yaml = {
    'train': os.path.join(current_dir, 'yolo_dataset/train_with_aug'), # new combined directory

        #os.path.join(current_dir, 'yolo_dataset/train'),          # données originales
    
    'val': os.path.join(current_dir, 'yolo_dataset/test'),        # chemin absolu vers test
    'nc': 1,  # nombre de classes
    'names': ['weed']  # nom des classes
}

# Écrire data.yaml
with open('yolo_dataset/data.yaml', 'w') as f:
    yaml.dump(data_yaml, f)

# Vérifier le contenu du fichier
with open('yolo_dataset/data.yaml', 'r') as f:
    print(f.read())

In [None]:
# Vérifier le contenu du fichier data.yaml
with open('yolo_dataset/data.yaml', 'r') as f:
    content = yaml.safe_load(f)
    print("Configuration data.yaml:")
    print(f"- Train path: {content['train']}")
    print(f"- Val path: {content['val']}")
    print(f"- Number of classes: {content['nc']}")
    print(f"- Class names: {content['names']}")

In [None]:
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'

print(f"Using device: {device}")

## Model Training Section



In [None]:
# Check CUDA availability and GPU info
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"Current device: {torch.cuda.current_device()}")
print(f"Device name: {torch.cuda.get_device_name()}")
print(f"Device memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

# Clear CUDA cache
torch.cuda.empty_cache()

In [None]:
# Charger le modèle YOLOv8
model = YOLO("yolo11m.pt")  

# Entraîner le modèle
results = model.train(
    data='yolo_dataset/data.yaml',  # Chemin vers le fichier data.yaml
    epochs=20,                     # Nombre d'époques
    imgsz=640,                     # Taille des images
    batch=16,                      # Taille du batch
    name='weed_detection',
    device=device
)

In [None]:
# Get the parent directory of the current working directory
parent_dir = os.path.dirname(os.getcwd())

# Navigate to runs/detect/weed_detection8 folder
target_dir = os.path.join('runs', 'detect', 'weed_detection')

print(f"Target directory: {target_dir}")

# List contents of weed_detection8 directory if it exists
if os.path.exists(target_dir):
    print("\nContents of weed_detection8 directory:")
    for item in os.listdir(target_dir):
        print(f"- {item}")
else:
    print("\nDirectory not found. Please check if the path is correct.")


In [None]:
# Import necessary libraries
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

# Load and plot training results
results = pd.read_csv('runs/detect/weed_detection/results.csv')

# Plot training metrics
plt.figure(figsize=(15, 10))

# Training Losses
plt.subplot(2, 2, 1)
plt.plot(results['epoch'], results['train/box_loss'], label='Box Loss')
plt.plot(results['epoch'], results['train/cls_loss'], label='Class Loss')
plt.plot(results['epoch'], results['train/dfl_loss'], label='DFL Loss')
plt.title('Training Losses')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

# Precision, Recall, mAP
plt.subplot(2, 2, 2)
plt.plot(results['epoch'], results['metrics/precision(B)'], label='Precision')
plt.plot(results['epoch'], results['metrics/recall(B)'], label='Recall')
plt.plot(results['epoch'], results['metrics/mAP50(B)'], label='mAP50')
plt.plot(results['epoch'], results['metrics/mAP50-95(B)'], label='mAP50-95')
plt.title('Model Performance Metrics')
plt.xlabel('Epoch')
plt.ylabel('Value')
plt.legend()

# Validation Losses
plt.subplot(2, 2, 3)
plt.plot(results['epoch'], results['val/box_loss'], label='Val Box Loss')
plt.plot(results['epoch'], results['val/cls_loss'], label='Val Class Loss')
plt.plot(results['epoch'], results['val/dfl_loss'], label='Val DFL Loss')
plt.title('Validation Losses')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

# Learning Rate
plt.subplot(2, 2, 4)
plt.plot(results['epoch'], results['lr/pg0'], label='Learning Rate')
plt.title('Learning Rate Schedule')
plt.xlabel('Epoch')
plt.ylabel('Learning Rate')
plt.legend()

plt.tight_layout()
plt.show()

# Print final metrics summary
print("\nFinal Model Performance:")
print(f"Precision: {results['metrics/precision(B)'].iloc[-1]:.3f}")
print(f"Recall: {results['metrics/recall(B)'].iloc[-1]:.3f}")
print(f"mAP50: {results['metrics/mAP50(B)'].iloc[-1]:.3f}")
print(f"mAP50-95: {results['metrics/mAP50-95(B)'].iloc[-1]:.3f}")

### **Strengths**

1. **Precision (0.759):**
   - The model has a relatively high precision, indicating that when it predicts a weed, it is correct about 75.9% of the time. This is beneficial in applications where false positives (incorrectly identifying non-weeds as weeds) need to be minimized, such as in automated weed control systems where unnecessary actions could be costly or damaging.

2. **mAP50 (0.736):**
   - The mean Average Precision at an IoU threshold of 50% is 73.6%, which suggests that the model is quite effective at detecting and localizing weeds with a reasonable degree of overlap between predicted and actual bounding boxes. This is a positive indicator of the model's ability to identify most weeds accurately.

### **Weaknesses**

1. **Recall (0.656):**
   - The recall is lower than precision, at 65.6%, indicating that the model misses a significant number of actual weed instances. This suggests that the model could benefit from improvements in detecting all instances of weeds, especially smaller or less distinct ones.

2. **mAP50-95 (0.363):**
   - The mean Average Precision across a range of IoU thresholds (50% to 95%) is quite low at 36.3%. This indicates that while the model can detect weeds, the precision of the bounding boxes is not as high, especially under stricter IoU conditions. This suggests that the model struggles with accurately placing bounding boxes around weeds, which could be due to variability in weed shapes, sizes, or occlusions in the dataset.

### **Overall Performance Summary**

The model demonstrates a good balance between precision and recall, with a stronger emphasis on precision. However, the lower recall and mAP50-95 suggest that there is room for improvement in detecting all weed instances and refining the accuracy of bounding box placements. Potential areas for improvement could include:

- **Data Augmentation:** Enhance the training dataset with more diverse augmentations to help the model generalize better to unseen data.
- **Model Architecture:** Experiment with more complex architectures or fine-tuning pre-trained models to improve detection capabilities.
- **Hyperparameter Tuning:** Adjust training parameters such as learning rate, batch size, and number of epochs to optimize model performance.
- **Additional Data:** Incorporate more training data, especially for underrepresented weed types or challenging conditions, to improve recall and bounding box precision. 

Overall, while the model shows promise, particularly in precision, further refinements could enhance its ability to detect and accurately localize all weed instances.


# Visualizing Training Results and Model Performance Metrics
We will now plot various visualizations generated during model training, including:
- Confusion matrices (normalized and raw) to evaluate classification performance
- Validation batch predictions and ground truth labels
- Training batch samples showing model's learning progress
- Performance curves (P, R, F1, PR curves) to assess model metrics
- Label statistics and correlations
These plots help us analyze the model's behavior and validate its performance on our weed detection task.








In [None]:
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import os

# List of image files to plot (excluding non-image files)
image_files = [
    'confusion_matrix.png',
    'val_batch1_labels.jpg', 
    'results.png',
    'P_curve.png',
    'val_batch2_pred.jpg',
    'labels_correlogram.jpg', 
    'train_batch4161.jpg',
    'train_batch4160.jpg',
    'train_batch4162.jpg',
    'val_batch0_labels.jpg',
    'F1_curve.png',
    'train_batch0.jpg',
    'train_batch1.jpg',
    'confusion_matrix_normalized.png',
    'train_batch2.jpg',
    'val_batch1_pred.jpg',
    'val_batch0_pred.jpg',
    'R_curve.png',
    'labels.jpg',
    'val_batch2_labels.jpg',
    'PR_curve.png'
]

# Create a figure with subplots
n_images = len(image_files)
n_cols = 3
n_rows = (n_images + n_cols - 1) // n_cols

plt.figure(figsize=(20, 5*n_rows))

# Plot each image
for i, img_file in enumerate(image_files):
    img_path = os.path.join(target_dir, img_file)
    if os.path.exists(img_path):
        plt.subplot(n_rows, n_cols, i+1)
        img = mpimg.imread(img_path)
        plt.imshow(img)
        plt.title(img_file)
        plt.axis('off')
    else:
        print(f"File not found: {img_path}")

plt.tight_layout()
plt.show()
