This is DL Program now:
1) I chose Resnet as pretrained model instead of Vision Transformer cuz ViT is pretrained with 14 millions images and needs thousands of images at least to not overfit.
Since I have only 1600 images Resnet will be more enough and more accurate because its pretrained only with 1,2M images
2) I chose ResNetCNN instead of CNN because it solves the vanishing gradient problem which is explained already
3) Program Overview: Load csv Datas for test and train -> Split it into validation and Train data with L2 and Dropout -> Train and Validation -> Prediction for testing -> Printing the Result

In [17]:
import os
import pandas as pd
import numpy as np
import torch.nn as nn
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import Dataset, DataLoader
from PIL import Image
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
import torchvision.transforms as transforms
from torch.utils.data import Dataset, DataLoader
from PIL import Image
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split


In [18]:
# -------------------- Config --------------------
BASE_PATH = "C:/Users/Sami/Desktop/CV"
IMAGE_PATH = os.path.join(BASE_PATH, "images")
TRAIN_CSV = os.path.join(BASE_PATH, "train.csv")
TEST_CSV = os.path.join(BASE_PATH, "test.csv")
SUBMISSION_CSV = "submission_ResnetCNN.csv"

BATCH_SIZE = 128 # Number of samples per gradient update
EPOCHS = 25 #Passthrough the entire dataset 25 times
LR = 2.5e-4 # Learning Rate

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")


Here I defined Dataset class for leaves to load images and labels which I took from Pytorch Website and modified it for the Project

In [19]:

# Define custom dataset to load images and labels
class LeafDataset(Dataset):
    def __init__(self, df, img_dir, transform=None, is_test=False):
        self.df = df.reset_index(drop=True)
        self.img_dir = img_dir
        self.transform = transform #Store the transformations to apply
        self.is_test = is_test # Flag: True for test set, False for train/val

    def __len__(self):
        return len(self.df) # Return number of images in the dataset

    def __getitem__(self, idx):
        row = self.df.iloc[idx]
        img_id = int(row["id"]) # needed fir image path
        img_path = os.path.join(self.img_dir, f"{img_id}.jpg") 

        

        img = Image.open(img_path).convert("RGB") # IMPORTANT!!!: I tried convert("L") for grayscale but performance dropped 
                                                                #because of pretrained weights with RESNET were with RGB images

        if self.transform:
            img = self.transform(img) # Apply transformations

        # for test: return only image
        if self.is_test:
            return img

        # for train/val: return image and label
        label = torch.tensor(row["species_encoded"], dtype=torch.long)
        return img, label  




Loading CSV Datas same part like GB

In [20]:

#Load train/test CSVs and encode species labels into numbers
print("Device:", device)

train_df = pd.read_csv(TRAIN_CSV)
test_df = pd.read_csv(TEST_CSV)

le = LabelEncoder()
train_df["species_encoded"] = le.fit_transform(train_df["species"]) #take the species column and encode it into numbers
NUM_CLASSES = len(le.classes_)
print(f"Number of classes: {NUM_CLASSES}")


Device: cuda
Number of classes: 99


 Split training data into train and validation sets like GB

In [21]:

# Split training data into train and validation sets
train_idx, val_idx = train_test_split(
    range(len(train_df)), test_size=0.15, stratify=train_df["species_encoded"], random_state=42 # test_size=0.15 
)


Here where Transformations happen to improve the model for training and validation. For example Randomcrop, RandomhorizontalFlip

In [22]:
#Define image transformations for training and validation
#Training includes augmentations; validation does not
train_tfm = transforms.Compose([
    transforms.Resize(254),   
    transforms.RandomCrop(224), # IMPORTANT!!!: augmentation , # THIS IS CHOSEN ON PURPOSE CUZ RESNET IS PRETRAINED WITH 224x224 IMAGES
    transforms.RandomHorizontalFlip(), #IMPORTANT!!!: #augmentation
    transforms.ToTensor(),             
    transforms.Normalize((0.5,), (0.5,)), #preprocessing just to help neurone trains better
])

val_tfm = transforms.Compose([
    transforms.Resize(254),
    transforms.CenterCrop(224), #IMPORTANT!!!: No augmentation
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,)),
])


Create PyTorch datasets and dataloaders for all sets

In [23]:
# Create PyTorch datasets and dataloaders for all sets
train_ds = LeafDataset(train_df.iloc[train_idx], IMAGE_PATH, train_tfm, is_test=False)
val_ds = LeafDataset(train_df.iloc[val_idx], IMAGE_PATH, val_tfm, is_test=False)
test_ds = LeafDataset(test_df, IMAGE_PATH, val_tfm, is_test=True)

train_loader = DataLoader(train_ds, batch_size=BATCH_SIZE, shuffle=True) #dataloader organizes data into batches and load them to cpu/gpu
val_loader = DataLoader(val_ds, batch_size=BATCH_SIZE, shuffle=False)
test_loader = DataLoader(test_ds, batch_size=BATCH_SIZE, shuffle=False)


I loaded the model with Dropout and modified the last layer for the leaves

In [24]:
# Load a pretrained ResNet-18 model and adapt it for our classification task
model = torchvision.models.resnet18(weights="IMAGENET1K_V1") # resnet 18, 50 101  / 
     # IMPORTANT!!!:Both pretrained with 1,2m,1,2m images  / 11M, 25M parameters(weights and bias). sta3melet hedha khatrou faster w less overfitting

# Replace final layer with Dropout used here and last layer modified !!!
model.fc = nn.Sequential(
    nn.Dropout(p=0.2), # IMPORTANT!!!: 20% neurones are randomly not fired to reduce overfitting
    nn.Linear(model.fc.in_features, NUM_CLASSES)
)

model = model.to(device) # I move the model to GPU when I use high-performing PC


Here loss will be calculated and L2 will be used

In [25]:
# Define loss function and optimizer
optimizer = torch.optim.AdamW(model.parameters(), lr=LR, weight_decay=1e-4) # IMPORTANT!!!: regularization L2 to reduce overfitting like we did with Dropout ,  This is a variant of gradient descent called AdamW
loss_func = torch.nn.CrossEntropyLoss()                                     # L2 discourage large weights by adding a penalty to the loss function -> discourage mermorizing/overfitting

This is the training and validation phase. Here according the epochs the best accuracy will be saved and its model used

In [26]:
# The training loop with validation and best model saving
# IMPORTANT DEF: The gradient tells how much and in which direction to change each weight.
best_val_acc = 0.0
for epoch in range(EPOCHS):
    # --- Training Phase ---
    model.train()  # Set model to training mode (enables dropout, etc.)
    running_loss = 0.0 # Reset loss accumulator

    for X_mb, y_mb in train_loader: #Loop through batches of training data
        X_mb, y_mb = X_mb.to(device), y_mb.to(device) # Move batch to GPU when training home
        
        optimizer.zero_grad()  # Reset gradients from previous batch
        y_hat = model(X_mb)    # Forward pass: get predictions  IMPORTANT!!!:
        loss = loss_func(y_hat, y_mb) # Caluculation of loss   IMPORTANT!!!:
        loss.backward()  # Backward pass: calculate the gradient for every weight in the model  IMPORTANT!!!:
        optimizer.step() # here happens gradient descent step: update weights using calculated gradients. The Formula ist new_weight = old_weight - (learning_rate Ã— gradient)  IMPORTANT!!!:
        
        running_loss += loss.item() # Accumulate loss

    # --- Validation Phase ---
    model.eval() # Set model to evaluation mode
    correct, total = 0, 0
    
    
    for X_mb, y_mb in val_loader: # Loop through validation batches
        X_mb, y_mb = X_mb.to(device), y_mb.to(device)
        y_hat = model(X_mb) # Get predictions
        _, predicted = torch.max(y_hat, 1) # Get class with highest score
        total += y_mb.size(0)  # Count total samples ( validated)
        correct += (predicted == y_mb).sum().item() # Count correct predictions only if predicted=label

    # --- Print Epoch Results and Save Best Model ---
    avg_loss = running_loss / len(train_loader)
    val_acc = 100 * correct / total
    
    print(f"Epoch [{epoch+1}/{EPOCHS}], Loss: {avg_loss:.4f}, Val Acc: {val_acc:.2f}%")

    if val_acc > best_val_acc:
        best_val_acc = val_acc
        torch.save(model.state_dict(), "best_model.pth") # Save model weights
        print(f" -> New best model saved (Val Acc: {best_val_acc:.2f}%)")

Epoch [1/25], Loss: 4.2752, Val Acc: 15.44%
 -> New best model saved (Val Acc: 15.44%)
Epoch [2/25], Loss: 3.0102, Val Acc: 40.94%
 -> New best model saved (Val Acc: 40.94%)
Epoch [3/25], Loss: 2.2719, Val Acc: 48.32%
 -> New best model saved (Val Acc: 48.32%)
Epoch [4/25], Loss: 1.7825, Val Acc: 61.74%
 -> New best model saved (Val Acc: 61.74%)
Epoch [5/25], Loss: 1.4837, Val Acc: 65.10%
 -> New best model saved (Val Acc: 65.10%)
Epoch [6/25], Loss: 1.1789, Val Acc: 70.47%
 -> New best model saved (Val Acc: 70.47%)
Epoch [7/25], Loss: 1.0217, Val Acc: 76.51%
 -> New best model saved (Val Acc: 76.51%)
Epoch [8/25], Loss: 0.8733, Val Acc: 76.51%
Epoch [9/25], Loss: 0.7220, Val Acc: 84.56%
 -> New best model saved (Val Acc: 84.56%)
Epoch [10/25], Loss: 0.6602, Val Acc: 83.22%
Epoch [11/25], Loss: 0.6159, Val Acc: 86.58%
 -> New best model saved (Val Acc: 86.58%)
Epoch [12/25], Loss: 0.5234, Val Acc: 87.25%
 -> New best model saved (Val Acc: 87.25%)
Epoch [13/25], Loss: 0.4881, Val Acc: 8

Best Validation Accuracy is 92,6%

In [27]:
# Load the highest validation accuracy
model.load_state_dict(torch.load("best_model.pth")) # Load the saved best model
print(f"\nBest validation accuracy: {best_val_acc:.2f}%")



Best validation accuracy: 92.62%


Here starts Testing

In [28]:
# Generate predictions on the test set using the best saved model
model.eval()
probs_list = []  # List to store predicted probabilities
with torch.no_grad():
    for imgs in test_loader:
        imgs = imgs.to(device)
        logits = model(imgs)   # Get  output scores
        probs = torch.softmax(logits, dim=1) # Convert to probabilities (maximum 1)
        probs_list.append(probs.cpu().numpy()) # Move to CPU and store as numpy array

preds = np.vstack(probs_list) # Stack all batches into one big array

Print of submission data with probability of every tested image to be a label
See example of Image 16,19


In [29]:
# Create and save submission CSV file
sub_df = pd.DataFrame(preds, columns=le.classes_)
sub_df.insert(0, "id", test_df["id"].astype(int))
sub_df.to_csv(SUBMISSION_CSV, index=False, float_format="%.3f")
print(f"Saved: {SUBMISSION_CSV}")


Saved: submission_ResnetCNN.csv
