## ProjF5 - Final Model

Use this document as a template to provide the evaluation of your final model. You are welcome to go in as much depth as needed.

Make sure you keep the sections specified in this template, but you are welcome to add more cells with your code or explanation as needed.

**Import Libraries**

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import imageio.v3 as imageio
import albumentations as A

from albumentations.pytorch import ToTensorV2
from torch.utils.data import Dataset, DataLoader
from torch import nn
from tqdm.notebook import tqdm
from sklearn.preprocessing import StandardScaler

import torch
import timm
import glob
import torchmetrics
import time
import psutil
import os

tqdm.pandas()

In [2]:
class Config():
    sub = "/kaggle/input/planttraits2024/sample_submission.csv"
    trgts = "/kaggle/input/planttraits2024/target_name_meta.tsv"
    train_path = "/kaggle/input/planttraits2024/train.csv"
    test_path = "/kaggle/input/planttraits2024/test.csv"
    train_image_path = "/kaggle/input/planttraits2024/train_images/"
    test_image_path = "/kaggle/input/planttraits2024/test_images/"
    IMAGE_SIZE = 384
    BACKBONE = 'swin_large_patch4_window12_384.ms_in22k_ft_in1k'
    TARGET_COLUMNS = ['X4_mean', 'X11_mean', 'X18_mean', 'X50_mean', 'X26_mean', 'X3112_mean']
    TARGET_COLS = ['X4_mean', 'X11_mean', 'X18_mean', 'X50_mean', 'X26_mean', 'X3112_mean']
    N_TARGETS = len(TARGET_COLUMNS)
    BATCH_SIZE = 10
    LR_MAX = 1e-4
    WEIGHT_DECAY = 0.01
    N_EPOCHS = 4
    TRAIN_MODEL = True
    IS_INTERACTIVE = os.environ['KAGGLE_KERNEL_RUN_TYPE'] == 'Interactive'
        
CONFIG = Config()

### 1. Load and Prepare Data

This should illustrate your code for loading the dataset and the split into training, validation and testing. You can add steps like pre-processing if needed.

In [3]:
import os
import random 

train = pd.read_csv(CONFIG.train_path)
#train

image_path = '/kaggle/input/planttraits2024/train_images'

image_files = [f for f in os.listdir(image_path) if f.endswith('.jpeg')]

random_images = random.sample(image_files, 5)

train["image_path"] = CONFIG.train_image_path + train['id'].astype(str) + '.jpeg'

train= train[['id', 'image_path', 'X4_mean', 'X11_mean', 'X18_mean', 'X26_mean', 'X50_mean', 'X3112_mean']].copy()
# Drop duplicates and nans
train = train.drop_duplicates().dropna()
train

Unnamed: 0,id,image_path,X4_mean,X11_mean,X18_mean,X26_mean,X50_mean,X3112_mean
0,192027691,/kaggle/input/planttraits2024/train_images/192...,0.401753,11.758108,0.117484,1.243779,1.849375,50.216034
1,195542235,/kaggle/input/planttraits2024/train_images/195...,0.480334,15.748846,0.389315,0.642940,1.353468,574.098472
2,196639184,/kaggle/input/planttraits2024/train_images/196...,0.796917,5.291251,8.552908,0.395241,2.343153,1130.096731
3,195728812,/kaggle/input/planttraits2024/train_images/195...,0.525236,9.568305,1.083629,0.154200,1.155308,1042.686546
4,195251545,/kaggle/input/planttraits2024/train_images/195...,0.411821,14.528877,0.657585,10.919966,2.246226,2386.467180
...,...,...,...,...,...,...,...,...
55484,190558785,/kaggle/input/planttraits2024/train_images/190...,0.337243,11.572778,0.233690,1.783193,1.608341,969.547831
55485,194523231,/kaggle/input/planttraits2024/train_images/194...,0.424371,6.114448,1.017099,12.713048,2.418300,1630.015480
55486,195888987,/kaggle/input/planttraits2024/train_images/195...,0.639659,5.549596,2.717395,10.206478,2.722599,602.229880
55487,135487319,/kaggle/input/planttraits2024/train_images/135...,0.774642,7.024218,4.429659,9.372170,3.251739,244.387170


In [4]:
train["image_path"] =CONFIG.train_image_path + train['id'].astype(str) + '.jpeg'

train= train[['id', 'image_path', 'X4_mean', 'X11_mean', 'X18_mean', 'X26_mean', 'X50_mean', 'X3112_mean']].copy()
# Drop duplicates and nans
train = train.drop_duplicates().dropna()
train

Unnamed: 0,id,image_path,X4_mean,X11_mean,X18_mean,X26_mean,X50_mean,X3112_mean
0,192027691,/kaggle/input/planttraits2024/train_images/192...,0.401753,11.758108,0.117484,1.243779,1.849375,50.216034
1,195542235,/kaggle/input/planttraits2024/train_images/195...,0.480334,15.748846,0.389315,0.642940,1.353468,574.098472
2,196639184,/kaggle/input/planttraits2024/train_images/196...,0.796917,5.291251,8.552908,0.395241,2.343153,1130.096731
3,195728812,/kaggle/input/planttraits2024/train_images/195...,0.525236,9.568305,1.083629,0.154200,1.155308,1042.686546
4,195251545,/kaggle/input/planttraits2024/train_images/195...,0.411821,14.528877,0.657585,10.919966,2.246226,2386.467180
...,...,...,...,...,...,...,...,...
55484,190558785,/kaggle/input/planttraits2024/train_images/190...,0.337243,11.572778,0.233690,1.783193,1.608341,969.547831
55485,194523231,/kaggle/input/planttraits2024/train_images/194...,0.424371,6.114448,1.017099,12.713048,2.418300,1630.015480
55486,195888987,/kaggle/input/planttraits2024/train_images/195...,0.639659,5.549596,2.717395,10.206478,2.722599,602.229880
55487,135487319,/kaggle/input/planttraits2024/train_images/135...,0.774642,7.024218,4.429659,9.372170,3.251739,244.387170


### 2. Prepare your Final Model

Here you can have your code to either train (e.g., if you are building it from scratch) your model. These steps may require you to use other packages or python files. You can just call them here. You don't have to include them in your submission. Remember that we will be looking at the saved outputs in the notebooked and we will not run the entire notebook.

In [5]:
train[CONFIG.TARGET_COLS] = np.log1p(train[CONFIG.TARGET_COLS])
train = train.dropna()

split_index = int(0.7 * len(train))
# Split the DataFrame into train and validation sets
train_data = train.iloc[:split_index].reset_index(drop=True)
val_data = train.iloc[split_index:].reset_index(drop=True)
train_data.shape, val_data.shape

  result = func(self.values, **kwargs)


((38819, 8), (16637, 8))

In [6]:
paths = train_data.image_path.tolist()

class CustomDataset(Dataset):
    def __init__(self, paths, labels, transform=None):
        self.paths = paths
        self.labels = labels
        self.transform = transform

    def __len__(self):
        return len(self.paths)

    def __getitem__(self, idx):
        image = Image.open(self.paths[idx]).convert('RGB')
        
        # Use normalized labels
        label = torch.tensor(self.labels[idx], dtype=torch.float32)

        if self.transform:
            image = self.transform(image)

        return image, label

    
from torchvision import transforms
# Define any image transformations you want to apply, here we also add augmentation. 
transform = transforms.Compose([
    transforms.Resize((384, 384)),
    transforms.RandomResizedCrop(384),
    transforms.RandomHorizontalFlip(),
    transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

In [7]:

train_paths  = train_data.image_path.tolist()
train_labels = train_data[CONFIG.TARGET_COLS].values

val_paths  = val_data.image_path.tolist()
val_labels = val_data[CONFIG.TARGET_COLS].values


#torch dataset
batch_size = CONFIG.BATCH_SIZE

# Create the datasets

dataset_train = CustomDataset(train_paths, train_labels, transform=transform)
train_dataloader = DataLoader(dataset_train, batch_size=batch_size, shuffle=True)


dataset_val = CustomDataset(val_paths, val_labels, transform=transform)
val_dataloader = DataLoader(dataset_val, batch_size=batch_size, shuffle=True)


# Define your dataset size and other configuration parameters
dataset_size = len(dataset_train)  # Assuming you have defined 'dataset' earlier
total_epochs = 5  # Total number of epochs

# Calculate total train steps
total_train_steps = dataset_size * batch_size * total_epochs

# Define warmup steps as 10% of total train steps
warmup_steps = int(total_train_steps * 0.10)

# Define decay steps as the remaining steps after warmup
decay_steps = total_train_steps - warmup_steps

print(f"Total Train Steps: {total_train_steps}")
print(f"Warmup Steps: {warmup_steps}")
print(f"Decay Steps: {decay_steps}")


Total Train Steps: 1940950
Warmup Steps: 194095
Decay Steps: 1746855


In [8]:
#!pip install pytorch_xla

In [9]:
# import torch_xla
# import torch_xla.core.xla_model as xm

In [10]:
class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.backbone = timm.create_model(
                CONFIG.BACKBONE,
                num_classes=CONFIG.N_TARGETS,
                pretrained=True)
        
    def forward(self, inputs):
        return self.backbone(inputs)

model = Model()
model = model.to('cuda')

model.safetensors:   0%|          | 0.00/801M [00:00<?, ?B/s]

In [11]:
class R2Loss(nn.Module):#causes nans
    def __init__(self):
        super(R2Loss, self).__init__()

    def forward(self, y_pred, y_true):

        SS_res = torch.sum((y_true - y_pred)**2)
        SS_tot = torch.sum((y_true - torch.mean(y_true))**2)

        epsilon = 1e-6  # Small epsilon to avoid division by zero
        r2 = 1 - (SS_res / (SS_tot + epsilon))
        mean_r2 = torch.mean(r2)

        return mean_r2

In [12]:

from PIL import Image


model = Model().to('cuda') # num_targets=len(CONFIG.TARGET_COLS)).to('cuda')

torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1)#to aviod nan losses

criterion = nn.MSELoss()  
optimizer = torch.optim.Adam(model.parameters(), lr=1e-5)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=5)

total_epochs = 3 # Config.EPOCHS
r2_loss =R2Loss()
train_losses = []  # To store training losses
val_losses = []    # To store validation losses
for epoch in range(total_epochs):
    print(f"Epoch {epoch + 1}/{total_epochs}")
    
    #R2.reset()
    model.train()  # Set the model to training mode
    
    #train_r2_value = 0
    # Use tqdm to add a progress bar
    train_r2_value= 0
    for images, targets in tqdm(train_dataloader, desc=f"Epoch {epoch + 1}/{total_epochs}"):
        optimizer.zero_grad()
        
        # Move data to GPU
        images = images.to('cuda')
        targets = targets.to('cuda')
        
        outputs = model(images)
        #print(outputs)
        
        loss = criterion(outputs, targets)#.item()
        train_r2_value = r2_loss(outputs, targets)
        
        #print(loss)
        loss.backward()
        
        #print(f"Epoch [{epoch+1}/{total_epochs}] - Loss: {loss:.4f} R2: {train_r2_value:.4f}")
        #print(loss)
        optimizer.step()
    
    # Update learning rate using the scheduler
    scheduler.step()
    r2_value = 0
    
    # Validation loop
    model.eval()  # Set the model to evaluation mode
    val_loss = 0.0
    with torch.no_grad():
        for images, targets in tqdm(val_dataloader, desc=f"Val Epoch {epoch + 1}/{total_epochs}"):
            # Assuming img_path is the path to your image file
            #img = Image.open(img_path)

            # Resize the image to the desired dimensions (e.g., 384x384)
            images = images.to('cuda')
            targets = targets.to('cuda')
            
            outputs = model(images)
            val_loss += criterion(outputs, targets)#.item()
            
            r2_value = r2_loss(outputs, targets)
            
            #print(val_loss)
    
    val_loss /= len(val_dataloader)
    #print(val_loss)
    # Append loss to lists
    train_losses.append(loss.item())
    val_losses.append(val_loss)
    print(f"Epoch [{epoch+1}/{total_epochs}] - Loss: {loss:.4f} R2: {train_r2_value:.4f}- Val Loss: {val_loss:.4f} - VAL R2: {r2_value:.4f}-")

Epoch 1/3


Epoch 1/3:   0%|          | 0/3882 [00:00<?, ?it/s]

Val Epoch 1/3:   0%|          | 0/1664 [00:00<?, ?it/s]

Epoch [1/3] - Loss: 0.3559 R2: 0.9239- Val Loss: 0.6171 - VAL R2: 0.9019-
Epoch 2/3


Epoch 2/3:   0%|          | 0/3882 [00:00<?, ?it/s]

Val Epoch 2/3:   0%|          | 0/1664 [00:00<?, ?it/s]

Epoch [2/3] - Loss: 0.6468 R2: 0.8908- Val Loss: 0.5740 - VAL R2: 0.7481-
Epoch 3/3


Epoch 3/3:   0%|          | 0/3882 [00:00<?, ?it/s]

Val Epoch 3/3:   0%|          | 0/1664 [00:00<?, ?it/s]

Epoch [3/3] - Loss: 0.5767 R2: 0.9172- Val Loss: 0.5602 - VAL R2: 0.9233-


In [14]:

torch.save(model, 'model.pth')

### 3. Model Performance

Make sure to include the following:
- Performance on the training set
- Performance on the test set
- Provide some screenshots of your output (e.g., pictures, text output, or a histogram of predicted values in the case of tabular data). Any visualization of the predictions are welcome.

### Performance / Metrics for train data 

In [None]:
print("The R2 Score for training data is 0.9122" )
print("The Mean square error  0.7462") 
print("The mean absolute error is mae: 0.5852" )
## Values taken from Model training cell

### Performance/ Metrics on Test data

### Since we created a new version of the notebook and could not revert back to it on kaggle - we could not do the predictions in the same notebook. The notebook is there in the zipped source-code. We have attched the screenshots in this notebook for your convinience

<img src="Test_Performance_Screenshot.png">

<img src="Test_Performance_Screenshot2.png">