# 🌱 Soil Image Classification Inference - Annam.ai, IIT Ropar 🌱

This notebook performs inference for the Soil Image Classification Challenge using a pre-trained ResNet18 model in PyTorch. It loads a trained model and generates predictions for the test set. Let's get started! 🚀

## 📚 Step 1: Import Libraries
We import the necessary libraries for data handling and model inference.

In [None]:
import os
import pandas as pd
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from torch.utils.data import Dataset
from torchvision import transforms
from torchvision import models
from PIL import Image

## 🛠️ Step 2: Set Up Paths
Define the paths to the test dataset directory and CSV file, and verify their existence.

In [None]:
base_dir = "/content/soil_competition-2025-2"
test_dir = os.path.join(base_dir, "test")
test_csv = os.path.join(base_dir, "test_ids.csv")

# Verify dataset
print("Checking dataset files... 📂")
print(f"Test CSV exists: {os.path.exists(test_csv)}")

## 🖼️ Step 3: Define Data Preprocessing
We define image transformations for testing (without augmentation) to prepare images for ResNet18.

In [None]:
# Test transforms without augmentation
test_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

## 📊 Step 4: Create Test Dataset Class
Custom dataset class for loading test images.

In [None]:
class TestDataset(Dataset):
    def __init__(self, csv_file, img_dir, transform=None):
        self.test_data = pd.read_csv(csv_file)
        self.img_dir = img_dir
        self.transform = transform

    def __len__(self):
        return len(self.test_data)

    def __getitem__(self, idx):
        img_name = os.path.join(self.img_dir, self.test_data.iloc[idx]['image_id'])
        image = Image.open(img_name).convert('RGB')
        if self.transform:
            image = self.transform(image)
        return image, self.test_data.iloc[idx]['image_id']

## 🚚 Step 5: Load Test Data
Create a data loader for the test dataset.

In [None]:
test_dataset = TestDataset(test_csv, test_dir, test_transforms)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

print(f"Test samples: {len(test_dataset)} 🧪")

## 🧠 Step 6: Define and Load the Model
Load the pre-trained ResNet18 model and the saved weights.

In [None]:
model = models.resnet18(pretrained=True)
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, 2)  # 2 classes: soil (1), not soil (0)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
model.load_state_dict(torch.load('soil_classifier.pth'))
model.eval()
print(f"Model loaded on: {device} ⚙️")

## 🔍 Step 7: Generate Test Predictions
Make predictions on the test set and collect image IDs.

In [None]:
predictions = []
image_ids = []
with torch.no_grad():
    for images, img_ids in test_loader:
        images = images.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs, 1)
        predictions.extend(predicted.cpu().numpy())
        image_ids.extend(img_ids)

# Remove .jpg from image_ids
image_ids = [img_id.replace('.jpg', '') for img_id in image_ids]

## 💾 Step 8: Create Submission File
Save the predictions to a CSV file for Kaggle submission.

In [None]:
submission = pd.DataFrame({
    "image_id": image_ids,
    "soil_label": predictions
})
submission.to_csv("submission.csv", index=False)
print("\nSubmission file created: ✅")
print(submission.head())

## 📝 Notes
- The model uses a pre-trained ResNet18 with weights loaded from `soil_classifier.pth`. 🧠
- Ensure the dataset paths are correct for your environment (e.g., `/content/soil_competition-2025-2`). 📂
- The submission file (`submission.csv`) is saved in the working directory for Kaggle submission. 🚀