# 🌍 Soil vs Non-Soil Image Classification with ResNet18

## 📌 Problem Statement
The goal is to classify images as **soil** (label = 1) or **non-soil** (label = 0) for the Soil Classification Part 2 (2025) Kaggle competition. Soil images come from a custom dataset, while non-soil images are sourced from CIFAR-10 and MNIST.

## 📊 Data Overview
- **Soil Images**: From `train_labels.csv` (label = 1).
- **Non-Soil Images**:
  - 120 images per class from CIFAR-10 (`test` split, label = 0).
  - 80 images per class from MNIST PNG (`testing` split, label = 0).
- **Total**: 3,222 images, split 80% train (2,577), 20% validation (645) using stratified `train_test_split`.
- **Preprocessing**: Images resized to 224x224, normalized, with training augmentations (random flips, rotations, color jitter).

## ⚙️ Method
- **Model**: Pretrained ResNet18 (ImageNet weights) with a modified final layer (`nn.Linear`) for binary classification.
- **Training**:
  - Loss: `BCEWithLogitsLoss`
  - Optimizer: Adam (learning rate = 1e-4)
  - Scheduler: `ReduceLROnPlateau` (based on validation accuracy)
  - Batch size: 32
  - Epochs: 10
- **Hardware**: NVIDIA Tesla T4 GPU (Kaggle environment).

## ✅ Results
```json
{
  "validation_accuracy": 1.0,
  "f1_score": 1.0,
  "leaderboard_rank": 1
}
```

## 🥇 Achievements
- **Perfect F1 Score**: Achieved 1.0 on the test set.
- **#1 Leaderboard Rank**: Topped the Kaggle competition leaderboard.
- **Robust Model**: Consistently achieved 100% validation accuracy across all 10 epochs.

## 💾 Model & Files
- **Model**: Fine-tuned ResNet18, saved for inference.
- **Notebooks**:
  - `/notebooks/soil-classification-2.ipynb`: Data loading, model training, and validation. Then Generates `submission.csv` for test predictions.
- **Metrics**: `/docs/cards/ml-metrics.json` details performance metrics.