# Soil Type Classification - Project Card

## Project Overview
This project is part of the **Kaggle Soil Classification Challenge (2025)**. The goal is to classify images of soil into one of the following four categories:

- **Alluvial soil**
- **Black Soil**
- **Clay soil**
- **Red soil**

We used a ResNet34 deep learning model trained on labeled soil images to predict the soil type for a test set.

---

## Dataset
The dataset was provided by the competition organizers and contains:
- `train_labels.csv`: Contains image IDs and their corresponding soil types.
- `train/`: Folder containing training images.
- `test_ids.csv`: Contains image IDs for testing.
- `test/`: Folder containing test images (some in `.gif` or `.webp` format).

We performed preprocessing to:
- Normalize and resize images to 224x224.
- Apply augmentations like rotation, flipping, and color jitter.
- Convert unsupported formats (.gif, .webp) to `.jpg`.

---

## Model Architecture
- **Backbone:** ResNet34 pretrained on ImageNet.
- Final fully connected layer was replaced with: `nn.Linear(in_features=512, out_features=4)`.

### Loss & Optimization
- **Loss Function:** CrossEntropyLoss
- **Optimizer:** Adam (`lr=1e-4`)
- **Scheduler:** StepLR (step size = 5, gamma = 0.5)

---

##
Training and Validation
We split the data into **80% training** and **20% validation** using stratified sampling.

### Training Details:
- **Epochs:** 15
- **Batch size:** 32
- **Device:** GPU (if available)

### Final Epoch Metrics:
```plaintext
Epoch 15/15
  ➤ Train Loss: 0.0250
  ➤ Per-class F1: [0.9714, 0.9787, 0.9367, 0.9903]
  ➤ Min F1: 0.9367


## Model Architecture

- **Backbone**: ResNet34 with pretrained ImageNet weights.
- **Modification**: Final FC layer replaced with a new `Linear(in_features, 4)` for 4 soil classes.
- **Loss Function**: CrossEntropyLoss
- **Optimizer**: Adam with learning rate `1e-4`
- **Learning Rate Scheduler**: StepLR (step size = 5, gamma = 0.5)

---

## Training & Evaluation

- Training done on an 80-20 train-validation split.
- Data augmentation applied for robust generalization:
  - Horizontal Flip
  - Random Rotation
  - Color Jitter
- Validation used minimal transformation for consistent evaluation.

### Final Epoch Metrics (Epoch 15 of 15):

- **Train Loss**: `0.0250`
- **Per-Class F1 Scores**:
  - Alluvial soil: `0.9714`
  - Black Soil: `0.9787`
  - Clay soil: `0.9367`
  - Red soil: `0.9903`
- **Minimum F1**: `0.9367`

---

## Test Inference Pipeline

- Test image formats handled: `.jpg`, `.jpeg`, `.png`, `.webp`, `.gif`
- Images were preprocessed and converted to `.jpg` if needed.
- The trained model was reloaded and used to predict soil types on test images.
- Final predictions were mapped back to class names using the inverse label map.

---

##  Output

- `submission.csv` generated with **100% test coverage**.
- Format:  
  ```csv
  image_id,soil_type
  xyz.jpg,Clay soil
  abc.jpg,Red soil
  ...
