In [1]:
{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": ["# Soil Classification - Project Summary"]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Overview\n",
        "This project aims to classify soil types based on image data using deep learning techniques. It was developed for the Kaggle Soil Classification competition. The goal was to maximize the **minimum F1-score across all classes**, making it important to balance performance on all soil categories."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Dataset\n",
        "- Source: Kaggle Soil Classification Competition\n",
        "- Input: RGB images of soil samples\n",
        "- Output: Categorical label for soil type\n",
        "- Number of classes: Varies (based on label file)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Model Architecture\n",
        "- Base model: Pretrained ResNet18 (from `torchvision.models`)\n",
        "- Final layer replaced with custom classifier\n",
        "- Image augmentations applied during training to prevent overfitting\n",
        "- Optimizer: Adam\n",
        "- Loss: CrossEntropyLoss\n",
        "- Metric: Per-class F1 score, with focus on **minimum F1**"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Training Details\n",
        "- Stratified data split (85% train, 15% validation)\n",
        "- Batch size: 32\n",
        "- Epochs: 10–20 (early stopping if no improvement)\n",
        "- Data loaders with image transformations for train/val/test\n",
        "- F1 score tracked each epoch for model checkpointing"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Evaluation\n",
        "- Best validation minimum F1 score: **1.0**\n",
        "- Public leaderboard score: **40**\n",
        "- Final submission made using predictions from the best model checkpoint"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Folder Structure\n",
        "- `notebooks/`: Jupyter notebooks for training and inference\n",
        "- `src/`: Source Python files (model, data loaders, etc.)\n",
        "- `docs/`: Architecture diagram and evaluation cards\n",
        "- `data/`: Contains download script (data not included)\n",
        "- `README.md`: Project overview\n",
        "- `requirements.txt`: Environment setup"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Conclusion\n",
        "This deep learning-based solution achieved consistent performance across all soil classes, successfully optimizing for the competition's unique metric. Future work could explore model ensembling, larger architectures, or spectral domain inputs if available."
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "name": "python",
      "version": ""
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
}


{'cells': [{'cell_type': 'markdown',
   'metadata': {},
   'source': ['# Soil Classification - Project Summary']},
  {'cell_type': 'markdown',
   'metadata': {},
   'source': ['## Overview\n',
    'This project aims to classify soil types based on image data using deep learning techniques. It was developed for the Kaggle Soil Classification competition. The goal was to maximize the **minimum F1-score across all classes**, making it important to balance performance on all soil categories.']},
  {'cell_type': 'markdown',
   'metadata': {},
   'source': ['## Dataset\n',
    '- Source: Kaggle Soil Classification Competition\n',
    '- Input: RGB images of soil samples\n',
    '- Output: Categorical label for soil type\n',
    '- Number of classes: Varies (based on label file)']},
  {'cell_type': 'markdown',
   'metadata': {},
   'source': ['## Model Architecture\n',
    '- Base model: Pretrained ResNet18 (from `torchvision.models`)\n',
    '- Final layer replaced with custom classifier\n',