In [None]:
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Car Brand Detection: Training Log\n\n",
    "This notebook documents the process of training the custom YOLOv5 model for the Car Brand Detection API.\n\n",
    "My goal was not to create a perfect, production-ready model, but to learn the end-to-end process of fine-tuning a model on a custom dataset and diagnosing its performance."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 1: Initial V1 Training (50 Epochs)\n\n",
    "I first trained the model for 50 epochs. This was to get a baseline understanding of its performance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# This is the command I used to run the initial training.\n",
    "# Note: This command is run from within the /yolov5 directory.\n\n",
    "!python train.py --img 640 --batch 8 --epochs 50 --data ../Car-Brand-Dataset/data.yaml --weights yolov5s.pt --name car_brand_v1_50_epochs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### V1 Results & Diagnosis\n\n",
    "After testing the V1 model, I found it was only good at detecting one brand: Toyota. This was a classic sign of an imbalanced dataset. I confirmed this by looking at the `labels.jpg` file created during training, which showed that the dataset had far more Toyota images than any other brand."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 2: V2 Training (150 Epochs with Augmentation)\n\n",
    "To fix the problems from V1, I decided to retrain the model with two key changes:\n",
    "1.  **Train for longer (150 epochs)** to give the model more time to learn the features of the less common brands.\n",
    "2.  **Use high-level data augmentation** (`--hyp data/hyps/hyp.scratch-high.yaml`) to artificially create more variety in the training data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# This is the command I used to run the improved V2 training.\n",
    "# Note: This command is run from within the /yolov5 directory.\n\n",
    "!python train.py --img 640 --batch 8 --epochs 150 --data ../Car-Brand-Dataset/data.yaml --weights yolov5s.pt --hyp data/hyps/hyp.scratch-high.yaml --name car_brand_v2_150_epochs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### V2 Results & Final Analysis\n\n",
    "The V2 model was a significant improvement. It could now identify several other brands. However, it still struggled with visually similar cars, like mistaking a BMW SUV for a Kia or Hyundai. \n\n",
    "This taught me the most important lesson: the quality and size of the dataset is the most critical factor. Even with longer training and augmentation, a small dataset with only ~1700 images is not enough to build a highly accurate model for a complex task with 19 classes. This project was a valuable, practical lesson in the real-world limitations of machine learning."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}