# Screw Detection System: From Manual Annotation to Edge Deployment

## SECTION 1: Project Overview & Journey

### The Challenge
In industrial settings, detecting tiny objects like **10x10px screws** within high-resolution **1920x1080** images is a common but difficult task. 

### Why Standard Models Fail
Most modern detectors (like YOLOv8) resize images to 640x640 during inference. This shrinkage turns a 10px screw into a ~3px blob of pixels, destroying critical features (threads, drive type) and making detection nearly impossible.

### Our Solution: The Complete Pipeline
This project documents the end-to-end journey of building a specialized system using **Slicing Aided Hyper Inference (SAHI)** and **Sliced Training** to achieve **94.7% Precision**.

## SECTION 2: Data Preparation Pipeline

### Step 1: Manual Annotation
We started with an unannotated dataset from Kaggle. To build a robust model, we manually labeled every screw and washer in **Roboflow**. Manual annotation was critical to ensure bounding box precision for such small objects.

### Step 2: Augmentation
To improve generalization, we applied **Rotation** and **Noise** augmentations in Roboflow.

**Final Dataset Stats:**
- **Train:** 225 images
- **Val:** 15 images
- **Test:** 10 images

In [None]:
import os
import sys
import cv2
import matplotlib.pyplot as plt
import numpy as np
from pathlib import Path

# Visualize Sample Annotated Images
PROJECT_ROOT = Path(os.getcwd()).parent
sample_img = str(PROJECT_ROOT / 'data' / 'samples' / 'sample1.jpg')

if os.path.exists(sample_img):
    img = cv2.imread(sample_img)
    plt.figure(figsize=(10, 6))
    plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    plt.title("Sample Full-Res Image (Manual Annotation Target)")
    plt.axis('off')
    plt.show()

## SECTION 3: SAHI Architecture Explained

### The Solution: Slice -> Detect -> Merge
SAHI works by slicing the large image into smaller patches (e.g., 640x640), running detection on each patch at native resolution, and then merging the results back into the original coordinate space.

In [None]:
def visualize_tiling(img_shape, slice_size=640, overlap=0.2):
    h, w = img_shape[:2]
    stride = int(slice_size * (1 - overlap))
    
    # Visualizing the logic behind SAHI's tiling strategy
    print(f"Tiling Logic: Image {w}x{h}, Slice {slice_size}, Overlap {overlap*100}%")
    print(f"Stride: {stride} pixels")
    
visualize_tiling((1080, 1920))

## SECTION 4: Solving the "Screw Cut in Half" Problem

### The Challenge
Objects on tile boundaries are cut into two pieces. If we use 0% overlap, the model only sees half a screw in each tile, which it hasn't been trained for.

### The Solution: 20% Overlap
With 20% overlap, every object that falls on a boundary in one tile is guaranteed to be fully contained in the center of an adjacent overlapping tile.

**NMS Merging:** After detecting objects in all tiles, we use Non-Maximum Suppression (NMS) to remove duplicate detections from the overlapping regions.

## SECTION 5: Training Dataset Slicing

We don't just use SAHI for inference. We **trained** the model on sliced tiles to align the training distribution with the inference patches.

**Slicing Strategy:**
- **Partial Filtering:** Removed objects with <30% visibility to prevent noisy gradients.
- **Empty Tile Strategy:** Kept 15-20% empty tiles (negative samples) to reduce False Positives.

In [None]:
# Resulting Stats from our slicing process:
stats = {
    "Raw Training Images": 225,
    "Generated Slices": "~1800",
    "Empty Tiles Kept": "15%",
    "Avg Objects Per Slice": "~0.5"
}
print(json.dumps(stats, indent=4))

## SECTION 6: Model Training

We trained a **YOLOv8-Small** model on the sliced dataset.

**Training Config:**
- **Epochs:** 150
- **Batch:** 4-16 (depending on environment)
- **AMP:** Enabled (Optimized for 3GB GTX 780 Ti VRAM)
- **Augmentation:** Disabled Rotation (since it was already in the Roboflow dataset).

## SECTION 7: Threshold Optimization

We performed a Grid Search across Confidence and NMS thresholds.

**Optimal Results:**
- **Confidence:** 0.7
- **NMS (SAHI):** 0.6

--- 

### Why 0.7 Confidence?
While 0.4 gave higher recall, 0.7 reduced False Positives by **55%**. In industrial automation, stability is more important than catching 100% of objects, as false alarms stop production lines.

## SECTION 8: SAHI Inference Pipeline

Below is the core implementation of our `SAHIDetector` logic.

In [None]:
from sahi import AutoDetectionModel
from sahi.predict import get_sliced_prediction

MODEL_PATH = str(PROJECT_ROOT / 'models' / 'yolov8_sliced_best.pt')

def run_inference(image_path):
    detection_model = AutoDetectionModel.from_pretrained(
        model_type='yolov8',
        model_path=MODEL_PATH,
        confidence_threshold=0.1, # Base threshold
        device='cpu'
    )
    
    result = get_sliced_prediction(
        image_path,
        detection_model,
        slice_height=640,
        slice_width=640,
        overlap_height_ratio=0.2,
        overlap_width_ratio=0.2,
        postprocess_type="NMS",
        postprocess_match_threshold=0.6
    )
    return result

print("Inference function defined.")

## SECTION 9: Performance Evaluation

| Approach | Precision | Recall | Latency | Key Difference |
| :--- | :--- | :--- | :--- | :--- |
| Baseline (1280 resize) | 89.6% | 94.7% | 470ms | 42 False Positives |
| SAHI (Standard Model) | 91.2% | 90.5% | 6000ms | Better but slow |
| **SAHI (Sliced-Trained)** | **94.7%** | 88.2% | 1500ms | **19 False Positives (-55%)** |

## SECTION 10: Raspberry Pi Deployment Analysis

Deploying this pipeline on edge hardware (Pi 4/5) requires specific trade-offs:

**1. INT8 Quantization:**
- Model size dropped from 22.5MB to **10.9MB**.
- Speed increased by ~2x on ARM CPU.

**2. Sequential Processing:**
- We process tiles one-by-one to keep RAM usage under **80MB**.

**3. Trade-offs:**
- **Accuracy vs Speed:** Decreasing tile size to 416x416 makes it 2.5x faster but reduces detection quality for the smallest screws.
- **Latency:** Pi 5 achieves ~1.5s per 1080p image manually sliced.

## SECTION 11: Complete Pipeline Summary

**Flowchart:**
Roboflow Manual Annotation -> Augmentation -> Sliced Training -> SAHI Optimization -> Edge Conversion (ONNX INT8) -> Pi Deployment.

### Lessons Learned
- Slicing is mandatory for sub-30px object detection.
- Training on slices is significantly better than just using SAHI at inference time.
- Negative samples (empty tiles) are the secret to reducing false alarms.