### **Author**
Shivansh Gupta

## **Importing All the Modules**

- **`YOLO` (from `ultralytics`)**
  - Load and run the **YOLOv8 model**.

- **`cv2`**
  - Read/write images.
  - Draw bounding boxes and other shapes.
  - Perform color space conversions (e.g., BGR ↔ RGB).

- **`numpy as np`**
  - Work with image arrays.
  - Perform numeric and matrix operations efficiently.

- **`typing.List` / `typing.Dict`**
  - Add **type annotations** for better code clarity and readability.

- **`torch`**
  - Detect GPU availability using `torch.cuda.is_available()`.
  - Perform tensor operations if needed.

- **`os`**
  - Read environment variables.
  - Check file paths and manage the filesystem.

In [18]:
import sys

from ultralytics import YOLO
import cv2
import numpy as np
from typing import List, Dict
import torch

### Loading Environment Variables

```python
# Load environment variables from a .env file


In [19]:
import os
from dotenv import load_dotenv
load_dotenv(dotenv_path="D:\PycharmProjects\Eco_Vision\Backend\.env")

  load_dotenv(dotenv_path="D:\PycharmProjects\Eco_Vision\Backend\.env")


True

### YOLO Model Setup

- **Device Selection:**
  You can **force the use of CPU or GPU** manually.
  If not specified, the device will be **auto-detected** later.

- **Model Path (`model_path`):**
  - Uses the environment variable `MODEL_PATH` if it exists.
  - Otherwise, defaults to `"yolov8n.pt"`.

- **`YOLO(model_path)`**:
  - Calls the **YOLO class** from Ultralytics with the given `model_path`.
  - Loads the **pretrained model into memory**.
  - **Important:** The model is loaded **once in the constructor**, so you **don’t need to reload it** every time you perform detection.


In [20]:
model_path = os.getenv("MODEL_PATH",None) #if we didn't set model , default this come "yolo12n.pt"
device = ("cuda" if torch.cuda.is_available() else "cpu")
model = YOLO(model_path)

#### As here  below we can see our model is lock & loaded on cpu not gpu as i dont have didicated graphic card 💀.

In [21]:
print(f"[YOLODetector] Loaded model: {model_path} on {device}")

[YOLODetector] Loaded model: yolo12n.pt on cpu


### **COCO 2017 Dataset Classes**

- These are the object classes we will use from the **COCO 2017 dataset**.
- The model will detect only on these classes.

> **Note:** COCO 2017 has 80 classes in total, but for our task, we select a subset relevant to our need.

- When a detection is made:
  - If the **class ID** is **in** `self.reusable_classes` → ✅ keep it.
  - If the **class ID** is **not in** `self.reusable_classes` → ❌ ignore it.


This keeps the detection system **focused only on relevant items**, reducing noise from unnecessary classes.


In [22]:
reusable_classes = {
    39: 'bottle',
    41: 'cup',
    42: 'fork',
    43: 'knife',
    44: 'spoon',
    45: 'bowl',
    46: 'banana',
    47: 'apple',
    51: 'orange',
    67: 'cell phone',
    73: 'laptop',
    76: 'keyboard',
    84: 'book',
}

### **Model Inference**

- This is where the **actual inference happens**.
- The input image/frame is passed to the YOLO model.
- The model runs its **forward pass** and returns detections:
  - **Bounding boxes** (location of objects).
  - **Class IDs** (what the object is).
  - **Confidence scores** (how sure the model is).

> **Note:** Inference = the stage where the trained model is **applied to new data** to make predictions.

In [23]:
def detect_objects(image_path: str, conf: float = 0.5, imgsz: int = 640) -> List[Dict]:
    if not os.path.exists(image_path):
        raise FileNotFoundError(f"Image not found: {image_path}")
    classes = list(reusable_classes.keys())   #Gets all the keys (class IDs like 39, 41, 42, …) from the reusable classes dictionary and make a list.
    results = model(
    image_path,
    conf=conf,
    imgsz=imgsz,
    device=device,
    classes=classes
)
    detections = []
    for result in results:
        if not getattr(result, "boxes", None):
            continue
        for box in result.boxes:
            class_id = int(box.cls[0])  #It’s a tensor (because YOLO is built on PyTorch)
            confidence = float(box.conf[0])
            x1, y1, x2, y2 = map(int, box.xyxy[0].tolist())
            detections.append({
                    "class_id": class_id,
                    "class_name": reusable_classes[class_id],
                    "confidence": {
                        "score": round(confidence, 4),       # raw score (0..1)
                        "percent": round(confidence * 100, 1)  # human-friendly %
                    },
                    "bbox": [x1, y1, x2, y2]
                })
    return detections

results is a list of Result objects (one per input image). Each contains detected boxes, class IDs, confidences, etc.

### **YOLO Inference Output (Ultralytics)**

When we run inference, **Ultralytics YOLO** returns a **list of `Results` objects**
👉 one `Results` object **per input image**.

---

#### **Step 1: What is a `Results` object?**
Each `Results` object contains:
- The **input image** (possibly resized).
- All **detections** found in that image.
- **Helper methods** (e.g., `.plot()`, `.save()`).

➡️ In short: **`result` = container for one image’s predictions**.

---

#### **Step 2: What is `.boxes` inside a result?**
- `result.boxes` → an attribute of the `Results` object.
- It is a **`Boxes` object** (Ultralytics’ custom class).
- Stores **all bounding boxes YOLO predicted** for that image.
- Each entry in `result.boxes` = **one detection**.

---

#### **Step 3: What does each box contain?**
A single box has:
- `.cls` → predicted **class id** (e.g., `tensor([39.])`).
- `.conf` → **confidence score** (e.g., `tensor([0.872])`).
- `.xyxy` → bounding box in **[x1, y1, x2, y2]** format (absolute pixel values).
- `.xywh` → bounding box in **[x_center, y_center, width, height]** format.
- `.data` → raw tensor with all values stacked.

---

✅ Example:
If YOLO finds **3 objects** in an image, then `result.boxes` will contain **3 box objects**,
each with its own class, confidence, and coordinates.

### **Lets run & see how this works**

In [24]:
detect = detect_objects("test.jpg")
print(detect)


image 1/1 D:\PycharmProjects\Eco_Vision\Backend\ML notebook\test.jpg: 576x640 8 bottles, 266.7ms
Speed: 64.9ms preprocess, 266.7ms inference, 21.5ms postprocess per image at shape (1, 3, 576, 640)
[{'class_id': 39, 'class_name': 'bottle', 'confidence': {'score': 0.9479, 'percent': 94.8}, 'bbox': [190, 67, 222, 181]}, {'class_id': 39, 'class_name': 'bottle', 'confidence': {'score': 0.7831, 'percent': 78.3}, 'bbox': [26, 62, 51, 155]}, {'class_id': 39, 'class_name': 'bottle', 'confidence': {'score': 0.7175, 'percent': 71.7}, 'bbox': [45, 81, 79, 153]}, {'class_id': 39, 'class_name': 'bottle', 'confidence': {'score': 0.663, 'percent': 66.3}, 'bbox': [26, 62, 51, 127]}, {'class_id': 39, 'class_name': 'bottle', 'confidence': {'score': 0.6598, 'percent': 66.0}, 'bbox': [145, 59, 169, 156]}, {'class_id': 39, 'class_name': 'bottle', 'confidence': {'score': 0.6534, 'percent': 65.3}, 'bbox': [156, 61, 186, 178]}, {'class_id': 39, 'class_name': 'bottle', 'confidence': {'score': 0.6361, 'percent'

We got list of classes and here it detect 8/10 bottles in our image.

### **YOLO Inference Timing Flow**

The YOLO output line:

```

test.jpg: 576x640 8 bottles, 214.9ms
Speed: 8.5ms preprocess, 214.9ms inference, 3.8ms postprocess per image at shape (1, 3, 576, 640)

```

can be visualized as:

```

Input Image: test.jpg (576x640)
│
▼
Preprocess: 8.5ms

* Load image
* Resize & normalize
* Convert to tensor
* Send to GPU
  │
  ▼
  Inference: 214.9ms
* YOLO model predicts bounding boxes & class scores
  │
  ▼
  Postprocess: 3.8ms
* Non-Max Suppression (NMS)
* Filter overlapping boxes
* Scale boxes to original image
  │
  ▼
  Output: 8 bottles detected
  Total time: 214.9ms

```

**Explanation:**
- **Preprocess:** Preparation before model runs.
- **Inference:** Model does all predictions.
- **Postprocess:** Refines and formats predictions.

> **Note:** Most of the time is spent in **inference**, which grows with image size or model complexity.

Here’s a clean Markdown snippet for your Jupyter Notebook explaining the input tensor shape:

### **Input Tensor Shape**

The YOLO model input tensor has the shape:

```

(1, 3, 576, 640)

```

**Breakdown:**

| Dimension | Meaning |
|-----------|---------|
| 1         | Batch size (**one image**) |
| 3         | Number of color channels (**RGB**) |
| 576       | Image height in pixels |
| 640       | Image width in pixels |

> **Note:** YOLO requires input images to be resized and converted into a tensor of shape `(batch, channels, height, width)` before passing it through the model.

In [25]:
def annotate_image(image_path: str, detections: List[Dict]) -> np.ndarray:
    img_bgr = cv2.imread(image_path)
    if img_bgr is None:
        raise FileNotFoundError(f"Failed to read image: {image_path}")
    for det in detections:
        x1, y1, x2, y2 = det["bbox"]
        label = f"{det['class_name']} {det['confidence']['percent']}%"
        cv2.rectangle(img_bgr, (x1, y1), (x2, y2), (16, 185, 129), 2)
        cv2.putText(img_bgr, label, (x1, y1 - 6),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.2,
                    (0, 0, 0), 1, cv2.LINE_AA)
    img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
    cv2.imshow("Annotated Image", img_rgb)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    return img_rgb

### **`cv2.imread()`**

- **What:**
  OpenCV function to **read an image from disk**.

- **Why:**
  Loads the image into a **NumPy array** in **BGR format**.
  > Note: OpenCV uses **Blue-Green-Red (BGR) channel order** by default, not RGB.


### **`cv2.rectangle()`**

Draws a rectangle on an image typically used for **bounding boxes** around detected objects.

```python
cv2.rectangle(img_bgr, (x1, y1), (x2, y2), (16, 185, 129), 2)
````

**Breakdown of parameters:**

| Parameter        | What it is                           | Why                                                            |
| ---------------- | ------------------------------------ | -------------------------------------------------------------- |
| `img_bgr`        | The image array (in BGR format)      | Rectangle will be drawn directly on this image                 |
| `(x1, y1)`       | Top-left corner of the rectangle     | Defines where the rectangle starts                             |
| `(x2, y2)`       | Bottom-right corner of the rectangle | Defines where the rectangle ends                               |
| `(16, 185, 129)` | BGR color tuple for the rectangle    | Chooses a visible color (here, a shade of green)               |
| `2`              | Line thickness in pixels             | Determines how thick the rectangle border appears on the image |

> **Note:** OpenCV uses **BGR** order, not RGB, for colors.

### **`cv2.putText()`**

Draws text on an image typically used to **display the class name and confidence** above a bounding box.

```python
cv2.putText(img_bgr, label, (x1, y1 - 6),
            cv2.FONT_HERSHEY_SIMPLEX, 0.2,
            (0, 0, 0), 1, cv2.LINE_AA)
````

**Breakdown of parameters:**

| Parameter                  | What it is                                            | Why                                                            |
|----------------------------| ----------------------------------------------------- | -------------------------------------------------------------- |
| `img_bgr`                  | The image array where text will be drawn (BGR format) | Text appears directly on this image alongside the bounding box |
| `label`                    | Text string (e.g., `"bottle 92.1%"`)                  | Displays the **object name and confidence**                    |
| `(x1, y1 - 6)`             | Bottom-left corner coordinates of the text            | Slightly above the bounding box to avoid overlap               |
| `cv2.FONT_HERSHEY_SIMPLEX` | Predefined OpenCV font type                           | Determines the **style of the text**                           |
| `0.2`                      | Font scale                                            | Controls **text size** (0.5 = moderately small)                |
| `(0, 0, 0)`                | Text color in BGR (white)                             | Ensures high visibility against most backgrounds               |
| `1`                        | Thickness of the text stroke                          | Determines how bold the text appears                           |
| `cv2.LINE_AA`              | Anti-aliased line type                                | Smooths edges for better readability                           |

> **Note:** Using anti-aliasing (`cv2.LINE_AA`) makes the text **look smoother and more professional**.

### **Converting BGR to RGB**

```python
img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
````

**Breakdown of components:**

| Component           | What it is                                      | Why                                                               |
| ------------------- | ----------------------------------------------- | ----------------------------------------------------------------- |
| `img_rgb`           | New variable storing the converted image        | Needed in **RGB format** for consistent display (matplotlib, web) |
| `cv2.cvtColor()`    | OpenCV function to convert image color spaces   | Converts BGR → RGB                                                |
| `img_bgr`           | Input image in **BGR format**                   | Source image with drawn bounding boxes and text                   |
| `cv2.COLOR_BGR2RGB` | OpenCV constant specifying BGR → RGB conversion | Ensures red, green, and blue channels are correctly reordered     |

> **Note:** OpenCV uses **BGR by default**, while most display libraries (matplotlib, PIL, browsers) expect **RGB**. This conversion prevents color distortion.

### Now annotation is also working

In [26]:
annotate_image("test.jpg",detect)

array([[[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       ...,

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]]

# **LETS CREATE SERVICE LAYER**

In [27]:
import base64
import cv2
import json
from pathlib import Path
from Backend.app.models.yolo_detector import YOLODetector

## **Imports Overview**

- `base64` → Convert images to **base64 strings** for JSON-safe API responses.
- `cv2` → OpenCV for **image processing** (drawing, encoding, color conversion).
- `json` → Load JSON files like `reuse_mapping.json` into Python dicts.
- `Path` from `pathlib` → Modern, **cross-platform file path management** better than os.path.
- `YOLODetector` → Our custom class for **running YOLO detections**.

# Python `pathlib` Module (`Path`)

The `pathlib` module provides **classes to handle filesystem paths** in an object-oriented way. It's modern, readable, and safer than using string paths with `os.path`.

---

## 1. Importing Path
```python
from pathlib import Path
````

* `Path` is the main class representing a **file or directory path**.

---

## 2. Creating Paths

```python
p = Path("C:/Users/Shivansh/Documents")  # Absolute path
q = Path("my_folder/my_file.txt")        # Relative path
```

* Handles forward slashes `/` on any OS.

---

## 3. Checking Path Properties

```python
p.exists()   # True if path exists
p.is_file()  # True if path is a file
p.is_dir()   # True if path is a directory
```

---

## 4. Path Operations

### Joining Paths

```python
p = Path("C:/Users/Shivansh")
q = p / "Documents" / "file.txt"
print(q)  # C:/Users/Shivansh/Documents/file.txt
```

* `/` operator joins paths (cleaner than `os.path.join`).

### Getting parts of a path

```python
print(q.name)     # file.txt
print(q.stem)     # file
print(q.suffix)   # .txt
print(q.parent)   # C:/Users/Shivansh/Documents
```

---

## 5. Creating Directories or Files

```python
p = Path("new_folder")
p.mkdir(exist_ok=True)  # Create folder if not exists

file_path = p / "my_file.txt"
file_path.touch()        # Create empty file
```

---

## 6. Iterating Over Files

```python
p = Path("C:/Users/Shivansh/Documents")
for file in p.iterdir():
    print(file)
```

### Filtering by extension

```python
for file in p.glob("*.txt"):
    print(file)
```

---

## 7. Reading and Writing Files

```python
file = Path("example.txt")

file.write_text("Hello Python!")   # Write to file
content = file.read_text()         # Read from file
print(content)
```

---

## ✅ Advantages of `pathlib` over `os.path`

1. Object-oriented – paths are objects with methods.
2. Cross-platform safe.
3. Cleaner syntax (`/` operator for joining).
4. Powerful : iteration, globbing, reading/writing.


#### **_Path.cwd() gets the current path of our file , then .parent just means go 1 time up to parent_**

In [28]:
Path.cwd().parent / "app" / "data" / "reuse_mapping.json"

WindowsPath('D:/PycharmProjects/Eco_Vision/Backend/app/data/reuse_mapping.json')

In [29]:
data_path = Path.cwd().parent / "app" / "data" / "reuse_mapping.json"
with open(data_path, "r") as f:
    reuse_mapping = json.load(f)

#### Now we add reuse tips using below fn and return detection list of dict + annonated images in base64 format

In [37]:
def run_detection(image_path: str, conf: float = 0.5) -> dict:

    detections = detect_objects(image_path, conf=conf)  # get raw detections from YOLO.

    for d in detections:
        cls = d["class_name"]
        d["reuse_tip"] = reuse_mapping.get(cls, "No tip available")

    # Annotate
    annotated_img = annotate_image(image_path, detections)

    # Step 4: Convert annotated image to base64 (for JSON return)
    _, buffer = cv2.imencode(".jpg", cv2.cvtColor(annotated_img, cv2.COLOR_RGB2BGR))
    img_base64 = base64.b64encode(buffer).decode("utf-8")

    return {
        "detections": detections,
        "annotated_image": img_base64,
        "total_items": len(detections)
    }

#### _Here u can see that imencode() gives us tuple._

In [31]:
import numpy as np

# 1. Create a dummy image (RGB)
img = np.zeros((100, 100, 3), dtype=np.uint8)
img[:] = [255, 0, 0]  # fill with red

# 2. Encode the image to JPG in memory
buffer = cv2.imencode(".jpg", cv2.cvtColor(img, cv2.COLOR_RGB2BGR)) #its a tuple (ecoding is succeeded or not , nd.array )
print(buffer)

(True, array([255, 216, 255, 224,   0,  16,  74,  70,  73,  70,   0,   1,   1,   0,   0,   1,   0,   1,   0,   0, 255, 219,   0,  67,   0,   2,   1,   1,   1,   1,   1,   2,   1,   1,   1,   2,   2,   2,   2,   2,   4,   3,   2,   2,   2,   2,   5,   4,   4,   3,   4,   6,   5,   6,   6,   6,   5,   6,   6,   6,   7,   9,
         8,   6,   7,   9,   7,   6,   6,   8,  11,   8,   9,  10,  10,  10,  10,  10,   6,   8,  11,  12,  11,  10,  12,   9,  10,  10,  10, 255, 219,   0,  67,   1,   2,   2,   2,   2,   2,   2,   5,   3,   3,   5,  10,   7,   6,   7,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,
        10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10,  10, 255, 192,   0,  17,   8,   0, 100,   0, 100,   3,   1,  34,   0,   2,  17,   1,   3,  17,   1, 255, 196,   0,  31,   0,   0,   1,   5,   1,
         1,   1,   1,   1,   1,   0,   0, 

## **Converting Annotated Image to Base64 for JSON**

**Step 1:** Convert RGB → BGR
```python
cv2.cvtColor(annotated_img, cv2.COLOR_RGB2BGR)
````

* OpenCV expects **BGR** internally.
* Ensures colors display correctly when encoding.

---

**Step 2:** Encode as JPEG

```python
_, buffer = cv2.imencode(".jpg", img_bgr)
```

* Converts the image array into **JPEG bytes**.
* `_` → status (ignored), `buffer` → byte array of the image.

---

**Step 3:** Convert bytes → base64 string

```python
img_base64 = base64.b64encode(buffer).decode("utf-8")
```

* `base64.b64encode(buffer)` → bytes → safe for JSON.
* `.decode("utf-8")` → converts bytes to a **normal Python string**.

---

**Why:**

* The frontend can **display the image directly** without saving a file to disk.
* Makes sending images in **API responses easy and safe**.

In [32]:
# 3. Convert to base64 string
img_base64 = base64.b64encode(buffer[1]).decode("utf-8")
print(img_base64)

/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAIBAQEBAQIBAQECAgICAgQDAgICAgUEBAMEBgUGBgYFBgYGBwkIBgcJBwYGCAsICQoKCgoKBggLDAsKDAkKCgr/2wBDAQICAgICAgUDAwUKBwYHCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgr/wAARCABkAGQDASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD4vooor+Uz/fwKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAoo

### **Output**

We get **two things**:
1. **List of dicts** → enriched with reuse tips.
2. **Base64 annotated image** → ready for JSON response.

In [38]:
run_detection("test.jpg", conf=0.5)


image 1/1 D:\PycharmProjects\Eco_Vision\Backend\ML notebook\test.jpg: 576x640 8 bottles, 225.8ms
Speed: 5.8ms preprocess, 225.8ms inference, 9.6ms postprocess per image at shape (1, 3, 576, 640)


{'detections': [{'class_id': 39,
   'class_name': 'bottle',
   'confidence': {'score': 0.9479, 'percent': 94.8},
   'bbox': [190, 67, 222, 181],
   'reuse_tip': 'Reuse bottles for water storage instead of single-use plastic.'},
  {'class_id': 39,
   'class_name': 'bottle',
   'confidence': {'score': 0.7831, 'percent': 78.3},
   'bbox': [26, 62, 51, 155],
   'reuse_tip': 'Reuse bottles for water storage instead of single-use plastic.'},
  {'class_id': 39,
   'class_name': 'bottle',
   'confidence': {'score': 0.7175, 'percent': 71.7},
   'bbox': [45, 81, 79, 153],
   'reuse_tip': 'Reuse bottles for water storage instead of single-use plastic.'},
  {'class_id': 39,
   'class_name': 'bottle',
   'confidence': {'score': 0.663, 'percent': 66.3},
   'bbox': [26, 62, 51, 127],
   'reuse_tip': 'Reuse bottles for water storage instead of single-use plastic.'},
  {'class_id': 39,
   'class_name': 'bottle',
   'confidence': {'score': 0.6598, 'percent': 66.0},
   'bbox': [145, 59, 169, 156],
   're

### **Base64 Conversion in `detection_service.py`**

```python
_, buffer = cv2.imencode(".jpg", cv2.cvtColor(annotated_img, cv2.COLOR_RGB2BGR))
img_base64 = base64.b64encode(buffer).decode("utf-8")
````

* `annotated_img` → RGB array from `yolo_detector`
* `cv2.cvtColor(..., cv2.COLOR_RGB2BGR)` → convert to BGR for `cv2.imencode()`
* `cv2.imencode(".jpg", ...)` → encode BGR image as JPEG bytes
* `base64.b64encode(buffer)` → bytes → base64 string for JSON

**Key point:**

* Base64 JPEG string **doesn’t care about BGR/RGB**; frontend will display it correctly.
* BGR → RGB conversion only matters when working with **raw NumPy arrays**, not encoded JPEGs.

# **LETS WORK ON API LAYER**

In [43]:
import sys
!{sys.executable} -m pip install fastapi
!{sys.executable} -m pip install "uvicorn[standard]"
!{sys.executable} -m pip install python-multipart

Collecting fastapi
  Using cached fastapi-0.117.1-py3-none-any.whl.metadata (28 kB)
Collecting starlette<0.49.0,>=0.40.0 (from fastapi)
  Using cached starlette-0.48.0-py3-none-any.whl.metadata (6.3 kB)
Collecting pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4 (from fastapi)
  Using cached pydantic-2.11.9-py3-none-any.whl.metadata (68 kB)
Collecting annotated-types>=0.6.0 (from pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4->fastapi)
  Using cached annotated_types-0.7.0-py3-none-any.whl.metadata (15 kB)
Collecting pydantic-core==2.33.2 (from pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4->fastapi)
  Using cached pydantic_core-2.33.2-cp313-cp313-win_amd64.whl.metadata (6.9 kB)
Collecting typing-inspection>=0.4.0 (from pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4->fastapi)
  Using cached typing_inspection-0.4.1-py3-none-any.whl.metadata (2.6 kB)
Using cached fastapi-0.117.1-py3-none-any.whl (95 kB)
Using cached pydantic-2.11.9-py3


[notice] A new release of pip is available: 25.0.1 -> 25.2
[notice] To update, run: D:\PycharmProjects\Eco_Vision\backend\venv\Scripts\python.exe -m pip install --upgrade pip


Collecting httptools>=0.6.3 (from uvicorn[standard])
  Downloading httptools-0.6.4-cp313-cp313-win_amd64.whl.metadata (3.7 kB)
Collecting watchfiles>=0.13 (from uvicorn[standard])
  Downloading watchfiles-1.1.0-cp313-cp313-win_amd64.whl.metadata (5.0 kB)
Collecting websockets>=10.4 (from uvicorn[standard])
  Using cached websockets-15.0.1-cp313-cp313-win_amd64.whl.metadata (7.0 kB)
Downloading httptools-0.6.4-cp313-cp313-win_amd64.whl (87 kB)
Downloading watchfiles-1.1.0-cp313-cp313-win_amd64.whl (292 kB)
Using cached websockets-15.0.1-cp313-cp313-win_amd64.whl (176 kB)
Installing collected packages: websockets, httptools, watchfiles
Successfully installed httptools-0.6.4 watchfiles-1.1.0 websockets-15.0.1



[notice] A new release of pip is available: 25.0.1 -> 25.2
[notice] To update, run: D:\PycharmProjects\Eco_Vision\backend\venv\Scripts\python.exe -m pip install --upgrade pip


Collecting python-multipart
  Using cached python_multipart-0.0.20-py3-none-any.whl.metadata (1.8 kB)
Using cached python_multipart-0.0.20-py3-none-any.whl (24 kB)
Installing collected packages: python-multipart
Successfully installed python-multipart-0.0.20



[notice] A new release of pip is available: 25.0.1 -> 25.2
[notice] To update, run: D:\PycharmProjects\Eco_Vision\backend\venv\Scripts\python.exe -m pip install --upgrade pip


## **Imports (API / Detection)**

1. `from pathlib import Path` → Modern, cross-platform **Path object** for file paths. Safer than strings + `os.path.join`.
2. `import base64` → Encode images to **base64 strings** for JSON; frontend can decode to display annotated images.
3. `from fastapi import ...` → FastAPI app, handle uploads (`File`, `UploadFile`), form fields (`Form`), and raise HTTP errors (`HTTPException`).
4. `from fastapi.middleware.cors import CORSMiddleware` → Handle **CORS** for frontend requests from different origins.
5. `from fastapi.responses import JSONResponse` → Return structured JSON responses.
6. `import cv2` → OpenCV for image processing: annotate images, convert RGB↔BGR, encode JPEG.
7. `import os` → Environment variables and OS operations.
8. `from Backend.app.services.detection_service import DetectionService` → Custom service wrapping YOLO, reuse tips, and image annotation.
9. `import tempfile` → Temporary files/directories; safer than saving uploads in project folder.

In [44]:
from pathlib import Path
import base64
from fastapi import FastAPI, File, UploadFile, Form, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
import cv2
import os
from Backend.app.services.detection_service import DetectionService
import tempfile

In [45]:
# Initialize FastAPI app
app = FastAPI(title="Reusable Item Detector API")

**Explanation:**

1. **Allowed Origins:**

```python
allowed_origins = [origin.strip() for origin in os.getenv("ALLOWED_ORIGINS", "http://localhost:3000").split(",")]
```

* Reads `ALLOWED_ORIGINS` environment variable (e.g., `"http://localhost:3000,http://example.com"`).
* Splits into a **list of domains** and strips extra spaces.
* Only these domains can call your API.

2. **Add Middleware:**

```python
app.add_middleware(
    CORSMiddleware,
    allow_origins=allowed_origins,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"]
)
```

* Adds **CORS support** to FastAPI.
* `allow_credentials=True` → allows cookies/auth headers.
* `allow_methods=["*"]` → all HTTP methods allowed.
* `allow_headers=["*"]` → all request headers allowed.

In [46]:
allowed_origins = (
    [origin.strip() for origin in (os.getenv("ALLOWED_ORIGINS", "http://localhost:3000").split(","))]
)

In [47]:
app.add_middleware(
    CORSMiddleware,
    allow_origins=allowed_origins,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"]
)

In [48]:
# Initialize service layers
detection_service = DetectionService()

[YOLODetector] Loaded model: yolo12n.pt on cpu


In [49]:
# Health check endpoint

@app.get("/health")
async def health():
    """
    Simple health check endpoint.
    """
    return {"status": "healthy"}

async def health() → asynchronous function for handling multiple requests concurrently.

In [51]:
# Detection endpoint

@app.post("/detect")
async def detect(
    file: UploadFile = File(...),
    confidence: float = Form(0.5)
):
    """
    Upload an image, run detection and return JSON response with detections and annotated image.
    """

    #  Validate uploaded file
    if not file.content_type.startswith("image/"):
        raise HTTPException(status_code=400, detail="Uploaded file must be an image")

    #  Save uploaded file temporarily using tempfile
    temp_dir = Path(tempfile.gettempdir())
    temp_path = temp_dir / f"tmp_{file.filename}"
    try:
        with temp_path.open("wb") as f:
            f.write(await file.read())

        #  Run detection
        result = detection_service.run_detection(str(temp_path), confidence)

        #  Convert annotated image to base64 (if available)
        annotated_b64 = None
        if result.get("annotated_image") is not None:
            # OpenCV expects BGR for encoding
            bgr_img = cv2.cvtColor(result["annotated_image"], cv2.COLOR_RGB2BGR)
            _, buf = cv2.imencode(".jpg", bgr_img)
            annotated_b64 = base64.b64encode(buf).decode("utf-8")

        #  Build JSON response
        resp = {
            "success": True,
            "detections": result.get("detections", []),
            "annotated_image": annotated_b64,
            "total_items": result.get("total_items", 0) #defaults to 0 if key is missing.
        }

        return JSONResponse(content=resp)

    finally:
        # Cleanup temporary file
        if temp_path.exists():
            temp_path.unlink()

**Parameters:**

   * `file: UploadFile = File(...)` → required uploaded image file with access to name, type, content.
   * `confidence: float = Form(0.5)` → optional threshold for detection confidence.

**Temporary saving:**

* `temp_dir = Path(tempfile.gettempdir())` → OS temp folder.
* `temp_path = temp_dir / f"tmp_{file.filename}"` → safe temp file path.
* `with temp_path.open("wb") as f: f.write(await file.read())` → writes uploaded bytes asynchronously.

**Return JSON Response**

* Converts Python dict `resp` into **JSON**.
* Sends it to the client with `Content-Type: application/json`.
* FastAPI automatic sets **HTTP 200 OK** for successful responses.


**Temporary File Cleanup**

**Delete file:** `temp_path.unlink()` removes the temporary file safely using `Path` object.