**Feature Extraction:** https://github.com/christiansafka/img2vec

# **img2vec_pytorch**

### **1. The Core Concept**

`img2vec` is a tool that translates an **Image** (pixels) into a **Vector** (a list of numbers).

* **Input:** A raw image (e.g., a photo of a sunrise).
* **The Black Box:** A pre-trained Deep Learning model (usually ResNet) that already knows how to "see" shapes, colors, and objects.
* **Output:** A feature vector (e.g., a list of 512 numbers) that represents the *content* of that image.

### **2. Why use it?**

Standard Machine Learning models (like Random Forest or SVM) cannot understand raw images because they are just massive grids of pixels.

* **Without img2vec:** You have to build and train a complex Convolutional Neural Network (CNN) from scratch (like YOLO or Teachable Machine).
* **With img2vec:** You extract the "smart" features instantly and feed them into a simple, fast classifier like Random Forest. It essentially allows you to do "Deep Learning" results with "Standard ML" speed and simplicity.

### **3. The Workflow (As seen in the tutorial)**

1. **Load Image:** Open the file using PIL (Python Imaging Library).
2. **Extract:** Pass the image to `img2vec.get_vec()`. This runs the image through the pre-trained neural network and captures the output before the final classification layer.
3. **Train:** Train a standard classifier (Random Forest) on these vectors.

### **4. The Code Cheatsheet**

Here is the essential code pattern extracted from the video:

```python
# 1. Setup
from img2vec_pytorch import Image2Vec
from PIL import Image

# Initialize the model (downloads a pre-trained ResNet model)
img2vec = Image2Vec() 

# 2. Convert Image to Vector
img = Image.open('./weather_dataset/rain/rain1.jpg')
feature_vector = img2vec.get_vec(img) 

# feature_vector is now a list of numbers representing the image
# You can now feed this into sklearn models

```

### **5. Performance Context**

In the video, this method achieved **94.4% accuracy** on the weather dataset. This demonstrates that using `img2vec` to extract features and a simple Random Forest to classify them is a highly effective strategy for standard image classification tasks.

---
---

# **Notes**

**Project Overview and Setup**

* **Goal:** Build an image classifier that utilizes feature extraction to categorize images.
* **Dataset:** Uses a weather dataset with four categories: `cloudy`, `rain`, `shine`, and `sunrise`.
    * The data is split into `train` (training) and `val` (validation) directories.
    * This is the same dataset used in previous YOLOv8 and Teachable Machine tutorials.


* **Environment:** The project is created in PyCharm using a virtual environment with Python 3.8.
* **Requirements:**
    * `img2vec_pytorch`: A library for feature extraction using pre-trained models.
    * `scikit-learn`: For the classifier (Random Forest) and metrics.
    * `pillow` (PIL): For image handling.



**Feature Extraction Implementation (`main.py`)**

* **Initialization:** Import `Image2Vec` and create a feature extractor object.
* **Data Preparation:**
    * Define paths for `train` and `val` directories.
    * Iterate through both directories and their subcategories (labels).
    * Load each image using `Image.open()` from PIL.
    * Extract features from the image using `img2vec.get_vec(image)`.
    * Append features and corresponding labels (directory names) to lists.


* **Data Organization:** Store training and validation data (features and labels) into a dictionary for easy access.

**Model Training and Evaluation**

* **Classifier:** Uses `RandomForestClassifier` from `sklearn.ensemble`.
    * Note: While Random Forest is used for its robustness and ease of use, other classifiers from scikit-learn could also be used.


* **Training:** Initialize the model and train it using `model.fit()` with the training features and training labels.
* **Testing:**
    * Generate predictions on the unseen validation set using `model.predict()`.
    * Calculate accuracy using `accuracy_score` from `sklearn.metrics` comparing predictions to validation labels.


* **Results:** The model achieved an accuracy of 94.4%.

**Saving the Model**

* **Method:** Use the `pickle` library to serialize and save the trained model to a file named `model.p`.

**Inference (`infer.py`)**

* **Setup:** Create a separate script to load the saved model and predict new images.
* **Process:**
    1. Initialize the `Image2Vec` extractor.
    2. Load a target image (e.g., `cloudy4.jpg`) using PIL.
    3. Extract features from the new image using `get_vec`.
    4. Load the trained model from `model.p` using `pickle.load()`.
    5. Predict the category using `model.predict()` with the extracted features.


* **Result:** The script correctly predicted the category "cloudy" for the test image.

---
---

# **SVC only vs. SVC with feature extraction first**

The difference is massive. It is the difference between **memorizing pixels** versus **understanding content**.

Here is the breakdown of why the "Feature Extraction" method (Current Project) destroys the "Raw Pixel" method (Previous Parking Project) on real-world data.

### 1. The "Raw Pixel" Problem (Previous Project)

In your parking slot project, you resized images to **15x15** pixels and flattened them.

* **How it works:** The model looks at pixel #1 and asks "Is it gray?" It looks at pixel #50 and asks "Is it red?"
* **The Flaw:** It is **position dependent**. If you take a picture of a car, and then shift the camera 2 inches to the left, **every single pixel value changes**. The model will likely fail because it hasn't memorized this specific arrangement of pixels.
* **Why it worked before:** The parking camera was **fixed**. The cars were always in the exact same spot, at the exact same size.

### 2. The "Feature Extraction" Advantage (Current Project)

In this project, `img2vec` uses a pre-trained brain (ResNet-18) to look at the image *before* the classifier sees it.

* **How it works:** The neural network doesn't care about specific pixel positions. It scans the image and outputs a summary like: *"I see a sharp edge, a metallic texture, and a cylinder shape."*
* **The Vector:** That list of 512 numbers you get isn't pixelsâ€”it's a list of **concepts** (textures, shapes, patterns).
* **The Result:** If you take a picture of a "Glass Bottle" and rotate it, zoom in, or move it to the corner, the **concepts** (glass texture, bottle shape) stay the same. The Random Forest classifies these concepts, not the pixels.

### Summary Comparison

| Feature | Raw Pixels (Old Method) | Feature Extraction (New Method) |
| --- | --- | --- |
| **Input** | Raw colors (Red=255, Green=0...) | Abstract concepts (Shape, Texture, Edges) |
| **Rotation/Movement** | Fails immediately (Fragile) | Works fine (Robust) |
| **Image Size** | Must be tiny (15x15) to run fast | Can be large (HD) and still fast |
| **Intelligence** | "There is a red dot at coordinate 10,10" | "There is a shiny object in the image" |

**In short:**

* **Method 1 (Raw Pixels):** Like trying to identify a book by measuring exactly where the ink dots are on the page.
* **Method 2 (Feature Extraction):** Like identifying a book by reading the plot summary.