# **Product Recognition of Food Products**

## Image Processing and Computer Vision - Assignment Module \#1


Contacts:

- Prof. Giuseppe Lisanti -> giuseppe.lisanti@unibo.it
- Prof. Samuele Salti -> samuele.salti@unibo.it
- Alex Costanzino -> alex.costanzino@unibo.it
- Francesco Ballerini -> francesco.ballerini4@unibo.it


Computer vision-based object detection techniques can be applied in super market settings to build a system that can identify products on store shelves.
An example of how this system could be used would be to assist visually impaired customers or automate common store management tasks like detecting low-stock or misplaced products, given an image of a shelf in a store.

## Task
Develop a computer vision system that, given a reference image for each product, is able to identify such product from one picture of a store shelf.

<figure>
<a href="https://imgbb.com/">
  <center>
  <img src="https://i.ibb.co/TwkMWnH/Screenshot-2024-04-04-at-14-54-51.png" alt="Screenshot-2024-04-04-at-14-54-51" border="0" width="300" />
</a>
</figure>

For each type of product displayed in the
shelf the system should report:
1. Number of instances;
1. Dimension of each instance (width and height in pixel of the bounding box that enclose them);
1. Position in the image reference system of each instance (center of the bounding box that enclose them).

#### Example of expected output
```
Product 0 - 2 instance found:
  Instance 1 {position: (256, 328), width: 57px, height: 80px}
  Instance 2 {position: (311, 328), width: 57px, height: 80px}
Product 1 – 1 instance found:
.
.
.
```

### Track A - Single Instance Detection
Develop an object detection system to identify single instance of products given one reference image for each item and a scene image.

The system should be able to correctly identify all the product in the shelves
image.

### Track B - Multiple Instances Detection
In addition to what achieved at step A, the system should also be able to detect multiple instances of the same product.

## Data
Two folders of images are provided:
* **Models**: contains one reference image for each product that the system should be able to identify.
* **Scenes**: contains different shelve pictures to test the developed algorithm in different scenarios. The images contained in this folder are corrupted by noise.

#### Track A - Single Instance Detection
* **Models**: {ref1.png to ref14.png}.
* **Scenes**: {scene1.png to scene5.png}.

#### Track B - Multiple Instances Detection
* **Models**: {ref15.png to ref27.png}.
* **Scenes**: {scene6.png to scene12.png}.

In [2]:
from google.colab import drive
drive.mount('/content/drive')

!cp -r /content/drive/MyDrive/AssignmentsIPCV/dataset.zip ./
!unzip dataset.zip

Mounted at /content/drive
Archive:  dataset.zip
   creating: dataset/
  inflating: __MACOSX/._dataset      
   creating: dataset/scenes/
  inflating: __MACOSX/dataset/._scenes  
  inflating: dataset/.DS_Store       
  inflating: __MACOSX/dataset/._.DS_Store  
   creating: dataset/models/
  inflating: __MACOSX/dataset/._models  
  inflating: dataset/scenes/scene12.png  
  inflating: __MACOSX/dataset/scenes/._scene12.png  
  inflating: dataset/scenes/scene10.png  
  inflating: __MACOSX/dataset/scenes/._scene10.png  
  inflating: dataset/scenes/scene11.png  
  inflating: __MACOSX/dataset/scenes/._scene11.png  
  inflating: dataset/scenes/scene5.png  
  inflating: __MACOSX/dataset/scenes/._scene5.png  
  inflating: dataset/scenes/scene4.png  
  inflating: __MACOSX/dataset/scenes/._scene4.png  
  inflating: dataset/scenes/scene6.png  
  inflating: __MACOSX/dataset/scenes/._scene6.png  
  inflating: dataset/scenes/scene7.png  
  inflating: __MACOSX/dataset/scenes/._scene7.png  
  inflating: 

## Evaluation criteria
1. **Procedural correctness**. There are several ways to solve the assignment. Design your own sound approach and justify every decision you make;

2. **Clarity and conciseness**. Present your work in a readable way: format your code and comment every important step;

3. **Correctness of results**. Try to solve as many instances as possible. You should be able to solve all the instances of the assignment, however, a thoroughly justified and sound procedure with a lower number of solved instances will be valued **more** than a poorly designed approach.

In [3]:
import cv2
import os
import numpy as np
import zipfile

# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

# Path to the ZIP file
zip_file_path = '/content/drive/My Drive/dataset.zip'
# Extraction directory
extracted_folder = '/content/dataset/'

# Extract the ZIP file
with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
    zip_ref.extractall(extracted_folder)

# Path to the folder containing the reference images
models_folder = os.path.join(extracted_folder, 'models')
# Path to the folder containing the scene images
scenes_folder = os.path.join(extracted_folder, 'scenes')

# Function to extract SIFT features from an image
def extract_sift_features(image_path):
    # Read the image in grayscale
    image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
    # Initialize the SIFT detector
    sift = cv2.SIFT_create()
    # Detect and describe features
    keypoints, descriptors = sift.detectAndCompute(image, None)
    return keypoints, descriptors

# Function to perform feature matching between two sets of descriptors
def match_features(descriptors1, descriptors2):
    # Initialize the brute force matcher
    bf = cv2.BFMatcher()
    # Perform descriptor matching
    matches = bf.knnMatch(descriptors1, descriptors2, k=2)
    # Apply ratio test to get the best matches
    good_matches = []
    for m, n in matches:
        if m.distance < 0.75 * n.distance:
            good_matches.append(m)
    return good_matches

# Function to calculate the position and size of each object instance in the scene
def calculate_object_positions(matches, scene_keypoints, ref_keypoints):
    # List to store the position and size of each object instance in the scene
    object_positions = []
    # Iterate over the rows of the match matrix
    for i, row in enumerate(matches):
        # Filter reference images that have at least one match
        relevant_refs = np.where(row > 0)[0]
        # Iterate over the relevant reference images
        for ref_idx in relevant_refs:
            # Get the corresponding keypoints from the scene and reference image
            scene_kps = [scene_keypoints[i] for i in range(len(scene_keypoints)) if matches[i, ref_idx] > 0]
            ref_kps = [ref_keypoints[ref_idx]]
            # Calculate the homography between the corresponding keypoints
            H, _ = cv2.findHomography(np.float32([kp.pt for kp in ref_kps]),
                                       np.float32([kp.pt for kp in scene_kps]), cv2.RANSAC)
            # Transform the reference image points to the scene
            ref_corners = np.float32([[0, 0], [0, 1], [1, 1], [1, 0]]).reshape(-1, 1, 2)
            transformed_corners = cv2.perspectiveTransform(ref_corners, H)
            # Calculate the center and size of the bounding box enclosing the instance
            x, y, w, h = cv2.boundingRect(transformed_corners)
            center_x = x + w / 2
            center_y = y + h / 2
            size = max(w, h)
            # Add the position and size of the instance to the list
            object_positions.append((ref_idx + 1, center_x, center_y, size))
    return object_positions

# List to store the features of all reference images
all_keypoints = []
all_descriptors = []

# List to store the feature matches for all scene images
all_matches = []

# Iterate over all images in the "models" folder
for filename in os.listdir(models_folder):
    if filename.endswith('.png'):
        image_path = os.path.join(models_folder, filename)
        # Extract SIFT features from the image
        keypoints, descriptors = extract_sift_features(image_path)
        # Add the features to the list
        all_keypoints.append(keypoints)
        all_descriptors.append(descriptors)

# Iterate over all images in the "scenes" folder
for filename in os.listdir(scenes_folder):
    if filename.endswith('.png'):
        scene_path = os.path.join(scenes_folder, filename)
        # Extract SIFT features from the scene image
        scene_keypoints, scene_descriptors = extract_sift_features(scene_path)
        # List to store matches for this scene
        scene_matches = []
        # Iterate over the features of the reference images
        for i, ref_descriptors in enumerate(all_descriptors):
            # Perform feature matching between scene features and reference features
            matches = match_features(scene_descriptors, ref_descriptors)
            # Store the number of matches for this reference image
            scene_matches.append(len(matches))
        # Add matches for this scene to the general list
        all_matches.append(scene_matches)

# Convert the matches list to a numpy array for easier processing
all_matches = np.array(all_matches)

# List to store object positions for all scenes
all_object_positions = []

# Iterate over all scene images
for i, filename in enumerate(os.listdir(scenes_folder)):
    if filename.endswith('.png'):
        scene_path = os.path.join(scenes_folder, filename)
        # Extract SIFT features from the scene image
        scene_keypoints, _ = extract_sift_features(scene_path)
        # Calculate object positions in the scene
        object_positions = calculate_object_positions(all_matches[i], scene_keypoints, all_keypoints)
        # Add object positions for this scene to the general list
        all_object_positions.append(object_positions)

# Print object positions for each scene
for i, object_positions in enumerate(all_object_positions):
    print(f"Scene {i + 1}:")
    for ref_idx, center_x, center_y, size in object_positions:
        print(f"  Product {ref_idx - 1} - 1 instance found:")
        print(f"    Instance 1 {{position: ({int(center_x)}, {int(center_y)}), width: {int(size)}px, height: {int(size)}px}}")



  relevant_refs = np.where(row > 0)[0]


TypeError: 'numpy.int64' object is not iterable