# Imports
The code imports several modules and packages that are needed for the implementation of the following tasks. These include:

- `os`: which provides a way of using operating system dependent functionality like reading or writing to the file system.
- `cv2`: which is an OpenCV library that provides computer vision functionality.
- `sklearn.cluster.MiniBatchKMeans`: which is a KMeans clustering algorithm that splits a dataset into k clusters.
- `tqdm`: which provides a progress bar for loops that takes an iterable object and returns an iterator.
- `transformers.DetrImageProcessor`: which is a transformers library implementation of a processor for DETR (Dense Object Detection and Segmentation).
- `transformers.DetrForObjectDetection`: which is a transformers library implementation of DETR (Dense Object Detection and Segmentation).
- `torch`: which is a deep learning framework used for building and training neural networks.
- `PIL`: which is a Python Imaging Library that adds image processing capabilities to Python interpreter.
- `dotenv`: which is a library that loads environment variables from a .env file.
- `pandas`: This library provides data structures and data analysis tools for handling and manipulating numerical tables and time series data.

The code then uses the `load_dotenv` function to load the environment variables from the .env file.

In [None]:
!pip install opencv-python scikit-learn tqdm transformers torch Pillow python-dotenv

In [None]:
import os
import cv2
from sklearn.cluster import MiniBatchKMeans
from tqdm import tqdm
from transformers import DetrImageProcessor, DetrForObjectDetection
import torch
from PIL import Image
from dotenv import load_dotenv
import pandas as pd
load_dotenv()

# Set the Base Folder Paths for the Project

The following code sets the base folder paths for the project, including:

- `output_path`: The base folder path for the project.
- `images_path`: The folder path for the images.
- `metadata_path`: The folder path for the metadata.
- `config_path`: The folder path for the configuration files.

The code then creates a `list_of_paths` that contains all of these folder paths.

In [None]:
# Set the base folder path for the project
output_path = "../output"
images_path = os.path.join(output_path, "images")
metadata_path = os.path.join(output_path, "metadata")
config_path = os.path.join(output_path, "config")

list_of_paths = [output_path, images_path, metadata_path, config_path]

#  create_folder function

The `create_folder` function is used to create a folder at a specified path. If the folder already exists, the function will print a message saying so. If there is an error creating the folder, the function will print the error message.

## Parameters
- `path` (str): The path of the folder to be created.

## Returns
- None

In [None]:
def create_folder(path):
    """
    This function creates a folder at the specified path.
    If the folder already exists, it will print a message saying so.
    If there is an error creating the folder, it will print the error message.

    Parameters:
        :param path (str): The path of the folder to be created.

    Returns:
    None
    """
    try:
        # Use os.mkdir to create the folder at the specified path
        os.mkdir(path)
        print(f"Folder {path} created")
    except FileExistsError:
        # If the folder already exists, print a message saying so
        print(f"Folder {path} already exists")
    except Exception as e:
        # If there is an error creating the folder, print the error message
        print(f"Error creating folder {path}: {e}")

# Initializing Folders
This function `init_folder` initializes the specified folders.

**Function Parameters:**

- `folder_names (list)`: A list of folder names to be created.

**Function Behavior:**

- The function iterates over the list of folder names and calls the `create_folder` function for each name.
- This function is used to create the required output, images, metadata, and include folders.

In [None]:
def init_folder(folder_names: list):
    for folder_name in folder_names:
        create_folder(folder_name)

In [None]:
init_folder(list_of_paths)

## get_all_images
This function returns a list of full paths to all the images with .png or .jpg extensions in the given path. If an error occurs while fetching images, the function returns an empty list and logs the error message.

### Args
- path (str): The path to the directory containing the images.

### Returns
- list: A list of full path to all the images with .png or .jpg extensions.
- empty list: An empty list if an error occurred while fetching images.

In [None]:
def get_all_images(path):
    """Get all images from the given path.

    Args:
    param: image_path (str): path to the directory containing the images.

    Returns:
    - list: a list of full path to all the images with png or jpg extensions.
    - empty list: an empty list if an error occurred while fetching images.
    """
    try:
        # use os.walk to traverse all the subdirectories and get all images
        return [os.path.join(root, name)
                for root, dirs, files in os.walk(path)
                for name in files
                if name.endswith((".png", ".jpg"))]
    except Exception as e:
        # return an empty list and log the error message if an error occurred
        print(f"An error occurred while fetching images: {e}")
        return []

# Facebook DETR model (detr-resnet-101)

The detect_with_transformers function takes an image file path as an input, then uses a pre-trained model called DEtection TRansformer (DETR) to detect objects within the image.

The function first opens the input image using the Python Imaging Library (PIL) Image.open method. Then it instantiates two components of the DETR model: a DetrImageProcessor and a DetrForObjectDetection model. The DetrImageProcessor is responsible for processing the input image into a format that can be fed into the DetrForObjectDetection model. The DetrForObjectDetection model then takes the processed image and performs object detection by predicting bounding boxes and class labels for each detected object.

Once the model has made its predictions, the function uses the processor.post_process_object_detection method to convert the bounding box and class label predictions into a format that is compatible with the Common Objects in Context (COCO) dataset. This conversion is necessary in order to use the COCO API, which provides a common framework for evaluating object detection models.

The function then filters the detected objects by only keeping those with a confidence score above a certain threshold (0.9 in this case), and extracts the corresponding class labels. Finally, the function prints out a message for each detected object, indicating its class label, confidence score, and location within the image. The function returns a list of the detected object class labels.

In [None]:
def detect_with_transformers(image):
    """
    This function detects objects in an image using the DETR (DEtection TRansformer) model by Facebook.

    Args:
    image: A string representing the path of the image to be processed.

    Returns:
    A list containing the labels of the detected objects in the image.

    Raises:
    None.
    """
    #image = Image.open(image)
    processor = DetrImageProcessor.from_pretrained("facebook/detr-resnet-101")
    model = DetrForObjectDetection.from_pretrained("facebook/detr-resnet-101")
    inputs = processor(images=image, return_tensors="pt")
    outputs = model(**inputs)

    # convert outputs (bounding boxes and class logits) to COCO API
    # let's only keep detections with score > 0.9
    target_sizes = torch.tensor([image.size[::-1]])
    results = processor.post_process_object_detection(outputs, target_sizes=target_sizes, threshold=0.9)[0]
    labels = []
    for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
        box = [round(i, 2) for i in box.tolist()]
        labels.append(model.config.id2label[label.item()])
        #print(
        #    f"Detected {model.config.id2label[label.item()]} with confidence "
        #    f"{round(score.item(), 3)} at location {box}"
        #)
    return labels

# Save  metadata

The function save_metadata allows you to save metadata information of an image in either pickle, json, or sqlite format. The function takes four parameters: metadata, img_name, metadata_path, and save_format.

metadata is a dictionary that contains the metadata information of an image. img_name is a string that represents the file name of the image. metadata_path is a string that specifies the path to the directory where the metadata will be saved. save_format is an optional parameter that specifies the format in which the metadata will be saved. The default value is pickle.

The function saves the metadata in the specified format. If save_format is set to pickle, the metadata is saved in the pickle format. If save_format is set to json, the metadata is saved in the json format. If save_format is set to sqlite, the metadata is saved in the sqlite database.

If an error occurs while saving the metadata, the function will print an error message indicating the image name and the error that occurred.

The function does not return any value.

In [None]:
def merge_df(df1):
    metadata_file = os.path.join(metadata_path, "metadata.csv")
    # merge on filename column
    df = pd.merge(df1, pd.read_csv(metadata_file), on="filename")
    df.to_csv(metadata_file, index=False)
    return df

# Set tags in metadata
This function "update_tags" is used to run the YOLOv3 algorithm on a set of images, update the metadata of each image with the detected labels (tags) and save the updated metadata.

The function takes 3 parameters:

images: a list of file paths for the images that need to be processed.
metadata_path: a file path to the directory where the metadata files are stored.
save_format: the format of the metadata files. Can be either 'pickle' or 'sqlite'.
The function uses the tqdm library to display a progress bar for the image processing. For each image, the function tries to retrieve its metadata based on the save_format. If the metadata file format is 'sqlite', the function calls the read_sqlite function to retrieve the metadata. If the metadata file format is 'pickle', the function reads the metadata file directly.

If the metadata already contains a "tags" key, it means that the image has already been processed and its metadata has been updated with the labels, so the function skips that image.

The function then calls the detect function to run the YOLOv3 algorithm on the image and retrieve the labels (tags). The labels are added to the metadata under the "tags" key.

Finally, the function calls the save_metadata function to save the updated metadata. If an error occurs while processing an image (e.g. the metadata file is not found), the function prints an error message and continues processing the next image.

In [None]:
def update_tags(images):
    # Run the YOLOv3 algorithm on each image
    # display progress bar in the first thread only
    metadata = {}
    for image in tqdm(images, desc="Updating tags"):
        file_name = os.path.basename(image)
        try:
            file_name, ext = file_name.split(".")
        except ValueError:
            continue
        extensions = ["jpg", "jpeg", "png"]
        if ext not in extensions:
            continue
        try:
            image = Image.open(image)
            # resize image to 416x416
            image = image.resize((416, 416))
            labels = detect_with_transformers(image)
            image.close()

            # Remove duplicates from labels
            labels = list(set(labels))
            # add labels to metadata
            metadata[file_name + '.jpg'] = {"tags": labels}
        except FileNotFoundError:
            print("File not found: ", file_name)
            continue
        except Exception as e:
            continue

    # Convert the metadata dictionary to a pandas dataframe
    metadata = pd.DataFrame.from_dict(metadata, orient="index")
    # Rename the first column to filename
    metadata = metadata.rename_axis("filename").reset_index()
    # Save the metadata
    merge_df(metadata)


In [None]:
# Get the list of images
images = os.listdir(images_path)
images = [os.path.join(images_path, image) for image in images]

update_tags(images)

    ### Now, find dominant colors in the images
The functions rgb_to_hex and find_dominant_colors are used to find the dominant colors in an image.

The function rgb_to_hex takes in an RGB array with 3 values, and returns the hexadecimal representation of the color. This can be useful for formatting colors in a standardized way, as hexadecimal codes are widely used in web development and other applications.

The function find_dominant_colors takes in an image and optional parameters k and image_processing_size. The k parameter specifies the number of dominant colors to return, with a default value of 4. The image_processing_size parameter allows you to resize the image to a smaller size, to speed up the processing, if desired.

The image is first converted from BGR to RGB, and then reshaped into a list of pixels. The KMeans algorithm is used to cluster the pixels into k clusters, and the most popular clusters are identified. The color values for each of the k clusters are converted to hexadecimal representation and returned as a list, along with the percentage of the image covered by each color.

In [None]:
def rgb_to_hex(rgb):
    return '#%02x%02x%02x' % (int(rgb[0]), int(rgb[1]), int(rgb[2]))

In [None]:
def find_dominant_colors(image_path, k=4, downsample=2, resize=(200, 200)):
    # Load image and convert to RGB
    image = cv2.imread(image_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    # Downsample the image
    image = cv2.resize(image, (image.shape[1] // downsample, image.shape[0] // downsample))

    # Resize the image if requested
    if resize is not None:
        image = cv2.resize(image, resize)

    # Flatten the image
    image_flat = image.reshape((image.shape[0] * image.shape[1], 3))

    # Cluster the pixels using KMeans and find percentage of image covered by each color
    clt = MiniBatchKMeans(n_clusters=k, n_init=10, batch_size=100, random_state=42)
    labels = clt.fit_predict(image_flat)

    # Count the number of pixels assigned to each cluster
    counts = np.bincount(labels)

    # Calculate the percentage of pixels assigned to each cluster
    percentages = counts / len(labels)

    # Get the dominant colors
    dominant_colors = clt.cluster_centers_

    # Convert to hexadecimal format
    dominant_colors_hex = [rgb_to_hex(color) for color in dominant_colors]

    # Combine the dominant colors and their percentages into a array of tuples
    result = list(zip(dominant_colors_hex, percentages))

    return result

The following code block is used to process images and find their dominant colors. The code first retrieves all the images present in the folder specified by the images_path variable. Then, it iterates over each image, reads the metadata associated with the image and finds its dominant color if it hasn't been calculated already.

For each image, the code first reads the image using OpenCV's cv2.imread() function and stores the result in the img variable. The code then reads the metadata of the image. The type of metadata file (e.g. .json, .pkl, .sqlite) is specified by the metadata_extension variable. Based on the file extension, the code reads the metadata using either read_sqlite(), json.load(), or pickle.load() functions. If the metadata file is not found, the code continues to the next iteration of the loop, but if there is an error, it prints the error message and continues to the next iteration.

If the metadata does not contain information about the dominant color of the image, the code calculates the dominant color by calling the find_dominant_colors() function. The result of the find_dominant_colors() function is then added to the metadata under the key "dominant_color". Finally, the updated metadata is saved using the save_metadata() function, which saves the metadata to the specified location using the specified file format (metadata_extension).

In [None]:
import numpy as np


def get_all_colors(image_path):
    """
    This coroutine extracts dominant colors from all images in a directory and saves the color information in the database.

    Parameters:
    image_path (str): The path to the directory where the images are stored.

    Returns:
    None
    """
    # Get a list of all images in the directory
    img_files = get_all_images(image_path)
    colors = []

    # Create a progress bar to track the progress of processing all images
    for img in tqdm(img_files, desc="Processing images (Aprox: 25 minutes"):
        try:
            # Create a list of coroutines to extract metadata for all images
            color = find_dominant_colors(img, downsample=2, resize=(100, 100))
        except Exception as e:
            print("Error: ", e)
            continue

        if color:
            # color to string to avoid errors with quote marks
            color = str(color)
            # replace quotes by double quotes
            color = color.replace("'", '"')
            colors.append(color)

    img_files = [os.path.basename(img) for img in img_files]

    # Create a dataframe with the image filenames and their dominant colors
    df = pd.DataFrame({"filename": img_files, "dominant_color": colors})

    merge_df(df)


In [None]:
get_all_colors(images_path)