# Metrics for evaluating the performance of 3D generative models

The repository combines several metrics for evaluating the performance of 3D generative models. We sorted them to the following categories:
- Semantic Metrics: Includes metrics like MSE and CLIP-S
- Geometry Metrics: 
    - Rel_BB_Aspect_Ratio_Diff: Bounding box aspect ratio difference
    - Rel_Pixel_Area_Diff: Relative pixel area difference
    - Squared_Outline_Normals_Angle_Diff: Squared difference of the angle between the outline normals
    - Squared_Summed_Outline_Normals_Angle_Diff: Squared difference of the summed outline normals
- Distribution Metrics: 
    - Frechet Inception Distance (FID): Measures the distance between two distributions
    - Kernel Inception Distance (KID): Similar to FID but uses a different approach to measure distance
    - Inception Score (IS): Measures the quality of generated images
- Prompt alignment metrics: 
    - CLIP-S: Measures the alignment of generated images with text prompts
    - ImageReward: Measures the quality of generated images using a reward model
- Vehicle Based Dimension comparison
    - Width and length comparison (normalized by height)
    - Vehicle wheelbase comparison (normalized by height). Uses Florence OD to detect the wheels.


## Semantic metrics based on a list of images

This block shows the general functionality of the Metrics class for the semantic metrics. This can serve as a template for implementing an evaluation routine for estimating the model performance during training. 

In [None]:
import os
import glob
import numpy as np
from PIL import Image
import torch
from metrics.helpers import preprocess_image # load_images_from_dir
from metrics.metrics import Metrics

os.environ["OMP_NUM_THREADS"] = "1"

device = "cuda" if torch.cuda.is_available() else "cpu"

# you can find this method also in the metrics.helpers module but we include it here for reference
def load_images_from_dir(image_dir: str, device, preprocess_func):
    """
    Load all images from a directory, preprocess them with preprocess_func,
    and return a tensor of shape (num_frames, channels, height, width).
    """
    files = sorted(glob.glob(os.path.join(image_dir, "*.png")))
    images = [preprocess_func(Image.open(f)) for f in files]
    arr = np.array([np.array(img) for img in images])
    tensor = np.transpose(arr, (0, 3, 1, 2))
    tensor = torch.tensor(tensor, dtype=torch.float32).to(device)
    return tensor

# by defining this class we can compute the metrics several times
semantic_metric = Metrics(device=device, compute_distribution_metrics=False)
# load the images from the directories
gt_folder = "example_data/Meshfleet_Eval/Ground_Truth/0ae696bd837219e784b8b7979807184decd5abdb813f0fd7bbfbf6a82bdcc798"
gen_folder = "example_data/Meshfleet_Eval/Results_000/0ae696bd837219e784b8b7979807184decd5abdb813f0fd7bbfbf6a82bdcc798"
input_tensor  = load_images_from_dir(gt_folder, device, preprocess_image)
target_tensor = load_images_from_dir(gen_folder,  device, preprocess_image)
# compute the metrics
semantic_metric.compute_image(input_tensor, target_tensor)

In [None]:
# if you want to specify the metrics you can pass them as metric_list
semantic_metric = Metrics(device=device, compute_distribution_metrics=False, metric_list=["MSE", "CLIP-S", "LPIPS", "SSIM", "PSNR"])
semantic_metric.compute_image(input_tensor, target_tensor)

## Process multiple Objects where the generated images are aligned by viewpoints

For a detailed evaluation of the model performance the following code can be used. 

The following code assumes that you have a set of images for each object where the generations and the reference are aligned by viewpoints. The images of each object should be in a directory. Take a look at the example data in `example_data/` to see how the data should be structured. The images should be named in a way that the sorted order of the filenames corresponds to the order of the viewpoints. For example, if you have 8 images for each object, the filenames should be `0.png`, `1.png`, `2.png`, ..., `7.png`. The code will automatically sort the images in each directory and calculate the metrics for each object. 

To align the images by viewpoints and preprocess the images, you can use the `data_preprocessing.ipynb` notebook. If you use the geometry metrics, this step is necessary.


You can configure which metrics should be calculated using the Metrics-Config file (`cfg_path  = "example_data/config_small.json"` in the example below). You can select if the metrics should be estimated on a viewpoint basis or over all images. The distribution metrics (FID, KID, IS) are calculated over all images. The semantic metrics (MSE, CLIP-S) can be calculated on a viewpoint basis or over all images. The geometry metrics are calculated on a viewpoint basis. 

In [None]:
from metrics.metrics_eval import process_metrics_by_viewpoint
os.environ["OMP_NUM_THREADS"] = "1"

gt_folder = "example_data/Meshfleet_Eval/Ground_Truth"
gen_folder = "example_data/Meshfleet_Eval/Results_000"
cfg_path  = "example_data/config_small.json"

metrics_result = process_metrics_by_viewpoint(
    ground_truth_folder=gt_folder,
    generated_folder=gen_folder,
    device="cuda",
    config_path=cfg_path,
)

import json
print(json.dumps(metrics_result, indent=2))

### Write results into json file

In [None]:
import os
from metrics.metrics_eval import process_metrics_by_viewpoint, tensor_to_serializable, json_file_to_combined_table
json_output = json.dumps(tensor_to_serializable(metrics_result), indent=4)

# Save the results in a JSON file in the root folder.
output_file = os.path.join(os.path.dirname(gt_folder), "metrics_results_config_test.json")
with open(output_file, 'w') as f:
    f.write(json_output)

print(f"Metrics saved to {output_file}")

## Prompt alignment metrics

If you want to evaluate a text-to-3D model, you can use the prompt alignment metrics. The method `process_folder_with_prompt_files` assumes that the images and prompts are in the same folder. The prompt files should have the same name as the images, but with a `.txt` extension. For example, if you have an image named `image_1.png`, the corresponding prompt file should be named `image_1.txt`. The prompt files should contain the text prompts used to generate the images.

In [None]:
from metrics.metrics import Metrics, GeometryMetrics, ImageBasedPromptEvaluator
from metrics.helpers import preprocess_image, process_folder_with_prompt_files, process_folder_with_prompt
os.environ["OMP_NUM_THREADS"] = "1"

prompt_metric = ImageBasedPromptEvaluator() 

generated_folder = "example_data/0ae696bd837219e784b8b7979807184decd5abdb813f0fd7bbfbf6a82bdcc798"
image_scores = process_folder_with_prompt_files(
    generated_folder=generated_folder,
    preprocess_func=preprocess_image,
    prompt_metric=prompt_metric
)
image_scores

If you have a single prompt and a folder with images you can use `process_folder_with_prompt`. 

In [None]:
prompt = "Compact crossover in azure blue, white roof & mirrors."
process_folder_with_prompt(
    generated_folder=generated_folder,
    object_prompts=prompt,
    preprocess_func=preprocess_image,
    prompt_metric=prompt_metric)

## Vehicle based dimension comparison

If you want to compare the dimensions without a detailed geometric analysis you can also use `estimate_dimension_differences` from `FlorenceWheelbaseOD`. This method estimates the dimensions and wheelbase of the (vehicle) objects based on the images normalizes them based on the height of the object and compares them. The method returns a dictionary with the differences in the dimensions. The last value is the non normalized height difference. 

In [None]:
import os 
from metrics.viewpoint_florence import FlorenceWheelbaseOD

os.environ["OMP_NUM_THREADS"] = "1"

original_folder = "example_data/Meshfleet_Eval/Ground_Truth/0ae696bd837219e784b8b7979807184decd5abdb813f0fd7bbfbf6a82bdcc798"
generated_folder = "example_data/Meshfleet_Eval/Results_000/0ae696bd837219e784b8b7979807184decd5abdb813f0fd7bbfbf6a82bdcc798"

# you have to remove the background of the images for the florence based metrics to work properly
# remove_background_recursive(generated_folder)
florence_wheelbase_od = FlorenceWheelbaseOD()
differences = florence_wheelbase_od.estimate_dimension_differences(generated_folder, original_folder, normalize=True)
differences