## Evaluation

This notebook evaluates the performance of MIRO across various simulated datasets.

We use the `calculate_metrics_for_experiments` function to compute the metrics reported in the manuscript and summarized in Table 1. This function, implemented in the metrics.py file, returns the results in a dictionary format.

In [20]:
import pandas as pd
import lib

# Define the path to the results file
# All results are stored in the 'results' folder.
path = "results/rings_results.csv"

# Load the results data into a Pandas DataFrame.
data = pd.read_csv(path)

# Calculate the metrics for the experiments.
results = lib.calculate_metrics_for_experiments(data)

Let's print the average metrics for the entire test set.

In [21]:
# Compute the average metrics by class names, grouping by 'class_names' and method (e.g., MIRO or DBSCAN).
# The mean is calculated, and the results are rounded to two decimal places for clarity.
aresults = results.groupby('class_names').mean().reset_index().round(2)

# Drop the 'experiment' column as it is not needed for this summary.
aresults = aresults.drop(columns=['experiment'])

# Set 'class_names' as the index and transpose the DataFrame for improved readability in visualization.
aresults = aresults.set_index('class_names').transpose()
aresults.columns.name = None

# Display the transposed DataFrame.
aresults

Unnamed: 0,DBSCAN,MIRO
IoU_values,0.68,0.95
ARI_values,0.33,0.82
ARI_c_values,0.34,0.86
ARI_dagger_values,0.69,0.85
AMI_values,0.73,0.91
JIc_values,0.55,0.99
RMSRE_N_values,1.18,0.11
RMSE_centr_values,0.15,0.05
