# Experiment 2: Part Type Collision Analysis

This notebook will contain gathered results from experiment 2.

# Part Type Collision Analysis
## Methodology
For each part type we have, run the experiment many times over many different hyperparameters. Specifically, isolate one hyperparameter, run the experiment over a range of values, tracking the computed collision rate each time. Repeat this for each hyperparameter and each part type.
## Deliverables
Graphs and analysis for the impact of different values of the hyperparmeters. How do they affect the final collision rate? Why are the effecting the collision rate like that? What does this tell us? 
Graphs and analysis for comparing the results across different part types. Are different part types affected in the same way by the same change in hyperperamters? How close are their collision rates? What does this tell us about the relative importance of both hyperparameters and part types. 

## Source Code

The below sections contains all of our source codes.

In [None]:
import mlflow
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [None]:
mlflow_client = mlflow.client.MlflowClient()
experiment_id = mlflow_client.get_experiment_by_name("Experiment 2").experiment_id
print(experiment_id)
runs = mlflow_client.search_runs(experiment_id, max_results=10_000)
# print(len(runs))

run_dicts = {
    run.info.run_id: {
        **run.data.metrics, 
        **run.data.params}
    for run in runs}


In [12]:
run_df = pd.DataFrame.from_dict(run_dicts, orient='index')
print(run_df.columns)

Index(['part_dim', 'confidence_bound', 'part_pdf_ci', 'meta_pdf_ci',
       'num_samples', 'part_type'],
      dtype='object')


In [13]:
analysis_groups = {
    "part_dim": run_df.groupby('part_dim'),
    "confidence_bound": run_df.groupby('confidence_bound'),
    "part_pdf_ci": run_df.groupby('part_pdf_ci'),
    "meta_pdf_ci": run_df.groupby('meta_pdf_ci')
}

In [18]:
mlflow.set_experiment("Experiment 2 Analysis")
for analysis_type in analysis_groups:
   
    group = analysis_groups[analysis_type]
    x_vals = []
    y_vals = []
    
    for index, df in group:

        col_vals = set(df[analysis_type].to_list())
        if len(col_vals) != 1:
            raise Exception(f"More than one {analysis_type} value in group")
        
        x_vals.append(col_vals.pop())
        y_vals.append(df['upper_collision_rate'].mean())
        
    plt.plot(x_vals, y_vals, label=f'{analysis_type}s vs upper_collision_rate')
    plt.xlabel(analysis_type)
    plt.ylabel(f"Averaged upper_collision_rate across all tested parts")
    plt.savefig(f"psig_matcher/experiments/graphs/{analysis_type}_vs_upper_collision_rate.png")
    mlflow.log_artifact(f"psig_matcher/experiments/graphs/{analysis_type}_vs_upper_collision_rate.png")
    

KeyError: 'upper_collision_rate'

---

## Conclusion

TBD.