# Getting Started with the Scoring Feature

This notebook demonstrates how to use the scoring feature from the LIPS package. We will cover the following topics:

- **Defining Scoring Based on Configuration File**: We will show how to define scoring configurations and explain the structure of the configuration file.
- **Application to Competitions**: We will apply the scoring feature to two competitions: the airfoil competition and the powergrid competition.

## 1. Configuration file

The core configuration for scoring is located in **configurations/powergrid/scoring/ScoreConfig.ini**. This file defines the scoring logic and allows for flexible customization of your score evaluation process. Three key sections are essential within this configuration: Thresholds, ValueByColor, and Coefficients.

### 1. Thresholds: Defining Performance Boundaries

The `Thresholds` section specifies the performance benchmarks for individual metrics. It includes both the threshold values and the comparison type, indicating whether to minimize, maximize, or evaluate a ratio.

* **Comparison Types:**
    * `minimize`: Lower values are considered better.
    * `maximize`: Higher values are considered better.
    * `ratio`: used to compare two values, if the ratio is under or above the thresholds.
* **Example:**
    ```ini
    "a_or": {"comparison_type": "minimize", "thresholds": [0.02, 0.05]}
    ```
    * **Explanation:** This entry defines thresholds for the metric "a_or". If the "a_or" value is less than 0.02, it falls within the best (green) range. If it's between 0.02 and 0.05, it falls within the middle (orange) range. If it's greater than 0.05, it falls within the worst (red) range. Because the comparison type is minimize, the lower the a_or value is the better.

### 2. ValueByColor: Assigning Scores to Performance Ranges

The `ValueByColor` section maps performance ranges (represented by colors) to numerical scores. This allows you to quantify the qualitative assessment of your metrics.

* **Example:**
    ```ini
    {"green": 2, "orange": 1, "red": 0}
    ```
    * **Explanation:** In this example, a metric falling within the "green" range receives a score of 2, "orange" receives 1, and "red" receives 0.

### 3. Coefficients: Weighting Metric Contributions

The `Coefficients` section defines the relative importance of different metrics and sub-metrics in the overall score. This allows you to prioritize certain aspects of your model's performance.

* **Hierarchical Structure:** Coefficients can be organized hierarchically, allowing you to assign weights to both top-level metrics (e.g., "ML", "OOD", "Physics") and their sub-metrics (e.g., "Accuracy", "SpeedUP").
* **Example:**
    ```ini
    {"ML": {"value": 0.4, "Accuracy": {"value": 0.75}, "SpeedUP": {"value": 0.25}},
     "OOD": {"value": 0.3, "Accuracy": {"value": 0.75}, "SpeedUP": {"value": 0.25}},
     "Physics": {"value": 0.3}}
    ```
    * **Explanation:**
        * The "ML" metric contributes 40% (0.4) to the overall score.
        * Within "ML", "Accuracy" contributes 75% (0.75) and "SpeedUP" contributes 25% (0.25) to the "ML" sub-score.
        * The "OOD" metric contributes 30% to the overall score, and also uses the same accuracy and speedup sub metric weights.
        * The "Physics" metric contributes 30% to the overall score, and has no submetrics.
        * This structure allows for fine-grained control over the scoring process, ensuring that the final score reflects the relative importance of different aspects of your model's performance.


## 2. Metric format

 The scoring feature expects metrics data in a nested dictionary format. The structure should represent a tree-like hierarchy, where each node
 contains either sub-metrics or leaf metrics. Leaf metrics are the actual values that will be colorized and scored.

 Here's an example of the expected format:

 ```json
 {
     "ML": {
         "metric1": 0.85,
         "metric2": 0.20
     },
     "OOD": {
         "metric3": 0.92,
         "metric4": 0.15
     },
     "Physics": {
         "metric5": 0.78,
         "metric6": 0.30
     }
 }
 ```

 In this example:
 - The top-level keys (`ML`, `OOD`, `Physics`) represent different categories or components.
 - Each category contains several metrics (e.g., `metric1`, `metric2`).
 - The values associated with each metric are numerical values.

 The metrics can also be structured in deeper hierarchies:

 ```json
 {
     "Category1": {
         "SubCategory1": {
             "metric1": 0.75,
             "metric2": 0.25
         },
         "SubCategory2": {
             "metric3": 0.60,
             "metric4": 0.40
         }
     },
     "Category2": {
         "metric5": 0.90,
         "metric6": 0.10
     }
 }
 ```

 It's important to maintain a consistent branching structure throughout the metrics data. This means that if one sub-category contains further nested
 categories, all other sub-categories at the same level should also have the same structure.

 The keys of the metrics should match the keys defined in the configuration file. For example, if the configuration file defines thresholds for `metric1`, the metrics data must also contain `metric1`.

## End-to-End Scoring Example
This section demonstrates an end-to-end example of how to use the scoring feature. We will load a configuration file, read a metrics file, colorize the metrics, calculate sub-scores, and then calculate the global score.

### 1. Define Example Configuration File

### 1. Define Example Configuration File

First, let's define an example configuration file (`config.ini`).
This file contains the thresholds, value_by_color mappings, and coefficients needed for scoring.

```ini
[thresholds]
a_or = {"comparison_type": "minimize", "thresholds": [0.02, 0.05]}
spearman_correlation_drag = {"comparison_type": "maximize", "thresholds": [0.7, 0.9]}
inference_time = {"comparison_type": "minimize", "thresholds": [500, 700]}
[valuebycolor]
green = 2
orange = 1
red = 0
[coefficients]
ML = {"value": 0.4, "Accuracy": {"value": 0.75}, "SpeedUP": {"value": 0.25}}
OOD = {"value": 0.3, "Accuracy": {"value": 0.75}, "SpeedUP": {"value": 0.25}}
Physics = {"value": 0.3}
```

In [None]:

#%% [markdown]
# ### 2. Define Example Metrics File
#
# Next, let's define an example metrics file (`metrics.json`).
# This file contains the metric values that we want to score.
#
# ```json
# {
#     "ML": {
#         "a_or": 0.01,
#         "spearman_correlation_drag": 0.95
#     },
#     "OOD": {
#         "inference_time": 400
#     },
#     "Physics": {
#         "a_or": 0.06,
#         "spearman_correlation_drag": 0.75
#     },
#     "Speed": {
#         "inference_time": 600
#     }
# }
# ```

#%%
import json
from scoring import Scoring
from utils import read_json

# Initialize the Scoring class with the path to the configuration file
scoring = Scoring(config_path="config.ini")

# Load metrics data from the JSON file
metrics_path = "metrics.json"
metrics_data = read_json(json_path=metrics_path)

#%% [markdown]
# ### 3. Colorize Metrics
#
# Now, let's colorize the metrics using the `colorize_metrics` function.
# This will convert the numerical metric values into color strings based on the
# thresholds defined in the configuration file.

#%%
# Colorize the metrics data
colorized_metrics = scoring.colorize_metrics(metrics_data)
print("Colorized Metrics:", json.dumps(colorized_metrics, indent=4))

#%% [markdown]
# ### 4. Calculate Sub-Scores
#
# Next, let's calculate the sub-scores using the `calculate_sub_scores` function.
# This will calculate the score for each sub-tree in the metrics data.

#%%
# Calculate sub-scores
sub_scores = scoring.calculate_sub_scores(colorized_metrics)
print("Sub-Scores:", json.dumps(sub_scores, indent=4))

#%% [markdown]
# ### 5. Calculate Global Score
#
# Finally, let's calculate the global score using the `calculate_global_score` function.
# This will calculate the overall score for the entire metrics tree.

#%%
# Calculate global score
global_score = scoring.calculate_global_score(sub_scores)
print("Global Score:", global_score)