# SoH estimatino comparaison with real world SoH readout
On the 14th of January, we recieved a small sample of SoH readouts from a few tesla vehicles.  
This notebook serves as a benchmark to compare our SoH estimation with the real world SoH readout.  

## Setup

In [None]:
import plotly.express as px

from core.plt_utils import *
from core.sql_utils import *
from core.stats_utils import *
from core.pandas_utils import *
from transform.fleet_info.main import fleet_info
from transform.raw_results.tesla_results import get_results
from transform.processed_tss.ProcessedTimeSeries import ProcessedTimeSeries

## Data extraction

In [None]:
charges = get_results()
charges

In [None]:
def agg_soh_per_vin(charges:DF) -> DF:
    return (
        charges
        .groupby("vin")
        .agg(
            soh=("soh", "median"),
            odometer=("odometer", "last"),
            date=("date", "last"),
        )
        .reset_index()
        .pipe(left_merge, fleet_info, "vin", "vin", src_dest_cols=["fleet_name"])
    )
raw_soh_per_vin = agg_soh_per_vin(charges)

In [None]:
ground_truth = (
    pd.read_csv(
        "data_cache/ground_truth.csv",
        dtype={
            "Score Aviloo": "int64",
            "SoH Readout": "float64",
            "VIN": "string",
            "BIB SOH": "float64",
            "Brand (FlashTest)": "string",
            "Model Group (FlashTest)": "string",
            "Mileage": "float64",
        }
    )
    .rename(columns={"VIN": "vin", "SoH Readout": "ground_truth_soh"})
)

## Comparison

In [None]:
raw_soh_per_vin_soh_per_vin_with_ground_truth = (
    raw_soh_per_vin
    .query("vin in @ground_truth.vin")
    .pipe(left_merge, ground_truth, "vin", "vin", src_dest_cols=["ground_truth_soh"])
    .eval("ground_truth_soh = ground_truth_soh / 100.0")
    .eval("residual_soh = soh - ground_truth_soh")
    .assign(abs_residual_soh=lambda x: x["residual_soh"].abs())
    .reset_index(drop=True)
)
display(raw_soh_per_vin_soh_per_vin_with_ground_truth)
display(raw_soh_per_vin_soh_per_vin_with_ground_truth.describe())

## Conclusion
We can see that the SoH estimation is quite good, with a mean residual of 0.016 (1.6%) and a standard deviation of 0.013 (1.3%).  