# Charging type identification
We want to know how the vehicles are being charged.  
To do so, we will either use the explicit data provided by the data providers or try to infer from other variables such as the speed of the charging or the power of the charging.

## Setup

### Imports

In [None]:
import plotly.express as px

from core.constants import KWH_TO_KJ
from transform.raw_results.config import LEVEL_1_MAX_POWER, LEVEL_2_MAX_POWER
from core.pandas_utils import *
from transform.raw_tss.main import get_raw_tss, GET_RAW_TSS_FUNCTIONS
from transform.processed_tss.ProcessedTimeSeries import ProcessedTimeSeries
from transform.processed_tss.config import ALL_MAKES

### Raw time series EDA

We will look at the sanity checks of each brand to find the columns that we can use to infer the charging type.  
Currently, most of these variables are not passed into processed_tss so we have to look into raw_tss.  

In [None]:
for make in GET_RAW_TSS_FUNCTIONS.keys():
    print(make)
    processed_tss = get_raw_tss(make).filter(like="charg")
    display(sanity_check(processed_tss))
    print("============================")

### Processed time series EDA

Here is what we can use for each brand:  
- bmw: charging_method  

- tesla: fast_charger_type, fast_charger_present, charge_rate, charger_actual_current, charger_power, charging_state  
    We could also use charge_port_cold_weather_mode, charger_pilot_current, charger_voltage  

- kia: There does not seem to be any explicit charging type data we might have to infer it from the charging speed.  

- mercedes-benz: charging.smart_charging_status, charging.status, charging.charging_rate  

- ford: No explicit charging type data. We might have to infer it from the charging speed.  

- renault:charging.status, charging.battery_charge_type, charging.charging_rate  

- opel: electricity.charging.mode, electricity.charging.planned, electricity.charging.rate  

- peugeot: electricity.charging.mode, electricity.charging.planned, electricity.charging.rate

- ds: electricity.charging.mode, electricity.charging.rate

- fiat: No explicit charging type data. We might have to infer it from the charging speed.  

- volvo-cars: charging.status maybe? Otherwise we might have to infer it from the charging speed.  

We will visualize the various in e time series scatter plots to compare the the different values of the charging type.  

In [None]:
def compute_charging_levels(tss:DF) -> DF:
    return (
        tss
        .eval("level_1 = soc_diff.where((power < @LEVEL_1_MAX_POWER) & trimmed_in_charge, 0)")
        .eval("level_2 = soc_diff.where(power.between(@LEVEL_1_MAX_POWER, @LEVEL_2_MAX_POWER) & trimmed_in_charge, 0)")
        .eval("level_3 = soc_diff.where((power > @LEVEL_2_MAX_POWER) & trimmed_in_charge, 0)")
    )

def compute_charging_rate(tss:DF) -> DF:
    print("compute_charging_rate called")
    tss_grp = tss.groupby("vin")
    tss = (
        tss
        .assign(
            soc_diff=tss_grp["soc"].diff(),
            time_diff=tss_grp["date"].diff().dt.as_unit("s").astype(int)
        )
        .eval("power = capacity * @KWH_TO_KJ * soc_diff / time_diff")
        .eval("power = power.mask(time_diff > 3600, 0)")
    )
    tss["power"] = tss.groupby(["vin", "in_charge_idx"])["power"].transform("median")
    return tss

def update_in_charge_idx(tss:DF) -> DF:
    #tss_grp = tss.groupby("vin")
    tss = tss.dropna(subset=["soc", "date"])
    tss["in_new_charge"] = tss.groupby("vin")["in_charge"].shift(1, fill_value=False).ne(tss["in_charge"])
    tss["in_charge_idx"] = tss.groupby("vin")["in_new_charge"].cumsum()
    return tss

tss = (
    ProcessedTimeSeries("mercedes-benz")
    .pipe(update_in_charge_idx)
    .pipe(compute_charging_rate)
    .pipe(compute_charging_levels)
)

In [None]:
most_common_vin = tss.groupby("vin")["power"].sum().idxmax()
ts = tss.query("vin == @most_common_vin")

In [None]:
px.scatter(
    ts.eval("in_level_2 = level_2 > 0").eval("in_level_3 = level_3 > 0"),
    x="date",
    y="soc",
    color="power",
    color_continuous_scale="Rainbow",
    symbol="in_charge",
    #symbol="in_new_charge",
    #hover_data=["power", "soc_diff", "time_diff", "in_charge_idx", "level_3"],
)

In [None]:
tss = ProcessedTimeSeries("tesla")
TARGET_VIN = "5YJ3E7EA4LF547495"
ts = tss.query("vin == @TARGET_VIN")

In [None]:
px.scatter(
    ts,
    x="date",
    y="soc",
    color="charging_status"
)

In [None]:
for make in ALL_MAKES:
    print(make)
    processed_tss = ProcessedTimeSeries(make)
    if "charging_method" in processed_tss.columns:
        most_common_vin = (
            processed_tss
            .groupby("vin")["charging_method"]
            .nunique()
            .idxmax()
        ) #["vin"].value_counts().idxmax()
        ts:DF = (
            processed_tss
            .query("vin == @most_common_vin")
            .astype({"charging_method": "string"})
            .eval("charging_method = charging_method.ffill()")
            .dropna(subset=["soc", "date"])
        )
        display(ts["charging_method"].value_counts(dropna=False))
        px.scatter(
            ts.eval("charging_method = charging_method.fillna('unknown')"),
            x="date",
            y="soc",
            color="charging_method",
            title=f"{make}, {most_common_vin} charging method",
        ).show()
    else:
        print("No charging method column.")
    print("============================")


### Charging power
Looking at the various brands we can see that we cannot get the charging types for all brands using only the charging_method column.  
This is because a few brands simply don't have it and because the ones that do have it, the charging methods are not always informative.  
Let's try to use the charging power instead.