## Impact du SoC moyen sur le SoH

**Objectif :**
Évaluer si le SoC moyen a une influence sur la dégradation du SoH.

**Ce qui est fait dans ce notebook :**   

Calcul d'une moyenne pondérée du SoC, en utilisant le temps passé à chaque niveau de charge sur la durée totale d'observation.   
L’analyse prend en compte les phases de charge, de roulage et de conservation.

**Conclusion :** 

On s’aperçoit que plus le SoC moyen est faible pour un véhicule, plus le SoH est bas.  
À l’inverse, un SoC moyen élevé (> 85 %) est associé à un SoH plus élevé (ce qui est un peu surprenant).  
On suppose donc que le SoC moyen a un impact direct sur le SoH mais qu'il n'est pas forcément le facteur le plus impactant ce qui expliquerait les anomalies constatées. 

**Next steps :**  
Étudier si le SoC moyen uniquement pendant les phases de conservation (lorsque le véhicule ne roule pas et ne charge pas) a un impact sur le SoH.   
Checker plus en profondeur les vehicules avec un SoC moyen élevé qui ont un SoH élevé.


In [None]:
import pandas as pd
from core.sql_utils import *
import plotly.express as px
import plotly.graph_objects as go
from scipy.optimize import curve_fit
from datetime import datetime 

In [None]:
from transform.raw_tss.tesla_raw_tss import get_raw_tss
from transform.processed_tss.ProcessedTimeSeries import ProcessedTimeSeries


## get data

In [None]:
def plot_log(df, column):
    def log_function(x, a):
        return 1 + a * np.log1p(x/1000) 
    fig = go.Figure()
    # create color
   # model_colors = {value: px.colors.qualitative.Plotly[i] for i, value in enumerate(df[column].unique())}
    for value in df[column].unique():
        df_model_temp = df[df[column]==value].dropna(subset='soh').sort_values('odometer').copy()
        # fir log function
        popt, _ = curve_fit(log_function, df_model_temp['odometer'], df_model_temp['soh'])
        x_vals = np.linspace(0.1,  240000, 500)
        y_vals = log_function(x_vals, *popt)

         # Couleur unique pour le modèle
       # color = model_colors[value] 

        # Génération des valeurs ajustées
        fig.add_traces(go.Scatter(x=x_vals, y=y_vals, name=f'{value} trend'))
        
    return fig

In [None]:
engine = get_sqlalchemy_engine()
con = engine.connect()

with engine.connect() as connection:
    dbeaver_df = pd.read_sql(text("""SELECT * FROM vehicle_data vd
            join vehicle v
            on v.id = vd.vehicle_id
            join vehicle_model vm 
            on vm.id = v.vehicle_model_id
            WHERE vm.model_name like '%model%';"""), con)



soh_df = dbeaver_df.groupby('vin', as_index=False, observed=True)[['soh', 'odometer', 'version']].last()

In [None]:
ts = ProcessedTimeSeries('tesla')

In [None]:
#df['date'] = pd.to_datetime(df['date'])
ts.sort_values(['vin', 'date'], inplace=True)

In [None]:
ts['time_diff'] = ts.groupby('vin',observed=True )['date'].diff().dropna().reset_index(drop=True)

In [None]:
# total seconds between two points
ts['time_diff'] = ts['time_diff'].dt.total_seconds()

In [None]:
# total seconds a this SoC
ts['time_at_soc'] = ts['soc'] * ts['time_diff']

In [None]:
# cat SoC
ts['low_soc'] = ts['soc'].apply(lambda x: 1 if x < 20 else 0)
ts['mid_soc'] = ts['soc'].apply(lambda x: 1 if 20 <= x < 80 else 0 )
ts['high_soc'] = ts['soc'].apply(lambda x: 1 if 80 <= x else 0)

In [None]:
# Time total pass at SoC cat
ts['time_at_low_soc'] = ts['low_soc'] * ts['time_diff']
ts['time_at_mid_soc'] = ts['mid_soc'] * ts['time_diff']
ts['time_at_high_soc'] = ts['high_soc'] * ts['time_diff']

In [None]:
ratio_soc = (ts.groupby('vin', as_index=False).agg(
    total_time_at_low_soc = ('time_at_low_soc', "sum"),
    total_time_at_mid_soc = ('time_at_mid_soc', "sum"),
    total_time_at_high_soc = ('time_at_high_soc', "sum"),
    total_time_diff=('time_diff', 'sum'))
             .eval("ratio_low=total_time_at_low_soc/total_time_diff")
             .eval("ratio_high=total_time_at_high_soc/total_time_diff")
             .eval("ratio_mid=total_time_at_mid_soc/total_time_diff")
             .eval("ratio_extremum=(total_time_at_low_soc+total_time_at_high_soc)/total_time_diff"))

In [None]:
# get SoH, odometer, version from dbeaver
ratio_soc = ratio_soc.merge(soh_df, how='inner', on='vin')

## Graph and results

In [None]:
ratio_soc['ratio_low_cat'] = ratio_soc['ratio_low'].apply(lambda x: 'upper 20%' if x > .2 else "under 20%")

In [None]:
ratio_soc["ratio_low_cat"].value_counts()

In [None]:
fig = plot_log(ratio_soc, 'ratio_low_cat')
fig.update_layout(title='Impact of the low SoC on the battery degradation')
fig.update_xaxes(title='odometer')
fig.update_yaxes(title='SoH')

## Archive

In [None]:
avg_soc = ts.groupby('vin', observed=True, as_index=False).agg(
    total_time_at_soc=("time_at_soc", 'sum'),
    total_time_diff=('time_diff', 'sum')).eval('soc_mean = total_time_at_soc/total_time_diff')

In [None]:
avg_soc = avg_soc.merge(soh_df, on='vin')

In [None]:
avg_soc.describe()

In [None]:
avg_soc['soc_cat'] = avg_soc['soc_mean'].apply(lambda x: "Low soc" if x <= 40 else "Normal soc")

### Graph and results

In [None]:
avg_soc['soc_cat'].value_counts()

In [None]:
fig = plot_log(avg_soc, 'soc_cat')
fig.update_layout(title='Impact of the soc on the battery degradation')
fig.update_xaxes(title='odometer')
fig.update_yaxes(title='SoH')