In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MinMaxScaler,StandardScaler

Let's test whether warranty data can be used as a proxy for battery life. We will do this by assessing the influence of 'manufacturer agnostic' metrics on warranty data:
* Cell chemistry
* Cooling type
* SoC window

As well as manufacturer-specific data. Analysing the relative contribution of each of these to explaining warranty will show whether it is a good metric (strongly dependent on physical properties of the battery pack) or not. 

In [3]:
df = pd.read_excel('pack_benchmark_spec_v1.055.xlsm','Benchmarks',header=[0,1],skiprows=1,nrows=1055)

In [4]:
df.loc[:,'Warranty'].describe()

Unnamed: 0,Years,Mileage,Cycles
count,159.0,115.0,49.0
mean,8.251572,111083.434323,6059.183673
std,2.056076,55007.134616,3872.171999
min,1.0,10.0,800.0
25%,8.0,100000.0,4000.0
50%,8.0,100000.0,6000.0
75%,8.0,100000.0,8000.0
max,15.0,372902.423866,20000.0


We can estimate that: 

$pack\ cycles = \frac{pack\ mileage\ (\mathrm{m})}{pack\ capacity\ (\mathrm{kWh/cycle})} \times \frac{power\ consumption\ (\mathrm{kW})}{average\ speed\ (\mathrm{m/h})}$ 

Further, we can estimate that the power consumption has a dependence on the total pack capacity - naively, the relationship should be linear (similar discharge 'time' for each pack), but in reality, discharge time probably has a positive relationship with pack capacity (bigger vehicles need to run longer). Average speed, on the other hand, is probably roughly constant (~60mph). 

Putting this together the relationship is:

$pack\ cycles \sim pack\ mileage \times (pack\ capacity)^a$

Which gives the linear equation:

$\log(pack\ cycles) = \log(pack\ mileage) + a\log(pack\ capacity) + b$ , \
with $a$ and $b$ being undetermined constants. 

In [5]:
df.loc[:,[('Warranty','Cycles'),('Warranty','Mileage'),('Energy','kWhtotal')]].dropna(axis=0)

Unnamed: 0_level_0,Warranty,Warranty,Energy
Unnamed: 0_level_1,Cycles,Mileage,kWhtotal
884,10000.0,10.0,11.0
885,10000.0,10.0,22.0
886,10000.0,10.0,5.5


Not enough data (yet). 