# Battery Charging Analysis for EV

In this notebook the main goal will be to analyze battery behavior and how it affects the performance over time.

Focus will be on the following aspects:
1. Battery capacity degradation
2. Battery life cycle analysis
3. Environemental impact of battery performance (temperature, humidity, etc.)
4. Range of EV based on battery performance (if this can be determined from the data)

## First to import necessary libraries that will be needed for analysis.

    1. Pandas library used for data manipulation.
    2. numpy library that support numerical operations (especcially with arrays and matrices) in python.
    3. Seaborn is statistical library built on top of matplotlib for visualization.
    4. Matplotlib ploting library used for static visualization in python.
    5. Plotly for creation of interactive visualization in python.
    6. OS library to provide a way to interact with operating system which allow us to interact with env variables and file system in a platform independant way.

In [4]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px

import os


* Now we should load data to support environment dynamic path

In [11]:
# os.environ["PROJECT_DIR"] = 'something_path' # Cghange this path to your project directory.

file_path = os.path.join('..', 'data', 'raw', 'EV-Population', 'Electric_Vehicle_Population_Data.csv')

In [13]:
ev_population_df = pd.read_csv(file_path)

### 1.0 Now to get Basic info on our dataset!

In [15]:
ev_population_df.head()

Unnamed: 0,VIN (1-10),County,City,State,Postal Code,Model Year,Make,Model,Electric Vehicle Type,Clean Alternative Fuel Vehicle (CAFV) Eligibility,Electric Range,Base MSRP,Legislative District,DOL Vehicle ID,Vehicle Location,Electric Utility,2020 Census Tract
0,5YJYGDEE1L,King,Seattle,WA,98122.0,2020,TESLA,MODEL Y,Battery Electric Vehicle (BEV),Clean Alternative Fuel Vehicle Eligible,291,0,37.0,125701579,POINT (-122.30839 47.610365),CITY OF SEATTLE - (WA)|CITY OF TACOMA - (WA),53033010000.0
1,7SAYGDEE9P,Snohomish,Bothell,WA,98021.0,2023,TESLA,MODEL Y,Battery Electric Vehicle (BEV),Eligibility unknown as battery range has not b...,0,0,1.0,244285107,POINT (-122.179458 47.802589),PUGET SOUND ENERGY INC,53061050000.0
2,5YJSA1E4XK,King,Seattle,WA,98109.0,2019,TESLA,MODEL S,Battery Electric Vehicle (BEV),Clean Alternative Fuel Vehicle Eligible,270,0,36.0,156773144,POINT (-122.34848 47.632405),CITY OF SEATTLE - (WA)|CITY OF TACOMA - (WA),53033010000.0
3,5YJSA1E27G,King,Issaquah,WA,98027.0,2016,TESLA,MODEL S,Battery Electric Vehicle (BEV),Clean Alternative Fuel Vehicle Eligible,210,0,5.0,165103011,POINT (-122.03646 47.534065),PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA),53033030000.0
4,5YJYGDEE5M,Kitsap,Suquamish,WA,98392.0,2021,TESLA,MODEL Y,Battery Electric Vehicle (BEV),Eligibility unknown as battery range has not b...,0,0,23.0,205138552,POINT (-122.55717 47.733415),PUGET SOUND ENERGY INC,53035940000.0


* Now to check the shape of the dataframe.

In [17]:
ev_population_df.shape

(177866, 17)

* Lets see for missing values in the dataset and types of data we have (general info).

In [18]:
ev_population_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 177866 entries, 0 to 177865
Data columns (total 17 columns):
 #   Column                                             Non-Null Count   Dtype  
---  ------                                             --------------   -----  
 0   VIN (1-10)                                         177866 non-null  object 
 1   County                                             177861 non-null  object 
 2   City                                               177861 non-null  object 
 3   State                                              177866 non-null  object 
 4   Postal Code                                        177861 non-null  float64
 5   Model Year                                         177866 non-null  int64  
 6   Make                                               177866 non-null  object 
 7   Model                                              177866 non-null  object 
 8   Electric Vehicle Type                              177866 non-null  object

* As it can be seen we have mixing data types and some missing values.


* And for the last we could get general statiscits of our dataset.

In [20]:
ev_population_df.describe()

Unnamed: 0,Postal Code,Model Year,Electric Range,Base MSRP,Legislative District,DOL Vehicle ID,2020 Census Tract
count,177861.0,177866.0,177866.0,177866.0,177477.0,177866.0,177861.0
mean,98172.453506,2020.515512,58.842162,1073.109363,29.127481,220231300.0,52976720000.0
std,2442.450668,2.989384,91.981298,8358.624956,14.892169,75849870.0,1578047000.0
min,1545.0,1997.0,0.0,0.0,1.0,4385.0,1001020000.0
25%,98052.0,2019.0,0.0,0.0,18.0,181474300.0,53033010000.0
50%,98122.0,2022.0,0.0,0.0,33.0,228252200.0,53033030000.0
75%,98370.0,2023.0,75.0,0.0,42.0,254844500.0,53053070000.0
max,99577.0,2024.0,337.0,845000.0,49.0,479254800.0,56033000000.0
