# <h> Exploratory Data Analysis for Electric Vehicle Range, Efficiency, and Price </h>

This project utilizes a data set on brand, range, efficiency, body style, and price for 103 electric vehicles from 33 top car brands. 

## <b> 1. Import and Clean Data

In this section, the Python libraries and data set are imported into Jupyter Notebook. The data are checked for duplicates, missing values, and any other potential issues.

In [2]:
# Import Libraries
import pandas as pd 
from matplotlib import pyplot as plt
from matplotlib import pyplot as plt 

In [3]:
# Read in dataset
ev = pd.read_csv('/Users/kellyshreeve/Desktop/Data-Sets/ElectricCarData_Clean.csv')

In [4]:
# Print dataset info
ev.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 103 entries, 0 to 102
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   Brand            103 non-null    object 
 1   Model            103 non-null    object 
 2   AccelSec         103 non-null    float64
 3   TopSpeed_KmH     103 non-null    int64  
 4   Range_Km         103 non-null    int64  
 5   Efficiency_WhKm  103 non-null    int64  
 6   FastCharge_KmH   103 non-null    object 
 7   RapidCharge      103 non-null    object 
 8   PowerTrain       103 non-null    object 
 9   PlugType         103 non-null    object 
 10  BodyStyle        103 non-null    object 
 11  Segment          103 non-null    object 
 12  Seats            103 non-null    int64  
 13  PriceEuro        103 non-null    int64  
dtypes: float64(1), int64(5), object(8)
memory usage: 11.4+ KB


In [7]:
# Print the first 10 rows of the dataset
ev.head(10)

Unnamed: 0,Brand,Model,AccelSec,TopSpeed_KmH,Range_Km,Efficiency_WhKm,FastCharge_KmH,RapidCharge,PowerTrain,PlugType,BodyStyle,Segment,Seats,PriceEuro
0,Tesla,Model 3 Long Range Dual Motor,4.6,233,450,161,940,Yes,AWD,Type 2 CCS,Sedan,D,5,55480
1,Volkswagen,ID.3 Pure,10.0,160,270,167,250,Yes,RWD,Type 2 CCS,Hatchback,C,5,30000
2,Polestar,2,4.7,210,400,181,620,Yes,AWD,Type 2 CCS,Liftback,D,5,56440
3,BMW,iX3,6.8,180,360,206,560,Yes,RWD,Type 2 CCS,SUV,D,5,68040
4,Honda,e,9.5,145,170,168,190,Yes,RWD,Type 2 CCS,Hatchback,B,4,32997
5,Lucid,Air,2.8,250,610,180,620,Yes,AWD,Type 2 CCS,Sedan,F,5,105000
6,Volkswagen,e-Golf,9.6,150,190,168,220,Yes,FWD,Type 2 CCS,Hatchback,C,5,31900
7,Peugeot,e-208,8.1,150,275,164,420,Yes,FWD,Type 2 CCS,Hatchback,B,5,29682
8,Tesla,Model 3 Standard Range Plus,5.6,225,310,153,650,Yes,RWD,Type 2 CCS,Sedan,D,5,46380
9,Audi,Q4 e-tron,6.3,180,400,193,540,Yes,AWD,Type 2 CCS,SUV,D,5,55000


In [14]:
# Rename columns with snake case

ev = ev.rename(
    columns={'Brand':'brand',
             'Model':'model',
             'AccelSec':'accel_sec',
             'TopSpeed_KmH':'top_speed_kmh',
             'Range_Km':'range_km',
             'Efficiency_WhKm':'efficiency_whkm',
             'FastCharge_KmH':'fast_charge_kmh',
             'RapidCharge':'rapid_charge',
             'PowerTrain':'power_train',
             'PlugType':'plug_type',
             'BodyStyle':'body_style',
             'Segment':'segment',
             'Seats':'seats',
             'PriceEuro':'price_euro'
             }
)

In [22]:
# Check columns are now all snake case
print(ev.columns)

Index(['brand', 'model', 'accel_sec', 'top_speed_kmh', 'range_km',
       'efficiency_whkm', 'fast_charge_kmh', 'rapid_charge', 'power_train',
       'plug_type', 'body_style', 'segment', 'seats', 'price_euro'],
      dtype='object')


In [23]:
# Change fast_charge_kmh data type from object to int

# First check the unique values of fast_charge_kmh
print(ev['fast_charge_kmh'].unique())

['940' '250' '620' '560' '190' '220' '420' '650' '540' '440' '230' '380'
 '210' '590' '780' '170' '260' '930' '850' '910' '490' '470' '270' '450'
 '350' '710' '240' '390' '570' '610' '340' '730' '920' '-' '550' '900'
 '520' '430' '890' '410' '770' '460' '360' '810' '480' '290' '330' '740'
 '510' '320' '500']


In [25]:
# Change '-' to NaN
ev['fast_charge_kmh'].replace('-', 'NaN')

print(ev['fast_charge_kmh'].unique())

['940' '250' '620' '560' '190' '220' '420' '650' '540' '440' '230' '380'
 '210' '590' '780' '170' '260' '930' '850' '910' '490' '470' '270' '450'
 '350' '710' '240' '390' '570' '610' '340' '730' '920' '-' '550' '900'
 '520' '430' '890' '410' '770' '460' '360' '810' '480' '290' '330' '740'
 '510' '320' '500']


In [21]:
# Change fast_charge_kmh to int type data
# ev['fast_charge_kmh'] = pd.to_numeric(ev['fast_charge_kmh'])

ev['fast_charge_kmh'].iloc[57]

'-'