# **EV Energy Consumption Prediction**

## **1.Define the Problem**

Electric Vehicles (EVs) are key to sustainable transportation, but predicting their energy consumption remains a challenge. 
Unlike fuel-based cars, EV efficiency depends on factors such as driving speed, distance, road type, weather, and temperature. 
This uncertainty often causes “range anxiety” among drivers. 
Accurate energy prediction is essential for better route planning, fleet optimization, and charging infrastructure management. 
Thus, the problem is to develop a model that learns from these variables and predicts EV energy usage reliably.

## Problem Statement

**“Develop a predictive machine learning model to estimate the energy consumption (in kWh) of electric vehicles based on driving behavior, road conditions, and weather data. The model aims to reduce range anxiety for EV drivers, support fleet management, and optimize charging schedules.”**

# **2.Dataset Overview**

#### Dataset Details
* **Dataset Name** : **EV Energy Consumption Dataset**
* **Source** : https://www.kaggle.com/datasets/ziya07/ev-energy-consumption-dataset
* **File Format** : .csv

#### Dataset Description

The EV Energy Consumption Dataset is designed to support research on energy efficiency in New Energy Vehicles (NEVs), particularly Electric Vehicles (EVs). This dataset captures real-time driving behavior, road conditions, weather factors, and vehicle attributes to predict energy consumption in kWh.

#### Key Features

- Driving Behavior: Speed, acceleration, and driving mode (Eco, Normal, Sport).
- Road & Traffic Conditions: Road type (Highway, Urban, Rural), slope percentage, and traffic levels.
- Weather Factors: Temperature, humidity, wind speed, and weather conditions (Sunny, Rainy, Snowy, Foggy).
- Vehicle Attributes: Battery state, voltage, temperature, tire pressure, and vehicle weight.
- Energy Consumption (Target Variable): Estimated in kWh based on influencing factors.

#### Loading the Dataset

In [42]:
import pandas as pd

file_name = "EV_Energy_Consumption_Dataset.csv"
df = pd.read_csv(file_name)

#### Exploring and Understanding Data

In [43]:
df.head()

Unnamed: 0,Vehicle_ID,Timestamp,Speed_kmh,Acceleration_ms2,Battery_State_%,Battery_Voltage_V,Battery_Temperature_C,Driving_Mode,Road_Type,Traffic_Condition,Slope_%,Weather_Condition,Temperature_C,Humidity_%,Wind_Speed_ms,Tire_Pressure_psi,Vehicle_Weight_kg,Distance_Travelled_km,Energy_Consumption_kWh
0,1102,2024-01-01 00:00:00,111.507366,-2.773816,30.415148,378.091525,25.314786,2,1,1,6.879446,4,0.74177,42.172533,7.829253,31.11202,1822.967368,20.757508,12.054317
1,1435,2024-01-01 00:01:00,48.612323,-0.796982,97.385534,392.718377,18.240755,1,2,1,-3.007212,4,-3.495516,57.018427,4.495572,31.504366,2091.831914,0.642918,4.488701
2,1860,2024-01-01 00:02:00,108.73332,0.2538,84.9126,398.993495,44.449145,1,1,3,0.029585,1,9.248275,69.028911,5.144489,33.838015,1816.702497,40.842824,11.701377
3,1270,2024-01-01 00:03:00,38.579484,-2.111395,28.777904,358.128273,28.980155,1,2,2,8.271943,3,2.868409,86.638349,4.518283,33.256014,1283.102642,5.305229,7.389266
4,1106,2024-01-01 00:04:00,57.172438,1.477883,29.74016,310.888162,33.184551,2,1,1,2.776814,2,16.750244,27.189185,4.263406,33.579678,2160.350788,5.825926,6.761205


## **3.Data Preprocessing**

In [44]:
#check if there are null values
df.isnull().sum()

Vehicle_ID                0
Timestamp                 0
Speed_kmh                 0
Acceleration_ms2          0
Battery_State_%           0
Battery_Voltage_V         0
Battery_Temperature_C     0
Driving_Mode              0
Road_Type                 0
Traffic_Condition         0
Slope_%                   0
Weather_Condition         0
Temperature_C             0
Humidity_%                0
Wind_Speed_ms             0
Tire_Pressure_psi         0
Vehicle_Weight_kg         0
Distance_Travelled_km     0
Energy_Consumption_kWh    0
dtype: int64

In [45]:
#is the dataset in order?
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total 19 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   Vehicle_ID              5000 non-null   int64  
 1   Timestamp               5000 non-null   object 
 2   Speed_kmh               5000 non-null   float64
 3   Acceleration_ms2        5000 non-null   float64
 4   Battery_State_%         5000 non-null   float64
 5   Battery_Voltage_V       5000 non-null   float64
 6   Battery_Temperature_C   5000 non-null   float64
 7   Driving_Mode            5000 non-null   int64  
 8   Road_Type               5000 non-null   int64  
 9   Traffic_Condition       5000 non-null   int64  
 10  Slope_%                 5000 non-null   float64
 11  Weather_Condition       5000 non-null   int64  
 12  Temperature_C           5000 non-null   float64
 13  Humidity_%              5000 non-null   float64
 14  Wind_Speed_ms           5000 non-null   

## **4.Data Splitting**

In [47]:
x= df.drop('Energy_Consumption_kWh',axis=1)
y= df['Energy_Consumption_kWh']

In [48]:
x

Unnamed: 0,Vehicle_ID,Timestamp,Speed_kmh,Acceleration_ms2,Battery_State_%,Battery_Voltage_V,Battery_Temperature_C,Driving_Mode,Road_Type,Traffic_Condition,Slope_%,Weather_Condition,Temperature_C,Humidity_%,Wind_Speed_ms,Tire_Pressure_psi,Vehicle_Weight_kg,Distance_Travelled_km
0,1102,2024-01-01 00:00:00,111.507366,-2.773816,30.415148,378.091525,25.314786,2,1,1,6.879446,4,0.741770,42.172533,7.829253,31.112020,1822.967368,20.757508
1,1435,2024-01-01 00:01:00,48.612323,-0.796982,97.385534,392.718377,18.240755,1,2,1,-3.007212,4,-3.495516,57.018427,4.495572,31.504366,2091.831914,0.642918
2,1860,2024-01-01 00:02:00,108.733320,0.253800,84.912600,398.993495,44.449145,1,1,3,0.029585,1,9.248275,69.028911,5.144489,33.838015,1816.702497,40.842824
3,1270,2024-01-01 00:03:00,38.579484,-2.111395,28.777904,358.128273,28.980155,1,2,2,8.271943,3,2.868409,86.638349,4.518283,33.256014,1283.102642,5.305229
4,1106,2024-01-01 00:04:00,57.172438,1.477883,29.740160,310.888162,33.184551,2,1,1,2.776814,2,16.750244,27.189185,4.263406,33.579678,2160.350788,5.825926
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4995,1289,2024-01-04 11:15:00,60.036619,-0.685606,78.360306,355.284001,23.338767,2,1,2,0.273436,1,2.678321,73.356245,13.936821,28.327849,1796.093041,47.102202
4996,1294,2024-01-04 11:16:00,107.291944,1.674625,68.163963,375.453798,19.056652,2,2,2,9.793821,3,37.105374,78.446046,12.197087,29.155964,2095.099662,7.538642
4997,1450,2024-01-04 11:17:00,110.384512,1.348736,72.918142,303.834905,28.155775,1,3,2,-0.429536,1,-2.955831,78.251840,1.595182,34.369795,2328.330539,5.644158
4998,1903,2024-01-04 11:18:00,6.618791,-2.642843,91.904887,302.708365,31.781034,2,1,2,-3.366394,3,16.154754,74.929797,13.503355,33.363985,1274.506744,32.768008


In [49]:
y

0       12.054317
1        4.488701
2       11.701377
3        7.389266
4        6.761205
          ...    
4995    10.089577
4996     9.600692
4997     9.148392
4998     5.581431
4999     7.841474
Name: Energy_Consumption_kWh, Length: 5000, dtype: float64

In [52]:
from sklearn.model_selection import train_test_split

In [53]:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

In [54]:
x_train

Unnamed: 0,Vehicle_ID,Timestamp,Speed_kmh,Acceleration_ms2,Battery_State_%,Battery_Voltage_V,Battery_Temperature_C,Driving_Mode,Road_Type,Traffic_Condition,Slope_%,Weather_Condition,Temperature_C,Humidity_%,Wind_Speed_ms,Tire_Pressure_psi,Vehicle_Weight_kg,Distance_Travelled_km
4227,1155,2024-01-03 22:27:00,115.393373,-1.649758,31.473489,303.163562,26.072395,1,3,1,6.642880,4,-3.276264,82.450244,11.831798,33.410697,2449.149862,38.168939
4676,1394,2024-01-04 05:56:00,5.647344,-2.319435,68.487147,308.503245,31.170747,2,2,1,-3.079258,4,38.427152,64.697475,13.297315,34.945110,1876.381381,43.753732
800,1505,2024-01-01 13:20:00,83.356915,-2.321402,58.118657,383.985800,31.419330,2,3,3,7.311235,4,26.162694,39.186261,3.471165,34.246790,1561.496006,10.038422
3671,1110,2024-01-03 13:11:00,110.375109,-1.614202,60.081646,383.524695,19.806092,2,3,2,3.664627,4,19.759301,59.757948,9.424877,31.582317,2337.483800,33.198900
4193,1189,2024-01-03 21:53:00,19.687631,0.391734,34.169325,352.620868,40.478184,2,3,2,-3.207713,1,4.374144,21.128248,7.521935,28.743683,2355.442037,27.183454
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4426,1983,2024-01-04 01:46:00,69.568454,-2.972133,45.366593,345.813836,21.502059,3,2,2,1.200163,1,-4.977621,79.865534,11.396680,29.600289,2214.750488,19.686173
466,1219,2024-01-01 07:46:00,33.158293,-2.406188,73.915556,395.032454,36.250777,2,1,3,2.767304,2,28.867363,83.479937,5.014739,30.337503,1840.316066,22.257895
3092,1553,2024-01-03 03:32:00,12.678041,-2.712195,28.769999,353.809694,13.624353,1,3,3,-0.944594,2,28.987705,76.171822,8.478853,32.840751,1528.665909,45.589327
3772,1914,2024-01-03 14:52:00,77.428517,-1.980680,78.541762,363.253377,42.624789,1,1,2,9.841818,1,3.253857,35.863015,13.907661,31.484743,1464.575034,46.518093


## Model Training

In [55]:
from sklearn.linear_model import LinearRegression

In [56]:
model = LinearRegression()