# Battery Remaining Useful Life (RUL) Prediction
This notebook demonstrates the process of predicting the Remaining Useful Life (RUL) of batteries using the K-Nearest Neighbors (KNN) algorithm. The dataset contains various features related to battery usage cycles and characteristics.

In [1]:
import os

# Install git if not already installed
!apt-get install -y git

# Clone the repository
!git clone https://github.com/Elite-AI-Club/AI-Driven-Innovation-Electronics.git

# Change working directory to the repository
%cd AI-Driven-Innovation-Electronics/6_Battery_RUL

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
git is already the newest version (1:2.34.1-1ubuntu1.11).
0 upgraded, 0 newly installed, 0 to remove and 49 not upgraded.
Cloning into 'AI-Driven-Innovation-Electronics'...
remote: Enumerating objects: 100, done.[K
remote: Counting objects: 100% (100/100), done.[K
remote: Compressing objects: 100% (78/78), done.[K
remote: Total 100 (delta 29), reused 85 (delta 17), pack-reused 0 (from 0)[K
Receiving objects: 100% (100/100), 20.64 MiB | 23.07 MiB/s, done.
Resolving deltas: 100% (29/29), done.
/content/AI-Driven-Innovation-Electronics/6_Battery_RUL


In [17]:
import pandas as pd
from sklearn.neighbors import KNeighborsRegressor
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error
import numpy as np
import seaborn as sns


## Load and Prepare the Data
First, load the dataset and prepare it for the model. This involves selecting relevant features and the target variable.

In [18]:
pwd

'/content/AI-Driven-Innovation-Electronics/6_Battery_RUL'

In [5]:
data = pd.read_csv('Battery_RUL.csv')
data

Unnamed: 0,Cycle_Index,Discharge Time (s),Decrement 3.6-3.4V (s),Max. Voltage Dischar. (V),Min. Voltage Charg. (V),Time at 4.15V (s),Time constant current (s),Charging time (s),RUL
0,1.0,2595.30,1151.488500,3.670,3.211,5460.001,6755.01,10777.82,1112
1,2.0,7408.64,1172.512500,4.246,3.220,5508.992,6762.02,10500.35,1111
2,3.0,7393.76,1112.992000,4.249,3.224,5508.993,6762.02,10420.38,1110
3,4.0,7385.50,1080.320667,4.250,3.225,5502.016,6762.02,10322.81,1109
4,6.0,65022.75,29813.487000,4.290,3.398,5480.992,53213.54,56699.65,1107
...,...,...,...,...,...,...,...,...,...
15059,1108.0,770.44,179.523810,3.773,3.742,922.775,1412.38,6678.88,4
15060,1109.0,771.12,179.523810,3.773,3.744,915.512,1412.31,6670.38,3
15061,1110.0,769.12,179.357143,3.773,3.742,915.513,1412.31,6637.12,2
15062,1111.0,773.88,162.374667,3.763,3.839,539.375,1148.00,7660.62,1


In [6]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15064 entries, 0 to 15063
Data columns (total 9 columns):
 #   Column                     Non-Null Count  Dtype  
---  ------                     --------------  -----  
 0   Cycle_Index                15064 non-null  float64
 1   Discharge Time (s)         15064 non-null  float64
 2   Decrement 3.6-3.4V (s)     15064 non-null  float64
 3   Max. Voltage Dischar. (V)  15064 non-null  float64
 4   Min. Voltage Charg. (V)    15064 non-null  float64
 5   Time at 4.15V (s)          15064 non-null  float64
 6   Time constant current (s)  15064 non-null  float64
 7   Charging time (s)          15064 non-null  float64
 8   RUL                        15064 non-null  int64  
dtypes: float64(8), int64(1)
memory usage: 1.0 MB


In [7]:
data.describe()

Unnamed: 0,Cycle_Index,Discharge Time (s),Decrement 3.6-3.4V (s),Max. Voltage Dischar. (V),Min. Voltage Charg. (V),Time at 4.15V (s),Time constant current (s),Charging time (s),RUL
count,15064.0,15064.0,15064.0,15064.0,15064.0,15064.0,15064.0,15064.0,15064.0
mean,556.155005,4581.27396,1239.784672,3.908176,3.577904,3768.336171,5461.26697,10066.496204,554.194172
std,322.37848,33144.012077,15039.589269,0.091003,0.123695,9129.552477,25155.845202,26415.354121,322.434514
min,1.0,8.69,-397645.908,3.043,3.022,-113.584,5.98,5.98,0.0
25%,271.0,1169.31,319.6,3.846,3.488,1828.884179,2564.31,7841.9225,277.0
50%,560.0,1557.25,439.239471,3.906,3.574,2930.2035,3824.26,8320.415,551.0
75%,833.0,1908.0,600.0,3.972,3.663,4088.3265,5012.35,8763.2825,839.0
max,1134.0,958320.37,406703.768,4.363,4.379,245101.117,880728.1,880728.1,1133.0


## Extract the features and the target variable


In [8]:
X = data.iloc[:, 1:-1]
y = data['RUL']

In [9]:
X

Unnamed: 0,Discharge Time (s),Decrement 3.6-3.4V (s),Max. Voltage Dischar. (V),Min. Voltage Charg. (V),Time at 4.15V (s),Time constant current (s),Charging time (s)
0,2595.30,1151.488500,3.670,3.211,5460.001,6755.01,10777.82
1,7408.64,1172.512500,4.246,3.220,5508.992,6762.02,10500.35
2,7393.76,1112.992000,4.249,3.224,5508.993,6762.02,10420.38
3,7385.50,1080.320667,4.250,3.225,5502.016,6762.02,10322.81
4,65022.75,29813.487000,4.290,3.398,5480.992,53213.54,56699.65
...,...,...,...,...,...,...,...
15059,770.44,179.523810,3.773,3.742,922.775,1412.38,6678.88
15060,771.12,179.523810,3.773,3.744,915.512,1412.31,6670.38
15061,769.12,179.357143,3.773,3.742,915.513,1412.31,6637.12
15062,773.88,162.374667,3.763,3.839,539.375,1148.00,7660.62


In [10]:
y

Unnamed: 0,RUL
0,1112
1,1111
2,1110
3,1109
4,1107
...,...
15059,4
15060,3
15061,2
15062,1


## Split the Data
Split the data into training and test sets to evaluate the model's performance effectively.

In [11]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## Standardize the Features
Standardizing the features is crucial for many machine learning models to perform well, including KNN.

In [12]:
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

## Train the KNN Model
Train the KNN model using the scaled training data.

In [13]:
knn = KNeighborsRegressor(n_neighbors=5)
knn.fit(X_train_scaled, y_train)

## Evaluate the Model
Evaluate the model using Root Mean Squared Error (RMSE) on the test set.

In [14]:
y_pred = knn.predict(X_test_scaled)
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
print(f'Root Mean Squared Error: {rmse}')

Root Mean Squared Error: 37.60927373613025


In [15]:
# train accuracy R sqared
print(f'Train Accuracy: {knn.score(X_train_scaled, y_train)}')

Train Accuracy: 0.9933708187909873


In [16]:
# performance on test data
print(f'Test Accuracy: {knn.score(X_test_scaled, y_test)}')

Test Accuracy: 0.9863450390972913
