## ⚡ Smart Grid Stability Prediction

Given *data about smart grids*, let's try to predict if a given grid is **stable** or not, and also predict a numerical estimate of its **stability**.

We will use both classification and regression models from XGBoost to make our predictions.

Data source: https://www.kaggle.com/datasets/pcbreviglieri/smart-grid-stability

### Importing Libraries

In [1]:
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split

from xgboost import XGBClassifier, XGBRegressor

In [2]:
data = pd.read_csv('smart_grid_stability_augmented.csv')
data

Unnamed: 0,tau1,tau2,tau3,tau4,p1,p2,p3,p4,g1,g2,g3,g4,stab,stabf
0,2.959060,3.079885,8.381025,9.780754,3.763085,-0.782604,-1.257395,-1.723086,0.650456,0.859578,0.887445,0.958034,0.055347,unstable
1,9.304097,4.902524,3.047541,1.369357,5.067812,-1.940058,-1.872742,-1.255012,0.413441,0.862414,0.562139,0.781760,-0.005957,stable
2,8.971707,8.848428,3.046479,1.214518,3.405158,-1.207456,-1.277210,-0.920492,0.163041,0.766689,0.839444,0.109853,0.003471,unstable
3,0.716415,7.669600,4.486641,2.340563,3.963791,-1.027473,-1.938944,-0.997374,0.446209,0.976744,0.929381,0.362718,0.028871,unstable
4,3.134112,7.608772,4.943759,9.857573,3.525811,-1.125531,-1.845975,-0.554305,0.797110,0.455450,0.656947,0.820923,0.049860,unstable
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
59995,2.930406,2.376523,9.487627,6.187797,3.343416,-1.449106,-0.658054,-1.236256,0.601709,0.813512,0.779642,0.608385,0.023892,unstable
59996,3.392299,2.954947,1.274827,6.894759,4.349512,-0.952437,-1.663661,-1.733414,0.502079,0.285880,0.567242,0.366120,-0.025803,stable
59997,2.364034,8.776391,2.842030,1.008906,4.299976,-0.943884,-1.380719,-1.975373,0.487838,0.149286,0.986505,0.145984,-0.031810,stable
59998,9.631511,2.757071,3.994398,7.821347,2.514755,-0.649915,-0.966330,-0.898510,0.365246,0.889118,0.587558,0.818391,0.037789,unstable


In [3]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 60000 entries, 0 to 59999
Data columns (total 14 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   tau1    60000 non-null  float64
 1   tau2    60000 non-null  float64
 2   tau3    60000 non-null  float64
 3   tau4    60000 non-null  float64
 4   p1      60000 non-null  float64
 5   p2      60000 non-null  float64
 6   p3      60000 non-null  float64
 7   p4      60000 non-null  float64
 8   g1      60000 non-null  float64
 9   g2      60000 non-null  float64
 10  g3      60000 non-null  float64
 11  g4      60000 non-null  float64
 12  stab    60000 non-null  float64
 13  stabf   60000 non-null  object 
dtypes: float64(13), object(1)
memory usage: 6.4+ MB


### Preprocessing

In [4]:
df = data.copy()

#### Task: 'Classification'

In [5]:
df = df.drop('stab', axis=1)

In [6]:
y = df['stabf'].copy()
y = y.replace({'stable': 1, 'unstable': 0})
X = df.drop('stabf', axis=1).copy()

  y = y.replace({'stable': 1, 'unstable': 0})


In [7]:
X

Unnamed: 0,tau1,tau2,tau3,tau4,p1,p2,p3,p4,g1,g2,g3,g4
0,2.959060,3.079885,8.381025,9.780754,3.763085,-0.782604,-1.257395,-1.723086,0.650456,0.859578,0.887445,0.958034
1,9.304097,4.902524,3.047541,1.369357,5.067812,-1.940058,-1.872742,-1.255012,0.413441,0.862414,0.562139,0.781760
2,8.971707,8.848428,3.046479,1.214518,3.405158,-1.207456,-1.277210,-0.920492,0.163041,0.766689,0.839444,0.109853
3,0.716415,7.669600,4.486641,2.340563,3.963791,-1.027473,-1.938944,-0.997374,0.446209,0.976744,0.929381,0.362718
4,3.134112,7.608772,4.943759,9.857573,3.525811,-1.125531,-1.845975,-0.554305,0.797110,0.455450,0.656947,0.820923
...,...,...,...,...,...,...,...,...,...,...,...,...
59995,2.930406,2.376523,9.487627,6.187797,3.343416,-1.449106,-0.658054,-1.236256,0.601709,0.813512,0.779642,0.608385
59996,3.392299,2.954947,1.274827,6.894759,4.349512,-0.952437,-1.663661,-1.733414,0.502079,0.285880,0.567242,0.366120
59997,2.364034,8.776391,2.842030,1.008906,4.299976,-0.943884,-1.380719,-1.975373,0.487838,0.149286,0.986505,0.145984
59998,9.631511,2.757071,3.994398,7.821347,2.514755,-0.649915,-0.966330,-0.898510,0.365246,0.889118,0.587558,0.818391


In [8]:
X.mean()

tau1    5.250000
tau2    5.250001
tau3    5.250001
tau4    5.250001
p1      3.750000
p2     -1.250000
p3     -1.250000
p4     -1.250000
g1      0.525000
g2      0.525000
g3      0.525000
g4      0.525000
dtype: float64

In [9]:
X.var()

tau1    7.520945
tau2    7.520960
tau3    7.520960
tau4    7.520960
p1      0.565698
p2      0.187504
p3      0.187504
p4      0.187504
g1      0.075210
g2      0.075209
g3      0.075209
g4      0.075209
dtype: float64

In [10]:
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, shuffle=True, random_state=1)

In [11]:
X_train.shape, X_test.shape

((42000, 12), (18000, 12))

In [12]:
y_train

51782    1
53781    0
55123    0
35823    1
48869    0
        ..
50057    1
32511    0
5192     0
12172    1
33003    0
Name: stabf, Length: 42000, dtype: int64

In [13]:
clf = XGBClassifier()
clf.fit(X_train, y_train)
print("Classifier trained.")

Classifier trained.


In [14]:
print("Classification Test Accuracy: {:.2f}%".format(clf.score(X_test, y_test)*100))

Classification Test Accuracy: 97.62%


#### Task: Regression

In [15]:
df = data.copy()

In [16]:
df = df.drop('stabf', axis=1)

In [17]:
y = df['stab'].copy()
X = df.drop('stab', axis=1)

In [18]:
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, shuffle=True, random_state=1)

In [19]:
reg = XGBRegressor()
reg.fit(X_train, y_train)
print("Regressor Trained.")

Regressor Trained.


In [21]:
print("Regression Test R^2 Score: {:.5f}".format(reg.score(X_test, y_test)))

Regression Test R^2 Score: 0.96223
