# Energy Grid Stability

In this notebook I will use the given data about an energy grid to predict weather or not it is stable. 

In [1]:
from __future__ import absolute_import, division, print_function, unicode_literals

import numpy as np
import pandas as pd

!pip install -q tensorflow==2.0.0-alpha0
import tensorflow as tf

from tensorflow import feature_column
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split

%matplotlib inline

  from ._conv import register_converters as _register_converters


### Preprocessing data

In [34]:
# read csv file into dataframe

energy = pd.read_csv('energy grid stability.csv')
energy.head()

Unnamed: 0,tau1,tau2,tau3,tau4,p1,p2,p3,p4,g1,g2,g3,g4,stabf
0,2.95906,3.079885,8.381025,9.780754,3.763085,-0.782604,-1.257395,-1.723086,0.650456,0.859578,0.887445,0.958034,unstable
1,9.304097,4.902524,3.047541,1.369357,5.067812,-1.940058,-1.872742,-1.255012,0.413441,0.862414,0.562139,0.78176,stable
2,8.971707,8.848428,3.046479,1.214518,3.405158,-1.207456,-1.27721,-0.920492,0.163041,0.766689,0.839444,0.109853,unstable
3,0.716415,7.6696,4.486641,2.340563,3.963791,-1.027473,-1.938944,-0.997374,0.446209,0.976744,0.929381,0.362718,unstable
4,3.134112,7.608772,4.943759,9.857573,3.525811,-1.125531,-1.845975,-0.554305,0.79711,0.45545,0.656947,0.820923,unstable


The data that I am using has categorical values for the target which means that I am going to have to encode that data so it cane be used in the model. 

In [35]:
# process the categorical data

processed_energy= {"stabf":     {"unstable": 0, "stable": 1}}


In [36]:
energy.replace(processed_energy, inplace=True)
energy.head()

Unnamed: 0,tau1,tau2,tau3,tau4,p1,p2,p3,p4,g1,g2,g3,g4,stabf
0,2.95906,3.079885,8.381025,9.780754,3.763085,-0.782604,-1.257395,-1.723086,0.650456,0.859578,0.887445,0.958034,0
1,9.304097,4.902524,3.047541,1.369357,5.067812,-1.940058,-1.872742,-1.255012,0.413441,0.862414,0.562139,0.78176,1
2,8.971707,8.848428,3.046479,1.214518,3.405158,-1.207456,-1.27721,-0.920492,0.163041,0.766689,0.839444,0.109853,0
3,0.716415,7.6696,4.486641,2.340563,3.963791,-1.027473,-1.938944,-0.997374,0.446209,0.976744,0.929381,0.362718,0
4,3.134112,7.608772,4.943759,9.857573,3.525811,-1.125531,-1.845975,-0.554305,0.79711,0.45545,0.656947,0.820923,0


Now that all of the data is in numerical form and ready to use I split the data into three different segments taht will be used to training, testing and validating the model. 

In [38]:
train, test = train_test_split(energy, test_size=0.2)
train, val = train_test_split(train, test_size=0.2)
print(len(train), 'train examples')
print(len(val), 'validation examples')
print(len(test), 'test examples')

6400 train examples
1600 validation examples
2000 test examples


The next step is to create an input pipeline which will allow us to use feature columns as a bridge to map from the columns in the Pandas dataframe to features used to train the model.

In [41]:
def df_to_dataset(energy, shuffle=True, batch_size=32):
  dataframe = energy.copy()
  labels = energy.pop('stabf')
  ds = tf.data.Dataset.from_tensor_slices((dict(energy), labels))
  if shuffle:
    ds = ds.shuffle(buffer_size=len(energy))
  ds = ds.batch(batch_size)
  return ds


In [42]:
batch_size = 5 # A small batch sized is used for demonstration purposes
train_ds = df_to_dataset(train, batch_size=batch_size)
val_ds = df_to_dataset(val, shuffle=False, batch_size=batch_size)
test_ds = df_to_dataset(test, shuffle=False, batch_size=batch_size)

Now that the pipeline is created, lets visualize it to see the data that is returned. 

In [44]:
for feature_batch, label_batch in train_ds.take(1):
  print('Every feature:', list(feature_batch.keys()))
  print('A batch of targets:', label_batch )

Every feature: ['tau1', 'tau2', 'tau3', 'tau4', 'p1', 'p2', 'p3', 'p4', 'g1', 'g2', 'g3', 'g4']
A batch of targets: tf.Tensor([1 0 0 1 0], shape=(5,), dtype=int32)


The next step is to create feature columns with the data that will be used to predict the label. For this example I used a numeric feature column because it is the most basic and that is all we need fro this example. 

In [46]:
feature_columns = []

# numeric cols
for header in ['tau1', 'tau2', 'tau3', 'tau4', 'p1', 'p2', 'p3', 'p4', 'g1', 'g2', 'g3', 'g4']:
  feature_columns.append(feature_column.numeric_column(header))


Now that the feature column has been created we can use it to create a feature layer which will be used as an input into the model. 

In [47]:
feature_layer = tf.keras.layers.DenseFeatures(feature_columns)

In this exmaple I used a standard model with two dense hidden layers to try and predict the stability of the energy grid

In [50]:
model = tf.keras.Sequential([
  feature_layer,
  layers.Dense(128, activation='relu'),
  layers.Dense(128, activation='relu'),
  layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.fit(train_ds,
          validation_data=val_ds,
          epochs=5)

Epoch 1/5


W0527 08:03:48.428843 4518475200 deprecation.py:323] From /Users/Stevenkoenemann/anaconda/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column_v2.py:2758: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.


Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x123374240>

The model increased accuracy and decreased loss with each epoch which is a good sign and means that we can see if it is a generalized and able to make predictions of the test set. 

In [51]:
loss, accuracy = model.evaluate(test_ds)
print("Accuracy", accuracy)

Accuracy 0.92


The model performed just as well on the test set as it did on the training set whihc is a good sign ans meadn that it is able to make predictions. 