## Power Outage Data

- Collected from: https://www.oe.netl.doe.gov/OE417_annual_summary.aspx
- Report information: https://www.oe.netl.doe.gov/docs/OE417_Form_Instructions_05312021.pdf




In [32]:
import pandas as pd
import numpy as np

In [33]:
# Import power outage data & concat from year 2015 - 2019
data_list = []
for i in range(5,10,1):
    data = pd.read_excel(f"../Data/201{i}_Annual_Summary.xls", header = 1)
    data_list.append(data)
df = pd.concat(data_list, axis=0, ignore_index=True)

#https://stackoverflow.com/questions/20906474/import-multiple-csv-files-into-pandas-and-concatenate-into-one-dataframe



In [34]:
df.isnull().sum()

Month                           0
Date Event Began                0
Time Event Began                0
Date of Restoration             0
Time of Restoration             0
Area Affected                   0
NERC Region                     6
Alert Criteria                  0
Event Type                      0
Demand Loss (MW)                0
Number of Customers Affected    0
dtype: int64

In [35]:
df = df.fillna("Unknown")

In [36]:
df.shape

(846, 11)

In [37]:
df_ca = df[df['Area Affected'].str.lower().str.contains("california")]

In [38]:
df_ca.shape

(122, 11)

**Get rid of `Demand Loss (MW)` = 0**

In [39]:
df_ca = df_ca[df_ca['Demand Loss (MW)'] != '0']

In [42]:
df_ca.shape

(89, 11)

In [43]:
df_ca

Unnamed: 0,Month,Date Event Began,Time Event Began,Date of Restoration,Time of Restoration,Area Affected,NERC Region,Alert Criteria,Event Type,Demand Loss (MW),Number of Customers Affected
7,February,2015-02-04 00:00:00,11:55:00,2015-02-04 00:00:00,11:56:00,"Vollmers, California",WECC,Suspected Physical Attack,Vandalism,Unknown,Unknown
9,February,2015-02-05 00:00:00,11:20:00,2015-02-05 00:00:00,11:21:00,"Dunismuir, California",WECC,Suspected Physical Attack,Vandalism,Unknown,Unknown
10,February,2015-02-06 00:00:00,20:58:00,Unknown,Unknown,Northern California,WECC,"Loss of electric service to more than 50,000 c...",Severe Weather - Wind,Unknown,65000
29,March,2015-03-26 00:00:00,15:21:00,2015-03-26 00:00:00,16:59:00,"Contra Costa County, California",WECC,Electrical System Separation (Islanding) where...,System Operations,15,Unknown
30,March,2015-03-29 00:00:00,04:26:00,2015-03-29 00:00:00,09:21:00,California,WECC,Suspected Physical Attack,Vandalism,Unknown,Unknown
...,...,...,...,...,...,...,...,...,...,...,...
756,May,05/24/2019,21:47:00,05/24/2019,23:58:00,California:,WECC,Electrical System Separation (Islanding) where...,Severe Weather,20,10961
758,June,06/02/2019,18:19:00,06/02/2019,20:43:00,California:,WECC,Electrical System Separation (Islanding) where...,Severe Weather/Transmission Interruption,Unknown,Unknown
769,June,06/12/2019,14:56:00,06/12/2019,15:50:00,"California: Imperial County, Riverside County;",WECC,Firm load shedding of 100 Megawatts or more im...,Generation Inadequacy,982,30907
808,July,07/23/2019,03:22:00,07/23/2019,05:40:00,California: Santa Cruz County;,WECC,Damage or destruction of its Facility that res...,Vandalism,Unknown,25


**Should we get rid of `Demand Loss (MW)` = "Unknown"?**

**Some might actually be power outage event.**

In [41]:
df_ca[df_ca['Demand Loss (MW)'] == 'Unknown'].head()

Unnamed: 0,Month,Date Event Began,Time Event Began,Date of Restoration,Time of Restoration,Area Affected,NERC Region,Alert Criteria,Event Type,Demand Loss (MW),Number of Customers Affected
7,February,2015-02-04 00:00:00,11:55:00,2015-02-04 00:00:00,11:56:00,"Vollmers, California",WECC,Suspected Physical Attack,Vandalism,Unknown,Unknown
9,February,2015-02-05 00:00:00,11:20:00,2015-02-05 00:00:00,11:21:00,"Dunismuir, California",WECC,Suspected Physical Attack,Vandalism,Unknown,Unknown
10,February,2015-02-06 00:00:00,20:58:00,Unknown,Unknown,Northern California,WECC,"Loss of electric service to more than 50,000 c...",Severe Weather - Wind,Unknown,65000
30,March,2015-03-29 00:00:00,04:26:00,2015-03-29 00:00:00,09:21:00,California,WECC,Suspected Physical Attack,Vandalism,Unknown,Unknown
34,April,2015-04-06 00:00:00,08:12:00,2015-04-06 00:00:00,12:08:00,"Butte County, California",WECC,"Loss of electric service to more than 50,000 c...",System Operations,Unknown,80000
