# Data Cleaning

## Problem

Problem 8: Utilizing Yelp data to estimate the number of businesses in a given locality and categorizing them according to FEMA's seven lifelines

Problem Statement: Prior to and during a disaster, it is important to understand the projected and actual effects of the event on the community, including its economic effects on critical services. FEMA has identified seven “lifelines” that require attention during a disaster:

(1) Safety and Security\
(2) Food, Water, Sheltering\
(3) Health and Medical\
(4) Energy (power, fuel)\
(5) Communications\
(6) Transportation\
(7) Hazardous Waste

This tool will utilize Yelp to estimate the effects of the event on each of the seven lifelines. This can include the number of businesses or services in each category or even, if available, their status (if provided by users and reviews in Yelp). The tool will search for relevant data and categorize it according to a list of impacted neighborhoods or a list of affected zip codes. It will provide an estimation of the potential impact of the event, at least according to the data available in Yelp.

## Imports

In [1]:
import requests
import json
import re

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib as mpl
import matplotlib.pyplot as plt

from bs4 import BeautifulSoup

import time

import regex as re

from nltk.stem import WordNetLemmatizer
from nltk.tokenize import RegexpTokenizer
import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords

from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.feature_extraction import stop_words

from sklearn import preprocessing

from sklearn.feature_selection import RFE
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import chi2

from sklearn.ensemble import ExtraTreesClassifier

from sklearn.metrics import confusion_matrix

#from sklearn.naive_bayes import CategoricalNB
from sklearn.naive_bayes import GaussianNB
from sklearn.naive_bayes import MultinomialNB

%matplotlib inline

[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/tringuyen/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


## Disaster Declarations Summaries (FEMA)

In [3]:
#read in csv file
disaster = pd.read_csv('../datasets/DisasterDeclarationsSummaries.csv')
disaster.head()

Unnamed: 0,disasterNumber,ihProgramDeclared,iaProgramDeclared,paProgramDeclared,hmProgramDeclared,state,declarationDate,fyDeclared,disasterType,incidentType,title,incidentBeginDate,incidentEndDate,disasterCloseOutDate,declaredCountyArea,placeCode,hash,lastRefresh,id
0,1,0,1,1,1,GA,1953-05-02T00:00:00.000Z,1953,DR,Tornado,TORNADO,1953-05-02T00:00:00.000Z,1953-05-02T00:00:00.000Z,1954-06-01T00:00:00.000Z,,,1dcb40d0664d22d39de787b706b0fa69,2019-07-26T18:08:57.368Z,5d1bbd8c8bdcfa6efb32fd8d
1,2,0,1,1,1,TX,1953-05-15T00:00:00.000Z,1953,DR,Tornado,TORNADO & HEAVY RAINFALL,1953-05-15T00:00:00.000Z,1953-05-15T00:00:00.000Z,1958-01-01T00:00:00.000Z,,,61612cea5779e361b429799098974b6a,2019-07-26T18:08:57.370Z,5d1bbd8c8bdcfa6efb32fd8e
2,3,0,1,1,1,LA,1953-05-29T00:00:00.000Z,1953,DR,Flood,FLOOD,1953-05-29T00:00:00.000Z,1953-05-29T00:00:00.000Z,1960-02-01T00:00:00.000Z,,,86f3e47785cb7acc51364d4535d36101,2019-07-26T18:08:57.369Z,5d1bbd8c8bdcfa6efb32fd8f
3,6,0,1,1,1,MI,1953-06-09T00:00:00.000Z,1953,DR,Tornado,TORNADO,1953-06-09T00:00:00.000Z,1953-06-09T00:00:00.000Z,1956-03-30T00:00:00.000Z,,,2208518c84c44f8e4164248d47f89ead,2019-07-26T18:08:57.369Z,5d1bbd8c8bdcfa6efb32fd92
4,4,0,1,1,1,MI,1953-06-02T00:00:00.000Z,1953,DR,Tornado,TORNADO,1953-06-02T00:00:00.000Z,1953-06-02T00:00:00.000Z,1956-02-01T00:00:00.000Z,,,1dbe5937a01fc74c8e699912e3f555cb,2019-07-26T18:08:57.370Z,5d1bbd8c8bdcfa6efb32fd91


In [4]:
df = disaster.loc[disaster['state'] == 'MA'].copy()
df.head()

Unnamed: 0,disasterNumber,ihProgramDeclared,iaProgramDeclared,paProgramDeclared,hmProgramDeclared,state,declarationDate,fyDeclared,disasterType,incidentType,title,incidentBeginDate,incidentEndDate,disasterCloseOutDate,declaredCountyArea,placeCode,hash,lastRefresh,id
6,7,0,1,1,1,MA,1953-06-11T00:00:00.000Z,1953,DR,Tornado,TORNADO,1953-06-11T00:00:00.000Z,1953-06-11T00:00:00.000Z,1956-06-01T00:00:00.000Z,,,1118ef337c5993b0e939b63ea4440c69,2019-07-26T18:08:57.369Z,5d1bbd8d8bdcfa6efb32fd94
17,22,0,1,1,1,MA,1954-09-02T00:00:00.000Z,1954,DR,Hurricane,HURRICANES,1954-09-02T00:00:00.000Z,1954-09-02T00:00:00.000Z,1956-12-01T00:00:00.000Z,,,83306db895dfa0b9da2b93c00e1ff0e0,2019-07-26T18:08:57.373Z,5d1bbd8d8bdcfa6efb32fda1
41,43,0,1,1,1,MA,1955-08-20T00:00:00.000Z,1955,DR,Hurricane,HURRICANE & FLOODS,1955-08-20T00:00:00.000Z,1955-08-20T00:00:00.000Z,1962-03-20T00:00:00.000Z,,,80cfeac2e8c043c24574d02909e2d63c,2019-07-26T18:08:57.382Z,5d1bbd8e8bdcfa6efb32fdb7
2154,325,0,1,1,0,MA,1972-03-06T00:00:00.000Z,1972,DR,Flood,SEVERE STORMS & FLOODING,1972-03-06T00:00:00.000Z,1972-03-06T00:00:00.000Z,1977-02-04T00:00:00.000Z,"Essex (County)(in PMSA 1120,4160,7090)",99009.0,a72066a412f6bea052805392a5994c8a,2019-07-26T18:08:59.385Z,5d1bbd988bdcfa6efb330603
2157,325,0,1,1,0,MA,1972-03-06T00:00:00.000Z,1972,DR,Flood,SEVERE STORMS & FLOODING,1972-03-06T00:00:00.000Z,1972-03-06T00:00:00.000Z,1977-02-04T00:00:00.000Z,"Plymouth (County)(in (P)MSA 1120,1200,5400)",99023.0,5a2b24f2de32bbe66004432b3d6f9a6c,2019-07-26T18:08:59.385Z,5d1bbd988bdcfa6efb330611


In [5]:
# remove unneccessary columns
df.drop('incidentEndDate', axis=1, inplace=True)
df.drop('disasterCloseOutDate', axis=1, inplace=True)
df.dropna(inplace=True)
df.head()

Unnamed: 0,disasterNumber,ihProgramDeclared,iaProgramDeclared,paProgramDeclared,hmProgramDeclared,state,declarationDate,fyDeclared,disasterType,incidentType,title,incidentBeginDate,declaredCountyArea,placeCode,hash,lastRefresh,id
2154,325,0,1,1,0,MA,1972-03-06T00:00:00.000Z,1972,DR,Flood,SEVERE STORMS & FLOODING,1972-03-06T00:00:00.000Z,"Essex (County)(in PMSA 1120,4160,7090)",99009.0,a72066a412f6bea052805392a5994c8a,2019-07-26T18:08:59.385Z,5d1bbd988bdcfa6efb330603
2157,325,0,1,1,0,MA,1972-03-06T00:00:00.000Z,1972,DR,Flood,SEVERE STORMS & FLOODING,1972-03-06T00:00:00.000Z,"Plymouth (County)(in (P)MSA 1120,1200,5400)",99023.0,5a2b24f2de32bbe66004432b3d6f9a6c,2019-07-26T18:08:59.385Z,5d1bbd988bdcfa6efb330611
2183,325,0,1,1,0,MA,1972-03-06T00:00:00.000Z,1972,DR,Flood,SEVERE STORMS & FLOODING,1972-03-06T00:00:00.000Z,Suffolk (County),99025.0,3e49b58d09182e5b4afcb771d9b26f3c,2019-07-26T18:08:59.386Z,5d1bbd988bdcfa6efb330607
2210,325,0,1,1,0,MA,1972-03-06T00:00:00.000Z,1972,DR,Flood,SEVERE STORMS & FLOODING,1972-03-06T00:00:00.000Z,"Norfolk (County)(in PMSA 1120,1200,6060)",99021.0,8b491886dc5dac1ccad76e9c677dfb45,2019-07-26T18:08:59.381Z,5d1bbd988bdcfa6efb330605
2557,357,0,1,0,0,MA,1972-09-28T00:00:00.000Z,1972,DR,Fishing Losses,TOXIC ALGAE IN COASTAL WATERS,1972-09-28T00:00:00.000Z,Barnstable (County),99001.0,534d287c2413197cb2ae9b28b532c79a,2019-07-26T18:08:59.677Z,5d1bbd998bdcfa6efb33078f


In [6]:
# Modify the dataframe by grouping two columns
df2 = df.groupby(['declaredCountyArea','incidentType'])['ihProgramDeclared','iaProgramDeclared'
                                                        , 'paProgramDeclared', 'hmProgramDeclared' ].sum()
df2.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,ihProgramDeclared,iaProgramDeclared,paProgramDeclared,hmProgramDeclared
declaredCountyArea,incidentType,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Barnstable (County),Coastal Storm,0,2,2,0
Barnstable (County),Fishing Losses,0,1,0,0
Barnstable (County),Flood,0,1,1,0
Barnstable (County),Hurricane,0,1,8,2
Barnstable (County),Severe Storm(s),0,0,4,4


In [7]:
# Organize the new dataframe
df3 = pd.DataFrame(df2.unstack().to_records())
df3.head()

Unnamed: 0,declaredCountyArea,"('ihProgramDeclared', 'Coastal Storm')","('ihProgramDeclared', 'Fire')","('ihProgramDeclared', 'Fishing Losses')","('ihProgramDeclared', 'Flood')","('ihProgramDeclared', 'Hurricane')","('ihProgramDeclared', 'Other')","('ihProgramDeclared', 'Severe Ice Storm')","('ihProgramDeclared', 'Severe Storm(s)')","('ihProgramDeclared', 'Snow')",...,"('hmProgramDeclared', 'Fire')","('hmProgramDeclared', 'Fishing Losses')","('hmProgramDeclared', 'Flood')","('hmProgramDeclared', 'Hurricane')","('hmProgramDeclared', 'Other')","('hmProgramDeclared', 'Severe Ice Storm')","('hmProgramDeclared', 'Severe Storm(s)')","('hmProgramDeclared', 'Snow')","('hmProgramDeclared', 'Terrorist')","('hmProgramDeclared', 'Tornado')"
0,Barnstable (County),0.0,,0.0,0.0,0.0,,,0.0,0.0,...,,0.0,0.0,2.0,,,4.0,0.0,,
1,Berkshire (County),,,,0.0,1.0,,0.0,1.0,0.0,...,,,0.0,1.0,,1.0,4.0,1.0,,
2,"Bristol (County)(in (P)MSA 1120,1200,2480,5400...",,,0.0,0.0,0.0,,,2.0,0.0,...,,0.0,0.0,2.0,,,5.0,0.0,0.0,
3,Dukes (County),0.0,,0.0,0.0,0.0,,,0.0,0.0,...,,0.0,0.0,2.0,,,3.0,0.0,,
4,"Essex (County)(in PMSA 1120,4160,7090)",0.0,0.0,0.0,1.0,0.0,0.0,0.0,2.0,0.0,...,0.0,0.0,1.0,0.0,0.0,1.0,8.0,2.0,,


In [8]:
# rename the column names
def clean_columns(df):
    a = []
    a.insert(0, 'county')
    for i in df.columns[1:]:
        a.append(i.replace('(', '').replace(')', '').replace('\'','').replace(', ', '_'))
    return(a)

In [9]:
# clean the county names
df3.columns = clean_columns(df3)
df3.fillna(0, inplace=True)
df3['county'] = df3['county'].apply(lambda x: re.sub(r' (.(County).)', '',x))
df3['county'] = df3['county'].apply(lambda x: re.sub(r'\.', '',x))
df3['county'] = df3['county'].apply(lambda x: x.lower().strip())
df3.head()

Unnamed: 0,county,ihProgramDeclared_Coastal Storm,ihProgramDeclared_Fire,ihProgramDeclared_Fishing Losses,ihProgramDeclared_Flood,ihProgramDeclared_Hurricane,ihProgramDeclared_Other,ihProgramDeclared_Severe Ice Storm,ihProgramDeclared_Severe Storms,ihProgramDeclared_Snow,...,hmProgramDeclared_Fire,hmProgramDeclared_Fishing Losses,hmProgramDeclared_Flood,hmProgramDeclared_Hurricane,hmProgramDeclared_Other,hmProgramDeclared_Severe Ice Storm,hmProgramDeclared_Severe Storms,hmProgramDeclared_Snow,hmProgramDeclared_Terrorist,hmProgramDeclared_Tornado
0,barnstable,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,2.0,0.0,0.0,4.0,0.0,0.0,0.0
1,berkshire,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,...,0.0,0.0,0.0,1.0,0.0,1.0,4.0,1.0,0.0,0.0
2,"bristol(in (p)msa 1120,1200,2480,5400,6060)",0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0.0,...,0.0,0.0,0.0,2.0,0.0,0.0,5.0,0.0,0.0,0.0
3,dukes,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,2.0,0.0,0.0,3.0,0.0,0.0,0.0
4,"essex(in pmsa 1120,4160,7090)",0.0,0.0,0.0,1.0,0.0,0.0,0.0,2.0,0.0,...,0.0,0.0,1.0,0.0,0.0,1.0,8.0,2.0,0.0,0.0


In [10]:
# export the clean file
df3.to_csv('../datasets/disaster_program_clean.csv', index=False)

In [11]:
df3.shape

(14, 45)

## 2016 MA Census by Counties (ATSDR)

In [13]:
c_counties = pd.read_csv('../datasets/Massachusetts_COUNTY.csv')
c_counties.head()

Unnamed: 0,FID,ST,STATE,ST_ABBR,COUNTY,FIPS,LOCATION,AREA_SQMI,E_TOTPOP,M_TOTPOP,...,F_CROWD,F_NOVEH,F_GROUPQ,F_THEME4,F_TOTAL,E_UNINSUR,M_UNINSUR,EP_UNINSUR,MP_UNINSUR,E_DAYPOP
0,0,25,MASSACHUSETTS,MA,Dukes,25007,"Dukes County, Massachusetts",103.265273,17137.0,0.0,...,0.0,0.0,0.0,0.0,0.0,1267.0,459.0,7.4,2.7,17381.0
1,1,25,MASSACHUSETTS,MA,Norfolk,25021,"Norfolk County, Massachusetts",396.092739,691218.0,0.0,...,0.0,0.0,0.0,0.0,0.0,14667.0,1152.0,2.1,0.2,708405.0
2,2,25,MASSACHUSETTS,MA,Worcester,25027,"Worcester County, Massachusetts",1510.635965,813589.0,0.0,...,0.0,0.0,0.0,0.0,0.0,24182.0,1437.0,3.0,0.2,785294.0
3,3,25,MASSACHUSETTS,MA,Barnstable,25001,"Barnstable County, Massachusetts",394.20624,214703.0,0.0,...,0.0,0.0,0.0,0.0,1.0,7507.0,781.0,3.5,0.4,210568.0
4,4,25,MASSACHUSETTS,MA,Essex,25009,"Essex County, Massachusetts",492.418905,769362.0,0.0,...,0.0,0.0,0.0,0.0,1.0,26396.0,1431.0,3.5,0.2,734690.0


In [14]:
c_counties.isnull().sum()

FID           0
ST            0
STATE         0
ST_ABBR       0
COUNTY        0
             ..
E_UNINSUR     0
M_UNINSUR     0
EP_UNINSUR    0
MP_UNINSUR    0
E_DAYPOP      0
Length: 124, dtype: int64

In [15]:
c_counties.shape

(14, 124)

In [20]:
c_counties.duplicated()

0     False
1     False
2     False
3     False
4     False
5     False
6     False
7     False
8     False
9     False
10    False
11    False
12    False
13    False
dtype: bool

In [28]:
c_counties.columns = map(str.lower,c_counties.columns)

In [30]:
c_counties.head()

Unnamed: 0,fid,st,state,st_abbr,county,fips,location,area_sqmi,e_totpop,m_totpop,...,f_crowd,f_noveh,f_groupq,f_theme4,f_total,e_uninsur,m_uninsur,ep_uninsur,mp_uninsur,e_daypop
0,0,25,MASSACHUSETTS,MA,Dukes,25007,"Dukes County, Massachusetts",103.265273,17137.0,0.0,...,0.0,0.0,0.0,0.0,0.0,1267.0,459.0,7.4,2.7,17381.0
1,1,25,MASSACHUSETTS,MA,Norfolk,25021,"Norfolk County, Massachusetts",396.092739,691218.0,0.0,...,0.0,0.0,0.0,0.0,0.0,14667.0,1152.0,2.1,0.2,708405.0
2,2,25,MASSACHUSETTS,MA,Worcester,25027,"Worcester County, Massachusetts",1510.635965,813589.0,0.0,...,0.0,0.0,0.0,0.0,0.0,24182.0,1437.0,3.0,0.2,785294.0
3,3,25,MASSACHUSETTS,MA,Barnstable,25001,"Barnstable County, Massachusetts",394.20624,214703.0,0.0,...,0.0,0.0,0.0,0.0,1.0,7507.0,781.0,3.5,0.4,210568.0
4,4,25,MASSACHUSETTS,MA,Essex,25009,"Essex County, Massachusetts",492.418905,769362.0,0.0,...,0.0,0.0,0.0,0.0,1.0,26396.0,1431.0,3.5,0.2,734690.0


In [38]:
list(c_counties.columns)

['fid',
 'st',
 'state',
 'st_abbr',
 'county',
 'fips',
 'location',
 'area_sqmi',
 'e_totpop',
 'm_totpop',
 'e_hu',
 'm_hu',
 'e_hh',
 'm_hh',
 'e_pov',
 'm_pov',
 'e_unemp',
 'm_unemp',
 'e_pci',
 'm_pci',
 'e_nohsdp',
 'm_nohsdp',
 'e_age65',
 'm_age65',
 'e_age17',
 'm_age17',
 'e_disabl',
 'm_disabl',
 'e_sngpnt',
 'm_sngpnt',
 'e_minrty',
 'm_minrty',
 'e_limeng',
 'm_limeng',
 'e_munit',
 'm_munit',
 'e_mobile',
 'm_mobile',
 'e_crowd',
 'm_crowd',
 'e_noveh',
 'm_noveh',
 'e_groupq',
 'm_groupq',
 'ep_pov',
 'mp_pov',
 'ep_unemp',
 'mp_unemp',
 'ep_pci',
 'mp_pci',
 'ep_nohsdp',
 'mp_nohsdp',
 'ep_age65',
 'mp_age65',
 'ep_age17',
 'mp_age17',
 'ep_disabl',
 'mp_disabl',
 'ep_sngpnt',
 'mp_sngpnt',
 'ep_minrty',
 'mp_minrty',
 'ep_limeng',
 'mp_limeng',
 'ep_munit',
 'mp_munit',
 'ep_mobile',
 'mp_mobile',
 'ep_crowd',
 'mp_crowd',
 'ep_noveh',
 'mp_noveh',
 'ep_groupq',
 'mp_groupq',
 'epl_pov',
 'epl_unemp',
 'epl_pci',
 'epl_nohsdp',
 'spl_theme1',
 'rpl_theme1',
 'epl_age

In [31]:
c_counties.to_csv('../datasets/ma_2016_census_by_counties_clean.csv', index=False)

## MA (last updated 09/2019) Electric Power Transmission Lines (HIFLD)

In [22]:
eptl = pd.read_csv('../datasets/MA_Electric_Power_Transmission_Lines.csv')
eptl.head()

Unnamed: 0,OBJECTID,ID,TYPE,STATUS,NAICS_CODE,NAICS_DESC,SOURCE,SOURCEDATE,VAL_METHOD,VAL_DATE,OWNER,VOLTAGE,VOLT_CLASS,INFERRED,SUB_1,SUB_2,SHAPE__Length
0,1,140809,"AC, OVERHEAD",IN SERVICE,221121,ELECTRIC BULK POWER TRANSMISSION AND CONTROL,IMAGERY,2014-04-16T00:00:00.000Z,IMAGERY,2017-02-15T00:00:00.000Z,NOT AVAILABLE,161.0,100-161,Y,PHILLIPS BEND,JOHN SEVIER,24018.513114
1,2,140837,"AC, OVERHEAD",IN SERVICE,221121,ELECTRIC BULK POWER TRANSMISSION AND CONTROL,"IMAGERY, https://www9.nationalgridus.com/oasis...",2015-06-16T00:00:00.000Z,IMAGERY,2019-03-05T00:00:00.000Z,NOT AVAILABLE,115.0,100-161,Y,TAP140359,TAP140373,5972.919614
2,3,140811,"AC, OVERHEAD",IN SERVICE,221121,ELECTRIC BULK POWER TRANSMISSION AND CONTROL,"IMAGERY, OpenStreetMap",2014-06-20T00:00:00.000Z,IMAGERY,2017-03-20T00:00:00.000Z,NOT AVAILABLE,115.0,100-161,Y,GENTILLY ROAD,MICHOUD STATION,14253.092823
3,4,140813,"AC, OVERHEAD",IN SERVICE,221121,ELECTRIC BULK POWER TRANSMISSION AND CONTROL,"IMAGERY, EIA 860",2016-10-04T00:00:00.000Z,IMAGERY/OTHER,2018-05-09T00:00:00.000Z,NOT AVAILABLE,161.0,100-161,Y,UNKNOWN137689,TAP137690,8950.844317
4,5,140814,OVERHEAD,IN SERVICE,221121,ELECTRIC BULK POWER TRANSMISSION AND CONTROL,"IMAGERY, EIA 861",2014-06-23T00:00:00.000Z,IMAGERY,2014-06-23T00:00:00.000Z,ALABAMA POWER CO,-999999.0,100-161,Y,UNKNOWN112122,UNKNOWN112349,5339.587603


In [23]:
eptl.isnull().sum()

OBJECTID         0
ID               0
TYPE             0
STATUS           0
NAICS_CODE       0
NAICS_DESC       0
SOURCE           0
SOURCEDATE       0
VAL_METHOD       0
VAL_DATE         0
OWNER            0
VOLTAGE          0
VOLT_CLASS       0
INFERRED         0
SUB_1            0
SUB_2            0
SHAPE__Length    0
dtype: int64

In [32]:
eptl.columns = map(str.lower,eptl.columns)

In [33]:
eptl.head()

Unnamed: 0,objectid,id,type,status,naics_code,naics_desc,source,sourcedate,val_method,val_date,owner,voltage,volt_class,inferred,sub_1,sub_2,shape__length
0,1,140809,"AC, OVERHEAD",IN SERVICE,221121,ELECTRIC BULK POWER TRANSMISSION AND CONTROL,IMAGERY,2014-04-16T00:00:00.000Z,IMAGERY,2017-02-15T00:00:00.000Z,NOT AVAILABLE,161.0,100-161,Y,PHILLIPS BEND,JOHN SEVIER,24018.513114
1,2,140837,"AC, OVERHEAD",IN SERVICE,221121,ELECTRIC BULK POWER TRANSMISSION AND CONTROL,"IMAGERY, https://www9.nationalgridus.com/oasis...",2015-06-16T00:00:00.000Z,IMAGERY,2019-03-05T00:00:00.000Z,NOT AVAILABLE,115.0,100-161,Y,TAP140359,TAP140373,5972.919614
2,3,140811,"AC, OVERHEAD",IN SERVICE,221121,ELECTRIC BULK POWER TRANSMISSION AND CONTROL,"IMAGERY, OpenStreetMap",2014-06-20T00:00:00.000Z,IMAGERY,2017-03-20T00:00:00.000Z,NOT AVAILABLE,115.0,100-161,Y,GENTILLY ROAD,MICHOUD STATION,14253.092823
3,4,140813,"AC, OVERHEAD",IN SERVICE,221121,ELECTRIC BULK POWER TRANSMISSION AND CONTROL,"IMAGERY, EIA 860",2016-10-04T00:00:00.000Z,IMAGERY/OTHER,2018-05-09T00:00:00.000Z,NOT AVAILABLE,161.0,100-161,Y,UNKNOWN137689,TAP137690,8950.844317
4,5,140814,OVERHEAD,IN SERVICE,221121,ELECTRIC BULK POWER TRANSMISSION AND CONTROL,"IMAGERY, EIA 861",2014-06-23T00:00:00.000Z,IMAGERY,2014-06-23T00:00:00.000Z,ALABAMA POWER CO,-999999.0,100-161,Y,UNKNOWN112122,UNKNOWN112349,5339.587603


In [34]:
eptl.to_csv('../datasets/ma_electric_transmission_lines_clean.csv', index=False)