# Goal

Utilizing Yelp data to estimate the number of businesses in a given locality and categorize them by lifeline

Will's notes
Population density
15% create
30% limitation



## Problem

Problem 8: Utilizing Yelp data to estimate the number of businesses in a given locality and categorizing them according to FEMA's seven lifelines

Problem Statement: Prior to and during a disaster, it is important to understand the projected and actual effects of the event on the community, including its economic effects on critical services. FEMA has identified seven “lifelines” that require attention during a disaster:

(1) Safety and Security\
(2) Food, Water, Sheltering\
(3) Health and Medical\
(4) Energy (power, fuel)\
(5) Communications\
(6) Transportation\
(7) Hazardous Waste

This tool will utilize Yelp to estimate the effects of the event on each of the seven lifelines. This can include the number of businesses or services in each category or even, if available, their status (if provided by users and reviews in Yelp). The tool will search for relevant data and categorize it according to a list of impacted neighborhoods or a list of affected zip codes. It will provide an estimation of the potential impact of the event, at least according to the data available in Yelp.

## Imports

In [1]:
import requests
import json

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib as mpl
import matplotlib.pyplot as plt

from bs4 import BeautifulSoup

import time

import regex as re

from nltk.stem import WordNetLemmatizer
from nltk.tokenize import RegexpTokenizer
import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords

from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.feature_extraction import stop_words

from sklearn import preprocessing

from sklearn.feature_selection import RFE
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import chi2

from sklearn.ensemble import ExtraTreesClassifier

from sklearn.metrics import confusion_matrix

#from sklearn.naive_bayes import CategoricalNB
from sklearn.naive_bayes import GaussianNB
from sklearn.naive_bayes import MultinomialNB

%matplotlib inline

[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/tringuyen/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


## Yelp Fushion API

In [2]:
api_key='OEwJjivomfYJlF4fY7s8ShrogVV8EFF11Um66hxN2hRpsDE5YgF5qeyVLOkOMY32sRDSXW7mfdC1amhjE1IhCjyqEhwNwjudYL82aH4jCjM-wqp6GR46OR3HmDk7XnYx'
headers = {'Authorization': 'Bearer %s' % api_key}


## (1) Safety & Security

In [3]:
url='https://api.yelp.com/v3/businesses/search'

# In the dictionary, term can take values like food, cafes or businesses like McDonalds
#https://www.yelp.com/developers/documentation/v3/all_category_list --Categories param
#https://www.yelp.com/developers/documentation/v3/business_search --Param
params_safety = {'categories':['firedepartments','policedepartments'],
                 'location':'Boston',
                 'limit': 50,
                 'radius':40000
                }


In [4]:
# Making a get request to the API
req=requests.get(url, params_safety, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [5]:
data = req.json()
posts_safety = data['businesses']
len(posts_safety)

18

## For Loop to Extract Name and Category of Each Business

In [6]:
posts_safety[1]['coordinates']['latitude']

42.3699354188619

In [7]:
names = []
categories = []
latitude = []
longitude = []
zipcode = []

for i in range(len(posts_safety)):
    names.append(posts_safety[i]['name'])
    categories.append(posts_safety[i]['categories'][0]['alias'])
    latitude.append(posts_safety[i]['coordinates']['latitude'])
    longitude.append(posts_safety[i]['coordinates']['longitude'])
    zipcode.append(posts_safety[i]['location']['zip_code'])

## Create a Dataframe for the Lifeline 1 Businesses

In [15]:
df_safety = pd.DataFrame(columns=['Business', 'Category'])
df_safety['Business'] = names
df_safety['Category'] = categories
df_safety['Lifeline'] = 1
df_safety['Latitude'] = latitude
df_safety['Longitude'] = longitude
df_safety['Zipcode']= zipcode

In [16]:
df_safety.head()

Unnamed: 0,Business,Category,Lifeline,Latitude,Longitude,Zipcode
0,Boston Police Department,policedepartments,1,42.361798,-71.060297,2114
1,Harvard University Police Department,policedepartments,1,42.369935,-71.112013,2138
2,Boston Police Headquarters,policedepartments,1,42.334077,-71.090885,2120
3,Somerville Fire Department,firedepartments,1,42.390857,-71.091024,2145
4,Cambridge Police Department,policedepartments,1,42.36726,-71.086151,2141


In [17]:
df_safety.to_csv('../datasets/safety_df.csv',index=False)

#### <i>Repeat Process for the Remaining Lifelines

## (2) Food, Water, Sheltering


In [18]:
params_food_shelter = {'categories':['foodbanks',
                                     'animalshelters',
                                    'homelessshelters',
                                    'communitycenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000}


In [19]:
req=requests.get(url, params_food_shelter, headers=headers)

# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [20]:
data = req.json()
posts_food_shelter = data['businesses']
len(posts_food_shelter)

28

In [22]:
names = []
categories = []
latitude = []
longitude = []
zipcode = []
for i in range(len(posts_food_shelter)):
    names.append(posts_food_shelter[i]['name'])
    categories.append(posts_food_shelter[i]['categories'][0]['alias'])
    latitude.append(posts_food_shelter[i]['coordinates']['latitude'])
    longitude.append(posts_food_shelter[i]['coordinates']['longitude'])
    zipcode.append(posts_food_shelter[i]['location']['zip_code'])

In [23]:
df_food_shelter = pd.DataFrame(columns = ['Business', 'Category'])
df_food_shelter['Business'] = names
df_food_shelter['Category'] = categories
df_food_shelter['Lifeline'] = 2
df_food_shelter['Latitude'] = latitude
df_food_shelter['Longitude'] = longitude
df_food_shelter['Zipcode'] = zipcode
df_food_shelter.head()

Unnamed: 0,Business,Category,Lifeline,Latitude,Longitude,Zipcode
0,MSPCA Angell,vet,2,42.323087,-71.111216,2130
1,Animal Rescue League of Boston,animalshelters,2,42.347221,-71.070002,2116
2,Greater Boston Food Bank,foodbanks,2,42.33423,-71.06566,2118
3,Ellen M Gifford Cat Shelter,animalshelters,2,42.34165,-71.16762,2135
4,Center for Arts at the Armory,venues,2,42.389736,-71.105905,2143


In [24]:
df_food_shelter.shape

(28, 6)

In [25]:
df_food_shelter.to_csv('../datasets/food_shelter_df.csv',index=False)

## (3) Health and Medical


In [26]:
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000
                        }

In [27]:
req=requests.get(url, params_health_medical, headers=headers)

# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [28]:
data = req.json()
posts_health_medical = data['businesses']
len(posts_health_medical)

50

In [29]:
names = []
categories = []
latitude = []
longitude = []
zipcode = []
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

### Lifeline 3 - Health and Medical Offset 50

In [30]:
#added offset parameter
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':50
                        }

req=requests.get(url, params_health_medical, headers=headers)

# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [31]:
data = req.json()
posts_health_medical = data['businesses']
len(posts_health_medical)

50

In [32]:
#add to created lists
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [33]:
len(names)

100

### Lifeline 3 - Health and Medical Offset 100

In [34]:
#offset = 100
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':100
                        }

req=requests.get(url, params_health_medical, headers=headers)

In [35]:
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [36]:
data = req.json()
posts_health_medical = data['businesses']
len(posts_health_medical)

50

In [37]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [38]:
len(names)

150

### Lifeline 3 - Health and Medical Offset 150

In [39]:
#offset = 150
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':150
                        }

req=requests.get(url, params_health_medical, headers=headers)

In [40]:
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [41]:
len(posts_health_medical)

50

In [42]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [43]:
len(names)

200

### Lifeline 3 - Health and Medical Offset 200

In [44]:
#offset = 200
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':200
                        }

In [45]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [46]:
len(posts_health_medical)

50

In [47]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [48]:
len(names)

250

### Lifeline 3 - Health and Medical Offset 250

In [49]:
#offset = 250
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':250
                        }

In [50]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [51]:
len(posts_health_medical)

50

In [52]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [53]:
len(names)

300

### Lifeline 3 - Health and Medical Offset 300

In [54]:
#offset = 300
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':300
                        }

In [55]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [56]:
len(posts_health_medical)

50

In [57]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [58]:
len(names)

350

### Lifeline 3 - Health and Medical Offset 350

In [59]:
#offset = 350
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':350
                        }

In [60]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [61]:
len(posts_health_medical)

50

In [62]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [63]:
len(names)

400

### Lifeline 3 - Health and Medical Offset 400

In [64]:
#offset = 400
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':400
                        }

In [65]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [66]:
len(posts_health_medical)

50

In [67]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [68]:
len(names)

450

### Lifeline 3 - Health and Medical Offset 450

In [69]:
#offset = 450
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':450
                        }

In [70]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [71]:
len(posts_health_medical)

50

In [72]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [73]:
len(names)

500

### Lifeline 3 - Health and Medical Offset 500

In [74]:
#offset = 500
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':500
                        }

In [75]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [76]:
len(posts_health_medical)

50

In [77]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [78]:
len(names)

550

### Lifeline 3 - Health and Medical Offset 550

In [79]:
#offset = 550
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':550
                        }

In [80]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [81]:
len(posts_health_medical)

50

In [82]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [83]:
len(names)

600

### Lifeline 3 - Health and Medical Offset 600

In [84]:
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':600
                        }

In [85]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [86]:
len(posts_health_medical)

50

In [87]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [88]:
len(names)

650

### Lifeline 3 - Health and Medical Offset 650

In [89]:
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':650
                        }

In [90]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [91]:
len(posts_health_medical)

50

In [92]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [93]:
len(names)

700

### Lifeline 3 - Health and Medical Offset 700

In [94]:
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':700
                        }

In [95]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [96]:
len(posts_health_medical)

50

In [97]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [98]:
len(names)

750

### Lifeline 3 - Health and Medical Offset 750

In [99]:
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':750
                        }

In [100]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [101]:
len(posts_health_medical)

50

In [102]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [103]:
len(names)

800

### Lifeline 3 - Health and Medical Offset 800

In [104]:
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':800
                        }

In [105]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [106]:
len(posts_health_medical)

50

In [107]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [108]:
len(names)

850

### Lifeline 3 - Health and Medical Offset 850

In [109]:
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':850
                        }

In [110]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [111]:
len(posts_health_medical)

50

In [112]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [113]:
len(names)

900

### Lifeline 3 - Health and Medical Offset 900

In [114]:
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':900
                        }

In [115]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [116]:
len(posts_health_medical)

50

In [117]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [118]:
len(names)

950

### Lifeline 3 - Health and Medical Offset 950

In [119]:
params_health_medical = {'categories':['emergencymedicine',
                                    'emergencyrooms',
                                    'hospitals',
                                      'medcenters'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                         'offset':950
                        }

In [120]:
req=requests.get(url, params_health_medical, headers=headers)
print('The status code is {}'.format(req.status_code))

The status code is 200


In [121]:
len(posts_health_medical)

50

In [122]:
for i in range(len(posts_health_medical)):
    names.append(posts_health_medical[i]['name'])
    categories.append(posts_health_medical[i]['categories'][0]['alias'])
    latitude.append(posts_health_medical[i]['coordinates']['latitude'])
    longitude.append(posts_health_medical[i]['coordinates']['longitude'])
    zipcode.append(posts_health_medical[i]['location']['zip_code'])

In [123]:
len(names)

1000

In [124]:
df_health_medical = pd.DataFrame(columns = ['Business', 'Category'])
df_health_medical['Business'] = names
df_health_medical['Category'] = categories
df_health_medical['Lifeline'] = 3
df_health_medical['Latitude'] = latitude
df_health_medical['Longitude'] = longitude
df_health_medical['Zipcode'] = zipcode
df_health_medical.head()

Unnamed: 0,Business,Category,Lifeline,Latitude,Longitude,Zipcode
0,Massachusetts General Hospital,hospitals,3,42.363155,-71.068834,2114
1,Brigham & Women's Hospital,hospitals,3,42.335929,-71.106716,2115
2,Boston Children's Hospital,hospitals,3,42.337273,-71.105997,2115
3,New England Baptist Hospital,hospitals,3,42.32919,-71.10626,2120
4,Fenway Health,medcenters,3,42.344077,-71.098988,2215


In [125]:
df_health_medical.shape

(1000, 6)

In [126]:
df_health_medical.to_csv('../datasets/health_medical_df.csv',index=False)

## (4) Energy (power, fuel)


In [127]:
#rework params to not inclue restaurants like kung fu tea
params_power_fuel = {'categories':['fueldocks',
                                   'servicestations',
                                   'utilities',
                                   'electricitysuppliers',
                                   'naturalgassuppliers',
                                   'watersuppliers'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000
                        }

In [128]:
req=requests.get(url, params_power_fuel, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [129]:
data = req.json()
posts_power_fuel = data['businesses']
len(posts_power_fuel)

50

In [130]:
names = []
categories = []
latitude = []
longitude = []
zipcode = []
for i in range(len(posts_power_fuel)):
    names.append(posts_power_fuel[i]['name'])
    categories.append(posts_power_fuel[i]['categories'][0]['alias'])
    latitude.append(posts_power_fuel[i]['coordinates']['latitude'])
    longitude.append(posts_power_fuel[i]['coordinates']['longitude'])
    zipcode.append(posts_power_fuel[i]['location']['zip_code'])

In [131]:
len(names)

50

### Lifeline 4 - Energy Offset 50

In [132]:
params_power_fuel = {'categories':['fueldocks',
                                   'servicestations',
                                   'utilities',
                                   'electricitysuppliers',
                                   'naturalgassuppliers',
                                   'watersuppliers'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                       'offset':50
                        }

In [133]:
req=requests.get(url, params_power_fuel, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [134]:
data = req.json()
posts_power_fuel = data['businesses']
len(posts_power_fuel)

50

In [135]:
for i in range(len(posts_power_fuel)):
    names.append(posts_power_fuel[i]['name'])
    categories.append(posts_power_fuel[i]['categories'][0]['alias'])
    latitude.append(posts_power_fuel[i]['coordinates']['latitude'])
    longitude.append(posts_power_fuel[i]['coordinates']['longitude'])
    zipcode.append(posts_power_fuel[i]['location']['zip_code'])

In [136]:
len(names)

100

### Lifeline 4 - Energy Offset 100

In [137]:
params_power_fuel = {'categories':['fueldocks',
                                   'servicestations',
                                   'utilities',
                                   'electricitysuppliers',
                                   'naturalgassuppliers',
                                   'watersuppliers'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                       'offset':100
                        }

In [138]:
req=requests.get(url, params_power_fuel, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [139]:
data = req.json()
posts_power_fuel = data['businesses']
len(posts_power_fuel)

50

In [140]:
for i in range(len(posts_power_fuel)):
    names.append(posts_power_fuel[i]['name'])
    categories.append(posts_power_fuel[i]['categories'][0]['alias'])
    latitude.append(posts_power_fuel[i]['coordinates']['latitude'])
    longitude.append(posts_power_fuel[i]['coordinates']['longitude'])
    zipcode.append(posts_power_fuel[i]['location']['zip_code'])

In [141]:
len(names)

150

### Lifeline 4 - Energy Offset 150

In [142]:
params_power_fuel = {'categories':['fueldocks',
                                   'servicestations',
                                   'utilities',
                                   'electricitysuppliers',
                                   'naturalgassuppliers',
                                   'watersuppliers'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                       'offset':150
                        }

In [143]:
req=requests.get(url, params_power_fuel, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [144]:
data = req.json()
posts_power_fuel = data['businesses']
len(posts_power_fuel)

50

In [145]:
for i in range(len(posts_power_fuel)):
    names.append(posts_power_fuel[i]['name'])
    categories.append(posts_power_fuel[i]['categories'][0]['alias'])
    latitude.append(posts_power_fuel[i]['coordinates']['latitude'])
    longitude.append(posts_power_fuel[i]['coordinates']['longitude'])
    zipcode.append(posts_power_fuel[i]['location']['zip_code'])

In [146]:
len(names)

200

In [147]:
df_power_fuel = pd.DataFrame(columns = ['Business', 'Category'])
df_power_fuel['Business'] = names
df_power_fuel['Category'] = categories
df_power_fuel['Lifeline'] = 4
df_power_fuel['Latitude'] = latitude
df_power_fuel['Longitude'] = longitude
df_power_fuel['Zipcode'] = zipcode
df_power_fuel.head()

Unnamed: 0,Business,Category,Lifeline,Latitude,Longitude,Zipcode
0,Columbia Road Gulf Service,servicestations,4,42.320594,-71.056486,2125
1,Magazine Beach Shell Service,autorepair,4,42.356675,-71.114257,2139
2,Nick's Auto Service and Repair,autorepair,4,42.407815,-71.109323,2155
3,Foreign & Domestic Auto Service,servicestations,4,42.411106,-71.120628,2155
4,A-Z Auto Center Gas,servicestations,4,42.344437,-71.1418,2135


In [148]:
df_power_fuel.to_csv('../datasets/power_fuel_df.csv',index=False)

## (5) Communication


In [149]:
#rework params to not inclue restaurants like kung fu tea
params_communications = {'categories':['telecommunications',
                                   'printmedia',
                                   'radiostations',
                                   'televisionstations',],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000
                        }

In [150]:
req=requests.get(url, params_communications, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [151]:
data = req.json()
posts_communications = data['businesses']
len(posts_communications)

50

In [152]:
names = []
categories = []
latitude = []
longitude = []
zipcode = []

for i in range(len(posts_communications)):
    names.append(posts_communications[i]['name'])
    categories.append(posts_communications[i]['categories'][0]['alias'])
    latitude.append(posts_communications[i]['coordinates']['latitude'])
    longitude.append(posts_communications[i]['coordinates']['longitude'])
    zipcode.append(posts_communications[i]['location']['zip_code'])

In [153]:
len(names)

50

### Lifeline 5 - Communication Offset 50

In [154]:
params_communications = {'categories':['telecommunications',
                                   'printmedia',
                                   'radiostations',
                                   'televisionstations',],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                       'offset':50
                        }

In [155]:
req=requests.get(url, params_communications, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [156]:
data = req.json()
posts_communications = data['businesses']
len(posts_communications)

50

In [157]:
for i in range(len(posts_communications)):
    names.append(posts_communications[i]['name'])
    categories.append(posts_communications[i]['categories'][0]['alias'])
    latitude.append(posts_communications[i]['coordinates']['latitude'])
    longitude.append(posts_communications[i]['coordinates']['longitude'])
    zipcode.append(posts_communications[i]['location']['zip_code'])

In [158]:
len(names)

100

### Lifeline 5 - Communication Offset 100

In [159]:
params_communications = {'categories':['telecommunications',
                                   'printmedia',
                                   'radiostations',
                                   'televisionstations',],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                       'offset':100
                        }

In [160]:
req=requests.get(url, params_communications, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [161]:
data = req.json()
posts_communications = data['businesses']
len(posts_communications)

35

In [162]:
for i in range(len(posts_communications)):
    names.append(posts_communications[i]['name'])
    categories.append(posts_communications[i]['categories'][0]['alias'])
    latitude.append(posts_communications[i]['coordinates']['latitude'])
    longitude.append(posts_communications[i]['coordinates']['longitude'])
    zipcode.append(posts_communications[i]['location']['zip_code'])

In [163]:
len(names)

135

In [164]:
df_communication = pd.DataFrame(columns=['Business','Category'])
df_communication['Business'] = names
df_communication['Category'] = categories
df_communication['Lifeline'] = 5
df_communication['Latitude'] = latitude
df_communication['Longitude'] = longitude
df_communication['Zipcode'] = zipcode
df_communication.head()

Unnamed: 0,Business,Category,Lifeline,Latitude,Longitude,Zipcode
0,WERS 88.9FM,radiostations,5,42.352707,-71.064118,2108
1,WBUR 90.9 FM,radiostations,5,42.350547,-71.115528,2215
2,The David Pakman Show,radiostations,5,42.31001,-71.11348,2130
3,WMBR 88.1 FM,radiostations,5,42.359175,-71.087723,2142
4,Weekly Dig,printmedia,5,42.34347,-71.0635,2118


In [165]:
df_communication.to_csv('../datasets/communication_df.csv',index=False)

## (6) Transportation


In [166]:
#rework params to not inclue restaurants like kung fu tea
params_transportation = {'categories':['airlines',
                                   'busstations',
                                   'ferries',
                                   'metrostations',
                                   'publictransport',
                                   'trains',
                                   'taxis'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000
                        }

In [167]:
req=requests.get(url, params_transportation, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [168]:
data = req.json()
posts_transportation = data['businesses']
len(posts_transportation)

50

In [169]:
names = []
categories = []
latitude = []
longitude = []
zipcode = []
for i in range(len(posts_transportation)):
    names.append(posts_transportation[i]['name'])
    categories.append(posts_transportation[i]['categories'][0]['alias'])
    latitude.append(posts_transportation[i]['coordinates']['latitude'])
    longitude.append(posts_transportation[i]['coordinates']['longitude'])
    zipcode.append(posts_transportation[i]['location']['zip_code'])

In [170]:
len(names)

50

### Lifeline 6 - Transportation Offset 50

In [171]:
params_transportation = {'categories':['airlines',
                                   'busstations',
                                   'ferries',
                                   'metrostations',
                                   'publictransport',
                                   'trains',
                                   'taxis'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                       'offset':50
                        }

In [172]:
req=requests.get(url, params_transportation, headers=headers)

# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [173]:
data = req.json()
posts_transportation = data['businesses']
len(posts_transportation)

50

In [174]:
for i in range(len(posts_transportation)):
    names.append(posts_transportation[i]['name'])
    categories.append(posts_transportation[i]['categories'][0]['alias'])
    latitude.append(posts_transportation[i]['coordinates']['latitude'])
    longitude.append(posts_transportation[i]['coordinates']['longitude'])
    zipcode.append(posts_transportation[i]['location']['zip_code'])

In [175]:
len(names)

100

### Lifeline 6 - Transportation Offset 100

In [176]:
params_transportation = {'categories':['airlines',
                                   'busstations',
                                   'ferries',
                                   'metrostations',
                                   'publictransport',
                                   'trains',
                                   'taxis'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                       'offset':100
                        }

In [177]:
req=requests.get(url, params_transportation, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [178]:
data = req.json()
posts_transportation = data['businesses']
len(posts_transportation)

50

In [179]:
for i in range(len(posts_transportation)):
    names.append(posts_transportation[i]['name'])
    categories.append(posts_transportation[i]['categories'][0]['alias'])
    latitude.append(posts_transportation[i]['coordinates']['latitude'])
    longitude.append(posts_transportation[i]['coordinates']['longitude'])
    zipcode.append(posts_transportation[i]['location']['zip_code'])

In [180]:
len(names)

150

## Lifeline 6 - Transportation Offset 150

In [181]:
params_transportation = {'categories':['airlines',
                                   'busstations',
                                   'ferries',
                                   'metrostations',
                                   'publictransport',
                                   'trains',
                                   'taxis'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                       'offset':150
                        }

In [182]:
req=requests.get(url, params_transportation, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [183]:
data = req.json()
posts_transportation = data['businesses']
len(posts_transportation)

50

In [184]:
for i in range(len(posts_transportation)):
    names.append(posts_transportation[i]['name'])
    categories.append(posts_transportation[i]['categories'][0]['alias'])
    latitude.append(posts_transportation[i]['coordinates']['latitude'])
    longitude.append(posts_transportation[i]['coordinates']['longitude'])
    zipcode.append(posts_transportation[i]['location']['zip_code'])

In [185]:
len(names)

200

### Lifeline 6 - Transportation Offset 200

In [186]:
params_transportation = {'categories':['airlines',
                                   'busstations',
                                   'ferries',
                                   'metrostations',
                                   'publictransport',
                                   'trains',
                                   'taxis'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                       'offset':200
                        }

In [187]:
req=requests.get(url, params_transportation, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [188]:
data = req.json()
posts_transportation = data['businesses']
len(posts_transportation)

50

In [189]:
for i in range(len(posts_transportation)):
    names.append(posts_transportation[i]['name'])
    categories.append(posts_transportation[i]['categories'][0]['alias'])
    latitude.append(posts_transportation[i]['coordinates']['latitude'])
    longitude.append(posts_transportation[i]['coordinates']['longitude'])
    zipcode.append(posts_transportation[i]['location']['zip_code'])

In [190]:
len(names)

250

### Lifeline 6 - Transportation Offset 250 

In [191]:
params_transportation = {'categories':['airlines',
                                   'busstations',
                                   'ferries',
                                   'metrostations',
                                   'publictransport',
                                   'trains',
                                   'taxis'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                       'offset':250
                        }

In [192]:
req=requests.get(url, params_transportation, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [193]:
data = req.json()
posts_transportation = data['businesses']
len(posts_transportation)

50

In [194]:
for i in range(len(posts_transportation)):
    names.append(posts_transportation[i]['name'])
    categories.append(posts_transportation[i]['categories'][0]['alias'])
    latitude.append(posts_transportation[i]['coordinates']['latitude'])
    longitude.append(posts_transportation[i]['coordinates']['longitude'])
    zipcode.append(posts_transportation[i]['location']['zip_code'])

In [195]:
len(names)

300

### Lifeline 6 - Transportation Offset 300

In [196]:
params_transportation = {'categories':['airlines',
                                   'busstations',
                                   'ferries',
                                   'metrostations',
                                   'publictransport',
                                   'trains',
                                   'taxis'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                       'offset':300
                        }

In [197]:
req=requests.get(url, params_transportation, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [198]:
data = req.json()
posts_transportation = data['businesses']
len(posts_transportation)

50

In [199]:
for i in range(len(posts_transportation)):
    names.append(posts_transportation[i]['name'])
    categories.append(posts_transportation[i]['categories'][0]['alias'])
    latitude.append(posts_transportation[i]['coordinates']['latitude'])
    longitude.append(posts_transportation[i]['coordinates']['longitude'])
    zipcode.append(posts_transportation[i]['location']['zip_code'])

In [200]:
len(names)

350

### Lifeline 6 - Transportation Offset 350

In [201]:
params_transportation = {'categories':['airlines',
                                   'busstations',
                                   'ferries',
                                   'metrostations',
                                   'publictransport',
                                   'trains',
                                   'taxis'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000,
                       'offset':350
                        }

In [202]:
req=requests.get(url, params_transportation, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [203]:
data = req.json()
posts_transportation = data['businesses']
len(posts_transportation)

36

In [204]:
for i in range(len(posts_transportation)):
    names.append(posts_transportation[i]['name'])
    categories.append(posts_transportation[i]['categories'][0]['alias'])
    latitude.append(posts_transportation[i]['coordinates']['latitude'])
    longitude.append(posts_transportation[i]['coordinates']['longitude'])
    zipcode.append(posts_transportation[i]['location']['zip_code'])

In [205]:
len(names)

386

In [206]:
df_transportation = pd.DataFrame(columns=['Business','Category'])
df_transportation['Business'] = names
df_transportation['Category'] = categories
df_transportation['Lifeline'] = 6
df_transportation['Latitude'] = latitude
df_transportation['Longitude'] = longitude
df_transportation['Zipcode'] = zipcode
df_transportation.head()

Unnamed: 0,Business,Category,Lifeline,Latitude,Longitude,Zipcode
0,JetBlue,airlines,6,42.366658,-71.01616,2128
1,Amtrak,publictransport,6,42.332908,-71.06081,2111
2,South Station Bus Terminal,busstations,6,42.351095,-71.056176,2111
3,Uber,taxis,6,42.334556,-71.066119,2108
4,Provincetown Fast Ferry,ferries,6,42.360188,-71.049113,2110


In [207]:
df_transportation.to_csv('../datasets/transportation_df.csv',index=False)

## (7) Hazardous waste

In [208]:
#rework params to not inclue restaurants like kung fu tea
params_hazardous_waste = {'categories':['biohazardcleanup',
                                   'hazardouswasteddisposal'],
                       'location':'Boston',
                       'limit': 50,
                       'radius':40000}

In [209]:
req=requests.get(url, params_hazardous_waste, headers=headers)
# proceed only if the status code is 200
print('The status code is {}'.format(req.status_code))

The status code is 200


In [210]:
data = req.json()
posts_hazardous_waste = data['businesses']
len(posts_hazardous_waste)

1

In [211]:
names = []
categories = []
latitude = []
longitude = []
zipcode = []
for i in range(len(posts_hazardous_waste)):
    names.append(posts_hazardous_waste[i]['name'])
    categories.append(posts_hazardous_waste[i]['categories'][0]['alias'])
    latitude.append(posts_hazardous_waste[i]['coordinates']['latitude'])
    longitude.append(posts_hazardous_waste[i]['coordinates']['longitude'])
    zipcode.append(posts_hazardous_waste[i]['location']['zip_code'])

In [212]:
len(names)

1

In [213]:
df_hazardous_waste = pd.DataFrame(columns=['Business','Category'])
df_hazardous_waste['Business'] = names
df_hazardous_waste['Category'] = categories
df_hazardous_waste['Lifeline'] = 7
df_hazardous_waste['Latitude'] = latitude
df_hazardous_waste['Longitude'] = longitude
df_hazardous_waste['Zipcode'] = zipcode
df_hazardous_waste.head()

Unnamed: 0,Business,Category,Lifeline,Latitude,Longitude,Zipcode
0,Flood Fire Pro,damagerestoration,7,42.058138,-71.392596,2038


In [214]:
df_hazardous_waste.to_csv('../datasets/waste_df.csv',index=False)

In [237]:
df_hazardous_waste[['Zipcode']]

Unnamed: 0,Zipcode
0,2038


### Combining Lifeline DataFrames

In [252]:
df = pd.concat([df_safety,
          df_food_shelter,
          df_health_medical,
          df_power_fuel,
          df_communication,
          df_transportation,
          df_hazardous_waste])

In [253]:
df.head()

Unnamed: 0,Business,Category,Lifeline,Latitude,Longitude,Zipcode
0,Boston Police Department,policedepartments,1,42.361798,-71.060297,2114
1,Harvard University Police Department,policedepartments,1,42.369935,-71.112013,2138
2,Boston Police Headquarters,policedepartments,1,42.334077,-71.090885,2120
3,Somerville Fire Department,firedepartments,1,42.390857,-71.091024,2145
4,Cambridge Police Department,policedepartments,1,42.36726,-71.086151,2141


In [254]:
df.shape

(1768, 6)

In [255]:
df.head()

Unnamed: 0,Business,Category,Lifeline,Latitude,Longitude,Zipcode
0,Boston Police Department,policedepartments,1,42.361798,-71.060297,2114
1,Harvard University Police Department,policedepartments,1,42.369935,-71.112013,2138
2,Boston Police Headquarters,policedepartments,1,42.334077,-71.090885,2120
3,Somerville Fire Department,firedepartments,1,42.390857,-71.091024,2145
4,Cambridge Police Department,policedepartments,1,42.36726,-71.086151,2141


In [257]:
df[['Zipcode']].head()

Unnamed: 0,Zipcode
0,2114
1,2138
2,2120
3,2145
4,2141


In [258]:
df.shape

(1768, 6)

In [259]:
df.isnull().sum()

Business     0
Category     0
Lifeline     0
Latitude     2
Longitude    2
Zipcode      0
dtype: int64

In [260]:
df = df.dropna()

In [261]:
df.shape

(1766, 6)

In [263]:
df.drop_duplicates()
df.shape

(1766, 6)

In [264]:
df.columns

Index(['Business', 'Category', 'Lifeline', 'Latitude', 'Longitude', 'Zipcode'], dtype='object')

In [265]:
df['Business'].value_counts()

MinuteClinic                            54
T-Mobile                                29
Massachusetts General Hospital          21
Partners Urgent Care                    20
Beth Israel Deaconess Medical Center    19
                                        ..
Logan Airport Express                    1
Boston Airport Taxi Cab Service          1
Milton Cab                               1
Flood Fire Pro                           1
Exxon Tiger Mart                         1
Name: Business, Length: 827, dtype: int64

In [266]:
df['Category'].value_counts()

medcenters                    459
hospitals                     267
servicestations               157
taxis                         146
publictransport                95
familydr                       58
walkinclinics                  54
metrostations                  47
radiostations                  44
urgent_care                    42
airlines                       42
printmedia                     40
mobilephones                   33
autorepair                     25
physicians                     21
dermatology                    19
airport_shuttles               19
physicaltherapy                18
cosmeticsurgeons               18
optometrists                   18
medicaltransportation          18
limos                          17
policedepartments              14
animalshelters                 12
communitycenters                9
televisionstations              8
telecommunications              5
convenience                     5
trainstations                   5
trains        

In [267]:
df['Lifeline'].value_counts()

3    1000
6     384
4     200
5     135
2      28
1      18
7       1
Name: Lifeline, dtype: int64

In [268]:
df.shape

(1766, 6)

In [277]:
df['Zipcode'].notnull().value_counts()

True    1766
Name: Zipcode, dtype: int64

In [278]:
df.to_csv('../datasets/lifelines_raw_df.csv',index= False)