# IBM Data Science Capstone Project

## This notebook serves as the data analysis and data visualization portion of the final project.
#### The analysis will focus upon urgent care centers in two different Pacific Northwest Cities in North America - Seattle Washington and Vancouver British Columbia. This notebook will include markup language that will help describe process and rationale for the appropriate code in a step-by-step format.  

#### 1) To start, I import appropriate libraries including pandas, numpy, matplotlib, json, geopy, and folium amongst others to allow for certain functionality. 

In [213]:
# main libraries for use:
import pandas as pd
import numpy as np
import json
import matplotlib
!pip install lxml


# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

import requests # library to handle requests
from pandas.io.json import json_normalize 

#geopy library and folium library:
!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim

!conda install -c conda-forge folium=0.5.0 --yes
import folium 

# import k-means from clustering stage
from sklearn.cluster import KMeans

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 

print('Libraries imported.')


Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Libraries imported.


In [214]:
!pip install beautifulsoup4



### Data relating to neighborhoods in the two cities was found via Wikipedia.

In [215]:
initial_frame_Sea = pd.read_html("https://www.zipcodestogo.com/Washington/")

In [216]:
df = pd.concat(initial_frame_Sea)

In [217]:
df.shape

(744, 4)

In [218]:
df.tail(50)

Unnamed: 0,0,1,2,3
693,99260,Spokane,Spokane,View Map
694,99299,Spokane,Spokane,View Map
695,99301,Pasco,Franklin,View Map
696,99302,Pasco,Franklin,View Map
697,99320,Benton City,Benton,View Map
698,99321,Beverly,Grant,View Map
699,99322,Bickleton,Klickitat,View Map
700,99323,Burbank,Walla Walla,View Map
701,99324,College Place,Walla Walla,View Map
702,99326,Connell,Franklin,View Map


In [219]:
df=df.dropna()

In [220]:
df.head()

Unnamed: 0,0,1,2,3
0,Zip Codes for the State of Washington,Zip Codes for the State of Washington,Zip Codes for the State of Washington,Zip Codes for the State of Washington
1,Zip Code,City,County,Zip Code Map
2,98001,Auburn,King,View Map
3,98002,Auburn,King,View Map
4,98003,Federal Way,King,View Map


#### Remove rows 1 and 2 via indexing and then re-index. Also, delete column 3 and then rename the columns.

In [221]:
df.drop([0, 1], inplace = True)

In [222]:
df = df.reset_index(drop=True)

In [223]:
df.head()

Unnamed: 0,0,1,2,3
0,98001,Auburn,King,View Map
1,98002,Auburn,King,View Map
2,98003,Federal Way,King,View Map
3,98004,Bellevue,King,View Map
4,98005,Bellevue,King,View Map


In [224]:
df1 = df

In [225]:
df1 = df1.drop(3, 1)

In [226]:
df1.head()

Unnamed: 0,0,1,2
0,98001,Auburn,King
1,98002,Auburn,King
2,98003,Federal Way,King
3,98004,Bellevue,King
4,98005,Bellevue,King


### Define Foursquare Client Connection:

##### In this case we will assign a LIMIT of 30

In [259]:
CLIENT_ID = 'M4CDQNDGONDDCYDKOZGUFEGINFNFAQDGGO5XRVFKLLYDSZUL' # your Foursquare ID
CLIENT_SECRET = '3DTEVCMIHSLQ1TFHNGH44VBHMSPP2V4N420WV5W1RJVCFJNY' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 10
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: M4CDQNDGONDDCYDKOZGUFEGINFNFAQDGGO5XRVFKLLYDSZUL
CLIENT_SECRET:3DTEVCMIHSLQ1TFHNGH44VBHMSPP2V4N420WV5W1RJVCFJNY


### For purposes of this project, we will start by considering the primary city center of Seattle, Washington.

In [260]:
address = 'Seattle, WA'

geolocator = Nominatim(user_agent="seattle_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(" The geographical coordinates of Seattle, WA are {}, {}.".format(latitude, longitude))

 The geographical coordinates of Seattle, WA are 47.6038321, -122.3300624.


### Now we will search the Foursquare API for "healthcare" venues within 1500 meters (approximately 1 mi) of this location.

#### 1) Start by defining the search query. 

In [261]:
search_query = 'medical center'
radius = 750
print(search_query + ' .... OK!')

medical center .... OK!


#### 2) Next, create the Fourssquare URL instance which will correspond to the search query just stated. 

In [262]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=M4CDQNDGONDDCYDKOZGUFEGINFNFAQDGGO5XRVFKLLYDSZUL&client_secret=3DTEVCMIHSLQ1TFHNGH44VBHMSPP2V4N420WV5W1RJVCFJNY&ll=47.6038321,-122.3300624&v=20180604&query=medical center&radius=750&limit=10'

#### 3) Utilize the GET request function (from the imported requests library) and assign to a results object.

In [263]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5eed6aa8618f43001b731448'},
 'response': {'venues': [{'id': '43598100f964a520f4281fe3',
    'name': 'Harborview Medical Center',
    'location': {'address': '325 9th Ave',
     'lat': 47.604166524573486,
     'lng': -122.32413768768309,
     'labeledLatLngs': [{'label': 'display',
       'lat': 47.604166524573486,
       'lng': -122.32413768768309}],
     'distance': 446,
     'postalCode': '98104',
     'cc': 'US',
     'city': 'Seattle',
     'state': 'WA',
     'country': 'United States',
     'formattedAddress': ['325 9th Ave',
      'Seattle, WA 98104',
      'United States']},
    'categories': [{'id': '4bf58dd8d48988d196941735',
      'name': 'Hospital',
      'pluralName': 'Hospitals',
      'shortName': 'Hospital',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/building/medical_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1592617611',
    'hasPerk': False},
   {'id': '467018ebf964a520c2471fe3'

#### From this set, we will want to obtain the "venues" and to make it readable, format into a dataframe with use of JSON.

In [264]:
venues = (results["response"]["venues"])

df_1 = pd.json_normalize(venues)
df_1.head(30)

Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.lat,location.lng,location.labeledLatLngs,location.distance,location.postalCode,location.cc,location.city,location.state,location.country,location.formattedAddress,location.neighborhood,location.crossStreet
0,43598100f964a520f4281fe3,Harborview Medical Center,"[{'id': '4bf58dd8d48988d196941735', 'name': 'H...",v-1592617611,False,325 9th Ave,47.604167,-122.324138,"[{'label': 'display', 'lat': 47.60416652457348...",446,98104.0,US,Seattle,WA,United States,"[325 9th Ave, Seattle, WA 98104, United States]",,
1,467018ebf964a520c2471fe3,Swedish Medical Center - First Hill Campus,"[{'id': '4bf58dd8d48988d196941735', 'name': 'H...",v-1592617611,False,747 Broadway,47.608403,-122.32189,"[{'label': 'display', 'lat': 47.60840275122101...",796,98122.0,US,Seattle,WA,United States,"[747 Broadway, Seattle, WA 98122, United States]",,
2,43680180f964a52078291fe3,Virginia Mason Hospital and Seattle Medical Ce...,"[{'id': '4bf58dd8d48988d196941735', 'name': 'H...",v-1592617611,False,1100 9th Ave,47.610128,-122.327232,"[{'label': 'display', 'lat': 47.61012849753076...",732,98101.0,US,Seattle,WA,United States,"[1100 9th Ave, Seattle, WA 98101, United States]",First Hill,
3,4e018329ae609fa8ede2d102,Harborview Medical Center - Maleng Building,"[{'id': '4bf58dd8d48988d196941735', 'name': 'H...",v-1592617611,False,410 9th Ave N,47.604522,-122.323825,"[{'label': 'display', 'lat': 47.60452215587606...",474,98109.0,US,Seattle,WA,United States,"[410 9th Ave N, Seattle, WA 98109, United States]",,
4,4bce3b9bc564ef3b87f5edf0,Pacific Medical Center - First Hill,"[{'id': '4bf58dd8d48988d104941735', 'name': 'M...",v-1592617611,False,1101 Madison St,47.60923,-122.323692,"[{'label': 'display', 'lat': 47.60923046303072...",767,98104.0,US,Seattle,WA,United States,"[1101 Madison St, Seattle, WA 98104, United St...",,
5,4ba3b584f964a520fd5638e3,Medical Center,"[{'id': '4bf58dd8d48988d124941735', 'name': 'O...",v-1592617611,False,"1215 4th Ave, Ste 710",47.607919,-122.335035,"[{'label': 'display', 'lat': 47.607919, 'lng':...",588,,US,Seattle,WA,United States,"[1215 4th Ave, Ste 710 (at U of W), Seattle, W...",,at U of W
6,4ad3a9e7f964a52074e520e3,Kaiser Permanente Downtown Medical Center,"[{'id': '4bf58dd8d48988d104941735', 'name': 'M...",v-1592617611,False,1420 5th Ave Ste 375,47.610434,-122.333809,"[{'label': 'display', 'lat': 47.61043418159325...",786,98101.0,US,Seattle,WA,United States,"[1420 5th Ave Ste 375 (Union Street), Seattle,...",Seattle Central Business District,Union Street
7,56e30123498ea3be46c29888,Harborview Medical Center Operating Room,"[{'id': '4bf58dd8d48988d196941735', 'name': 'H...",v-1592617611,False,,47.604532,-122.323897,"[{'label': 'display', 'lat': 47.60453201302538...",469,,US,Seattle,WA,United States,"[Seattle, WA, United States]",,
8,4fc7d795e4b0548822056bdc,Skybridge at Swedish Medical Center,"[{'id': '4bf58dd8d48988d104941735', 'name': 'M...",v-1592617611,False,1101 Madison Tower,47.609853,-122.323183,"[{'label': 'display', 'lat': 47.60985273569063...",846,98104.0,US,Seattle,WA,United States,"[1101 Madison Tower, Seattle, WA 98104, United...",,
9,4f5508dde4b065b5dbfa2f9e,Gift Shop at Swedish Medical Center,"[{'id': '4bf58dd8d48988d128951735', 'name': 'G...",v-1592617611,False,747 Broadway,47.608854,-122.321413,"[{'label': 'display', 'lat': 47.60885388408757...",856,98122.0,US,Seattle,WA,United States,"[747 Broadway (Swedish First Hill Campus), Sea...",,Swedish First Hill Campus


#### Notice that there is substantial additional information here that we will want to omit as it doesn't affect the analysis.  In this case, we will keep 

In [265]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in df_Sea_4Sq.columns if col.startswith('location.')] + ['id']
dataframe_filtered = df_Sea_4Sq.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered

Unnamed: 0,name,categories,address,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,neighborhood,crossStreet,id
0,Swedish Medical Center - First Hill Campus,Hospital,747 Broadway,47.608403,-122.32189,"[{'label': 'display', 'lat': 47.60840275122101...",796,98122.0,US,Seattle,WA,United States,"[747 Broadway, Seattle, WA 98122, United States]",,,467018ebf964a520c2471fe3
1,Harborview Medical Center,Hospital,325 9th Ave,47.604167,-122.324138,"[{'label': 'display', 'lat': 47.60416652457348...",446,98104.0,US,Seattle,WA,United States,"[325 9th Ave, Seattle, WA 98104, United States]",,,43598100f964a520f4281fe3
2,Virginia Mason Hospital and Seattle Medical Ce...,Hospital,1100 9th Ave,47.610128,-122.327232,"[{'label': 'display', 'lat': 47.61012849753076...",732,98101.0,US,Seattle,WA,United States,"[1100 9th Ave, Seattle, WA 98101, United States]",First Hill,,43680180f964a52078291fe3
3,Harborview Medical Center - Maleng Building,Hospital,410 9th Ave N,47.604522,-122.323825,"[{'label': 'display', 'lat': 47.60452215587606...",474,98109.0,US,Seattle,WA,United States,"[410 9th Ave N, Seattle, WA 98109, United States]",,,4e018329ae609fa8ede2d102
4,Pacific Medical Center - First Hill,Medical Center,1101 Madison St,47.60923,-122.323692,"[{'label': 'display', 'lat': 47.60923046303072...",767,98104.0,US,Seattle,WA,United States,"[1101 Madison St, Seattle, WA 98104, United St...",,,4bce3b9bc564ef3b87f5edf0
5,Medical Dental Building,Medical Center,509 Olive Way,47.612942,-122.337302,"[{'label': 'display', 'lat': 47.61294173595843...",1150,98101.0,US,Seattle,WA,United States,"[509 Olive Way (at 5th Ave), Seattle, WA 98101...",,at 5th Ave,4a8442d2f964a5200bfc1fe3
6,Medical Center,Office,"1215 4th Ave, Ste 710",47.607919,-122.335035,"[{'label': 'display', 'lat': 47.607919, 'lng':...",588,,US,Seattle,WA,United States,"[1215 4th Ave, Ste 710 (at U of W), Seattle, W...",,at U of W,4ba3b584f964a520fd5638e3
7,E J Nordstrom Medical Tower,Medical Center,1229 Madison St,47.609798,-122.322231,"[{'label': 'display', 'lat': 47.60979810520199...",886,98104.0,US,Seattle,WA,United States,"[1229 Madison St (at Summit Ave), Seattle, WA ...",,at Summit Ave,4b688bfef964a520287f2be3
8,Kaiser Permanente Downtown Medical Center,Medical Center,1420 5th Ave Ste 375,47.610434,-122.333809,"[{'label': 'display', 'lat': 47.61043418159325...",786,98101.0,US,Seattle,WA,United States,"[1420 5th Ave Ste 375 (Union Street), Seattle,...",Seattle Central Business District,Union Street,4ad3a9e7f964a52074e520e3
9,King County Medical Examiner,Medical Center,908 Jefferson St,47.605137,-122.323722,"[{'label': 'display', 'lat': 47.6051372798497,...",497,98104.0,US,Seattle,WA,United States,"[908 Jefferson St (Terry Ave.), Seattle, WA 98...",,Terry Ave.,4b3b8716f964a520367525e3


In [266]:
dataframe_filtered.name

0            Swedish Medical Center - First Hill Campus
1                             Harborview Medical Center
2     Virginia Mason Hospital and Seattle Medical Ce...
3           Harborview Medical Center - Maleng Building
4                   Pacific Medical Center - First Hill
5                               Medical Dental Building
6                                        Medical Center
7                           E J Nordstrom Medical Tower
8             Kaiser Permanente Downtown Medical Center
9                          King County Medical Examiner
10                   Swedish Medical Group / Providence
11          Swedish Medical Center - Cherry Hill Campus
12                  Gift Shop at Swedish Medical Center
13                  Skybridge at Swedish Medical Center
14                Minor & James Medical - OB/Gyn & Endo
15           Swedish Medical Cherry Hill Parking Garage
16                           Pike Market Medical Clinic
17                             Performance Home 

In [267]:
dataframe_filtered.categories.unique()

array(['Hospital', 'Medical Center', 'Office', 'Gift Shop',
       "Doctor's Office", 'Parking', 'Tourist Information Center',
       'Cafeteria', 'Medical Lab', 'General Entertainment',
       'Hospital Ward', 'Café', 'Business Center'], dtype=object)

In [268]:
dataframe_filtered.keys()

Index(['name', 'categories', 'address', 'lat', 'lng', 'labeledLatLngs',
       'distance', 'postalCode', 'cc', 'city', 'state', 'country',
       'formattedAddress', 'neighborhood', 'crossStreet', 'id'],
      dtype='object')

In [280]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=13) # generate map centred around Seattle City Center

# add a red circle marker to represent center of Seattle
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Seattle City Center',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

venues_map

In [275]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=15) # generate map centred around Ecco


# add Ecco as a red circle mark
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    popup='Ecco',
    fill=True,
    color='red',
    fill_color='red',
    fill_opacity=0.6
    ).add_to(venues_map)


# add popular spots to the map as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        fill=True,
        color='blue',
        fill_color='blue',
        fill_opacity=0.6
        ).add_to(venues_map)

# display map
venues_map