<h1>A Survey of Hospital Density in Wisconsin<\h1>

<h2>Problem Statement<\h2>

When doing initial testing, healthcare manufacturers often choose to do test runs of their equipment at hospitals which are close by. This allows the manufacturer to quickly address any issues that may arise.

However, it is equally important to obtain feedback from users in the field at this stage. To accomplish this, equipment is tested at several local hospitals at a time to maximize feedback.

The goal of this project is to determine which hospitals that a Wisconsin manufacturer should test at in this phase weighing both the density of hospitals in the region and their distance from the manufacturer.

<h2>The Data<\h2>

In this notebook, will be using Foursquare to acquire the locations of Wisconsin hospitals. This is used to create a heat-map of hospitals within the state. This information will then be weighed against the distance from the manufacturer to determine which area would be the most productive to test in.

<h2>Code<\2>

<h3>Install Packages if Needed<\h3>

In [1]:
!conda install -c conda-forge geopy --yes 
!conda install -c conda-forge folium=0.5.0 --yes

Solving environment: done


  current version: 4.5.11
  latest version: 4.7.12

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geopy-1.20.0               |             py_0          57 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    ------------------------------------------------------------
                                           Total:          91 KB

The following NEW packages will be INSTALLED:

    geographiclib: 1.50-py_0   conda-forge
    geopy:         1.20.0-py_0 conda-forge


Downloading and Extracting Packages
geopy-1.20.0         | 57 KB     | ##################################### | 100% 
geographiclib-1.50   | 34 KB     | ##

<h3>Import Libraries<\h3>

In [12]:
import requests
import pandas as pd
import numpy as np
import random
import folium

from IPython.display import Image 
from IPython.core.display import HTML 
from pandas.io.json import json_normalize
from geopy.geocoders import Nominatim

from sklearn.cluster import KMeans
from sklearn.datasets.samples_generator import make_blobs

<h3>Data Acquisition<\h3>

<b>Manufacturer Information</b>

In [3]:
address = 'Milwaukee, WI'

#get latitude and longitude of the manufacturer address
geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

print(address, "is located at", latitude, longitude)

Milwaukee, WI is located at 43.0349931 -87.922497


<b>Build the query</b>

In [4]:
# Set the query info
CLIENT_ID = '0LEQXT1Y141HZDYJP5PIUFCCRYTAMQZRABPJOAQ0X5ENU35K' # Foursquare ID
CLIENT_SECRET = 'ADIQGNWNC5JYDXE5I4RF0SCN5VDKYHRRYMOMDK4PQ3DGHUNM' #Foursquare Secret
VERSION = '20180604'
QUERY = 'Hospital'
RADIUS = 10000
LIMIT = 50

In [5]:
# Set the base foursquare api info
foursquare_url = 'https://api.foursquare.com/v2/venues/'
client_info = 'client_id=' + CLIENT_ID + '&client_secret=' + CLIENT_SECRET

In [6]:
# Build the url used to search for hospitals near the manufacturer
url = foursquare_url + 'search?' + client_info
url += '&ll=' + str(latitude) +',' + str(longitude)
url += '&v=' + VERSION
url += '&query=' + QUERY
url += '&radius=' + str(RADIUS)
url += '&limit=' + str(LIMIT)
print("URL generated:\n", url)

URL generated:
 https://api.foursquare.com/v2/venues/search?client_id=0LEQXT1Y141HZDYJP5PIUFCCRYTAMQZRABPJOAQ0X5ENU35K&client_secret=ADIQGNWNC5JYDXE5I4RF0SCN5VDKYHRRYMOMDK4PQ3DGHUNM&ll=43.0349931,-87.922497&v=20180604&query=Hospital&radius=10000&limit=50


<b>Run the Query</b>

In [7]:
results = requests.get(url).json()
if results['meta']['code'] == 200 or results['meta']['code'] == 201:
    df = json_normalize(results['response']['venues'])
    print(len(df), "hospitals found.")
else:
    print("Unable to connect!") 

50 hospitals found.


In [8]:
df = df.rename(columns={'name': 'label', 'location.lat': 'lat', 'location.lng': 'lng'})
hospital_locations = df[['lat', 'lng']]

<h3>Data Analysis<\h3>

In [40]:
# Fit the coordinates into k groups
k = 5
kmeans = KMeans(n_clusters=k)
kmeans.fit(coordinates)

KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300,
    n_clusters=5, n_init=10, n_jobs=None, precompute_distances='auto',
    random_state=None, tol=0.0001, verbose=0)

In [62]:
# Get group and centroid info
groups = kmeans.labels_
centroids = pd.DataFrame(kmeans.cluster_centers_)

In [67]:
#Convert centroids to Dataframe
colors = {0: 'blue', 1: 'green', 2: 'purple', 3: 'orange', 4:'black'}
i_to_lat_lng = {0: 'lat', 1: 'lng'}

centroids = centroids.rename(columns=i_to_lat_lng)
centroids['color'] = colors.values()

Unnamed: 0,lat,lng,color
0,42.975127,-87.876397,blue
1,43.040342,-88.022384,green
2,43.054947,-87.919557,purple
3,42.994242,-87.939706,orange
4,43.040015,-87.978226,black


In [68]:
# Add group assignment to the dataframe
grouped_locations = hospital_locations.copy()
grouped_locations['cluster'] = groups

In [69]:
# Convert group number to color
grouped_locations['cluster'] = grouped_locations['cluster'].apply(lambda x: colors[x])

In [70]:
#determine the size of each group
unique, counts = np.unique(grouped_locations['cluster'], return_counts=True)

for i in range(0, len(unique)):
    print("Size of", unique[i], "Group:", counts[i])

Size of black Group: 10
Size of blue Group: 1
Size of green Group: 17
Size of orange Group: 15
Size of purple Group: 7


<h2>Visual Generation<\h2>

<h3>Heat Map of Area Hospitals</h3>

In [43]:
#initialize map
area_map = folium.Map(location=[latitude, longitude], zoom_start=11)

# add a red circle marker to represent the manufacturer
folium.features.CircleMarker([latitude, longitude], radius=8, color='red', fill=True, 
                             fill_color='red', fill_opacity=0.6).add_to(area_map)

<folium.features.CircleMarker at 0x7fa4e651f828>

In [44]:
# add blue circle markers to represent nearby hospitals
for i in range(0, len(hospital_locations)):
    folium.features.CircleMarker([hospital_locations.lat[i], hospital_locations.lng[i]], radius=2,
                                 color='blue', fill=True, fill_color='blue', 
                                 fill_opacity=0.6).add_to(area_map)

In [45]:
#display heat map
area_map

<h3>Clusters</h3>

In [90]:
#initialize map
cluster_map = folium.Map(location=[latitude, longitude], zoom_start=11)

# add a red circle marker to represent the manufacturer
folium.features.CircleMarker([latitude, longitude], radius=8, color='red', 
                             fill=True, fill_color='red', fill_opacity=0.6).add_to(cluster_map)

<folium.features.CircleMarker at 0x7fa4e6164630>

In [91]:
#add centroids by color
#centroids will be darker and have a black outline
for i in range(0, len(centroids)):
    folium.features.CircleMarker([centroids.lat[i], centroids.lng[i]], radius=4,
                                 color='black', fill=True, 
                                 fill_color=centroids.color[i], 
                                 fill_opacity=1).add_to(cluster_map)

In [93]:
#add locations, colored according to their centroid
for i in range(0, len(grouped_locations)):
    folium.features.CircleMarker([grouped_locations.lat[i], grouped_locations.lng[i]], radius=4,
                                 color=grouped_locations.cluster[i], fill=True, 
                                 fill_color= grouped_locations.cluster[i], 
                                 fill_opacity=0.6, stroke=False).add_to(cluster_map)

In [94]:
cluster_map