# The Battle of Neighborhoods
## Introduction
Having access to health care is an important part of the modern age life. The goal of the project is to create platfrom for exploring medical centers in the city of Melbourne. 
We are trying to answer the following questions:
1. What are the available facilities in a certain neigbourhood. 
2. Where is the closest place that provides a specific service (like an eye doctor) 

## Data
We will use the Foursquare database to get geographical location of different facilities.

Australian Postcode Location Data has been used, which is available at:
http://www.corra.com.au/australian-postcode-location-data/

Department of Health and Human Services hospital database has been used, which is available at:
https://discover.data.vic.gov.au/dataset/hospital-locations-spatial

## Methodology
### Exploratory Analysis

In [2]:
!pip install folium

Collecting folium
  Downloading folium-0.11.0-py2.py3-none-any.whl (93 kB)
Collecting branca>=0.3.0
  Downloading branca-0.4.1-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.1 folium-0.11.0


In [3]:
import numpy as np
import pandas as pd
import folium

from pandas.io.json import json_normalize

In [5]:
#Read location data
loc_df = pd.read_csv('Australian_Post_Codes_Lat_Lon.csv')
#Keep only victoria
loc_df = loc_df[loc_df['state'] == 'VIC'].reset_index(drop = True)
#Remove unnecessary columns
loc_df = loc_df.drop(['type','dc','state','postcode'],axis=1)
loc_df = loc_df.groupby(['suburb']).first().reset_index()
loc_df.head()

Unnamed: 0,suburb,lat,lon
0,ABBEYARD,-36.976415,146.782515
1,ABBOTSFORD,-37.801781,144.998752
2,ABECKETT STREET,-37.809696,144.959314
3,ABERFELDIE,-37.75669,144.896259
4,ABERFELDY,-37.696566,146.364064


In [6]:
#Setting up Foursquare
CLIENT_ID = '33BAAMRJEAWBNA2YV0H4XEN3ARYAO2K31NVGJ0PRM4TJCFNW' # your Foursquare ID
CLIENT_SECRET = 'QNFBDYE33KAIMZEG3MV5Y1YEZAGVPLMMWDCXLWVMHXOA40FP' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30



#### Creat a map of Melbourne (with markers for subburbs that have Melbourne in them)

In [7]:
import requests
# create map of Melbourne using latitude and longitude values
map = folium.Map(location=[-37.814563, 144.970267], zoom_start=10)

df = loc_df[loc_df['suburb'].str.contains('MELBOURNE')].reset_index(drop = True)
# add markers to map
for lat, lng, sub in zip(df['lat'], df['lon'], df['suburb']):
    label = sub
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map)  
    
map

In [8]:
### Visulaize Hospitals in Melbourne

In [9]:
search_query = 'Hospital'
suburb = 'MELBOURNE'
latitude = loc_df[loc_df['suburb'] == suburb]['lat'].values[0]
longitude = loc_df[loc_df['suburb'] == suburb]['lon'].values[0]
radius = 500


url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
results = requests.get(url).json()
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)[['name','location.lat','location.lng','categories']]
dataframe.head()


  dataframe = json_normalize(venues)[['name','location.lat','location.lng','categories']]


Unnamed: 0,name,location.lat,location.lng,categories
0,Dr.J.Delgado Memorial Hospital,-37.815608,144.972043,"[{'id': '4bf58dd8d48988d196941735', 'name': 'H..."
1,Asian Medical Hospital,-37.815609,144.972044,"[{'id': '4bf58dd8d48988d196941735', 'name': 'H..."
2,Central Hospital,-37.812964,144.968565,"[{'id': '4bf58dd8d48988d196941735', 'name': 'H..."
3,Hospitality Training Victoria,-37.815527,144.965723,"[{'id': '4bf58dd8d48988d1a2941735', 'name': 'C..."
4,Hospitality Training Australia,-37.815673,144.965747,"[{'id': '4bf58dd8d48988d124941735', 'name': 'O..."


In [10]:
#Cleaning the categories column

def clean_category(x):
    if len(x) == 0:
        x_cleaned = -1
    else:
        x_cleaned = x[0]['name']
        
    return x_cleaned
dataframe['categories'] = dataframe['categories'].apply(clean_category)
dataframe.head()

Unnamed: 0,name,location.lat,location.lng,categories
0,Dr.J.Delgado Memorial Hospital,-37.815608,144.972043,Hospital
1,Asian Medical Hospital,-37.815609,144.972044,Hospital
2,Central Hospital,-37.812964,144.968565,Hospital
3,Hospitality Training Victoria,-37.815527,144.965723,Community College
4,Hospitality Training Australia,-37.815673,144.965747,Office


In [11]:
#Keep only hospitals
dataframe = dataframe[dataframe['categories'] == 'Hospital']

#Visualize
map = folium.Map(location=[latitude, longitude], zoom_start=15)

# add markers to map
for lat, lng, name in zip(dataframe['location.lat'], dataframe['location.lng'], dataframe['name']):
    label = name
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map)  
    
map

### Add pharmecies to the map in red

In [12]:
search_query = 'Pharmacy'
suburb = 'MELBOURNE'
latitude = loc_df[loc_df['suburb'] == suburb]['lat'].values[0]
longitude = loc_df[loc_df['suburb'] == suburb]['lon'].values[0]
radius = 500


url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
results = requests.get(url).json()
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)[['name','location.lat','location.lng','categories']]
dataframe['categories'] = dataframe['categories'].apply(clean_category)

# add markers to map
for lat, lng, name in zip(dataframe['location.lat'], dataframe['location.lng'], dataframe['name']):
    label = name
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map)  
    
map

  dataframe = json_normalize(venues)[['name','location.lat','location.lng','categories']]


## Visualize medical centers in a suburb

In [13]:
def filter_results(dataframe,queries):
    df_filtered = pd.DataFrame()
    
    for query in queries:
        df_filtered = df_filtered.append(dataframe[dataframe['categories'] == query] , ignore_index=True)
    
    df_filtered = df_filtered.groupby(['name']).first().reset_index()
    return df_filtered

search_queries = ['Acupuncturist','Alternative Healer','Chiropractor',"Dentist's Office","Doctor's Office",'Eye Doctor',
                 'Hospital','Maternity Clinic','Medical Lab','Mental Health Office','Nutritionist','Physical Therapist',
                 'Rehab Center','Veterinarian','Medical Center']
suburb = 'DONCASTER'
latitude = loc_df[loc_df['suburb'] == suburb]['lat'].values[0]
longitude = loc_df[loc_df['suburb'] == suburb]['lon'].values[0]
radius = 1000
df = pd.DataFrame()
for search_query in search_queries:
    url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
    results = requests.get(url).json()
    # assign relevant part of JSON to venues
    venues = results['response']['venues']
    
    try:
        # tranform venues into a dataframe
        dataframe = json_normalize(venues)[['name','location.lat','location.lng','categories']]
        dataframe['categories'] = dataframe['categories'].apply(clean_category)
        
        df = df.append(dataframe, ignore_index=True)
    except:
        pass

#Remove irrelavant results
df = filter_results(df,search_queries)
df

  dataframe = json_normalize(venues)[['name','location.lat','location.lng','categories']]


Unnamed: 0,name,location.lat,location.lng,categories
0,Eye surgery Associates,-37.773645,145.116016,Doctor's Office
1,Myhealth Medical Centre,-37.785149,145.125294,Doctor's Office
2,St George Specialist Clinic For Women,-37.775124,145.124096,Medical Center
3,Vic Medical Doctors,-37.786453,145.1254,Doctor's Office


In [14]:
#Visualize
map = folium.Map(location=[latitude, longitude], zoom_start=14)

# add markers to map
for lat, lng, name, cat in zip(df['location.lat'], df['location.lng'], df['name'], df['categories']):
    label = name + '-' + cat
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map)  
    
map

## Finding the closest medical centers (a hospital as an example) to a location

In [15]:
search_query = 'Hospital'
latitude = -37.783031
longitude = 145.122517
radius = 3000


url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
results = requests.get(url).json()
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)[['id','name','location.lat','location.lng','categories','location.distance']]
dataframe['categories'] = dataframe['categories'].apply(clean_category)
dataframe = dataframe[dataframe['categories'] == 'Hospital']
dataframe

  dataframe = json_normalize(venues)[['id','name','location.lat','location.lng','categories','location.distance']]


Unnamed: 0,id,name,location.lat,location.lng,categories,location.distance
0,4b9ed98cf964a5201b0637e3,Epworth Eastern Hospital,-37.8146,145.119292,Hospital,3525
1,4b0e4609f964a520835623e3,Box Hill Hospital,-37.81395,145.118819,Hospital,3457
2,51206544e4b027095e549294,Birralee - Box Hill Hospital,-37.810331,145.117596,Hospital,3069
3,504d84e3e4b086b5995081c3,Box Hill Hospital - Delivery Suites,-37.812478,145.120139,Hospital,3284
6,504fbce1e4b0fb6d8f9c44cf,Box Hill Hospital - Operating Theatre,-37.813229,145.119902,Hospital,3369
7,4e783fbe7d8b90e441f7a208,4 West Box Hill Hospital,-37.813769,145.119029,Hospital,3435
11,4babf1faf964a520e8d73ae3,Epworth Hospital - NeuroDiagnostics Unit,-37.814646,145.118769,Hospital,3534
12,5223df5411d2bc86b1b8fdc4,"Eastern Health Care, Boxhill Hospital",-37.817425,145.11763,Hospital,3852


In [16]:
#A function that retrieves rating for a venue
def get_rating(venue_id):
    
    url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION)
    result = requests.get(url).json()
    try:
        rating = result['response']['venue']['rating']
    except:
        rating = 'NA'
        
    return rating

In [17]:
#Create results dataframe
df = pd.DataFrame()
df['Name'] = dataframe['name']
df['Category'] = dataframe['categories']
df['Distance'] = dataframe['location.distance']
df['Rating'] = dataframe['id'].apply(get_rating)
df

Unnamed: 0,Name,Category,Distance,Rating
0,Epworth Eastern Hospital,Hospital,3525,
1,Box Hill Hospital,Hospital,3457,
2,Birralee - Box Hill Hospital,Hospital,3069,
3,Box Hill Hospital - Delivery Suites,Hospital,3284,
6,Box Hill Hospital - Operating Theatre,Hospital,3369,
7,4 West Box Hill Hospital,Hospital,3435,
11,Epworth Hospital - NeuroDiagnostics Unit,Hospital,3534,
12,"Eastern Health Care, Boxhill Hospital",Hospital,3852,


### Remove redundant rows and add private or public tag

In [18]:
hospitals_df = pd.read_csv('Hospital_Locations.csv')
hospitals_df.head()

FileNotFoundError: [Errno 2] File Hospital_Locations.csv does not exist: 'Hospital_Locations.csv'

In [None]:
df = df.merge(hospitals_df[['LabelName','Type']], 'inner', left_on = 'Name', right_on ='LabelName').drop(['LabelName'], axis=1)
df

## Results 
We have retrieved the required infromation using variuos databases. This includes creating a map of healthcare centers in a suburb and creating a list of closest service providers.

## Discussion
There seems to be not enough information about medical centers on Foursquare. Most of the venues seem to be unranked. Adding another database which is more popular amongst users, and has a wider range of information on user experience, might be useful.

## Conclusion
The goal of this project was to aid users in finding the best medical center near them. We did this by providing a list of different centers in each neighbourhood and creating a list of closest centers to each location.