# Capstone Project - Best location for developing resedential complex for senior citizens in Kolkata

## Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

In this project we will try to find an **optimal location** for building a **residential complex** exclusively for **senior citizens** in Kolkata. Specifically, this report will be targeted to a Real Estate company interested in **developing a property** in **Kolkata, West Bengal, India**.

We will try to find out such **location** which are in **vicinity of a hospital**. We are also particularly interested in **areas with maximum number of facilities in vicinity of ra**. We would also prefer locations **as close to city center as possible**, assuming that first two conditions are met.

We will use our data science powers to generate a few most promissing neighborhoods based on this criteria. Advantages of each area will then be clearly expressed so that best possible final location can be chosen by stakeholders.

## Data <a name="data"></a>

Based on definition of our problem, factors that will influence our decission are:
* number of and distance to nearby venues in the neighborhood, if any
* distance of location from city center

Following data sources will be needed to extract/generate the required information:
* centers of hospitals will be generated algorithmically and approximate location of hospitals will be obtained using **Foursquare API**
* number of nearby venues and their type and location in every neighborhood will be obtained using **Foursquare API**
* coordinates of Kolkata will be obtained using **Foursquare API** 

### Neighborhood Candidates

Let's create latitude & longitude coordinates for centroids of our candidate neighborhoods around hospital coordinates using **Foursquare API**.

Let's first find the latitude & longitude of Kolkata city center, using **Foursquare API**.

## Methodology <a name="methodology"></a>

In this project we will direct our efforts on detecting areas of hospitals in Kolkata that have most number of distinct ammenities/facilities in its vicinity. We will limit our analysis to area ~5km around city center.

In first step we will collect the required **data: location of hospitals within 5km from Kolkata center** (according to Foursquare categorization).

Second step in our analysis will be exploration of '**nearby venues**' across different hospitals of Kolkata - we will use **Foursquare** to find the areas close to the hospitals

In third and final step we will create **clusters of locations of hospitals**. We will take into consideration locations of hospitals with **most number of nearby venues in radius of 500 meters**. We will present map of all such locations and also create clusters (using **k-means clustering**) of those locations to identify the most favourable location for a residential complex for senior citizen near any hospital. As a result, it should be a starting point for final 'street level' exploration and search for optimal venue location by stakeholders.

## Import necessary Libraries

In [1]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

#!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

#!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Folium installed
Libraries imported.


## Define Foursquare Credentials and Version

In [2]:
# Removed for security purpose
CLIENT_ID = '55K3SIPM2J3RH2KJADQMALH0IXRZTUN0AVWTEJTLX415NM0L' # your Foursquare ID
CLIENT_SECRET = 'TE3E2CZY5GUH00ISLFQLM1KSGJZWFTU3EL4NTWZYTFZRCSI5' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 100

Let's start by converting the Kokata's location to its latitude and longitude coordinates.

In order to define an instance of the geocoder, we need to define a user_agent. We will name our agent foursquare_agent, as shown below.


In [3]:
address = 'Kolkata'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

22.54541245 88.3567751581234


## Search for a specific venue category
> `https://api.foursquare.com/v2/venues/`**search**`?client_id=`**CLIENT_ID**`&client_secret=`**CLIENT_SECRET**`&ll=`**LATITUDE**`,`**LONGITUDE**`&v=`**VERSION**`&query=`**QUERY**`&radius=`**RADIUS**`&limit=`**LIMIT**

In [4]:
search_query = 'Hospital'
radius = 5000
print(search_query + ' .... OK!')

Hospital .... OK!


#### Define the corresponding URL

In [5]:
## Removed for security purpose
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=55K3SIPM2J3RH2KJADQMALH0IXRZTUN0AVWTEJTLX415NM0L&client_secret=TE3E2CZY5GUH00ISLFQLM1KSGJZWFTU3EL4NTWZYTFZRCSI5&ll=22.54541245,88.3567751581234&v=20180604&query=Hospital&radius=5000&limit=100'

### Let's get the geojson

In [6]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5eab6e166001fe001b7e9b30'},
 'response': {'venues': [{'id': '4e04abee81dc9d212d106c85',
    'name': 'Sambhunath Pandit Hospital',
    'location': {'address': '1, Lala Lajpat Rai Sarani',
     'lat': 22.538611179064645,
     'lng': 88.34813475608826,
     'labeledLatLngs': [{'label': 'display',
       'lat': 22.538611179064645,
       'lng': 88.34813475608826}],
     'distance': 1167,
     'postalCode': '700020',
     'cc': 'IN',
     'city': 'Kolkata',
     'state': 'West Bengal',
     'country': 'India',
     'formattedAddress': ['1, Lala Lajpat Rai Sarani',
      'Kolkata 700020',
      'West Bengal',
      'India']},
    'categories': [{'id': '4bf58dd8d48988d196941735',
      'name': 'Hospital',
      'pluralName': 'Hospitals',
      'shortName': 'Hospital',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/building/medical_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1588293149',
    'hasPerk': False}

#### Get relevant part of JSON and transform it into a *pandas* dataframe

In [7]:
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = pd.json_normalize(venues)
dataframe.head()

Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.lat,location.lng,location.labeledLatLngs,location.distance,location.postalCode,location.cc,location.city,location.state,location.country,location.formattedAddress,location.crossStreet
0,4e04abee81dc9d212d106c85,Sambhunath Pandit Hospital,"[{'id': '4bf58dd8d48988d196941735', 'name': 'H...",v-1588293149,False,"1, Lala Lajpat Rai Sarani",22.538611,88.348135,"[{'label': 'display', 'lat': 22.53861117906464...",1167,700020,IN,Kolkata,West Bengal,India,"[1, Lala Lajpat Rai Sarani, Kolkata 700020, We...",
1,4e16b775d4c0c7a8fbb4b1c6,Dr. R. Ahmed Dental College and Hospital,"[{'id': '4d4b7105d754a06372d81259', 'name': 'C...",v-1588293149,False,"114, Acharya Jagadish Chandra Bose Rd",22.550755,88.368959,"[{'label': 'display', 'lat': 22.55075471877546...",1386,700014,IN,Kolkata,West Bengal,India,"[114, Acharya Jagadish Chandra Bose Rd, Kolkat...",
2,51df6356498ed1df58526826,Achintya Mohan Homoeopathic Hospital,"[{'id': '4bf58dd8d48988d177941735', 'name': 'D...",v-1588293149,False,"21, Justice Dwarka Nath Rd",22.533567,88.34792,"[{'label': 'display', 'lat': 22.53356721100715...",1602,700020,IN,Kolkata,West Bengal,India,"[21, Justice Dwarka Nath Rd, Kolkata 700020, W...",
3,506d9a73e4b08e51334fb585,Nil Ratan Sarkar Medical College and Hospital,"[{'id': '4bf58dd8d48988d1b3941735', 'name': 'M...",v-1588293149,False,"138, Acharya Jagadish Chandra Bose Rd",22.553502,88.37546,"[{'label': 'display', 'lat': 22.55350187773104...",2121,700014,IN,Kolkata,West Bengal,India,"[138, Acharya Jagadish Chandra Bose Rd, Kolkat...",
4,5056998ce4b0d505c3dfc70e,Priyamvada Birla Aravind Eye Hospital,"[{'id': '522e32fae4b09b556e370f19', 'name': 'E...",v-1588293149,False,"10, U. N. Brahmachari St",22.542949,88.356026,"[{'label': 'display', 'lat': 22.54294942923132...",284,700017,IN,Kolkata,West Bengal,India,"[10, U. N. Brahmachari St, Kolkata 700017, Wes...",


#### Get relevant part of JSON and transform it into a *pandas* dataframe

In [8]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered.head()

Unnamed: 0,name,categories,address,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,crossStreet,id
0,Sambhunath Pandit Hospital,Hospital,"1, Lala Lajpat Rai Sarani",22.538611,88.348135,"[{'label': 'display', 'lat': 22.53861117906464...",1167,700020,IN,Kolkata,West Bengal,India,"[1, Lala Lajpat Rai Sarani, Kolkata 700020, We...",,4e04abee81dc9d212d106c85
1,Dr. R. Ahmed Dental College and Hospital,College & University,"114, Acharya Jagadish Chandra Bose Rd",22.550755,88.368959,"[{'label': 'display', 'lat': 22.55075471877546...",1386,700014,IN,Kolkata,West Bengal,India,"[114, Acharya Jagadish Chandra Bose Rd, Kolkat...",,4e16b775d4c0c7a8fbb4b1c6
2,Achintya Mohan Homoeopathic Hospital,Doctor's Office,"21, Justice Dwarka Nath Rd",22.533567,88.34792,"[{'label': 'display', 'lat': 22.53356721100715...",1602,700020,IN,Kolkata,West Bengal,India,"[21, Justice Dwarka Nath Rd, Kolkata 700020, W...",,51df6356498ed1df58526826
3,Nil Ratan Sarkar Medical College and Hospital,Medical School,"138, Acharya Jagadish Chandra Bose Rd",22.553502,88.37546,"[{'label': 'display', 'lat': 22.55350187773104...",2121,700014,IN,Kolkata,West Bengal,India,"[138, Acharya Jagadish Chandra Bose Rd, Kolkat...",,506d9a73e4b08e51334fb585
4,Priyamvada Birla Aravind Eye Hospital,Eye Doctor,"10, U. N. Brahmachari St",22.542949,88.356026,"[{'label': 'display', 'lat': 22.54294942923132...",284,700017,IN,Kolkata,West Bengal,India,"[10, U. N. Brahmachari St, Kolkata 700017, Wes...",,5056998ce4b0d505c3dfc70e


#### Let's visualize the Hospitals that are nearby

In [9]:
dataframe_filtered.drop(['crossStreet'],axis =1,inplace=True)
dataframe_filtered.dropna(axis =0,inplace=True)

In [10]:
dataframe_filtered=dataframe_filtered.sort_values('name',axis =0).reset_index()
dataframe_filtered

Unnamed: 0,index,name,categories,address,lat,lng,labeledLatLngs,distance,postalCode,cc,city,state,country,formattedAddress,id
0,44,AMRI Hospital,Hospital,Gariahat Rd,22.512438,88.367629,"[{'label': 'display', 'lat': 22.51243801897641...",3836,700031,IN,Kolkata,West Bengal,India,"[Gariahat Rd, Kolkata 700031, West Bengal, India]",53ba349f498e1b9ab9738bf9
1,2,Achintya Mohan Homoeopathic Hospital,Doctor's Office,"21, Justice Dwarka Nath Rd",22.533567,88.34792,"[{'label': 'display', 'lat': 22.53356721100715...",1602,700020,IN,Kolkata,West Bengal,India,"[21, Justice Dwarka Nath Rd, Kolkata 700020, W...",51df6356498ed1df58526826
2,24,All India Blind Welfare Week Netaji Eye Hospital,Office,"28A, Harish Mukherjee Rd",22.536718,88.344079,"[{'label': 'display', 'lat': 22.53671847394578...",1624,700025,IN,Kolkata,West Bengal,India,"[28A, Harish Mukherjee Rd, Kolkata 700025, Wes...",529be61a11d29e0ad00f1ff3
3,26,Anandalok Hospital,Hospital,"67/A, Ashutosh Mukherjee Rd",22.53109,88.34585,"[{'label': 'display', 'lat': 22.53108975254625...",1950,700025,IN,Kolkata,West Bengal,India,"[67/A, Ashutosh Mukherjee Rd, Kolkata 700025, ...",52f393de11d224c5df1f6d0a
4,45,B. P. Poddar Hospital & Medical Research Limited,Hospital,"71/1, Humayun Kabir Sarani",22.510899,88.333038,"[{'label': 'display', 'lat': 22.51089852582056...",4551,700053,IN,Kolkata,West Bengal,India,"[71/1, Humayun Kabir Sarani, Kolkata 700053, W...",53e9b832498ef14e0421c890
5,31,B. R. Singh Railway Hospital,Hospital,Sealdah,22.565513,88.370612,"[{'label': 'display', 'lat': 22.56551253159502...",2651,700014,IN,Kolkata,West Bengal,India,"[Sealdah, Kolkata 700014, West Bengal, India]",5239191e7e48cf67c0c003f7
6,29,Belle Vue Clinic,Hospital,"9, Dr. U. N. Brahmachari St",22.542595,88.354872,"[{'label': 'display', 'lat': 22.54259469354197...",369,700017,IN,Kolkata,West Bengal,India,"[9, Dr. U. N. Brahmachari St, Kolkata 700017, ...",4da7d40afa8c4175d0ab668e
7,12,Bengal Infertility & Reproductive Therapy Hosp...,Hospital,Umananda Rd,22.537452,88.350141,"[{'label': 'display', 'lat': 22.53745177639009...",1118,700020,IN,Kolkata,West Bengal,India,"[Umananda Rd, Kolkata 700020, West Bengal, India]",5299479e11d29a7d0d92ac31
8,30,Bhagirathi Neotia Woman & Child Care Centre,Hospital,"2, Rawdon St",22.547544,88.35933,"[{'label': 'display', 'lat': 22.54754419875704...",354,700017,IN,Kolkata,West Bengal,India,"[2, Rawdon St, Kolkata 700017, West Bengal, In...",4cb1a5d6c5e6a1cdea45dff6
9,21,Bhowanipore Charitable Homoeopathic Hospital,Doctor's Office,"1B, Dr. Rajendra Rd",22.535311,88.34711,"[{'label': 'display', 'lat': 22.53531131510519...",1500,700020,IN,Kolkata,West Bengal,India,"[1B, Dr. Rajendra Rd, Kolkata 700020, West Ben...",529ddb59498e2aee9db2fd47


In [11]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=12.4) 
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Kolkata',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.name):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)
venues_map

## Explore the nearby locations of each hospital location using Foursquare
> `https://api.foursquare.com/v2/venues/`**explore**`?client_id=`**CLIENT_ID**`&client_secret=`**CLIENT_SECRET**`&ll=`**LATITUDE**`,`**LONGITUDE**`&v=`**VERSION**`&limit=`**LIMIT**

In [12]:
def getNearbyVenues(names, latitudes, longitudes, radius=500, LIMIT=100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Hospital', 
                  'Hospital Latitude', 
                  'Hospital Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [13]:
kolkata_venues = getNearbyVenues(names=dataframe_filtered['name'],
                                   latitudes=dataframe_filtered['lat'],
                                   longitudes=dataframe_filtered['lng']
                                  )

AMRI Hospital
Achintya Mohan Homoeopathic Hospital
All India Blind Welfare Week Netaji Eye Hospital
Anandalok Hospital
B. P. Poddar Hospital & Medical Research Limited
B. R. Singh Railway Hospital
Belle Vue Clinic
Bengal Infertility & Reproductive Therapy Hospital Pvt. Ltd.
Bhagirathi Neotia Woman & Child Care Centre
Bhowanipore Charitable Homoeopathic Hospital
Calcutta Lions Bimal Poddar Eye Hospital
Chittaranjan National Cancer Institute
Chittaranjan Shishu Sadan Hospital
Command Hospital
Currae Eye Care Hospital
Disha Eye Hospital
Dr. R. Ahmed Dental College and Hospital
Flemming Hospital
Fortis Hospital
GDDI Hospital
Good Samaritan Hospital
Islamia Hospital - School of Nursing
Islamia hospital
Lady Dufferin Hospital
M. R. Bangur Hospital
Maa ENT Hospital
Medical College and Hospital
Motor Hospital & Co
Narayana Superspeciality Hospital
Nightingale Hospital
Nil Ratan Sarkar Medical College and Hospital
Park Circus Charitable Hospital
Priyamvada Birla Aravind Eye Hospital
R. G. Kar M

In [14]:
kolkata_venues.shape

(378, 7)

In [15]:
kolkata_venues.head()

Unnamed: 0,Hospital,Hospital Latitude,Hospital Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,AMRI Hospital,22.512438,88.367629,Aminia,22.516571,88.366739,Mughlai Restaurant
1,AMRI Hospital,22.512438,88.367629,Golpark,22.516224,88.366218,Plaza
2,AMRI Hospital,22.512438,88.367629,Dakshinapan Shopping Complex,22.508749,88.366601,Shopping Mall
3,AMRI Hospital,22.512438,88.367629,Cafe Coffee Day,22.516449,88.366905,Café
4,AMRI Hospital,22.512438,88.367629,Dolly's Tea,22.509053,88.366717,Tea Room


## Analysis <a name="analysis"></a>

Let's perform some basic explanatory data analysis and derive some additional info from our raw data. 
#### Let's count the number of venues of various categories

In [16]:
venue_count=pd.DataFrame(kolkata_venues.groupby(['Hospital','Venue Category']).count())
venue_count.sort_index()
venue_count

Unnamed: 0_level_0,Unnamed: 1_level_0,Hospital Latitude,Hospital Longitude,Venue,Venue Latitude,Venue Longitude
Hospital,Venue Category,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
AMRI Hospital,Café,1,1,1,1,1
AMRI Hospital,Mughlai Restaurant,1,1,1,1,1
AMRI Hospital,Plaza,1,1,1,1,1
AMRI Hospital,Shopping Mall,1,1,1,1,1
AMRI Hospital,Tea Room,1,1,1,1,1
...,...,...,...,...,...,...
Tra General Hospital,Tex-Mex Restaurant,1,1,1,1,1
Tra General Hospital,Train Station,1,1,1,1,1
Woodlands Multispeciality Hospital Limited,South Indian Restaurant,1,1,1,1,1
id hospital,Clothing Store,1,1,1,1,1


#### Let's see how many various categories of vanues we have nearby

In [17]:
print('There are {} uniques categories.'.format(len(kolkata_venues['Venue Category'].unique())))

There are 73 uniques categories.


In [18]:
venue_count=venue_count.reset_index()
venue_count

Unnamed: 0,Hospital,Venue Category,Hospital Latitude,Hospital Longitude,Venue,Venue Latitude,Venue Longitude
0,AMRI Hospital,Café,1,1,1,1,1
1,AMRI Hospital,Mughlai Restaurant,1,1,1,1,1
2,AMRI Hospital,Plaza,1,1,1,1,1
3,AMRI Hospital,Shopping Mall,1,1,1,1,1
4,AMRI Hospital,Tea Room,1,1,1,1,1
...,...,...,...,...,...,...,...
305,Tra General Hospital,Tex-Mex Restaurant,1,1,1,1,1
306,Tra General Hospital,Train Station,1,1,1,1,1
307,Woodlands Multispeciality Hospital Limited,South Indian Restaurant,1,1,1,1,1
308,id hospital,Clothing Store,1,1,1,1,1


#### Count the total number of distinct venues nearby each hospital

In [19]:
venue_count.groupby('Hospital').count()
venue_count_data=venue_count.groupby('Hospital').count().reset_index()
venue_count_data

Unnamed: 0,Hospital,Venue Category,Hospital Latitude,Hospital Longitude,Venue,Venue Latitude,Venue Longitude
0,AMRI Hospital,5,5,5,5,5,5
1,Achintya Mohan Homoeopathic Hospital,5,5,5,5,5,5
2,All India Blind Welfare Week Netaji Eye Hospital,5,5,5,5,5,5
3,Anandalok Hospital,4,4,4,4,4,4
4,B. P. Poddar Hospital & Medical Research Limited,3,3,3,3,3,3
5,B. R. Singh Railway Hospital,4,4,4,4,4,4
6,Belle Vue Clinic,11,11,11,11,11,11
7,Bengal Infertility & Reproductive Therapy Hosp...,14,14,14,14,14,14
8,Bhagirathi Neotia Woman & Child Care Centre,8,8,8,8,8,8
9,Bhowanipore Charitable Homoeopathic Hospital,8,8,8,8,8,8


#### Using K-means clustering for forming clusters of locations

In [20]:
from sklearn.cluster import KMeans
# set number of clusters
kclusters = 5

venue_count_data_clustering = venue_count_data.drop('Hospital', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(venue_count_data_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([0, 0, 0, 0, 4, 0, 3, 1, 3, 3, 4, 4, 4, 0, 2, 1, 0, 0, 0, 0, 3, 1,
       0, 4, 0, 1, 4, 4, 4, 1, 0, 1, 3, 4, 0, 2, 0, 1, 4, 0, 1, 4, 4])

#### Adding clusters to venue count dataframe

In [21]:
# add clustering labels
#venue_count_data.drop('Cluster Labels',axis=1,inplace=True)
venue_count_data.insert(0, 'Cluster Labels', kmeans.labels_)

#kolkata_merged['Hospital Name','Latitude','Longitude','Cluster Labels'] = dataframe_filtered['name','lat','lng','Cluster Labels']
venue_count_data

Unnamed: 0,Cluster Labels,Hospital,Venue Category,Hospital Latitude,Hospital Longitude,Venue,Venue Latitude,Venue Longitude
0,0,AMRI Hospital,5,5,5,5,5,5
1,0,Achintya Mohan Homoeopathic Hospital,5,5,5,5,5,5
2,0,All India Blind Welfare Week Netaji Eye Hospital,5,5,5,5,5,5
3,0,Anandalok Hospital,4,4,4,4,4,4
4,4,B. P. Poddar Hospital & Medical Research Limited,3,3,3,3,3,3
5,0,B. R. Singh Railway Hospital,4,4,4,4,4,4
6,3,Belle Vue Clinic,11,11,11,11,11,11
7,1,Bengal Infertility & Reproductive Therapy Hosp...,14,14,14,14,14,14
8,3,Bhagirathi Neotia Woman & Child Care Centre,8,8,8,8,8,8
9,3,Bhowanipore Charitable Homoeopathic Hospital,8,8,8,8,8,8


#### Let's visualize the clusters on amap

In [22]:
import folium
import matplotlib.cm as cm
import matplotlib.colors as colors
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12.4)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, name, cluster in zip(dataframe_filtered['lat'], dataframe_filtered['lng'], dataframe_filtered['name'], venue_count_data['Cluster Labels']):
    label = folium.Popup(str(name) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
    #print(rainbow[int(cluster)-1],cluster)
    
legend_html =   '''
                <div style="position: fixed; 
                            bottom: 20px; right: 10px; width: 170px; height: 200px; 
                            border:2px solid black; z-index:9999; font-size:12px;
                            "><h5>&nbsp; Favourability Legend </h5><br>
                              &nbsp; Highly Favourable &nbsp; <i class="fa fa-map-marker fa-2x" style="color:#8000ff"></i><br>
                              &nbsp; Favourable &nbsp; <i class="fa fa-map-marker fa-2x" style="color:#00b5eb"></i><br>
                              &nbsp; Moderately Favourable &nbsp; <i class="fa fa-map-marker fa-2x" style="color:#80ffb4"></i><br>
                              &nbsp; Unfavourable &nbsp; <i class="fa fa-map-marker fa-2x" style="color:#ffb360"></i><br>
                              &nbsp; Highly Unfavourable &nbsp; <i class="fa fa-map-marker fa-2x" style="color:#ff0000"></i><br>
                </div>
                ''' 
map_clusters.get_root().html.add_child(folium.Element(legend_html))
map_clusters

## Results and Discussion <a name="results"></a>

From our above analysis, we found that most of the hospitals are situated nearer to the city centre. We also found that most of the favourable locations are also situated nearer to the city centre. These locations have 10 to 20 different kinds of venues in the radius of 500m. 

we can also see that highly favourable and highly unfavourable locations are situated next to each other. This is an anomaly  There are various reasons for it. Either the hospital is not so popular or the hospital is situated on the border of a military facility where resedential construction is not allowed. The hospital is nearer to a railway station and the hospital belongs to railways and no outsider resedential construction is allowed on railway property. The hospital might be a military hospital where only army or their families are allowed.

We can also see that all the hospitals farther from city centre are unfavourable for property development as they lack good number of venues in their vicinity.

## Conclusion <a name="conclusion"></a>

Purpose of this project was to identify areas of Hospitals in Kolkata having maximum number of distinct nearby venues in order to aid stakeholders in narrowing down the search for optimal location for a new residential complex for senior citizens. By calculating number of nearby distict vicinity we have created a map showing the locations starting from most to least favourable for building the residential complex. These locations will provide a good idea to the stakeholders to choose the starting points for final exploration.

Final decission on optimal location will be made by stakeholders based on specific characteristics of neighborhoods and locations in every recommended hospital zone, taking into consideration additional factors like proximity to major roads, real estate prices, social and economic dynamics of every neighborhood, crime record of that perticular location etc.