### DS_Capstone_Project_Delivery - Recommend City for a Residential living in Texas

This project involves collecting venues belonging to specific categories of two populor cities separately.

Cluster the venues and visualize it to get better clarity. 

Compare the results of these maps of two cities to recommend better place of interest for a residential living.

In [2]:
# Import the necessary packages
import requests
import pandas as pd

from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes
import folium
import numpy as np
import matplotlib.cm as cm
import matplotlib.colors as colors

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2019.11.28         |           py36_0         149 KB  conda-forge
    altair-4.0.1               |             py_0         575 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    branca-0.3.1               |             py_0          25 KB  conda-forge
    ca-certificates-2019.11.28 |       hecc5488_0         145 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    openssl-1.1.1d             |       h516909a_0         2.1 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         3.0 MB

The following NEW packages will be 

In [4]:
# The code was removed by Watson Studio for sharing.

In [9]:
#Set the user authentication details to form a foursquare request (which is in hidden cell)

# Get Nearby Venues based on specific categories
# Inputs to this function are Name of the city, Latitude of the city, Longitude of the city and radius for places to identify
# Return Value is Pandas dataframe that containes nearby venues

def getnearbyvenues(name, latitude, longitude, radius=1000):
    
    # List of category id - school, hospital, residential building
    cat_id = "4bf58dd8d48988d13b941735,4bf58dd8d48988d196941735,4d954b06a243a5684965b473"

    # create the API request URL
    url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&categoryId={}'.format(    
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            latitude, 
            longitude, 
            radius, 
            cat_id
            )
   
    # make the GET request
    results = requests.get(url).json()["response"]
    #print(results)
    venues_list = []
    # Fetch the necessary values from the response 
    for item in results['venues']:
          venues_list.append([item['name'],
                             item['categories'][0]['name'],
                             item['location']['distance'],
                             item['location']['lat'], 
                             item['location']['lng']
                            ])
    nearby_venues_df = pd.DataFrame(venues_list)  
    nearby_venues_df.columns = ["Name", "Category", "Distance", "Latitude", "Longitude"]
    return nearby_venues_df

In [10]:
# encode_and_cluster function will perform one hot encoder and modelling
# Clustering method is chosen and data is fitted to the model
# K-Means Cluster with number of clusters of 3 and 4 are tried

# Input to this function are nearby venues dataframe fetched from the foursqaure requests, number of clusters

# Return Value is encoded clustered dataframe with cluster column added at the last

def encode_and_cluster(input_df, nclusters):
    #one hot encoding
    encoded_venues = pd.get_dummies(input_df['Category'])
    encoded_venues['Distance'] = input_df['Distance']
    
    #kmeans clustering
    k1 = KMeans(n_clusters = nclusters)
    k1.fit(encoded_venues)
    k1.labels_
    encoded_venues['Name'] = input_df['Name']
    encoded_venues['Latitude'] = input_df['Latitude']
    encoded_venues['Longitude'] = input_df['Longitude']
    encoded_venues['Cluster'] = k1.labels_.tolist()
    
    #encoded_venues
    return encoded_venues

In [11]:
# draw_map function is to draw using folium maps to visualize the selected city
# Input to this function are clustered venues dataframe, city's latitude,
# city's longitude aand number of cluster (for color coding the clusters)
# Return Value is a map of the city showcasing the clusters in different colors

def draw_map(city_venues_df, city_lat, city_long, nclusters):
    city_map = folium.Map(location=[city_lat, city_long], zoom_start=14)

    # set color scheme for the clusters
    kclusters = nclusters
    x = np.arange(nclusters)
    ys = [i + x + (i*x)**2 for i in range(kclusters)]
    colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
    rainbow = [colors.rgb2hex(i) for i in colors_array]

    # add markers to the map
    markers_colors = []
    for lat, long, pos, cluster in zip(city_venues_df['Latitude'], city_venues_df['Longitude'], city_venues_df['Name'], city_venues_df['Cluster']):
        label = folium.Popup(str(pos) + ' Cluster ' + str(cluster), parse_html=True)
        folium.CircleMarker(
            [lat, long],
            radius=3,
            popup=label,
            color=rainbow[cluster-1],
            fill=True,
            fill_color=rainbow[cluster-1],
            fill_opacity=0.7).add_to(city_map)
    
    return city_map

In [12]:
# Fetch the data from foursquare site and select the categories of interest
# df1 refers to data source for "Austin" city
df1 = getnearbyvenues("Austin", "30.303", "-97.754")

# df2 refers to data source for "Houston" city
df2 = getnearbyvenues("Houston", "29.78", "-95.39")




In [13]:
# Encode and Cluster the fetched venues for Austin, Houston
# Number of Clusters used for Clustering is 3
austin_cluster_venues_nc3 = encode_and_cluster(df1, 3)
houston_cluster_venues_nc3= encode_and_cluster(df2, 3)

In [14]:
austin_cluster_venues_nc3

Unnamed: 0,Doctor's Office,Elementary School,High School,Hospital,Hospital Ward,Medical Center,Private School,Residential Building (Apartment / Condo),School,Distance,Name,Latitude,Longitude,Cluster
0,0,1,0,0,0,0,0,0,0,336,Brykerwoods Elementary,30.304727,-97.751121,2
1,0,0,0,0,0,1,0,0,0,833,Seton Medical Center Austin (Hospital),30.30481,-97.74559,1
2,0,0,0,1,0,0,0,0,0,431,Seton Medical Center Rm 526,30.303255,-97.749524,2
3,1,0,0,0,0,0,0,0,0,465,Snyder Dermatology,30.306006,-97.750636,2
4,0,0,0,1,0,0,0,0,0,745,SMCA Respiratory Dept.,30.303388,-97.746257,1
5,0,0,0,0,1,0,0,0,0,824,Seton Medical Center ICU,30.304463,-97.74559,1
6,0,0,0,1,0,0,0,0,0,834,Seton Medical Center IMC,30.305062,-97.745647,1
7,0,0,0,1,0,0,0,0,0,954,SMCA PAT,30.305821,-97.74462,0
8,0,0,0,1,0,0,0,0,0,891,SMCA PACU,30.305225,-97.74509,1
9,0,0,0,1,0,0,0,0,0,689,Bailey Square 4th Floor Waiting Room,30.302537,-97.746851,1


In [15]:
austin_cluster_venues_nc3[austin_cluster_venues_nc3['Cluster'] == 0]

Unnamed: 0,Doctor's Office,Elementary School,High School,Hospital,Hospital Ward,Medical Center,Private School,Residential Building (Apartment / Condo),School,Distance,Name,Latitude,Longitude,Cluster
7,0,0,0,1,0,0,0,0,0,954,SMCA PAT,30.305821,-97.74462,0
11,0,0,0,1,0,0,0,0,0,958,SMCA Pre-Op,30.305551,-97.744475,0
19,0,0,0,0,0,0,0,0,1,991,Child craft school,30.29845,-97.745129,0
20,0,0,0,0,0,0,0,1,0,1030,Minnieapolis,30.310788,-97.748211,0
21,0,0,0,0,0,0,0,1,0,1075,Pecan Square,30.297297,-97.744971,0
22,0,0,0,0,0,0,0,1,0,1079,Shoal Creek Park,30.311519,-97.748642,0
23,0,0,0,0,0,0,0,1,0,999,Richardson at Tarrytown,30.308037,-97.762604,0
25,0,0,0,0,0,0,0,1,0,1000,West University Place,30.297344,-97.745912,0


From the above cluster, Austin has 2 hospital, 1 school, 5 residential at a distance of 900m and above.

In [16]:
austin_cluster_venues_nc3[austin_cluster_venues_nc3['Cluster'] == 1]

Unnamed: 0,Doctor's Office,Elementary School,High School,Hospital,Hospital Ward,Medical Center,Private School,Residential Building (Apartment / Condo),School,Distance,Name,Latitude,Longitude,Cluster
1,0,0,0,0,0,1,0,0,0,833,Seton Medical Center Austin (Hospital),30.30481,-97.74559,1
4,0,0,0,1,0,0,0,0,0,745,SMCA Respiratory Dept.,30.303388,-97.746257,1
5,0,0,0,0,1,0,0,0,0,824,Seton Medical Center ICU,30.304463,-97.74559,1
6,0,0,0,1,0,0,0,0,0,834,Seton Medical Center IMC,30.305062,-97.745647,1
8,0,0,0,1,0,0,0,0,0,891,SMCA PACU,30.305225,-97.74509,1
9,0,0,0,1,0,0,0,0,0,689,Bailey Square 4th Floor Waiting Room,30.302537,-97.746851,1
10,0,0,0,1,0,0,0,0,0,797,Seton Medical Center - Maternity Ward,30.305066,-97.746049,1
13,0,0,0,1,0,0,0,0,0,849,Seton Medical Center 6th Floor,30.305313,-97.745577,1
14,0,0,0,1,0,0,0,0,0,760,ARC Medical Park Tower (Obstetrics/gynecology),30.305577,-97.746673,1
15,1,0,0,0,0,0,0,0,0,763,After Hours Kids,30.308149,-97.748756,1


From the above cluster, Austin has many hospital, medical related offices and a school at a distance of 600-900m

In [17]:
austin_cluster_venues_nc3[austin_cluster_venues_nc3['Cluster'] == 2]

Unnamed: 0,Doctor's Office,Elementary School,High School,Hospital,Hospital Ward,Medical Center,Private School,Residential Building (Apartment / Condo),School,Distance,Name,Latitude,Longitude,Cluster
0,0,1,0,0,0,0,0,0,0,336,Brykerwoods Elementary,30.304727,-97.751121,2
2,0,0,0,1,0,0,0,0,0,431,Seton Medical Center Rm 526,30.303255,-97.749524,2
3,1,0,0,0,0,0,0,0,0,465,Snyder Dermatology,30.306006,-97.750636,2
12,0,0,1,0,0,0,0,0,0,558,St. Andrew's School,30.302686,-97.748203,2
17,0,0,0,0,0,0,1,0,0,622,austin STEM academy,30.307552,-97.750246,2
24,0,0,0,0,0,0,0,1,0,517,The Worthington,30.307539,-97.752856,2


From the above cluster, Austin has very few hospital, school and residence at a distance of 300-500m

In [18]:
houston_cluster_venues_nc3

Unnamed: 0,Elementary School,Hospital,Residential Building (Apartment / Condo),Distance,Name,Latitude,Longitude,Cluster
0,0,0,1,271,Camden Heights Apartments,29.777659,-95.390785,2
1,0,0,1,722,Sawyer Heights Lofts Luxury Apartments,29.77593,-95.38417,1
2,0,0,1,904,Assembly At Historic Heights,29.780531,-95.399337,1
3,0,0,1,1161,HiLine Heights Apartments,29.772697,-95.398591,0
4,0,0,1,771,Elan Heights Apartments,29.780882,-95.382077,1
5,0,1,0,1144,AFC Urgent Care Washington Heights,29.77341,-95.399092,0
6,0,0,1,1088,Fisher Homes,29.786288,-95.398623,0
7,1,0,0,871,Harvard Elementary School,29.785173,-95.396775,1
8,0,1,0,933,Today's Vision Sawyer Heights,29.774097,-95.383139,0
9,0,0,1,1235,Alta West End,29.774656,-95.401201,0


In [19]:
houston_cluster_venues_nc3[houston_cluster_venues_nc3['Cluster'] == 0]


Unnamed: 0,Elementary School,Hospital,Residential Building (Apartment / Condo),Distance,Name,Latitude,Longitude,Cluster
3,0,0,1,1161,HiLine Heights Apartments,29.772697,-95.398591,0
5,0,1,0,1144,AFC Urgent Care Washington Heights,29.77341,-95.399092,0
6,0,0,1,1088,Fisher Homes,29.786288,-95.398623,0
8,0,1,0,933,Today's Vision Sawyer Heights,29.774097,-95.383139,0
9,0,0,1,1235,Alta West End,29.774656,-95.401201,0
11,0,0,1,1172,Alta Heights Courtyard,29.77225,-95.398217,0
12,0,0,1,1106,Jojo & Rube's,29.786561,-95.398599,0
18,0,0,1,955,Alexan @6th Gym,29.782472,-95.399472,0
19,0,0,1,1126,Yale Street Lofts,29.786708,-95.398729,0
20,0,0,1,1014,Schniggas,29.785777,-95.39812,0


From the above cluster, Houston has very few hospital, multiple residences concentrated at a distance 900-1100m

In [20]:
houston_cluster_venues_nc3[houston_cluster_venues_nc3['Cluster'] == 1]

Unnamed: 0,Elementary School,Hospital,Residential Building (Apartment / Condo),Distance,Name,Latitude,Longitude,Cluster
1,0,0,1,722,Sawyer Heights Lofts Luxury Apartments,29.77593,-95.38417,1
2,0,0,1,904,Assembly At Historic Heights,29.780531,-95.399337,1
4,0,0,1,771,Elan Heights Apartments,29.780882,-95.382077,1
7,1,0,0,871,Harvard Elementary School,29.785173,-95.396775,1
10,0,0,1,910,Yale @ 6th,29.78156,-95.399252,1
13,0,0,1,741,sawyer park parking garage,29.775327,-95.38453,1
14,0,0,1,706,Gant House,29.781348,-95.397143,1
15,0,0,1,659,Sawyer Heights Reflection Pond,29.776123,-95.384843,1
17,0,0,1,694,Sawyer Heights Mediation Garden,29.776211,-95.384287,1
22,0,0,1,871,alexan @6th gym,29.781771,-95.398784,1


From the above cluster, Houston has multiple residences with no hospitals, school nearby at a distance of 600-900m

In [21]:
houston_cluster_venues_nc3[houston_cluster_venues_nc3['Cluster'] == 2]

Unnamed: 0,Elementary School,Hospital,Residential Building (Apartment / Condo),Distance,Name,Latitude,Longitude,Cluster
0,0,0,1,271,Camden Heights Apartments,29.777659,-95.390785,2
16,0,0,1,317,1111 Studewood Place,29.782037,-95.3877,2


From the above cluster, Houston has residences only

In [22]:
# Draw folium map of Austin with clusters 3
m1 = draw_map(austin_cluster_venues_nc3, 30.303, -97.754, 3)
m1

In [23]:
# Draw map of Houston with clusters 3
m2 = draw_map(houston_cluster_venues_nc3, 29.78, -95.39, 3)
m2

In [24]:
# Encode and Cluster the fetched venues for Austin, Houston
# Number of Clusters used for Clustering is 4
austin_cluster_venues_nc4 = encode_and_cluster(df1, 4)
houston_cluster_venues_nc4= encode_and_cluster(df2, 4)

In [25]:
austin_cluster_venues_nc4[austin_cluster_venues_nc4['Cluster'] == 0]

Unnamed: 0,Doctor's Office,Elementary School,High School,Hospital,Hospital Ward,Medical Center,Private School,Residential Building (Apartment / Condo),School,Distance,Name,Latitude,Longitude,Cluster
0,0,1,0,0,0,0,0,0,0,336,Brykerwoods Elementary,30.304727,-97.751121,0
2,0,0,0,1,0,0,0,0,0,431,Seton Medical Center Rm 526,30.303255,-97.749524,0
3,1,0,0,0,0,0,0,0,0,465,Snyder Dermatology,30.306006,-97.750636,0
12,0,0,1,0,0,0,0,0,0,558,St. Andrew's School,30.302686,-97.748203,0
24,0,0,0,0,0,0,0,1,0,517,The Worthington,30.307539,-97.752856,0


From the above cluster, Austin has residence, hospital, school spreaded at a distance of 300-600m

In [26]:
austin_cluster_venues_nc4[austin_cluster_venues_nc4['Cluster'] == 1]

Unnamed: 0,Doctor's Office,Elementary School,High School,Hospital,Hospital Ward,Medical Center,Private School,Residential Building (Apartment / Condo),School,Distance,Name,Latitude,Longitude,Cluster
1,0,0,0,0,0,1,0,0,0,833,Seton Medical Center Austin (Hospital),30.30481,-97.74559,1
5,0,0,0,0,1,0,0,0,0,824,Seton Medical Center ICU,30.304463,-97.74559,1
6,0,0,0,1,0,0,0,0,0,834,Seton Medical Center IMC,30.305062,-97.745647,1
8,0,0,0,1,0,0,0,0,0,891,SMCA PACU,30.305225,-97.74509,1
10,0,0,0,1,0,0,0,0,0,797,Seton Medical Center - Maternity Ward,30.305066,-97.746049,1
13,0,0,0,1,0,0,0,0,0,849,Seton Medical Center 6th Floor,30.305313,-97.745577,1
16,0,0,0,1,0,0,0,0,0,873,Seton Medical Center NICU,30.305053,-97.745228,1
18,0,0,0,0,0,0,0,0,1,879,ACE Academy,30.308226,-97.747133,1


From the above cluster, Austin has mutliple hospitals and a school at a distance of 800-900m

In [27]:
austin_cluster_venues_nc4[austin_cluster_venues_nc4['Cluster'] == 2]

Unnamed: 0,Doctor's Office,Elementary School,High School,Hospital,Hospital Ward,Medical Center,Private School,Residential Building (Apartment / Condo),School,Distance,Name,Latitude,Longitude,Cluster
7,0,0,0,1,0,0,0,0,0,954,SMCA PAT,30.305821,-97.74462,2
11,0,0,0,1,0,0,0,0,0,958,SMCA Pre-Op,30.305551,-97.744475,2
19,0,0,0,0,0,0,0,0,1,991,Child craft school,30.29845,-97.745129,2
20,0,0,0,0,0,0,0,1,0,1030,Minnieapolis,30.310788,-97.748211,2
21,0,0,0,0,0,0,0,1,0,1075,Pecan Square,30.297297,-97.744971,2
22,0,0,0,0,0,0,0,1,0,1079,Shoal Creek Park,30.311519,-97.748642,2
23,0,0,0,0,0,0,0,1,0,999,Richardson at Tarrytown,30.308037,-97.762604,2
25,0,0,0,0,0,0,0,1,0,1000,West University Place,30.297344,-97.745912,2


From the above cluster, Austin has multiple residences, few hospitals and school at a distance of 900 - 1100m

In [28]:
austin_cluster_venues_nc4[austin_cluster_venues_nc4['Cluster'] == 3]

Unnamed: 0,Doctor's Office,Elementary School,High School,Hospital,Hospital Ward,Medical Center,Private School,Residential Building (Apartment / Condo),School,Distance,Name,Latitude,Longitude,Cluster
4,0,0,0,1,0,0,0,0,0,745,SMCA Respiratory Dept.,30.303388,-97.746257,3
9,0,0,0,1,0,0,0,0,0,689,Bailey Square 4th Floor Waiting Room,30.302537,-97.746851,3
14,0,0,0,1,0,0,0,0,0,760,ARC Medical Park Tower (Obstetrics/gynecology),30.305577,-97.746673,3
15,1,0,0,0,0,0,0,0,0,763,After Hours Kids,30.308149,-97.748756,3
17,0,0,0,0,0,0,1,0,0,622,austin STEM academy,30.307552,-97.750246,3


From the cluster, Austin has mutliple hospital and a school at a distance of 600-800m

In [29]:
houston_cluster_venues_nc4[houston_cluster_venues_nc4['Cluster'] == 0]

Unnamed: 0,Elementary School,Hospital,Residential Building (Apartment / Condo),Distance,Name,Latitude,Longitude,Cluster
3,0,0,1,1161,HiLine Heights Apartments,29.772697,-95.398591,0
5,0,1,0,1144,AFC Urgent Care Washington Heights,29.77341,-95.399092,0
6,0,0,1,1088,Fisher Homes,29.786288,-95.398623,0
9,0,0,1,1235,Alta West End,29.774656,-95.401201,0
11,0,0,1,1172,Alta Heights Courtyard,29.77225,-95.398217,0
12,0,0,1,1106,Jojo & Rube's,29.786561,-95.398599,0
19,0,0,1,1126,Yale Street Lofts,29.786708,-95.398729,0
21,0,0,1,1121,Historic Heights Properties,29.787729,-95.39745,0
24,0,0,1,1039,Broadstone Park West,29.788599,-95.385801,0


From the above cluster, Houston has multiple residences at a distance of 1000m and above

In [30]:
houston_cluster_venues_nc4[houston_cluster_venues_nc4['Cluster'] == 1]

Unnamed: 0,Elementary School,Hospital,Residential Building (Apartment / Condo),Distance,Name,Latitude,Longitude,Cluster
1,0,0,1,722,Sawyer Heights Lofts Luxury Apartments,29.77593,-95.38417,1
4,0,0,1,771,Elan Heights Apartments,29.780882,-95.382077,1
13,0,0,1,741,sawyer park parking garage,29.775327,-95.38453,1
14,0,0,1,706,Gant House,29.781348,-95.397143,1
15,0,0,1,659,Sawyer Heights Reflection Pond,29.776123,-95.384843,1
17,0,0,1,694,Sawyer Heights Mediation Garden,29.776211,-95.384287,1
25,0,0,1,694,600 Heights,29.781503,-95.396979,1


From the above cluster, Houston has multiple residences only at a distance of 650-750m

In [31]:
houston_cluster_venues_nc4[houston_cluster_venues_nc4['Cluster'] == 2]

Unnamed: 0,Elementary School,Hospital,Residential Building (Apartment / Condo),Distance,Name,Latitude,Longitude,Cluster
0,0,0,1,271,Camden Heights Apartments,29.777659,-95.390785,2
16,0,0,1,317,1111 Studewood Place,29.782037,-95.3877,2


From the above cluster, Houston has residences only at a distance of 250-350m

In [32]:
houston_cluster_venues_nc4[houston_cluster_venues_nc4['Cluster'] == 3]

Unnamed: 0,Elementary School,Hospital,Residential Building (Apartment / Condo),Distance,Name,Latitude,Longitude,Cluster
2,0,0,1,904,Assembly At Historic Heights,29.780531,-95.399337,3
7,1,0,0,871,Harvard Elementary School,29.785173,-95.396775,3
8,0,1,0,933,Today's Vision Sawyer Heights,29.774097,-95.383139,3
10,0,0,1,910,Yale @ 6th,29.78156,-95.399252,3
18,0,0,1,955,Alexan @6th Gym,29.782472,-95.399472,3
20,0,0,1,1014,Schniggas,29.785777,-95.39812,3
22,0,0,1,871,alexan @6th gym,29.781771,-95.398784,3
23,0,0,1,1004,Caldareras Rentals,29.776318,-95.39949,3


From the above cluster, Houston has multiple residences, a hospital, a school at a distance of 850-1100m

In [33]:
# Draw map of Austin with clusters 4
m3 = draw_map(austin_cluster_venues_nc4, 30.303, -97.754, 4)
m3

In [34]:
# Draw map of Houston with clusters 4
m4 = draw_map(houston_cluster_venues_nc4, 29.78, -95.39, 4)
m4

Based on the above maps, clusters "Austin" is suggested to be recommended place for residential living. As it contains combination of all these categories although hospitals are slightly greater whereas Houston has more residential buildings only.