# Pocket Parks in Memphis

### Objective: 

The goal of this project is to characterize Memphis neighborhoods access to outdoor recreational features and community centers to determine pocket park priority locations

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1. <a href="#item1">Download Neighborhood and Zip Code information</a>

2. <a href="#item2">Explore Memphis Neighborhoods</a>

3. <a href="#item3">Where are the Parks?</a>

4. <a href="#item4">Where are the Community Centers?</a>

5. <a href="#item5">Cluster Neighborhoods that have no Parks - KMeans</a>

6. <a href="#item6">Cluster Neighborhoods that have no Parks - DBSCAN</a>    
</font>
</div>

In [1]:
! pip install beautifulsoup4
!pip install geocoder
!pip install folium
#Import the necessary packages for web scraping, geocoding, mapping, clustering
import pandas as pd
pd.set_option('display.max_colwidth', -1) #Show entire column 
import numpy as np
import requests

from bs4 import BeautifulSoup

import geocoder

import folium as fm
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

import json # library to handle JSON files

# import k-means from clustering stage
from sklearn.cluster import KMeans
#import DBSCAn for second clustering
from sklearn.cluster import DBSCAN
import sklearn.utils
from sklearn.preprocessing import StandardScaler


print('Packages Downloaded')



Packages Downloaded


### 1. Download Neighborhood and Zip Code information #Change headings

<a id='item1'></a>

Tripsavvy.com lists Memphis zip codes and their associated neighborhood(s). The following section uses the Beautiful soup packge to acquire this information and express it a dataframe. The geocoder package determines the coordinates for the associated zip codes. 

In [2]:
#Use the requests package to fetch the website and use the html5lib to parse the html. 
url = requests.get("https://www.tripsavvy.com/shelby-county-zip-codes-2321917").text
soup = BeautifulSoup(url, 'html5lib')
#print(soup.prettify())

In [3]:
#Identify the block of section on the website of interest that lists the memphis neighborhoods and their zipcodes
list_section = soup.find('div',class_="comp text-passage mntl-sc-block travel-sc-block-html mntl-sc-block-html",id="mntl-sc-block_1-0-5")
#print(list_section.prettify())

In [4]:
#Each line item is a different zip code - neighborhood.
#The loop should pull out each pair and format it correctly. 

mem = [] #initialize list
for item in list_section.find_all('li'):
    zip_raw = item.text #Extract the zip code/neighborhood
    
    zipCode = int(zip_raw.split('-', maxsplit = 1)[0]) #Extract the zipcode and make it an integer

    neigh = zip_raw.split('-', maxsplit = 1)[1] #Extract the neighborhood
    neigh = neigh.replace(", including", ":").replace(" and", ",").lstrip() #Clean the neighborhood name and remove the first character (which is a space) 
    #print('The zip is', zipCode, 'and the neighborhood is', neigh)
    mem.append((zipCode, neigh))

#Create a pandas dataframe

mem = pd.DataFrame(mem, columns=('Zip Code', 'Neighborhood'))

#remove airport and some other neighborhoods that are farher away from the data

mem = mem[~mem['Neighborhood'].isin(['Airport', 'Atoka','Arlington','Bartlett, Ellendale', 'Collierville', 'Eads', 'East Hickory Hill', 'Hickory Hill','Whitehaven', 'Millington'])]

In [5]:
#Loop until all the zip code coordinates are retrieved
zips = mem['Zip Code']
g=[] #Initialize a place to put the coordinates
for z in zips: #Using a for loop seems to save from issues with how many are being retrieved at one time.
    print(z) #printing out the postcode as it happens is a way to see when and if the process gets stuck. 
    geo = None #initialize the variable for the while loop
    while(geo is None):
        geo = geocoder.arcgis('{}, Memphis, Tennessee'.format(z)).latlng #save the latlng and while loop makes sure it is NOT None beefor emoving on. 
    g.append(geo)

38016
38018
38103
38104
38105
38106
38107
38108
38111
38112
38114
38117
38119
38120
38122
38126
38127
38128
38133
38134
38138
38139


In [6]:
#g #Is a list of sublists. Want to extract the first element of each for latitude and the second for longitude. 

lat = [item[0] for item in g] #Provides list of latitudes. 
long = [item[1] for item in g] #Provides list of longitudes.

mem =mem.assign(Latitude = lat, Longitude = long)
mem.head()


Unnamed: 0,Zip Code,Neighborhood,Latitude,Longitude
2,38016,Cordova,35.197533,-89.728039
4,38018,Cordova,35.139279,-89.80119
7,38103,"Downtown Memphis: South Main, South Bluffs, Beale Street",35.171334,-90.05133
8,38104,Midtown Memphis,35.135665,-90.006235
9,38105,Downtown Memphis: The Pinch District,35.15189,-90.03505


<a id='item2'></a>

### 2. Explore Memphis Neighborhoods

We will map the Memphis neighborhoods using Folium and use Foursquare to acquire the venues in the area. We will use the circle marker in folium to give us the radius in meters from a location. 

In [7]:
#Retrieve the latitude and longituide for Memphis and create a base map.

memCoord = geocoder.arcgis('Memphis, Tennessee').latlng 
m_lat = memCoord[0]
m_long = memCoord[1]
print('latitude', m_lat, '\nlongitude', m_long)

latitude 35.14976000000007 
longitude -90.04924999999997


In [8]:
#Create a map of Memphis using the lat and long values and then superimpose the zipcode neighbourhoods on top. 
map = fm.Map(location=[m_lat, m_long], zoom_start = 10, control_scale = True)

for lt, lng, zp, neigh in zip(mem['Latitude'], mem['Longitude'], mem['Zip Code'],mem['Neighborhood']):
    label = '{}, {}'.format(neigh, zp)
    label = fm.Popup(label, parse_html=True)
    fm.Circle([lt,lng], #Circle radius = m
                    radius = 1500, 
                    popup = label,
                    color = 'orange',
                    fill=False,
                    parse_html=False).add_to(map)
map

A radius of 1,500 m seems appropriate, especially for the zip code neighborhoods within the I-240 loop. Larger radiuses may be better for the suburb zip codes, however, it would cause too much overlap from for the inner zip code neighborhoods. Further, 1,500 m is a reasonable walking distance.   

Zip Code is incorporated for the dataset because some suburbs have (i.e. Cordova, East Memphis, etc) may have the same name for multiple zip codes but are large in geographic area. 

In [9]:
# The code was removed by Watson Studio for sharing.

In [10]:
LIMIT = 500
RADIUS = 1500
def getNearbyVenues(names,zipCodes, latitudes, longitudes, limit = LIMIT, radius=RADIUS):
    
    venues_list=[]
    for name, zipCode, lat, lng in zip(names,zipCodes, latitudes, longitudes):
        #print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            limit)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, #Is there zip code? 
            zipCode,
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                    'Zip Code',
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [11]:
mem_venues = getNearbyVenues(names=mem['Neighborhood'], 
                             zipCodes =mem['Zip Code'],
                            latitudes = mem['Latitude'],
                            longitudes = mem['Longitude'])

mem_venues=mem_venues[mem_venues['Venue Category'] != 'Neighborhood'] #Remove instances where the venue category is neighborhood
mem_venues.head()

Unnamed: 0,Neighborhood,Zip Code,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Cordova,38016,35.197533,-89.728039,T J Mulligans,35.202129,-89.73183,Bar
1,Cordova,38016,35.197533,-89.728039,Baskin-Robbins,35.205247,-89.732022,Ice Cream Shop
2,Cordova,38016,35.197533,-89.728039,CVS pharmacy,35.205041,-89.733405,Pharmacy
3,Cordova,38016,35.197533,-89.728039,Red Fish Sushi Asian Bistro,35.205249,-89.73241,Sushi Restaurant
4,Cordova,38016,35.197533,-89.728039,Dollar General,35.203925,-89.73262,Discount Store


In [12]:
#Group the venues by neighborhood/Zip Code

mem_venues.groupby(['Neighborhood', 'Zip Code']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Zip Code,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Bartlett,38133,93,93,93,93,93,93
Bartlett,38134,41,41,41,41,41,41
Bellevue/McLemore (South Midtown),38106,15,15,15,15,15,15
Berclair,38122,24,24,24,24,24,24
Cordova,38016,23,23,23,23,23,23
Cordova,38018,58,58,58,58,58,58
"Downtown Memphis: South Main, South Bluffs, Beale Street",38103,21,21,21,21,21,21
Downtown Memphis: The Pinch District,38105,58,58,58,58,58,58
East Memphis,38120,7,7,7,7,7,7
"East Memphis, Fountain Square, Kirby Trace",38119,44,44,44,44,44,44


It's important to note that some neighborhoods are much more popular on FourSquare than others. For example, Frayser is a large community in Memphis, but its residents do not utilize Foursquare frequently, apparently, because there is only one venue in this zip code. Whereas Midtown, a much smaller geographical area, is a popular neighborhood and has 100 venues listed. 

In [13]:
print('There are {} unique venue categories.'.format(len(mem_venues['Venue Category'].unique())))

There are 198 unique venue categories.


In [14]:
# one hot encoding
mem_hot = pd.get_dummies(mem_venues[['Venue Category']], prefix= "", prefix_sep="")

#Add Neighborhood column back to the dataframe and place it first.
mem_hot.insert(0,'Neighborhood', mem_venues['Neighborhood'], True)
mem_hot.insert(1,'Zip Code',mem_venues['Zip Code'],True)
#print(mem_hot.shape)

#Create a dataframe that has the mean frequency of each category and show the top 10 venues in each zipcode region
mem_mean = mem_hot.groupby(['Neighborhood', 'Zip Code']).mean().reset_index() 

In [15]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[2:] #Changed this to 2 since I want to exclude the zip code
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [16]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood','Zip Code']

for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = mem_mean['Neighborhood']
neighborhoods_venues_sorted['Zip Code'] = mem_mean['Zip Code']

for ind in np.arange(mem_mean.shape[0]):
    neighborhoods_venues_sorted.iloc[ind,2:] = return_most_common_venues(mem_mean.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,Zip Code,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bartlett,38133,American Restaurant,Clothing Store,Shoe Store,Department Store,Cosmetics Shop,Sandwich Place,Mexican Restaurant,Fast Food Restaurant,Furniture / Home Store,Pizza Place
1,Bartlett,38134,Gas Station,Fast Food Restaurant,American Restaurant,Hotel,Discount Store,Breakfast Spot,Donut Shop,Mobile Phone Shop,Fried Chicken Joint,Mediterranean Restaurant
2,Bellevue/McLemore (South Midtown),38106,Food,Climbing Gym,Fast Food Restaurant,Gas Station,Electronics Store,Music Venue,Park,Rental Car Location,Italian Restaurant,Museum
3,Berclair,38122,Mexican Restaurant,Pizza Place,Discount Store,Mobile Phone Shop,Clothing Store,Fried Chicken Joint,Food & Drink Shop,Food,Fast Food Restaurant,Restaurant
4,Cordova,38016,Fast Food Restaurant,Pizza Place,Pharmacy,Gym,Ice Cream Shop,Residential Building (Apartment / Condo),Mexican Restaurant,Breakfast Spot,Liquor Store,Sushi Restaurant
5,Cordova,38018,Mexican Restaurant,Sandwich Place,Gas Station,Burger Joint,Pharmacy,BBQ Joint,Cosmetics Shop,Sporting Goods Shop,Gym,Pub
6,"Downtown Memphis: South Main, South Bluffs, Beale Street",38103,Harbor / Marina,American Restaurant,Coffee Shop,Hotel,Gym,Grocery Store,Light Rail Station,Fried Chicken Joint,Nail Salon,Eastern European Restaurant
7,Downtown Memphis: The Pinch District,38105,Sandwich Place,Coffee Shop,Hotel,Bar,Pizza Place,Fried Chicken Joint,Rental Car Location,Fast Food Restaurant,Southern / Soul Food Restaurant,Food Court
8,East Memphis,38120,Trail,Opera House,Home Service,Paintball Field,Stables,Zoo,Food Truck,Food Court,Food & Drink Shop,Food
9,"East Memphis, Fountain Square, Kirby Trace",38119,Mexican Restaurant,Pizza Place,Sandwich Place,Convenience Store,Video Store,Pharmacy,Fast Food Restaurant,Gas Station,Liquor Store,Ice Cream Shop


<a id='item3'></a>

### 3. Where are the Parks?

We want to isolate where there are and are not parks or outdoor recreational spaces to characterize what may be priority neighborhoods for parks. Various Foursquare venue categories have been included to be broad. 

In [17]:
mem_parks = mem_venues[mem_venues['Venue Category'].isin(['Park', 'Trail','Scenic lookout', 'Outdoors & Rec', 'Garden', 'Dog Run'])]


Orange markers represent the zip codes/neighborhoods of interest. Green markers represent zip codes that have at least one outdoor recreational feature within the Foursquare dataset.  

Important note: Foursquare does *NOT* have all the outdoor features in Memphis, which is a limitation of using this data set. 

In [18]:
#Create a map of Memphis using the lat and long values and then superimpose the zipcode neighbourhoods on top. 
park_map = fm.Map(location=[m_lat, m_long], zoom_start = 10, control_scale = True)


for lt, lng, zp, neigh in zip(mem['Latitude'], mem['Longitude'], mem['Zip Code'],mem['Neighborhood']):
    label = '{}, {}'.format(neigh, zp)
    label = fm.Popup(label, parse_html=True)
    fm.CircleMarker([lt,lng],
                    radius = 5, 
                    popup = label,
                    color = 'orange',
                    fill=False,
                    parse_html=False).add_to(park_map)
for lt, lng, nm, vc in zip(mem_parks['Neighborhood Latitude'], mem_parks['Neighborhood Longitude'], mem_parks['Venue'],mem_parks['Venue Category']):
    label = '{}, {}'.format(nm, vc) 
    label = fm.Popup(label, parse_html=True)
    fm.CircleMarker([lt,lng],
                    radius = 2, 
                    popup = label,
                    color = 'green',
                    fill=False ,
                    parse_html=False).add_to(park_map)   
park_map

#### Where are there NO parks?

In [19]:
noPark = mem_venues[~mem_venues['Zip Code'].isin(mem_parks['Zip Code'])]


for item in noPark['Neighborhood'].unique():
    print(item)

Cordova
Midtown Memphis
North Memphis: Snowden, New Chicago 
Kingsbury
University of Memphis area, Colonial Yorkshire, in East Memphis
Orange Mound
East Memphis, Fountain Square, Kirby Trace
Berclair
Frayser
Raleigh
Bartlett


<a id='item4'></a>

### 4. Where are the Community Centers?

Community centers may also serve many of the functions of parks (although they may lack green space). If we know where there are community centers, that may help to select neighborhoods that need pocket parks more than others.

Community center data is acquired from a local government website. Only the community center and its associated zip code are of interest from this dataset and will become a dataframe. We only are interested in those zip codes which were retained in our dataset of neighborhoods. We can associate zip codes and their respective coordinates.  

In [20]:
!wget -q -O 'hh7a-g7mu.json' https://data.memphistn.gov/resource/hh7a-g7mu.json
print('Data downloaded!')

Data downloaded!


In [21]:
with open('hh7a-g7mu.json') as json_data:
    mem_data = json.load(json_data) #mem_data is a list with dictionaries. 

#mem_data[1].keys()

In [22]:
#Create empty community centers dataframe to insert this new info
comm_cen = pd.DataFrame(columns=['Community Center', 'Zip Code'])

for data in mem_data:
    comm = data['community_']
    zipCode = int(data['zip'])
    comm_cen = comm_cen.append({'Community Center': comm, 
                                'Zip Code':zipCode}, ignore_index=True)
#Retain those zip codes which are in our dataset of interest. 
memZips =mem['Zip Code'].tolist()
comm_cen=comm_cen[comm_cen['Zip Code'].isin(memZips)]

#Get coordinates for the zip codes. 
#Loop until all the coordinates are retrieved for the community center
zips = comm_cen['Zip Code']
g=[] #Initialize a place to put the coordinates
for z in zips: #Using a for loop seems to save from issues with how many are being retrieved at one time.
    print(z) #printing out the postcode as it happens is a way to see when and if the process gets stuck. 
    geo = None #initialize the variable for the while loop
    while(geo is None):
        geo = geocoder.arcgis('{}, Memphis, Tennessee'.format(z)).latlng #save the latlng and while loop makes sure it is NOT None beefor emoving on. 
    g.append(geo)
    
#g #Is a list of sublists. Want to extract the first element of each for latitude and the second for longitude. 

lat = [item[0] for item in g] #Provides list of latitudes. 
long = [item[1] for item in g] #Provides list of longitudes.

comm_cen =comm_cen.assign(Latitude = lat, Longitude = long)
comm_cen.head()

38018
38107
38114
38117
38128
38106
38122
38107
38127
38106
38114
38111
38112
38108
38128
38127
38108
38107


Unnamed: 0,Community Center,Zip Code,Latitude,Longitude
0,Bert Ferguson Community Center,38018,35.139279,-89.80119
1,Kate Sexton Community Center,38107,35.1747,-90.02907
2,Glenview Community Center,38114,35.098935,-89.98349
3,Marion Hale Community Center,38117,35.111245,-89.906825
5,Cunningham Community Center,38128,35.224585,-89.919445


In [23]:
#Get neighborhood names for community centers from Memphis
comm_cen=pd.merge(left=comm_cen, right=mem[['Neighborhood', 'Zip Code']], left_on='Zip Code', right_on = 'Zip Code')
comm_cen.groupby(['Neighborhood','Zip Code']).count().reset_index()


Unnamed: 0,Neighborhood,Zip Code,Community Center,Latitude,Longitude
0,Bellevue/McLemore (South Midtown),38106,2,2,2
1,Berclair,38122,1,1,1
2,Cordova,38018,1,1,1
3,"East Memphis, Laurelwood",38117,1,1,1
4,Frayser,38127,2,2,2
5,"Highland Heights, Hollywood-Jackson, Evergreen, Overton Square, Binghampton",38112,1,1,1
6,Kingsbury,38108,2,2,2
7,"North Memphis: Snowden, New Chicago",38107,3,3,3
8,Orange Mound,38114,2,2,2
9,Raleigh,38128,2,2,2


Now we are ready to map the neighborhoods (orange), those with parks (green),and those with community centers(purple). 

In [24]:
#Add the community centers to the park map from earlier
for lt, lng, cc in zip(comm_cen['Latitude'], comm_cen['Longitude'], comm_cen['Community Center']):
    label = '{}'.format(cc)
    label = fm.Popup(label, parse_html=True)
    fm.CircleMarker([lt,lng],
                    radius = 8, 
                    popup = label,
                    color = 'purple',
                    fill=False ,
                    parse_html=False).add_to(park_map)   
park_map

#### Where are there NO parks OR Community Centers? 

In [25]:
nopark_CC= noPark[~noPark['Zip Code'].isin(comm_cen['Zip Code'])]

for item in nopark_CC['Neighborhood'].unique():
    print(item)

Cordova
Midtown Memphis
East Memphis, Fountain Square, Kirby Trace
Bartlett


<a id='item5'></a>

### 5. Cluster Neighborhoods that have no Parks - KMeans

It seems apparent that Cordova, Midtown, East Memphis, and Bartlett may  need a pocket park more than the other neighborhoods since they lack parks (according to Foursquare) and also do not have a community center (according to the government data). 

Those may be the top priority, but clustering the 11 neighborhoods that don't have a park, regardless of community center status may provide more insight into the needs of the community. 

In [27]:
#Separate the communities that don't have parks from those that do from the one hot encoding. 
noPark_hot = mem_hot[~mem_venues['Zip Code'].isin(mem_parks['Zip Code'])]
noPark_grouped = noPark_hot.groupby(['Neighborhood', 'Zip Code']).mean().reset_index() #Changed mean to count

#Set number of clusters
kclusters = 4

noPark_grouped_clustering = noPark_grouped.drop(['Neighborhood', 'Zip Code'],1)

#Run k-means clustering

kmeans = KMeans(n_clusters = kclusters, random_state=0).fit(noPark_grouped_clustering)

kmeans.labels_[0:10]

noPark_venues_sorted=neighborhoods_venues_sorted[neighborhoods_venues_sorted['Zip Code'].isin(noPark_hot['Zip Code'])].reset_index(drop=True)
#print(noPark_venues_sorted.shape)

#Add clustering labels
noPark_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

noPark_merged = mem[mem['Zip Code'].isin(noPark['Zip Code'])]

#merge no Park neighborhood, zip code, and coordinates data with the sorted venues
noPark_merged = noPark_merged.join(noPark_venues_sorted.set_index(["Neighborhood", 'Zip Code']), on=["Neighborhood", 'Zip Code']).reset_index(drop=True)

noPark_merged

Unnamed: 0,Zip Code,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,38016,Cordova,35.197533,-89.728039,0,Fast Food Restaurant,Pizza Place,Pharmacy,Gym,Ice Cream Shop,Residential Building (Apartment / Condo),Mexican Restaurant,Breakfast Spot,Liquor Store,Sushi Restaurant
1,38104,Midtown Memphis,35.135665,-90.006235,0,American Restaurant,Burger Joint,Music Venue,Pizza Place,Bar,Mexican Restaurant,Sandwich Place,Vietnamese Restaurant,Coffee Shop,Theater
2,38107,"North Memphis: Snowden, New Chicago",35.1747,-90.02907,2,Diner,Home Service,Golf Course,Bar,Basketball Court,Fried Chicken Joint,Discount Store,Donut Shop,French Restaurant,Food Truck
3,38108,Kingsbury,35.17299,-89.978815,3,Liquor Store,Fish Market,Salon / Barbershop,Gym,Locksmith,Fried Chicken Joint,Zoo,Food Truck,Food Court,Food & Drink Shop
4,38111,"University of Memphis area, Colonial Yorkshire, in East Memphis",35.10931,-89.948325,0,Bar,Coffee Shop,Fried Chicken Joint,Pizza Place,Discount Store,Fast Food Restaurant,Sandwich Place,Chinese Restaurant,Video Store,Restaurant
5,38114,Orange Mound,35.098935,-89.98349,0,Fast Food Restaurant,Discount Store,Convenience Store,Fried Chicken Joint,Shoe Store,Wings Joint,Pizza Place,Grocery Store,Sandwich Place,Furniture / Home Store
6,38119,"East Memphis, Fountain Square, Kirby Trace",35.07857,-89.851095,0,Mexican Restaurant,Pizza Place,Sandwich Place,Convenience Store,Video Store,Pharmacy,Fast Food Restaurant,Gas Station,Liquor Store,Ice Cream Shop
7,38122,Berclair,35.162075,-89.91908,0,Mexican Restaurant,Pizza Place,Discount Store,Mobile Phone Shop,Clothing Store,Fried Chicken Joint,Food & Drink Shop,Food,Fast Food Restaurant,Restaurant
8,38127,Frayser,35.231135,-90.058948,1,Disc Golf,Furniture / Home Store,Fried Chicken Joint,French Restaurant,Food Truck,Food Court,Food & Drink Shop,Food,Flower Shop,Flea Market
9,38128,Raleigh,35.224585,-89.919445,0,Fast Food Restaurant,Discount Store,Pharmacy,Fried Chicken Joint,Wings Joint,Department Store,Convenience Store,Sandwich Place,Food,Smoothie Shop


It appears that communities in the  most common cluster 0 have many restaurants. Cluster 1 is only Frayser, which only has the disc golf course as a venue, making it unique. Cluster 2 has gyms and golf courses, which are not free community amenities. Cluster 3 does include a zoo, which is free on Tuesdays for Tennessee citizens. 

In [28]:
#Map the Clusters

#Set up color scheme for clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

#Add markers to map
markers_colors = []
for lt, lng, zp, cluster in zip(noPark_merged['Latitude'], noPark_merged['Longitude'], noPark_merged['Zip Code'], noPark_merged['Cluster Labels']):
    label = fm.Popup(str(zp) + ' Cluster ' + str(cluster), parse_html=True)
    fm.CircleMarker(
        [lt, lng],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(park_map)
park_map

Cluster 2 (teal) is not characterized by restaurants, which are natural 'third spaces' for people to meet. Although it has a community center, since it has no park, this may be a good candidate for pocket parks since community engagement options are low (lack of restaurants.) The same may be said of the cluster 1 (purple). Midtown earlieris a part of the  cluster 0 (red), because it has many restaurants, as the other neighborhoods without parks.

<a id='item6'></a>

### 6. Cluster Neighborhoods that have no Parks - DBSCAN

Another way to visualize where appropriate pocket parks may be located would be using DBSCAN to group those neighborhoods that are geographically closer.

In [29]:
sklearn.utils.check_random_state(1000)

noPark_map = noPark
Clus_dataSet = noPark_map[['Venue Longitude','Venue Latitude']]

# Compute DBSCAN
db = DBSCAN(eps=0.03, min_samples=5).fit(Clus_dataSet)
core_samples_mask = np.zeros_like(db.labels_, dtype=bool)
core_samples_mask[db.core_sample_indices_] = True
labels = db.labels_
noPark_map["Clus_Db"]=labels

print(set(labels))

{0, 1, 2, 3, 4, 5, -1}


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


In [30]:

#Set up color scheme for clusters
x = np.arange(len(set(labels)))
ys = [i + x + (i*x)**2 for i in range(len(set(labels)))]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]



#Add markers to map
markers_colors = []
for lt, lng, cluster in zip(noPark_map['Venue Latitude'], noPark_map['Venue Longitude'], noPark_map['Clus_Db']):
    label = fm.Popup('Cluster' + str(cluster), parse_html=True)
    fm.CircleMarker(
        [lt, lng],
        radius=3,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.1).add_to(park_map)
park_map

In [31]:
noPark_map.groupby(['Neighborhood', 'Zip Code','Clus_Db']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Zip Code,Clus_Db,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Bartlett,38133,5,93,93,93,93,93,93
Berclair,38122,3,24,24,24,24,24,24
Cordova,38016,0,23,23,23,23,23,23
"East Memphis, Fountain Square, Kirby Trace",38119,2,44,44,44,44,44,44
Frayser,38127,-1,1,1,1,1,1,1
Kingsbury,38108,1,6,6,6,6,6,6
Midtown Memphis,38104,1,100,100,100,100,100,100
"North Memphis: Snowden, New Chicago",38107,1,7,7,7,7,7,7
Orange Mound,38114,1,39,39,39,39,39,39
Raleigh,38128,4,35,35,35,35,35,35


It appears that the neighborhoods outside of the loop have venues that are geographically very isolated with the exception of Berclair, which is in the northeastern corner of the inner loop.

#### You've reached the end. Thanks for going through this capstone notebook! 