<h1 align=center><font size = 5>Applied Data Science Capstone</font></h1>
<h2 align=center><font size = 4>Week 5 Assignment <br>
    The Battle of the Neighborhoods</font></h1>

## Introduction:
Client is planning to visit Miami, FL and they want to identify Neighborhoods within half mile radius of the city and depict the neighborhoods that meet the criteria on a map with markers. Additionally, they have would like to identify neighborhoods that have both a park and coffee shop as they enjoy stopping in to get a gourmet coffee and taking a relaxing stroll in the neighborhood park. They would also like to see a map showing the places that match this criteria, allowing them to click on the marker and have a text box popup that lists the type of venue, the venue name and the neighborhood.

In this Notebook, data for Miami, Florida neighborhoods will be extracted from Wikipedia page. Data will then be loaded into dataframe for additional review and evaluation to meet objective for identifying neighborhoods with half mile radius of city and those that have both a park and a coffee shop.  Using the data gathered a map of Miami with markers for each neighborhood will be displayed as well as map of the venues for neighborhoods that have both coffee shop and park. The user will be able to click on the marker for additional information to be displayed in form of text popup.

### Import Dependencies

In [1]:
import pandas as pd
import folium
from geopy.geocoders import Nominatim
import requests
import numpy as np
import matplotlib.cm as cm
import matplotlib.colors as colors

### Load dataframe from Wikipedia page and set column names

In [2]:
df = pd.read_html('https://en.wikipedia.org/wiki/List_of_neighborhoods_in_Miami')[0]
df.columns = 'Neighborhood','Demonym','Population','2010 Population','Sub-Neighborhoods','Coordinates'
print(df.shape)
df.head()

(26, 6)


Unnamed: 0,Neighborhood,Demonym,Population,2010 Population,Sub-Neighborhoods,Coordinates
0,Allapattah,,54289,4401,,25.815-80.224
1,Arts & Entertainment District,,11033,7948,,25.799-80.190
2,Brickell,Brickellite,31759,14541,West Brickell,25.758-80.193
3,Buena Vista,,9058,3540,Buena Vista East Historic District and Design ...,25.813-80.192
4,Coconut Grove,Grovite,20076,3091,"Center Grove, Northeast Coconut Grove, Southwe...",25.712-80.257


In [3]:
### Split Coordinates column and create new columns for Latitude and Longitude

newloc=df.Coordinates.str.split('-',expand=True)
df["Latitude"]=newloc[0].astype(float)
df["Longitude"]=newloc[1].astype(float)
df.Longitude = df.Longitude*-1
df.head()

Unnamed: 0,Neighborhood,Demonym,Population,2010 Population,Sub-Neighborhoods,Coordinates,Latitude,Longitude
0,Allapattah,,54289,4401,,25.815-80.224,25.815,-80.224
1,Arts & Entertainment District,,11033,7948,,25.799-80.190,25.799,-80.19
2,Brickell,Brickellite,31759,14541,West Brickell,25.758-80.193,25.758,-80.193
3,Buena Vista,,9058,3540,Buena Vista East Historic District and Design ...,25.813-80.192,25.813,-80.192
4,Coconut Grove,Grovite,20076,3091,"Center Grove, Northeast Coconut Grove, Southwe...",25.712-80.257,25.712,-80.257


### Drop unecessary columns from dataframe

In [4]:
df.drop(['Demonym','Sub-Neighborhoods','Population','2010 Population','Coordinates'], axis=1, inplace=True)
print(df.shape)
df.head()

(26, 3)


Unnamed: 0,Neighborhood,Latitude,Longitude
0,Allapattah,25.815,-80.224
1,Arts & Entertainment District,25.799,-80.19
2,Brickell,25.758,-80.193
3,Buena Vista,25.813,-80.192
4,Coconut Grove,25.712,-80.257


### Drop rows that are missing either Latitude or Longitude

In [5]:
df_clean =df.dropna(subset = ['Latitude', 'Longitude']).reset_index(drop=True)
print(df_clean.shape)
df_clean.head()

(24, 3)


Unnamed: 0,Neighborhood,Latitude,Longitude
0,Allapattah,25.815,-80.224
1,Arts & Entertainment District,25.799,-80.19
2,Brickell,25.758,-80.193
3,Buena Vista,25.813,-80.192
4,Coconut Grove,25.712,-80.257


### Visually analyze layout of neighborhoods.  Create map of Miami, overlay neighborhoods and set labels

In [6]:
# Get latitude and longitude for Miami

address = 'Miami, FL'

geolocator = Nominatim(user_agent="miami_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

# create map of Miami using latitude and longitude values
map_miami = folium.Map(location=[latitude, longitude], zoom_start=11)

# instantiate a feature group for the neighborhoods in the dataframe
neighborhoods = folium.map.FeatureGroup()

# loop through data and add each neighborhood to feature group
for lat, lng, neighborhood in zip(df_clean.Latitude, df_clean.Longitude, df_clean.Neighborhood):
    neighborhoods.add_child(
        folium.features.CircleMarker(
            [lat, lng],
            radius=5, 
            color='blue',
            fill=True,
            fill_color='#31866cc',
            fill_opacity=0.7
        )
    )

# add pop-up text to each marker on the map
latitudes = list(df_clean.Latitude)
longitudes = list(df_clean.Longitude)
labels = list(df_clean.Neighborhood)

for lat, lng, label in zip(latitudes, longitudes, labels):
    folium.Marker([lat, lng], popup=label).add_to(map_miami)    
    
# add neighborhoods to map
map_miami.add_child(neighborhoods)

### Set parameter values to be used for Foursquare API and retreive results from url

In [7]:
CLIENT_ID = ''
CLIENT_SECRET = ''
VERSION = '20180605'

### Define function to get nearby venues for neighborhoods in Miami, use radius of ~.5 miles (805 meters) and limit results to 100

In [8]:
def getNearbyVenues(names, latitudes, longitudes,radius=805,LIMIT=100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius,
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

### Get the nearby venues for neighborhoods in subset and look at data

In [9]:
miami_venues = getNearbyVenues(names=df_clean['Neighborhood'],
                                   latitudes=df_clean['Latitude'],
                                   longitudes=df_clean['Longitude']
                                  )

Allapattah
Arts & Entertainment District
Brickell
Buena Vista
Coconut Grove
Coral Way
Design District
Downtown
Edgewater
Flagami
Grapeland Heights
Liberty City
Little Haiti
Little Havana
Lummus Park
Midtown
Overtown
Park West
The Roads
Upper Eastside
Venetian Islands
Virginia Key
West Flagler
Wynwood


In [10]:
miami_venues['Neighborhood'] = miami_venues['Neighborhood'].astype('str') 
print(miami_venues.shape)
miami_venues

(1143, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Allapattah,25.815,-80.224,Little Caesars,25.809315,-80.224240,Pizza Place
1,Allapattah,25.815,-80.224,Winn-Dixie,25.808179,-80.224911,Grocery Store
2,Allapattah,25.815,-80.224,Ribs On Deck,25.813065,-80.224282,American Restaurant
3,Allapattah,25.815,-80.224,Fritura Dominicana,25.809588,-80.223622,Food Truck
4,Allapattah,25.815,-80.224,amarillis,25.808804,-80.223752,Latin American Restaurant
5,Arts & Entertainment District,25.799,-80.190,Bunnie Cakes,25.799544,-80.190953,Cupcake Shop
6,Arts & Entertainment District,25.799,-80.190,Bunbury Miami,25.798284,-80.191118,Wine Shop
7,Arts & Entertainment District,25.799,-80.190,Jack's Home Cooking,25.800447,-80.191031,American Restaurant
8,Arts & Entertainment District,25.799,-80.190,Yodi's Threading Spa,25.800490,-80.189093,Spa
9,Arts & Entertainment District,25.799,-80.190,Plant Food + Wine Miami,25.800452,-80.192805,Restaurant


### Extract coffee shops and parks

In [11]:
miami_cs = miami_venues[miami_venues['Venue Category'].isin(['Coffee Shop','Park'])].reset_index(drop=True)
print(miami_cs.shape)
miami_cs

(63, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Arts & Entertainment District,25.799,-80.190,Basketball Court at Margaret Pace Park,25.798518,-80.185483,Park
1,Arts & Entertainment District,25.799,-80.190,Margaret Pace Park,25.795651,-80.186654,Park
2,Arts & Entertainment District,25.799,-80.190,Starbucks,25.805133,-80.189237,Coffee Shop
3,Arts & Entertainment District,25.799,-80.190,Bold Brew Cafe,25.798376,-80.187484,Coffee Shop
4,Brickell,25.758,-80.193,Starbucks,25.762836,-80.193068,Coffee Shop
5,Brickell,25.758,-80.193,Starbucks,25.765060,-80.192977,Coffee Shop
6,Brickell,25.758,-80.193,1814 Brickell Park,25.755720,-80.197062,Park
7,Buena Vista,25.813,-80.192,Blue Bottle Coffee,25.812247,-80.193319,Coffee Shop
8,Buena Vista,25.813,-80.192,OTL,25.813395,-80.192375,Coffee Shop
9,Buena Vista,25.813,-80.192,Angelina's Coffee & Yogurt,25.809732,-80.192609,Coffee Shop


In [12]:
# Calculate Category ID to use for map markers

miami_cs['Category ID'] = np.where(miami_cs['Venue Category']=='Coffee Shop', 1, 2)
miami_cs

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Category ID
0,Arts & Entertainment District,25.799,-80.190,Basketball Court at Margaret Pace Park,25.798518,-80.185483,Park,2
1,Arts & Entertainment District,25.799,-80.190,Margaret Pace Park,25.795651,-80.186654,Park,2
2,Arts & Entertainment District,25.799,-80.190,Starbucks,25.805133,-80.189237,Coffee Shop,1
3,Arts & Entertainment District,25.799,-80.190,Bold Brew Cafe,25.798376,-80.187484,Coffee Shop,1
4,Brickell,25.758,-80.193,Starbucks,25.762836,-80.193068,Coffee Shop,1
5,Brickell,25.758,-80.193,Starbucks,25.765060,-80.192977,Coffee Shop,1
6,Brickell,25.758,-80.193,1814 Brickell Park,25.755720,-80.197062,Park,2
7,Buena Vista,25.813,-80.192,Blue Bottle Coffee,25.812247,-80.193319,Coffee Shop,1
8,Buena Vista,25.813,-80.192,OTL,25.813395,-80.192375,Coffee Shop,1
9,Buena Vista,25.813,-80.192,Angelina's Coffee & Yogurt,25.809732,-80.192609,Coffee Shop,1


In [13]:
miami_cs.groupby(['Neighborhood','Venue Category']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Category ID
Neighborhood,Venue Category,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Arts & Entertainment District,Coffee Shop,2,2,2,2,2,2
Arts & Entertainment District,Park,2,2,2,2,2,2
Brickell,Coffee Shop,2,2,2,2,2,2
Brickell,Park,1,1,1,1,1,1
Buena Vista,Coffee Shop,4,4,4,4,4,4
Buena Vista,Park,4,4,4,4,4,4
Coconut Grove,Park,3,3,3,3,3,3
Design District,Coffee Shop,4,4,4,4,4,4
Design District,Park,3,3,3,3,3,3
Downtown,Coffee Shop,4,4,4,4,4,4


In [14]:
df_grouped=miami_cs.groupby(['Neighborhood','Venue Category'])['Venue'].count().squeeze().unstack()
df_grouped

Venue Category,Coffee Shop,Park
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1
Arts & Entertainment District,2.0,2.0
Brickell,2.0,1.0
Buena Vista,4.0,4.0
Coconut Grove,,3.0
Design District,4.0,3.0
Downtown,4.0,1.0
Edgewater,2.0,3.0
Grapeland Heights,1.0,
Liberty City,1.0,1.0
Little Haiti,,1.0


In [15]:
## Drop Neighborhoods that do not have both Coffee Shop and Park (If either Coffee Shop or Park colums are NaN)

miami_cs_clean=df_grouped.dropna(subset = ['Coffee Shop', 'Park'])
print(miami_cs_clean.shape)
miami_cs_clean

(10, 2)


Venue Category,Coffee Shop,Park
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1
Arts & Entertainment District,2.0,2.0
Brickell,2.0,1.0
Buena Vista,4.0,4.0
Design District,4.0,3.0
Downtown,4.0,1.0
Edgewater,2.0,3.0
Liberty City,1.0,1.0
Midtown,6.0,1.0
Park West,2.0,2.0
Upper Eastside,3.0,2.0


In [16]:
miami_cs_clean.insert(0, 'Neighborhood_ID', range(1, 1 + len(miami_cs_clean)))
miami_cs_clean

Venue Category,Neighborhood_ID,Coffee Shop,Park
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Arts & Entertainment District,1,2.0,2.0
Brickell,2,2.0,1.0
Buena Vista,3,4.0,4.0
Design District,4,4.0,3.0
Downtown,5,4.0,1.0
Edgewater,6,2.0,3.0
Liberty City,7,1.0,1.0
Midtown,8,6.0,1.0
Park West,9,2.0,2.0
Upper Eastside,10,3.0,2.0


In [17]:
# Merge cleaned neighborhood data frame with venue data fram
df_merged = pd.merge(miami_cs_clean, miami_cs, on="Neighborhood")
df_merged

Unnamed: 0,Neighborhood,Neighborhood_ID,Coffee Shop,Park,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Category ID
0,Arts & Entertainment District,1,2.0,2.0,25.799,-80.19,Basketball Court at Margaret Pace Park,25.798518,-80.185483,Park,2
1,Arts & Entertainment District,1,2.0,2.0,25.799,-80.19,Margaret Pace Park,25.795651,-80.186654,Park,2
2,Arts & Entertainment District,1,2.0,2.0,25.799,-80.19,Starbucks,25.805133,-80.189237,Coffee Shop,1
3,Arts & Entertainment District,1,2.0,2.0,25.799,-80.19,Bold Brew Cafe,25.798376,-80.187484,Coffee Shop,1
4,Brickell,2,2.0,1.0,25.758,-80.193,Starbucks,25.762836,-80.193068,Coffee Shop,1
5,Brickell,2,2.0,1.0,25.758,-80.193,Starbucks,25.76506,-80.192977,Coffee Shop,1
6,Brickell,2,2.0,1.0,25.758,-80.193,1814 Brickell Park,25.75572,-80.197062,Park,2
7,Buena Vista,3,4.0,4.0,25.813,-80.192,Blue Bottle Coffee,25.812247,-80.193319,Coffee Shop,1
8,Buena Vista,3,4.0,4.0,25.813,-80.192,OTL,25.813395,-80.192375,Coffee Shop,1
9,Buena Vista,3,4.0,4.0,25.813,-80.192,Angelina's Coffee & Yogurt,25.809732,-80.192609,Coffee Shop,1


In [18]:
## How many neighborhoods and unique venues remain
nbrhds=df_merged.groupby(['Neighborhood']).ngroups
print('There are {} unique venues.'.format(len(df_merged['Venue'].unique())))
print('There are {} unique neighborhoods.'.format(len(df_merged['Neighborhood'].unique())))
df_merged

There are 28 unique venues.
There are 10 unique neighborhoods.


Unnamed: 0,Neighborhood,Neighborhood_ID,Coffee Shop,Park,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Category ID
0,Arts & Entertainment District,1,2.0,2.0,25.799,-80.19,Basketball Court at Margaret Pace Park,25.798518,-80.185483,Park,2
1,Arts & Entertainment District,1,2.0,2.0,25.799,-80.19,Margaret Pace Park,25.795651,-80.186654,Park,2
2,Arts & Entertainment District,1,2.0,2.0,25.799,-80.19,Starbucks,25.805133,-80.189237,Coffee Shop,1
3,Arts & Entertainment District,1,2.0,2.0,25.799,-80.19,Bold Brew Cafe,25.798376,-80.187484,Coffee Shop,1
4,Brickell,2,2.0,1.0,25.758,-80.193,Starbucks,25.762836,-80.193068,Coffee Shop,1
5,Brickell,2,2.0,1.0,25.758,-80.193,Starbucks,25.76506,-80.192977,Coffee Shop,1
6,Brickell,2,2.0,1.0,25.758,-80.193,1814 Brickell Park,25.75572,-80.197062,Park,2
7,Buena Vista,3,4.0,4.0,25.813,-80.192,Blue Bottle Coffee,25.812247,-80.193319,Coffee Shop,1
8,Buena Vista,3,4.0,4.0,25.813,-80.192,OTL,25.813395,-80.192375,Coffee Shop,1
9,Buena Vista,3,4.0,4.0,25.813,-80.192,Angelina's Coffee & Yogurt,25.809732,-80.192609,Coffee Shop,1


In [19]:
# create map of venues
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

## Get number of neighborhoods

nvenues=df_merged.groupby(['Venue']).ngroups

# set color scheme for the neighborhoods
x = np.arange(nvenues)
ys = [i + x + (i*x)**2 for i in range(nvenues)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, id, cat, area  in zip(df_merged['Venue Latitude'], df_merged['Venue Longitude'], df_merged['Venue'], df_merged['Category ID'], df_merged['Venue Category'], df_merged['Neighborhood']):
    label = folium.Popup(str(cat) + ': ' + str(poi) + ' ; Neighborhood: ' + str(area), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[id-1],
        fill=True,
        fill_color=rainbow[id-2],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Conclusion
The data clearly shows there are neighborhoods in Miami where our client, and others, can easily grab their favorite cup of coffee and enjoy it while taking a stroll in the park.