# Capstone Project - The Battle of Neighborhoods (Week 1)

## Part 1

**A description of the problem and a discussion of the background.**

Clearly define a problem or an idea of your choice, where you would need to leverage the Foursquare location data to solve or execute. Remember that data science problems always target an audience and are meant to help a group of stakeholders solve a problem, so make sure that you explicitly describe your audience and why they would care about your problem.

This submission will eventually become your Introduction/Business Problem section in your final report. So I recommend that you push the report (having your Introduction/Business Problem section only for now) to your Github repository and submit a link to it.

## Part 2

**A description of the data and how it will be used to solve the problem.**

Describe the data that you will be using to solve the problem or execute your idea. Remember that you will need to use the Foursquare location data to solve the problem or execute your idea. You can absolutely use other datasets in combination with the Foursquare location data. So make sure that you provide adequate explanation and discussion, with examples, of the data that you will be using, even if it is only Foursquare location data.

This submission will eventually become your Data section in your final report. So I recommend that you push the report (having your Data section) to your Github repository and submit a link to it.

---

### Part 1 - Description of the problem

#### Introduction / Business Problem 

The main goal of this project is to define where is the best place in Vancouver (Canada) to open a new restaurant specialized in Mediterranean cuisine. In order to answer this question, we will analyze the venues of Vancouver and get all the information about the restaurants that currently exist in Vancouver. To approach the project we will use the Foursquare API, which will help us in collecting the information such as tips, restaurant's category and more. This will let us to know which is the most common category of restaurants, so we will know if there are more popular restaurants than others. Finally we will know if it is viable to open the restaurant or not depending on the analysis. The most important task is to detect potential competitors.

**Main goal**

Search for business opportunities starting with the opening of a new Mediterranean cuisine restaurant in Vancouver.

**Target Audience**

The following target populations would be potential clients: students, tourists, workers, families.

### Part 2 - Description of the data



A study of the Vancouver neighborhoods will be done in order to make a classification according to the number of restaurants.

We will analyze the data collected from these venues (restaurants), such as cateogory, location and more.

#### Work done in this notebook

In this notebook we will make an introduction in order to prepare the envioronment of our study.

We only get the boroughs and neighborhoods and spotted them on the map, the following task is to analyze and prepare a final report based on the obtained venues, which is the final part of this notebook.

*The same method used in the Toronto study has been applied here, but in order to get all boroughs and neighborhoods we needed to use different methods of collecting the information. A lot of data cleaning has been done.*

In [1]:
!pip install folium

  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes


In [2]:
from bs4 import BeautifulSoup
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.options.mode.chained_assignment = None  # default='warn'

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


In [3]:
data_info = requests.get("https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_V")
soup = BeautifulSoup(data_info.text, 'html5lib')

In [4]:
table_contents=[]
table=soup.find('table')
for row in table.findAll('td'):
    cell = {}
    if "Vancouver" not in row.span.text:
        pass
    else:
        cell['PostalCode'] = row.b.text[:3]
        cell['Borough'] = (row.span.a.text).split('(')[0]
        my_string = row.span.a.text
        cell['Neighborhood'] = (((((row.span.text).split(my_string)[1]).strip(')')).replace(' /',',')).replace(')',' ')).strip('( ')
        table_contents.append(cell)

# print(table_contents)
df=pd.DataFrame(table_contents)

In [5]:
df.shape

(44, 3)

In [6]:
df['Latitude'] = 0.0
df['Longitude'] = 0.0

In [7]:
for index, row in df.iterrows():
    address = (str([i for i in row['Neighborhood'].split(',')][0]), str(row['Borough']))
    
    geolocator = Nominatim(user_agent="vancouver_explorer")
    location = geolocator.geocode(address)
    if(location is not None):
        latitude = location.latitude
        df['Latitude'][index] = latitude
        longitude = location.longitude
        df['Longitude'][index] = longitude

In [8]:
df = df[df["Latitude"] != 0.0]
df.drop_duplicates(subset ="Latitude", keep = False, inplace = True)
df = df.reset_index(drop=True)
df

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,V6A,Vancouver,"Strathcona, Chinatown, Downtown Eastside",49.277693,-123.088539
1,V6C,Vancouver,"Waterfront, Coal Harbour, Canada Place",49.28595,-123.111279
2,V6E,Vancouver,"SE West End, Davie Village",49.285062,-123.12306
3,V6G,Vancouver,"NW West End, Stanley Park",49.291275,-123.135402
4,V7G,North Vancouver,Outer East,52.350956,-128.504415
5,V6H,Vancouver,"West Fairview, Granville Island, NE Shaughnessy",49.261956,-123.130408
6,V5K,Vancouver,North Hastings-Sunrise,49.267549,-123.027694
7,V5L,Vancouver,North Grandview-Woodland,49.276901,-123.071121
8,V6L,Vancouver,"NW Arbutus Ridge, NE Dunbar-Southlands",49.235905,-123.155344
9,V5R,Vancouver,South Renfrew-Collingwood,49.261219,-123.026585


In [9]:
df.shape

(17, 5)

In [10]:
geolocator = Nominatim(user_agent="vancouver_explorer")
location = geolocator.geocode("Vancouver, Canada")
latitude = location.latitude
longitude = location.longitude
vancouver_map = folium.Map(location=[latitude, longitude], zoom_start=12)
vancouver_map

In [11]:
# create map of New York using latitude and longitude values
vancouver_map = folium.Map(location=[latitude, longitude], zoom_start=7)

# add markers to map
for lat, lng, borough, neighborhood in zip(df['Latitude'], df['Longitude'], df['Borough'], df['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(vancouver_map)  
    
vancouver_map

In [12]:
CLIENT_ID = 'WXDDCYBSUHBDMYUR0RVZGAP3N450CE443MAOXJXUFCSI40LI' # your Foursquare ID
CLIENT_SECRET = 'VL302DLWYYPSVYQROKFONGDNO4IE1ZMXZ25FT4CTS4S51R1M' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: WXDDCYBSUHBDMYUR0RVZGAP3N450CE443MAOXJXUFCSI40LI
CLIENT_SECRET:VL302DLWYYPSVYQROKFONGDNO4IE1ZMXZ25FT4CTS4S51R1M


In [13]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [14]:
vancouver_venues = getNearbyVenues(names=df['Neighborhood'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )

Strathcona, Chinatown, Downtown Eastside
Waterfront, Coal Harbour, Canada Place
SE West End, Davie Village
NW West End, Stanley Park
Outer East
West Fairview, Granville Island, NE Shaughnessy
North Hastings-Sunrise
North Grandview-Woodland
NW Arbutus Ridge, NE Dunbar-Southlands
South Renfrew-Collingwood
Killarney
NW Dunbar-Southlands, Chaldecutt, South University Endowment Lands
UBC
SE Oakridge, East Marpole, South Sunset
Bentall Centre
Pacific Centre
East Fairview, South Cambie


In [15]:
vancouver_venues.shape

(597, 7)

In [16]:
vancouver_venues.head(10)

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Strathcona, Chinatown, Downtown Eastside",49.277693,-123.088539,Union Market,49.277371,-123.086989,Deli / Bodega
1,"Strathcona, Chinatown, Downtown Eastside",49.277693,-123.088539,MacLean Park,49.278809,-123.088546,Park
2,"Strathcona, Chinatown, Downtown Eastside",49.277693,-123.088539,Strathcona Park,49.275183,-123.084919,Park
3,"Strathcona, Chinatown, Downtown Eastside",49.277693,-123.088539,Wilder Snail,49.279346,-123.087338,Coffee Shop
4,"Strathcona, Chinatown, Downtown Eastside",49.277693,-123.088539,Strathcona Beer Company,49.281294,-123.085111,Brewery
5,"Strathcona, Chinatown, Downtown Eastside",49.277693,-123.088539,Finch’s Market,49.278565,-123.093473,Sandwich Place
6,"Strathcona, Chinatown, Downtown Eastside",49.277693,-123.088539,les amis du Fromage,49.281199,-123.086241,Cheese Shop
7,"Strathcona, Chinatown, Downtown Eastside",49.277693,-123.088539,The Juice Truck,49.281281,-123.09212,Food Truck
8,"Strathcona, Chinatown, Downtown Eastside",49.277693,-123.088539,The Heatley,49.281091,-123.089629,Restaurant
9,"Strathcona, Chinatown, Downtown Eastside",49.277693,-123.088539,Astoria Pub,49.281295,-123.087777,Pub


In [17]:
vancouver_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bentall Centre,75,75,75,75,75,75
"East Fairview, South Cambie",44,44,44,44,44,44
Killarney,20,20,20,20,20,20
"NW Arbutus Ridge, NE Dunbar-Southlands",38,38,38,38,38,38
"NW Dunbar-Southlands, Chaldecutt, South University Endowment Lands",8,8,8,8,8,8
"NW West End, Stanley Park",97,97,97,97,97,97
North Grandview-Woodland,40,40,40,40,40,40
North Hastings-Sunrise,11,11,11,11,11,11
Pacific Centre,13,13,13,13,13,13
"SE Oakridge, East Marpole, South Sunset",37,37,37,37,37,37
