<h2>Import Packages</h2>

In [1]:
from bs4 import BeautifulSoup
import requests
import pandas as pd
import numpy as np

<h2>Get source code from Wikipedia URL</h2>

In [2]:
source = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').text
soup = BeautifulSoup(source,'lxml')

<h2>Get source code for ONLY the table of interest in the Wikipedia article</h2>
<p>Row 1: source code for the table</p>
<p>Row 2: source doe for the rows in the table</p>

In [3]:
My_table = soup.find('table',{'class':'wikitable sortable'})
table_rows = My_table.find_all('tr')

<h2>Initiate Pandas DataFrame from source code</h2>

In [4]:
l = []
for tr in table_rows:
    td = tr.find_all('td')
    row = [tr.text for tr in td]
    l.append(row)
df = pd.DataFrame(l, columns=['Postcode', 'Borough', 'Neighbourhood'])

<h2>Clean up DataFrame</h2>
<p>Rows 1-2: Remove all whitespace characters</p>
<p>Row 3: Drop the first row (which was the header row)</p>
<p>Row 4: Remove all rows where Borough was Not Assigned</p>
<p>Rows 5-7: Loop through rows, and if Neighbourhood was Not Assigned, change the name of the Neighborhood to its corresponding Borough</p>
<p>Rows 8-9: Group rows by Postcodes/Boroughs</p> 

In [5]:
df = df.replace(r'\\n',' ', regex=True)
df = df.apply(lambda x: x.str.strip() if x.dtype == "object" else x)
df = df.drop([0])
df = df[df.Borough != 'Not assigned']
for row in range(len(df)):
    if df.iloc[row,2] == 'Not assigned':
        df.iloc[row,2] = df.iloc[row,1]
df["Neighbourhood"] = df.groupby("Postcode")["Neighbourhood"].transform(lambda neigh: ', '.join(neigh))
df = df.drop_duplicates()
df

Unnamed: 0,Postcode,Borough,Neighbourhood
3,M3A,North York,Parkwoods
4,M4A,North York,Victoria Village
5,M5A,Downtown Toronto,"Harbourfront, Regent Park"
7,M6A,North York,"Lawrence Heights, Lawrence Manor"
9,M7A,Queen's Park,Queen's Park
11,M9A,Etobicoke,Islington Avenue
12,M1B,Scarborough,"Rouge, Malvern"
15,M3B,North York,Don Mills North
16,M4B,East York,"Woodbine Gardens, Parkview Hill"
18,M5B,Downtown Toronto,"Ryerson, Garden District"


<h2>Show Dimensions of DataFrame (which has 103 rows)</h2>

In [6]:
print(df.shape)

(103, 3)


<h2>Import Geo Data</h2>

In [7]:
import types
import pandas as pd
from botocore.client import Config
import ibm_boto3

def __iter__(self): return 0

# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It includes your credentials.
# You might want to remove those credentials before you share your notebook.
client_b261bf3e12a045ff909b5d2a62b31142 = ibm_boto3.client(service_name='s3',
    ibm_api_key_id='OGdvo4s_te15srCGrCbJ7RVfFTUBSLUYuMvCPlp_p7he',
    ibm_auth_endpoint="https://iam.ng.bluemix.net/oidc/token",
    config=Config(signature_version='oauth'),
    endpoint_url='https://s3-api.us-geo.objectstorage.service.networklayer.com')

body = client_b261bf3e12a045ff909b5d2a62b31142.get_object(Bucket='courseracapstoneproject-donotdelete-pr-eymr55m1tsxq5b',Key='Geospatial_Coordinates.csv')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )

geo = pd.read_csv(body)
geo.head()



Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


<h2>Merge the two dataframes to connect Latitude and Longitude to the Postal Codes</h2>

In [8]:
df = df.merge(geo, left_on='Postcode', right_on='Postal Code', how='outer')

<h2>Drop redundant column name</h2>

In [9]:
df = df.drop(['Postal Code'], axis=1)

In [10]:
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Harbourfront, Regent Park",43.65426,-79.360636
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.718518,-79.464763
4,M7A,Queen's Park,Queen's Park,43.662301,-79.389494


<h2>Show dimensions of dataframe</h2>

In [11]:
df.shape

(103, 5)

<h2>Sort the dataframe alphabetically by the Neighbourhood column</h2>

In [12]:
df = df.sort_values(by=['Neighbourhood'])
df

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
30,M5H,Downtown Toronto,"Adelaide, King, Richmond",43.650571,-79.384568
78,M1S,Scarborough,Agincourt,43.794200,-79.262029
85,M1V,Scarborough,"Agincourt North, L'Amoreaux East, Milliken, St...",43.815252,-79.284577
89,M9V,Etobicoke,"Albion Gardens, Beaumond Heights, Humbergate, ...",43.739416,-79.588437
93,M8W,Etobicoke,"Alderwood, Long Branch",43.602414,-79.543484
28,M3H,North York,"Bathurst Manor, Downsview North, Wilson Heights",43.754328,-79.442259
39,M2K,North York,Bayview Village,43.786947,-79.385975
55,M5M,North York,"Bedford Park, Lawrence Manor East",43.733283,-79.419750
20,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
58,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


<h2>Import packages to create a Map of Toronto</h2>

In [13]:
from geopy.geocoders import Nominatim
import json
import requests
from pandas.io.json import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
!conda install -c conda-forge folium=0.5.0 --yes
import folium

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    folium-0.5.0               |             py_0          45 KB  conda-forge
    certifi-2019.6.16          |           py36_1         149 KB  conda-forge
    openssl-1.1.1c             |       h516909a_0         2.1 MB  conda-forge
    branca-0.3.1               |             py_0          25 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    ca-certificates-2019.6.16  |       hecc5488_0         145 KB  conda-forge
    altair-3.2.0               |           py36_0         770 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         3.3 MB

The following NEW packages will be 

<h2>Fetch the coordinates of Toronto to initiate Map</h2>

In [14]:
address = 'Toronto, CA'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.653963, -79.387207.


<h2>Start map, then loop through the dataframe of Neighborhoods in Toronto to:</h2>
<p><ol type="1">
  <li>Fetch the Coordinates of the Neighborhood</li>
  <li>Create a blue Marker for the Neighborhood</li>
  <li>Add it to the Map of Toronto</li>
</ol></p>

In [15]:
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(df['Latitude'], df['Longitude'], df['Neighbourhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
#display map   
map_toronto

<h2>Define Foursquare Credentials and Version</h2>

In [16]:
CLIENT_ID = '2XEEYAM0D0X203RMZ5VLEOATWGWYUBQUSVGH2QW0GSKEVOKU'
CLIENT_SECRET = 'JNXQNOIWN4P2FV0CZBZSE4JUI13URQUWYOGEAKGACLKBCJD4'
VERSION = '20180605'
LIMIT = 100
radius = 500

<h2>Testing out a call to the Foursquare API with one neighborhood first (Davisville)</h2>
<p>Will test out a call that will fetch all venues in a 500 mile radius of Davisville and their respective category type. If this works for Davisville, will later proceed to do the same with all neighborhoods (Chose a neighborhood at random to try)</p>

In [17]:
#Checking that Davisville info is on row 79 of the dataframe
df.loc[79, 'Neighbourhood']

'Davisville'

In [18]:
#create variables for the name and coordinates of Davisville
neighborhood_latitude = df.loc[79, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = df.loc[79, 'Longitude'] # neighborhood longitude value

neighborhood_name = df.loc[79, 'Neighbourhood'] # neighborhood name

In [19]:
#Create a URL that will fetch data from Foursquare about Davisville using the variables created above
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url

'https://api.foursquare.com/v2/venues/explore?&client_id=2XEEYAM0D0X203RMZ5VLEOATWGWYUBQUSVGH2QW0GSKEVOKU&client_secret=JNXQNOIWN4P2FV0CZBZSE4JUI13URQUWYOGEAKGACLKBCJD4&v=20180605&ll=43.7043244,-79.3887901&radius=500&limit=100'

In [20]:
#check the results of the URL
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5d716d23c53093002cd6165c'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Davisville',
  'headerFullLocation': 'Davisville, Toronto',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 37,
  'suggestedBounds': {'ne': {'lat': 43.7088244045, 'lng': -79.38257691798016},
   'sw': {'lat': 43.699824395499995, 'lng': -79.39500328201983}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4ae6ea6ef964a52082a721e3',
       'name': 'Jules Cafe Patisserie',
       'location': {'address': '617 Mt Pleasant Ave',
        'crossStreet': 'at Manor Rd E',
        'lat': 43.70413799694304,
        'lng': -79.38841260442167,
        'labeledLatLngs':

In [21]:
#define function to fetch the category type of a venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [22]:
#clean results data and put it in a datafram called nearby_venus
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues

Unnamed: 0,name,categories,lat,lng
0,Jules Cafe Patisserie,Dessert Shop,43.704138,-79.388413
1,Thobors Boulangerie Patisserie Café,Café,43.704514,-79.388616
2,Marigold Indian Bistro,Indian Restaurant,43.702881,-79.388008
3,XO Gelato,Dessert Shop,43.705177,-79.388793
4,Viva Napoli,Pizza Place,43.705752,-79.389125
5,Starbucks,Coffee Shop,43.706084,-79.389355
6,Zee Grill,Seafood Restaurant,43.704985,-79.388476
7,June Rowlands Park,Park,43.700517,-79.389189
8,Florentia Ristorante,Italian Restaurant,43.703594,-79.387985
9,Sakae Sushi,Sushi Restaurant,43.704944,-79.388704


<h2>Since call to Foursquare API worked for one neighborhood, will proceed with the same method for ALL neighborhoods in Toronto</h2>

In [23]:
#create a looped function to repeat what was just done for Davisville to ALL neighborhoods
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [24]:
#apply the function that was defined above to the Toronto cities dataframe and store results in variable called Toronto_venues
Toronto_venues = getNearbyVenues(names=df['Neighbourhood'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )

Adelaide, King, Richmond
Agincourt
Agincourt North, L'Amoreaux East, Milliken, Steeles East
Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown
Alderwood, Long Branch
Bathurst Manor, Downsview North, Wilson Heights
Bayview Village
Bedford Park, Lawrence Manor East
Berczy Park
Birch Cliff, Cliffside West
Bloordale Gardens, Eringate, Markland Wood, Old Burnhamthorpe
Brockton, Exhibition Place, Parkdale Village
Business Reply Mail Processing Centre 969 Eastern
CFB Toronto, Downsview East
CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara
Cabbagetown, St. James Town
Caledonia-Fairbanks
Canada Post Gateway Processing Centre
Cedarbrae
Central Bay Street
Chinatown, Grange Park, Kensington Market
Christie
Church and Wellesley
Clairlea, Golden Mile, Oakridge
Clarks Corners, Sullivan, Tam O'Shanter
Cliffcrest, Cliffside, Scarborough Village West
Cloverdale, Islington, Martin Grove, P

<h2>Show shape of the Toronto Venues dataframe and show top 5 rows</h2>

In [25]:
#Number of rows reflects number of venues that were found
print(Toronto_venues.shape)
Toronto_venues.head()

(2253, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Adelaide, King, Richmond",43.650571,-79.384568,Four Seasons Centre for the Performing Arts,43.650592,-79.385806,Concert Hall
1,"Adelaide, King, Richmond",43.650571,-79.384568,The Keg Steakhouse & Bar,43.649937,-79.384196,Steakhouse
2,"Adelaide, King, Richmond",43.650571,-79.384568,Nathan Phillips Square,43.65227,-79.383516,Plaza
3,"Adelaide, King, Richmond",43.650571,-79.384568,Rosalinda,43.650252,-79.385156,Vegetarian / Vegan Restaurant
4,"Adelaide, King, Richmond",43.650571,-79.384568,Shangri-La Toronto,43.649129,-79.386557,Hotel


<h2>Count number of venues that were found for each Neighborhood</h2>

In [26]:
Toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Adelaide, King, Richmond",100,100,100,100,100,100
Agincourt,5,5,5,5,5,5
"Agincourt North, L'Amoreaux East, Milliken, Steeles East",3,3,3,3,3,3
"Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown",10,10,10,10,10,10
"Alderwood, Long Branch",10,10,10,10,10,10
"Bathurst Manor, Downsview North, Wilson Heights",19,19,19,19,19,19
Bayview Village,4,4,4,4,4,4
"Bedford Park, Lawrence Manor East",22,22,22,22,22,22
Berczy Park,56,56,56,56,56,56
"Birch Cliff, Cliffside West",4,4,4,4,4,4


<h2>Print number of unique category types for the venues that were found</h2>

In [27]:
print('There are {} uniques categories.'.format(len(Toronto_venues['Venue Category'].unique())))

There are 271 uniques categories.


<h2>Create dataframe that will house frequency of each venue category type (columns) for each Neighborhood in Toronto (rows)</h2>

In [28]:
# one hot encoding
toronto_onehot = pd.get_dummies(Toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = Toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Yoga Studio,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [29]:
toronto_onehot.shape

(2253, 271)

In [30]:
#group the dataframe by unique Neighborhoods
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wings Joint,Women's Store
0,"Adelaide, King, Richmond",0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.030000,...,0.000000,0.0,0.020000,0.000000,0.000000,0.000000,0.000000,0.010000,0.000000,0.0
1,Agincourt,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0
2,"Agincourt North, L'Amoreaux East, Milliken, St...",0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0
3,"Albion Gardens, Beaumond Heights, Humbergate, ...",0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0
4,"Alderwood, Long Branch",0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0
5,"Bathurst Manor, Downsview North, Wilson Heights",0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.0,0.000000,0.000000,0.052632,0.000000,0.000000,0.000000,0.000000,0.0
6,Bayview Village,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0
7,"Bedford Park, Lawrence Manor East",0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.045455,...,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0
8,Berczy Park,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.0,0.017857,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0
9,"Birch Cliff, Cliffside West",0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0


In [31]:
#confirm the size of the dataframe reflects the number of Neighborhoods (rows) and the number of unique venue category types (columns)
toronto_grouped.shape

(100, 271)

<h2>Show top 10 venue category types for each Neighborhood and store in dataframe called neighborhoods_venues_sorted

In [32]:
#create function that will sort the veues in descending order
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [33]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Coffee Shop,Café,Steakhouse,Bar,Hotel,Cosmetics Shop,Restaurant,Thai Restaurant,Burger Joint,American Restaurant
1,Agincourt,Lounge,Sandwich Place,Breakfast Spot,Clothing Store,Chinese Restaurant,Drugstore,Discount Store,Dog Run,Doner Restaurant,Donut Shop
2,"Agincourt North, L'Amoreaux East, Milliken, St...",Park,Playground,Coffee Shop,Women's Store,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant
3,"Albion Gardens, Beaumond Heights, Humbergate, ...",Grocery Store,Liquor Store,Coffee Shop,Fast Food Restaurant,Beer Store,Sandwich Place,Fried Chicken Joint,Pizza Place,Pharmacy,Comic Shop
4,"Alderwood, Long Branch",Pizza Place,Coffee Shop,Gym,Pharmacy,Skating Rink,Sandwich Place,Dance Studio,Pool,Pub,Women's Store


In [34]:
neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Coffee Shop,Café,Steakhouse,Bar,Hotel,Cosmetics Shop,Restaurant,Thai Restaurant,Burger Joint,American Restaurant
1,Agincourt,Lounge,Sandwich Place,Breakfast Spot,Clothing Store,Chinese Restaurant,Drugstore,Discount Store,Dog Run,Doner Restaurant,Donut Shop
2,"Agincourt North, L'Amoreaux East, Milliken, St...",Park,Playground,Coffee Shop,Women's Store,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant
3,"Albion Gardens, Beaumond Heights, Humbergate, ...",Grocery Store,Liquor Store,Coffee Shop,Fast Food Restaurant,Beer Store,Sandwich Place,Fried Chicken Joint,Pizza Place,Pharmacy,Comic Shop
4,"Alderwood, Long Branch",Pizza Place,Coffee Shop,Gym,Pharmacy,Skating Rink,Sandwich Place,Dance Studio,Pool,Pub,Women's Store
5,"Bathurst Manor, Downsview North, Wilson Heights",Coffee Shop,Shopping Mall,Supermarket,Sushi Restaurant,Middle Eastern Restaurant,Deli / Bodega,Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Diner
6,Bayview Village,Café,Japanese Restaurant,Bank,Chinese Restaurant,Dessert Shop,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop
7,"Bedford Park, Lawrence Manor East",Thai Restaurant,Italian Restaurant,Coffee Shop,Greek Restaurant,Sandwich Place,Juice Bar,Fast Food Restaurant,Restaurant,Butcher,Indian Restaurant
8,Berczy Park,Coffee Shop,Cocktail Bar,Italian Restaurant,Steakhouse,Beer Bar,Cheese Shop,Seafood Restaurant,Bakery,Café,Farmers Market
9,"Birch Cliff, Cliffside West",College Stadium,General Entertainment,Café,Skating Rink,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop


<h2>Finally, to decide what product to supply as a wholesale distributor count the number of times each venue category type shows up as the 1st Most Common Venue in any Neighborhood and select the topmost type</h2>

In [35]:
neighborhoods_venues_sorted['1st Most Common Venue'].value_counts()

Coffee Shop               20
Park                      15
Café                       6
Grocery Store              5
Home Service               4
Trail                      3
Pizza Place                3
Fast Food Restaurant       2
Breakfast Spot             2
Bakery                     2
Furniture / Home Store     2
Indian Restaurant          2
Sandwich Place             2
Gym / Fitness Center       2
Lounge                     1
Gym                        1
Chinese Restaurant         1
French Restaurant          1
Motel                      1
Pet Store                  1
College Stadium            1
Food Truck                 1
Bar                        1
Pool                       1
Electronics Store          1
Restaurant                 1
Shopping Mall              1
Pharmacy                   1
Clothing Store             1
History Museum             1
Garden Center              1
Convenience Store          1
Beer Store                 1
Playground                 1
Department Sto

<h2>COFFEE</h2>
<p>Coffee Shops are more frequently the 1st most common venue category type than any other venue category type. Therefore, it is recommended to start a Coffee Supply Wholesale Distribution business in Toronto as the city has an abundance of coffee shops</p> 