# Business Problem:

- Taking the example of a groceries contractor in one of the boroughs of Toronto (Scarborough)
- The contractor provides many diffenet places such as: Different types of Coffee shops, Restaurants, Brewery and Café, Bakery with fresh and high quality ingredients
- He intends to construct a warehouse to store all the ingredients he buys from farmers in Scarborough, such that he can provide ingredients to even more customers
---
- However, it is difficult to determine where should the contractor set up his warehouse at because if the warehouse is located near to those famous restaurants, then the ingredients can be delivered to the restaurant early and in time before the restaurant opens in the morning. Hence allowing the contractor will gain more reputation as a reliable contractor and possibly gain more customers in the future and thus improving his earnings
- On the other hand, if the contractor set up his warehouse in nearer the farmers, he may not be able to deliver the ingredients to the restaurants as early
---
- In addition to the above example, the question of which neighborhood in Scarborough is the be a best location for the contractor to set up the warehouse in should be considered as well
---
- Hence, finding the right neighborhood for the contractor to set up his warehouse is the main objective of this project
- This is done my coming up with a recommender system which will produce a sorted list of neighborhoods in which the first elemnt of the list will be the best suggested neighborhood, allowing the contractor to determine the best location to set up his warehouse

# Data that are required:

1. We will need geo-locational information about Scarborough and the neighborhoods in that Scarborough. Thus, the latitude and longitude numbers of Scarborough are required for locating it on the map. This will be provided by the contractor. The Postal Codes that fall into Scarborough are required as well. The Postal Codes will then by used to find the neighborhoods in Scarborough.
---
2. We will need data about different venues in different neighborhoods of Scarborough. To obtain these information, we will use "Foursquare" locational information. Locational information for each venue means basic and advanced information about that particular venue. Basic information includes the precise latitude and longitude and the distance of a particular venue from the center of the neighborhood. Advanced information includes the category of that venue, whether this venue is a popular one and maybe the average price of the services of this venue. 
---
A typical request from Foursquare will provide us with the following information:

[Postal Code] [Neighborhood(s)] [Neighborhood Latitude] [Neighborhood Longitude] [Venue] [Venue Summary] [Venue Category] [Distance (meter)]
[M1L] [Clairlea, Golden Mile, Oakridge] [43.711112] [-79.284577] [Tim Hortons] [This spot is popular] [Coffee Shop] [592]

# Recommender System for Ingredient Contractor

In [2]:
# importing libraries
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
from bs4 import BeautifulSoup
import requests # library to handle requests
import json # library to handle JSON files
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

!conda install -c conda-forge geopy --yes
import geopy.geocoders # convert an address into latitude and longitude values

!conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library

print('Libraries are imported.')

Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment /opt/conda/envs/DSX-Python35:

The following NEW packages will be INSTALLED:

    geographiclib: 1.49-py_0   conda-forge
    geopy:         1.18.1-py_0 conda-forge

geographiclib- 100% |################################| Time: 0:00:00  22.67 MB/s
geopy-1.18.1-p 100% |################################| Time: 0:00:00  32.90 MB/s
Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment /opt/conda/envs/DSX-Python35:

The following NEW packages will be INSTALLED:

    altair:  2.2.2-py35_1 conda-forge
    branca:  0.3.1-py_0   conda-forge
    folium:  0.5.0-py_0   conda-forge
    vincent: 0.4.4-py_1   conda-forge

altair-2.2.2-p 100% |################################| Time: 0:00:00  21.92 MB/s
branca-0.3.1-p 100% |################################| Time: 0:00:00  31.77 MB/s
vincent-0.4.4- 100% |###################

## Postal Code in Toronto

In [3]:
# Loading the dataset which is about postal codes in Toronto
df_toronto = pd.read_csv('https://cocl.us/Geospatial_data')
df_toronto.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [4]:
# Cleaning and adding dataset from wikipedia to the df_toronto
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
page = requests.get(url).text

soup = BeautifulSoup(page, 'lxml')

table = soup.find('table', class_='wikitable sortable')
table.prettify

tableheader = table.find_all('tr')

tablehead = []
for t1 in tableheader:
    t2 = t1.find_all('th')
    row1 = [t1.text.strip() for t1 in t2 if t1.text.strip()]
    
    if row1:
        tablehead.append(row1)

df1 = pd.DataFrame(tablehead)

tablerows = table.find_all('tr')

tablebody = []
for t1 in tablerows:
    t2 = t1.find_all('td')
    row2 = [t1.text.strip() for t1 in t2 if t1.text.strip()]
    if row2:
        tablebody.append(row2)

df2 = pd.DataFrame(tablebody) 

df3 = pd.concat([df1, df2])
df3.columns = df3.iloc[0]
df3 = df3[1:]
df3 = df3.drop(df3[df3["Borough"] == "Not assigned"].index)

df3.replace("Not assigned", np.nan, inplace=True)
df3.ffill(axis =1)

df = df3

df['Postcode'] = df['Postcode'].astype(str)
df['Borough'] = df['Borough'].astype(str)
df['Neighbourhood'] = df['Neighbourhood'].astype(str)

df.set_index(['Postcode', 'Borough'], inplace=True)
df = df.groupby(level=['Postcode', 'Borough'], sort=False).agg( ','.join)

df = df.reset_index()

df = df.join(df_toronto)
df

Unnamed: 0,Postcode,Borough,Neighbourhood,Postal Code,Latitude,Longitude
0,M3A,North York,Parkwoods,M1B,43.806686,-79.194353
1,M4A,North York,Victoria Village,M1C,43.784535,-79.160497
2,M5A,Downtown Toronto,"Harbourfront,Regent Park",M1E,43.763573,-79.188711
3,M6A,North York,"Lawrence Heights,Lawrence Manor",M1G,43.770992,-79.216917
4,M7A,Queen's Park,,M1H,43.773136,-79.239476
5,M9A,Etobicoke,Islington Avenue,M1J,43.744734,-79.239476
6,M1B,Scarborough,"Rouge,Malvern",M1K,43.727929,-79.262029
7,M3B,North York,Don Mills North,M1L,43.711112,-79.284577
8,M4B,East York,"Woodbine Gardens,Parkview Hill",M1M,43.716316,-79.239476
9,M5B,Downtown Toronto,"Ryerson,Garden District",M1N,43.692657,-79.264848


## Number of columns and rows of dataframe

In [7]:
df_toronto = df
df_toronto.shape

(103, 6)

## Creating a Map of Toronto with  its 

In [8]:
# for the city Toronto, latitude and longtitude are manually extracted via google search
toronto_latitude = 43.6932; toronto_longitude = -79.3832
map_toronto = folium.Map(location = [toronto_latitude, toronto_longitude], zoom_start = 10.7)

# add markers to map
for lat, lng, borough, neighborhood in zip(df_toronto['Latitude'], df_toronto['Longitude'], df_toronto['Borough'], df_toronto['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_toronto)  
    

map_toronto

## Selecting only "Scarorough" Borough in Toronto (its neighborhoods)

In [10]:
# selecting only neighborhoods regarding to "Scarborough" borough.
scarborough_data = df_toronto[df_toronto['Borough'] == 'Scarborough']
scarborough_data = scarborough_data.reset_index(drop=True)
scarborough_data.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Postal Code,Latitude,Longitude
0,M1B,Scarborough,"Rouge,Malvern",M1K,43.727929,-79.262029
1,M1C,Scarborough,"Highland Creek,Rouge Hill,Port Union",M1S,43.7942,-79.262029
2,M1E,Scarborough,"Guildwood,Morningside,West Hill",M2J,43.778517,-79.346556
3,M1G,Scarborough,Woburn,M2N,43.77012,-79.408493
4,M1H,Scarborough,Cedarbrae,M3B,43.745906,-79.352188


## Create a Map of Scarborough and Its Neighbourhoods

In [12]:
address_scar = 'Scarborough, Toronto'
latitude_scar = 43.773077
longitude_scar = -79.257774
print('The geograpical coordinate of "Scarborough" are: {}, {}.'.format(latitude_scar, longitude_scar))

map_Scarborough = folium.Map(location=[latitude_scar, longitude_scar], zoom_start=11.5)

# add markers to map
for lat, lng, label in zip(scarborough_data['Latitude'], scarborough_data['Longitude'], scarborough_data['Neighbourhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius = 10,
        popup = label,
        color ='blue',
        fill = True,
        fill_color = '#3186cc',
        fill_opacity = 0.7).add_to(map_Scarborough)  

map_Scarborough

The geograpical coordinate of "Scarborough" are: 43.773077, -79.257774.


In [13]:
def foursquare_crawler (postal_code_list, neighborhood_list, lat_list, lng_list, LIMIT = 500, radius = 1000):
    result_ds = []
    counter = 0
    for postal_code, neighborhood, lat, lng in zip(postal_code_list, neighborhood_list, lat_list, lng_list):
         
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, CLIENT_SECRET, VERSION, 
            lat, lng, radius, LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        tmp_dict = {}
        tmp_dict['Postal Code'] = postal_code; tmp_dict['Neighborhood(s)'] = neighborhood; 
        tmp_dict['Latitude'] = lat; tmp_dict['Longitude'] = lng;
        tmp_dict['Crawling_result'] = results;
        result_ds.append(tmp_dict)
        counter += 1
        print('{}.'.format(counter))
        print('Data is Obtained, for the Postal Code {} (and Neighborhoods {}) SUCCESSFULLY.'.format(postal_code, neighborhood))
    return result_ds;

In [15]:
CLIENT_ID = '3XW3TOWZPHZQKTEJABYEEEADSGTDEQI1COQRUGHX1WM4340R' # your Foursquare ID
CLIENT_SECRET = 'JG2OGEETAOAETUEAV2DILFQVJT4Y0BAVMBQHUNUFI5NEWK33' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

## Crawling Foursquare database for Venues in the Neighborhoods inside "Scarborough"

In [16]:
print('Crawling different neighborhoods inside "Scarborough"')
Scarborough_foursquare_dataset = foursquare_crawler(list(scarborough_data['Postcode']),
                                                   list(scarborough_data['Neighbourhood']),
                                                   list(scarborough_data['Latitude']),
                                                   list(scarborough_data['Longitude']),)

Crawling different neighborhoods inside "Scarborough"
1.
Data is Obtained, for the Postal Code M1B (and Neighborhoods Rouge,Malvern) SUCCESSFULLY.
2.
Data is Obtained, for the Postal Code M1C (and Neighborhoods Highland Creek,Rouge Hill,Port Union) SUCCESSFULLY.
3.
Data is Obtained, for the Postal Code M1E (and Neighborhoods Guildwood,Morningside,West Hill) SUCCESSFULLY.
4.
Data is Obtained, for the Postal Code M1G (and Neighborhoods Woburn) SUCCESSFULLY.
5.
Data is Obtained, for the Postal Code M1H (and Neighborhoods Cedarbrae) SUCCESSFULLY.
6.
Data is Obtained, for the Postal Code M1J (and Neighborhoods Scarborough Village) SUCCESSFULLY.
7.
Data is Obtained, for the Postal Code M1K (and Neighborhoods East Birchmount Park,Ionview,Kennedy Park) SUCCESSFULLY.
8.
Data is Obtained, for the Postal Code M1L (and Neighborhoods Clairlea,Golden Mile,Oakridge) SUCCESSFULLY.
9.
Data is Obtained, for the Postal Code M1M (and Neighborhoods Cliffcrest,Cliffside,Scarborough Village West) SUCCESSFULL

## Saving results of Foursquare

In [17]:
import pickle
with open("Scarborough_foursquare_dataset.txt", "wb") as fp:   #Pickling
    pickle.dump(Scarborough_foursquare_dataset, fp)
print('Received Data from Internet is Saved to Computer.')

Received Data from Internet is Saved to Computer.


In [18]:
with open("Scarborough_foursquare_dataset.txt", "rb") as fp:   # Unpickling
    Scarborough_foursquare_dataset = pickle.load(fp)

## Cleaning the RAW Data Received from Foursquare Database

In [19]:
# This function is created to connect to the saved list which is the received database.
# It will extract each venue for every neighborhood inside the database

def get_venue_dataset(foursquare_dataset):
    result_df = pd.DataFrame(columns = ['Postal Code', 'Neighborhood', 
                                           'Neighborhood Latitude', 'Neighborhood Longitude',
                                          'Venue', 'Venue Summary', 'Venue Category', 'Distance'])
    # print(result_df)
    
    for neigh_dict in foursquare_dataset:
        postal_code = neigh_dict['Postal Code']; neigh = neigh_dict['Neighborhood(s)']
        lat = neigh_dict['Latitude']; lng = neigh_dict['Longitude']
        print('Number of Venuse in Coordination "{}" Posal Code and "{}" Negihborhood(s) is:'.format(postal_code, neigh))
        print(len(neigh_dict['Crawling_result']))
        
        for venue_dict in neigh_dict['Crawling_result']:
            summary = venue_dict['reasons']['items'][0]['summary']
            name = venue_dict['venue']['name']
            dist = venue_dict['venue']['location']['distance']
            cat =  venue_dict['venue']['categories'][0]['name']
            
            
            # print({'Postal Code': postal_code, 'Neighborhood': neigh, 
            #                   'Neighborhood Latitude': lat, 'Neighborhood Longitude':lng,
            #                   'Venue': name, 'Venue Summary': summary, 
            #                   'Venue Category': cat, 'Distance': dist})
            
            result_df = result_df.append({'Postal Code': postal_code, 'Neighborhood': neigh, 
                              'Neighborhood Latitude': lat, 'Neighborhood Longitude':lng,
                              'Venue': name, 'Venue Summary': summary, 
                              'Venue Category': cat, 'Distance': dist}, ignore_index = True)
            # print(result_df)
    
    return(result_df)

In [20]:
scarborough_venues = get_venue_dataset(Scarborough_foursquare_dataset)

Number of Venuse in Coordination "M1B" Posal Code and "Rouge,Malvern" Negihborhood(s) is:
27
Number of Venuse in Coordination "M1C" Posal Code and "Highland Creek,Rouge Hill,Port Union" Negihborhood(s) is:
46
Number of Venuse in Coordination "M1E" Posal Code and "Guildwood,Morningside,West Hill" Negihborhood(s) is:
43
Number of Venuse in Coordination "M1G" Posal Code and "Woburn" Negihborhood(s) is:
100
Number of Venuse in Coordination "M1H" Posal Code and "Cedarbrae" Negihborhood(s) is:
29
Number of Venuse in Coordination "M1J" Posal Code and "Scarborough Village" Negihborhood(s) is:
4
Number of Venuse in Coordination "M1K" Posal Code and "East Birchmount Park,Ionview,Kennedy Park" Negihborhood(s) is:
59
Number of Venuse in Coordination "M1L" Posal Code and "Clairlea,Golden Mile,Oakridge" Negihborhood(s) is:
8
Number of Venuse in Coordination "M1M" Posal Code and "Cliffcrest,Cliffside,Scarborough Village West" Negihborhood(s) is:
42
Number of Venuse in Coordination "M1N" Posal Code an

## Showing Venues for Each Neighborhood in Scarboroug

In [21]:
scarborough_venues.head()

Unnamed: 0,Postal Code,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Summary,Venue Category,Distance
0,M1B,"Rouge,Malvern",43.727929,-79.262029,Tim Hortons,This spot is popular,Coffee Shop,851
1,M1B,"Rouge,Malvern",43.727929,-79.262029,Dollarama,This spot is popular,Discount Store,784
2,M1B,"Rouge,Malvern",43.727929,-79.262029,Chung Moi,This spot is popular,Chinese Restaurant,764
3,M1B,"Rouge,Malvern",43.727929,-79.262029,Giant Tiger,This spot is popular,Department Store,342
4,M1B,"Rouge,Malvern",43.727929,-79.262029,Subway,This spot is popular,Sandwich Place,674


In [22]:
scarborough_venues.tail()

Unnamed: 0,Postal Code,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Summary,Venue Category,Distance
960,M1X,Upper Rouge,43.643515,-79.577201,Cafe Sympatico,This spot is popular,Café,192
961,M1X,Upper Rouge,43.643515,-79.577201,No Frills,This spot is popular,Grocery Store,817
962,M1X,Upper Rouge,43.643515,-79.577201,TD Canada Trust,This spot is popular,Bank,861
963,M1X,Upper Rouge,43.643515,-79.577201,Hasty Market,This spot is popular,Convenience Store,194
964,M1X,Upper Rouge,43.643515,-79.577201,Renforth Mall Fish and Chips,This spot is popular,Fish & Chips Shop,855


## Saving a Cleaned Version of DataFrame as the Results from Foursquare

In [23]:
scarborough_venues.to_csv('scarborough_venues.csv')

## Loading Data from File (Saved "Foursquare " DataFrame for Venues)

In [24]:
scarborough_venues = pd.read_csv('scarborough_venues.csv')

## Summary Information about Neighborhoods inside "Scarborough"

In [25]:
neigh_list = list(scarborough_venues['Neighborhood'].unique())
print('Number of Neighborhoods inside Scarborough:')
print(len(neigh_list))
print('List of Neighborhoods inside Scarborough:')
neigh_list

Number of Neighborhoods inside Scarborough:
17
List of Neighborhoods inside Scarborough:


['Rouge,Malvern',
 'Highland Creek,Rouge Hill,Port Union',
 'Guildwood,Morningside,West Hill',
 'Woburn',
 'Cedarbrae',
 'Scarborough Village',
 'East Birchmount Park,Ionview,Kennedy Park',
 'Clairlea,Golden Mile,Oakridge',
 'Cliffcrest,Cliffside,Scarborough Village West',
 'Birch Cliff,Cliffside West',
 'Dorset Park,Scarborough Town Centre,Wexford Heights',
 'Maryvale,Wexford',
 'Agincourt',
 "Clarks Corners,Sullivan,Tam O'Shanter",
 "Agincourt North,L'Amoreaux East,Milliken,Steeles East",
 "L'Amoreaux West,Steeles West",
 'Upper Rouge']

## Summary Information about Neighborhoods inside "Scarborough" Cont'd

In [26]:
neigh_venue_summary = scarborough_venues.groupby('Neighborhood').count()
neigh_venue_summary.drop(columns = ['Unnamed: 0']).head()

Unnamed: 0_level_0,Postal Code,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Summary,Venue Category,Distance
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Agincourt,100,100,100,100,100,100,100
"Agincourt North,L'Amoreaux East,Milliken,Steeles East",100,100,100,100,100,100,100
"Birch Cliff,Cliffside West",100,100,100,100,100,100,100
Cedarbrae,29,29,29,29,29,29,29
"Clairlea,Golden Mile,Oakridge",8,8,8,8,8,8,8


In [27]:
print('There are {} uniques categories.'.format(len(scarborough_venues['Venue Category'].unique())))

print('Here is the list of different categories:')
list(scarborough_venues['Venue Category'].unique())

There are 216 uniques categories.
Here is the list of different categories:


['Coffee Shop',
 'Discount Store',
 'Chinese Restaurant',
 'Department Store',
 'Sandwich Place',
 'Burger Joint',
 'Bank',
 'Train Station',
 'Fast Food Restaurant',
 'Hobby Shop',
 'Grocery Store',
 'Pizza Place',
 'Hockey Arena',
 'Convenience Store',
 'Bus Line',
 'Light Rail Station',
 'Rental Car Location',
 'Asian Restaurant',
 'Breakfast Spot',
 'Caribbean Restaurant',
 'Sri Lankan Restaurant',
 'Malay Restaurant',
 'Supermarket',
 'Indian Restaurant',
 'Bakery',
 'Noodle House',
 'Lounge',
 'Cantonese Restaurant',
 'Seafood Restaurant',
 'Restaurant',
 'Sushi Restaurant',
 'Pool',
 'Pharmacy',
 'Japanese Restaurant',
 'Pool Hall',
 'Shopping Mall',
 'Mediterranean Restaurant',
 'Skating Rink',
 'Shanghai Restaurant',
 'Badminton Court',
 'Motorcycle Shop',
 'Vietnamese Restaurant',
 'Filipino Restaurant',
 'Bubble Tea Shop',
 'Park',
 'BBQ Joint',
 'Toy / Game Store',
 'Electronics Store',
 'Candy Store',
 'Tea Room',
 'Movie Theater',
 'American Restaurant',
 'Juice Bar',
 'S

## One-hot Encoding the "categroies" Column 

In [57]:
scarborough_onehot = pd.get_dummies(data = scarborough_venues, drop_first  = False, 
                              prefix = "", prefix_sep = "", columns = ['Venue Category'])
scarborough_onehot.head()

Unnamed: 0.1,Unnamed: 0,Postal Code,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Summary,Distance,Accessories Store,Adult Boutique,Afghan Restaurant,American Restaurant,Antique Shop,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Dealership,BBQ Joint,Badminton Court,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Basketball Court,Beach,Beach Bar,Beer Bar,Beer Store,Bike Shop,Bookstore,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Line,Bus Stop,Butcher,Café,Cajun / Creole Restaurant,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Castle,Cheese Shop,Chinese Restaurant,Chiropractor,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Gym,College Quad,College Rec Center,College Theater,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Cupcake Shop,Curling Ice,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Diner,Discount Store,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fireworks Store,Fish & Chips Shop,Flea Market,Flower Shop,Food & Drink Shop,Food Court,Food Truck,Fraternity House,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gastropub,Gay Bar,General Entertainment,General Travel,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hardware Store,Hawaiian Restaurant,Health Food Store,Historic Site,History Museum,Hobby Shop,Hockey Arena,Home Service,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jewish Restaurant,Juice Bar,Karaoke Bar,Korean Restaurant,Latin American Restaurant,Light Rail Station,Liquor Store,Lounge,Malay Restaurant,Mediterranean Restaurant,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Motorcycle Shop,Movie Theater,Museum,Music School,Music Store,Music Venue,Nail Salon,Neighborhood.1,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Organic Grocery,Paper / Office Supplies Store,Park,Pastry Shop,Performing Arts Venue,Persian Restaurant,Pet Store,Pharmacy,Pie Shop,Pizza Place,Playground,Plaza,Pool,Pool Hall,Portuguese Restaurant,Poutine Place,Pub,Ramen Restaurant,Rental Car Location,Restaurant,River,Rock Club,Salad Place,Salon / Barbershop,Sandwich Place,School,Seafood Restaurant,Shanghai Restaurant,Shoe Store,Shopping Mall,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soccer Stadium,Soup Place,Souvlaki Shop,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Sri Lankan Restaurant,Stadium,Steakhouse,Storage Facility,Supermarket,Sushi Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,University,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Wine Bar,Wings Joint,Yoga Studio
0,0,M1B,"Rouge,Malvern",43.727929,-79.262029,Tim Hortons,This spot is popular,851,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,1,M1B,"Rouge,Malvern",43.727929,-79.262029,Dollarama,This spot is popular,784,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,2,M1B,"Rouge,Malvern",43.727929,-79.262029,Chung Moi,This spot is popular,764,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,3,M1B,"Rouge,Malvern",43.727929,-79.262029,Giant Tiger,This spot is popular,342,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,4,M1B,"Rouge,Malvern",43.727929,-79.262029,Subway,This spot is popular,674,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


## Selecting Related Features for the Ingredient Contractor

In [58]:
# The list is created personally 
important_list_of_features = [
 'Neighborhood',
 'Neighborhood Latitude',
 'Neighborhood Longitude',

 'American Restaurant',
 'Asian Restaurant',
 
 'BBQ Joint',
 
 'Bakery',
 
 'Breakfast Spot',

 'Burger Joint',

 'Cajun / Creole Restaurant',
 'Cantonese Restaurant',
 'Caribbean Restaurant',
 'Chinese Restaurant',
 
 'Diner',

 'Fast Food Restaurant',
 'Filipino Restaurant',
 'Food & Drink Shop',
 'Fried Chicken Joint',
 
 'Greek Restaurant',
 'Grocery Store',
 
 'Indian Restaurant',

 'Italian Restaurant',
 'Japanese Restaurant',
 'Korean Restaurant',
 'Latin American Restaurant',

 'Malay Restaurant',
 
 'Mediterranean Restaurant',
 
 'Mexican Restaurant',
 'Middle Eastern Restaurant',
 
 'Noodle House',
 
 'Pizza Place',
 
 'Restaurant',
 'Sandwich Place',
 'Seafood Restaurant',
 'Shanghai Restaurant',
 
 'Sushi Restaurant',
 'Taiwanese Restaurant',
 
 'Thai Restaurant',
 
 'Vegetarian / Vegan Restaurant',
 
 'Vietnamese Restaurant',
 'Wings Joint']

## Updating the One-hot Encoded DataFrame and Grouping the Data by Neighborhoods

In [60]:
scarborough_onehot = scarborough_onehot[important_list_of_features].drop(
    columns = ['Neighborhood Latitude', 'Neighborhood Longitude', 'Unnamed: 0'])

In [61]:
scarborough_onehot.head()

Unnamed: 0,Neighborhood,Neighborhood.1,American Restaurant,Asian Restaurant,BBQ Joint,Bakery,Breakfast Spot,Burger Joint,Cajun / Creole Restaurant,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Diner,Fast Food Restaurant,Filipino Restaurant,Food & Drink Shop,Fried Chicken Joint,Greek Restaurant,Grocery Store,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Noodle House,Pizza Place,Restaurant,Sandwich Place,Seafood Restaurant,Shanghai Restaurant,Sushi Restaurant,Taiwanese Restaurant,Thai Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wings Joint
0,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0


### Remove the duplicate "Neighborhood" columns

In [63]:
# Getting a list of columns
scarborough_onehot.columns

Index(['Neighborhood', 'Neighborhood', 'American Restaurant',
       'Asian Restaurant', 'BBQ Joint', 'Bakery', 'Breakfast Spot',
       'Burger Joint', 'Cajun / Creole Restaurant', 'Cantonese Restaurant',
       'Caribbean Restaurant', 'Chinese Restaurant', 'Diner',
       'Fast Food Restaurant', 'Filipino Restaurant', 'Food & Drink Shop',
       'Fried Chicken Joint', 'Greek Restaurant', 'Grocery Store',
       'Indian Restaurant', 'Italian Restaurant', 'Japanese Restaurant',
       'Korean Restaurant', 'Latin American Restaurant', 'Malay Restaurant',
       'Mediterranean Restaurant', 'Mexican Restaurant',
       'Middle Eastern Restaurant', 'Noodle House', 'Pizza Place',
       'Restaurant', 'Sandwich Place', 'Seafood Restaurant',
       'Shanghai Restaurant', 'Sushi Restaurant', 'Taiwanese Restaurant',
       'Thai Restaurant', 'Vegetarian / Vegan Restaurant',
       'Vietnamese Restaurant', 'Wings Joint'],
      dtype='object')

In [64]:
#Changing column name of duplicated 'Neighborhood' column
scarborough_onehot.columns = ['Neighborhood', 'to_be_removed', 'American Restaurant',
       'Asian Restaurant', 'BBQ Joint', 'Bakery', 'Breakfast Spot',
       'Burger Joint', 'Cajun / Creole Restaurant', 'Cantonese Restaurant',
       'Caribbean Restaurant', 'Chinese Restaurant', 'Diner',
       'Fast Food Restaurant', 'Filipino Restaurant', 'Food & Drink Shop',
       'Fried Chicken Joint', 'Greek Restaurant', 'Grocery Store',
       'Indian Restaurant', 'Italian Restaurant', 'Japanese Restaurant',
       'Korean Restaurant', 'Latin American Restaurant', 'Malay Restaurant',
       'Mediterranean Restaurant', 'Mexican Restaurant',
       'Middle Eastern Restaurant', 'Noodle House', 'Pizza Place',
       'Restaurant', 'Sandwich Place', 'Seafood Restaurant',
       'Shanghai Restaurant', 'Sushi Restaurant', 'Taiwanese Restaurant',
       'Thai Restaurant', 'Vegetarian / Vegan Restaurant',
       'Vietnamese Restaurant', 'Wings Joint']

scarborough_onehot.head()

Unnamed: 0,Neighborhood,to_be_removed,American Restaurant,Asian Restaurant,BBQ Joint,Bakery,Breakfast Spot,Burger Joint,Cajun / Creole Restaurant,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Diner,Fast Food Restaurant,Filipino Restaurant,Food & Drink Shop,Fried Chicken Joint,Greek Restaurant,Grocery Store,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Noodle House,Pizza Place,Restaurant,Sandwich Place,Seafood Restaurant,Shanghai Restaurant,Sushi Restaurant,Taiwanese Restaurant,Thai Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wings Joint
0,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0


In [66]:
# Dropping the column that was once a duplicate
scarborough_onehot.drop(columns = ['to_be_removed'], axis = 1, inplace = True)
scarborough_onehot.head()

Unnamed: 0,Neighborhood,American Restaurant,Asian Restaurant,BBQ Joint,Bakery,Breakfast Spot,Burger Joint,Cajun / Creole Restaurant,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Diner,Fast Food Restaurant,Filipino Restaurant,Food & Drink Shop,Fried Chicken Joint,Greek Restaurant,Grocery Store,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Noodle House,Pizza Place,Restaurant,Sandwich Place,Seafood Restaurant,Shanghai Restaurant,Sushi Restaurant,Taiwanese Restaurant,Thai Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wings Joint
0,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0


### Grouping the data by Neighborhoods

In [70]:
scarborough_onehot = scarborough_onehot.groupby('Neighborhood').sum()

scarborough_onehot.head()

Unnamed: 0_level_0,American Restaurant,Asian Restaurant,BBQ Joint,Bakery,Breakfast Spot,Burger Joint,Cajun / Creole Restaurant,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Diner,Fast Food Restaurant,Filipino Restaurant,Food & Drink Shop,Fried Chicken Joint,Greek Restaurant,Grocery Store,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Noodle House,Pizza Place,Restaurant,Sandwich Place,Seafood Restaurant,Shanghai Restaurant,Sushi Restaurant,Taiwanese Restaurant,Thai Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wings Joint
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1
Agincourt,1,0,2,3,1,1,0,0,2,0,1,0,0,0,0,0,0,2,2,1,0,0,0,0,2,0,0,0,4,2,1,0,0,0,0,3,0,0
"Agincourt North,L'Amoreaux East,Milliken,Steeles East",0,0,0,0,1,3,0,0,0,1,2,0,0,0,0,0,1,1,2,2,0,0,0,0,1,0,0,3,2,1,2,0,4,0,1,1,0,1
"Birch Cliff,Cliffside West",4,2,0,2,1,1,0,0,0,0,0,0,0,0,0,1,0,1,1,3,0,0,0,1,0,0,1,1,3,0,1,0,3,0,2,1,0,0
Cedarbrae,0,1,0,0,1,2,0,0,1,0,1,0,0,0,0,1,0,0,0,3,0,0,0,0,0,0,0,2,1,0,0,0,0,0,1,0,0,0
"Clairlea,Golden Mile,Oakridge",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


## Integrating Different Restaurants and Different Joints

#### (Assuming Different Resaturants Use the Same Raw Ingredeints)

#### This Assumption is made due to insufficient details in dataset about neighborhoods

In [71]:
feat_name_list = list(scarborough_onehot.columns)
restaurant_list = []


for counter, value in enumerate(feat_name_list):
    if value.find('Restaurant') != (-1):
        restaurant_list.append(value)
        
scarborough_onehot['Total Restaurants'] = scarborough_onehot[restaurant_list].sum(axis = 1)
scarborough_onehot = scarborough_onehot.drop(columns = restaurant_list)


feat_name_list = list(scarborough_onehot.columns)
joint_list = []


for counter, value in enumerate(feat_name_list):
    if value.find('Joint') != (-1):
        joint_list.append(value)
        
scarborough_onehot['Total Joints'] = scarborough_onehot[joint_list].sum(axis = 1)
scarborough_onehot = scarborough_onehot.drop(columns = joint_list)

## Showing the Fully-Processed DataFrame about Neighborhoods inside Scarborrough

In [72]:
scarborough_onehot

Unnamed: 0_level_0,Bakery,Breakfast Spot,Diner,Food & Drink Shop,Grocery Store,Noodle House,Pizza Place,Sandwich Place,Total Restaurants,Total Joints
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Agincourt,3,1,1,0,0,0,0,2,18,3
"Agincourt North,L'Amoreaux East,Milliken,Steeles East",0,1,2,0,1,0,3,1,17,4
"Birch Cliff,Cliffside West",2,1,0,0,0,1,1,0,23,1
Cedarbrae,0,1,1,0,0,0,2,0,8,2
"Clairlea,Golden Mile,Oakridge",0,0,0,0,0,0,0,0,0,0
"Clarks Corners,Sullivan,Tam O'Shanter",3,2,1,1,2,0,1,3,21,1
"Cliffcrest,Cliffside,Scarborough Village West",0,0,2,0,0,0,0,0,13,0
"Dorset Park,Scarborough Town Centre,Wexford Heights",3,0,0,0,3,0,3,2,28,2
"East Birchmount Park,Ionview,Kennedy Park",0,1,0,0,3,0,0,2,7,3
"Guildwood,Morningside,West Hill",2,0,0,0,0,0,1,2,6,3


##### This dataset is ready for analysis 

## K-Mean Clustering

In [73]:
# import k-means from clustering stage
from sklearn.cluster import KMeans

# Run k-means clustering using 5 clusters
kmeans = KMeans(n_clusters = 5, random_state = 0).fit(scarborough_onehot)

### Displaying Centers of Each Cluster

In [74]:
means_df = pd.DataFrame(kmeans.cluster_centers_)
means_df.columns = scarborough_onehot.columns
means_df.index = ['G1','G2','G3','G4','G5']
means_df['Total Sum'] = means_df.sum(axis = 1)
means_df.sort_values(axis = 0, by = ['Total Sum'], ascending=False)

Unnamed: 0,Bakery,Breakfast Spot,Diner,Food & Drink Shop,Grocery Store,Noodle House,Pizza Place,Sandwich Place,Total Restaurants,Total Joints,Total Sum
G3,2.0,0.0,0.0,0.0,2.0,0.0,4.5,3.0,29.5,2.5,43.5
G2,2.0,1.2,0.8,0.2,0.8,0.4,1.2,1.6,19.8,2.0,30.0
G5,0.0,0.0,1.0,0.0,0.5,0.0,0.0,0.5,12.0,1.5,15.5
G1,0.6,0.8,0.2,0.0,1.0,0.0,1.2,1.0,7.2,2.2,14.2
G4,0.0,0.0,-5.5511150000000004e-17,6.938894e-18,0.333333,1.387779e-17,0.333333,0.0,1.0,0.0,1.666667


### Result:
#### Best Group is G3;
#### Second Best Group is G2;
#### Third Best Group is G5;

## Inserting "kmeans.labels_" into the Original Scarborough DataFrame
### Finding the Corresponding Group for Each Neighborhood.

In [76]:
neigh_summary = pd.DataFrame([scarborough_onehot.index, 1 + kmeans.labels_]).T
neigh_summary.columns = ['Neighborhood', 'Group']
neigh_summary

Unnamed: 0,Neighborhood,Group
0,Agincourt,2
1,"Agincourt North,L'Amoreaux East,Milliken,Steel...",2
2,"Birch Cliff,Cliffside West",2
3,Cedarbrae,1
4,"Clairlea,Golden Mile,Oakridge",4
5,"Clarks Corners,Sullivan,Tam O'Shanter",2
6,"Cliffcrest,Cliffside,Scarborough Village West",5
7,"Dorset Park,Scarborough Town Centre,Wexford He...",3
8,"East Birchmount Park,Ionview,Kennedy Park",1
9,"Guildwood,Morningside,West Hill",1


## Deducing Results:
### Best Neighborhood Are...

In [77]:
neigh_summary[neigh_summary['Group'] == 3]

Unnamed: 0,Neighborhood,Group
7,"Dorset Park,Scarborough Town Centre,Wexford He...",3
16,Woburn,3


In [78]:
name_of_neigh = list(neigh_summary[neigh_summary['Group'] == 5]['Neighborhood'])[0]
scarborough_venues[scarborough_venues['Neighborhood'] == name_of_neigh].iloc[0,1:5].to_dict()

{'Neighborhood': 'Cliffcrest,Cliffside,Scarborough Village West',
 'Neighborhood Latitude': 43.667966999999997,
 'Neighborhood Longitude': -79.367675300000002,
 'Postal Code': 'M1M'}

### Second Best Neighborhoods

In [79]:
neigh_summary[neigh_summary['Group'] == 1]

Unnamed: 0,Neighborhood,Group
3,Cedarbrae,1
8,"East Birchmount Park,Ionview,Kennedy Park",1
9,"Guildwood,Morningside,West Hill",1
11,"L'Amoreaux West,Steeles West",1
13,"Rouge,Malvern",1


### Third Best Neighborhood

In [80]:
neigh_summary[neigh_summary['Group'] == 4]

Unnamed: 0,Neighborhood,Group
4,"Clairlea,Golden Mile,Oakridge",4
14,Scarborough Village,4
15,Upper Rouge,4


---

## Thank You for viewing!
### By - TJQ