# Predicting Best place for Warehouse House

## Exploring Toronto Neighborhoods - to open new warehouse in Etobicoke


## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

It’s never been easier to go for Warehouse Store. Why leave the comfort of your warm bed when you can simply press a few buttons and have your full order (produce and all) delivered to your Place after sometime. Warehouse Contractor wanted to develop and open it at such a place where his team can deliver the product to their neighbours in that Borough. The Warehouse should be the most benefited among all in that Borough. The least time contractor will take to deliver the products, more benefits they will be getting.So how this can be possible??
In order to minimize the chance of getting late they should plan and do research in a way to get the least delay for Customers. Satisfaction of Customers need no delay, good quality, optimum price e.t.c...

The daily work for Warehouse’s Contractor to deliver products to the local Customer in the least time. The place of Delivery may be far or near. And there are many regular Customer who are doing their Business great and their demands are very high. These are most valuable Customers whom they won’t want to have delay. How they will manage to deliver products in minimum time??
We are taking the data of Toronto City in which many Borough are includes. We are manipulating the data of Toronto which is taken from wikipedia page. Link: https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M

We are fond of finding the best location in Etobicoke Borough in Toronto City such that the nearest places of Delivery should be the most valuable Customers from whom Warehouse are getting the most benefits. Neighbourhood places of the Borough Etobicoke should have more number of Customers.


## Data <a name="data"></a>

The data acquired for this project is a combination of data from three sources. The first data source of the project uses a List of postal codes of
Canada: M that shows the neighbours per borough in Toronto.
The second source of dataset is created from scratch using 

the list of neighbourhood available on the site Latitudes and Longitudes . This page contains additional information about the boroughs.
The third data source is the Foursquare API as found on the given link. This dataset is responsible for information of all neighbours latitude and longitude by requesting url using Foursquare API. This contains:
- CLIENT_ID = # your Foursquare ID
- CLIENT_SECRET =# your Foursquare Secret
- VERSION = # Foursquare API version

In [1]:
from bs4 import BeautifulSoup

# library to handle data in a vectorized manner
import numpy as np

# library for data analsysis
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

# library to handle JSON files
import json
print('numpy, pandas ... imported.')

# !pip -q install geopy
# conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
print('geopy installed...')
# convert an address into latitude and longitude values
from geopy.geocoders import Nominatim
print('Nominatim imported...')

# library to handle requests
import requests
print('requests imported...')

# tranform JSON file into a pandas dataframe
from pandas.io.json import json_normalize
print('json_normalize imported...')

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
print('matplotlib imported...')

# import k-means from clustering stage
from sklearn.cluster import KMeans
print('Kmeans imported...')

# install the Geocoder
# !pip -q install geocoder
import geocoder

# import time
import time
import io

# !conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
# !pip -q install folium
print('folium installed...')
import folium # map rendering library
print('folium imported...')
print('Done')

numpy, pandas ... imported.
geopy installed...
Nominatim imported...
requests imported...
json_normalize imported...
matplotlib imported...
Kmeans imported...
folium installed...
folium imported...
Done


In [2]:
# We are scraping all data of Canada from raw form & convert it to pandas DataFrame

In [3]:
url = "https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M"
html = requests.get(url).text

In [4]:
df = pd.read_html(html, header=0)[0]

In [5]:
print('Shape: {}'.format(df.shape))
df.head()

Shape: (287, 3)


Unnamed: 0,Postcode,Borough,Neighborhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


In [6]:
df = df[df.Borough != 'Not assigned']

In [7]:
print('Shape: {}'.format(df.shape))
df.head()

Shape: (210, 3)


Unnamed: 0,Postcode,Borough,Neighborhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront
5,M6A,North York,Lawrence Heights
6,M6A,North York,Lawrence Manor


In [8]:
# Adding Column of "Latitide" and "Longitude"

In [9]:
# Merging all borugh with latitude and longitudes 

In [10]:
url_location = "http://cocl.us/Geospatial_data"
source_location = requests.get(url_location).content
c = pd.read_csv(io.StringIO(source_location.decode('utf-8')))

In [11]:
c.columns = ['Postcode', 'Latitude', 'Longitude']
df = pd.merge(c, df, on='Postcode')

In [12]:
df = df[['Postcode', 'Borough', 'Neighborhood', 'Latitude', 'Longitude']]
df.head()

Unnamed: 0,Postcode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,Rouge,43.806686,-79.194353
1,M1B,Scarborough,Malvern,43.806686,-79.194353
2,M1C,Scarborough,Highland Creek,43.784535,-79.160497
3,M1C,Scarborough,Rouge Hill,43.784535,-79.160497
4,M1C,Scarborough,Port Union,43.784535,-79.160497


In [13]:
print("We are doing with {} Borough and {} Neighborhood".format(len(df['Borough'].unique()), df.shape[0]))

We are doing with 11 Borough and 210 Neighborhood


In [36]:
# Getting map of Toronto using Flium

In [14]:
# from above data we collected latitude and longitudes of toronto
toronto_latitude = 43.739416; toronto_longitude = -79.594054


map_toronto = folium.Map(location = [toronto_latitude, toronto_longitude], zoom_start = 11)

# add markers to map
for lat, lng, borough, neighborhood in zip(df['Latitude'], df['Longitude'], df['Borough'], df['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_toronto)  
    
map_toronto

In [15]:
# Getting all data of Botough 'Etobicoke'

In [16]:

Etobicoke_data = df[df['Borough'] == 'Etobicoke']
Etobicoke_data.head()

Unnamed: 0,Postcode,Borough,Neighborhood,Latitude,Longitude
161,M8V,Etobicoke,Humber Bay Shores,43.605647,-79.501321
162,M8V,Etobicoke,Mimico South,43.605647,-79.501321
163,M8V,Etobicoke,New Toronto,43.605647,-79.501321
164,M8W,Etobicoke,Alderwood,43.602414,-79.543484
165,M8W,Etobicoke,Long Branch,43.602414,-79.543484


In [17]:
# map for Botough 'Etobicoke'

In [18]:
address_scar = 'Etobicoke, Toronto'

latitude_scar = 43.605647
longitude_scar = -79.501321
print('The geograpical coordinate of "Etobicoke" are: {}, {}.'.format(latitude_scar, longitude_scar))

map_Etobicoke = folium.Map(location=[latitude_scar, longitude_scar], zoom_start=10)

# add markers to map
for lat, lng, label in zip(Etobicoke_data['Latitude'], Etobicoke_data['Longitude'], Etobicoke_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius = 10,
        popup = label,
        color ='blue',
        fill = True,
        fill_color = '#3186cc',
        fill_opacity = 0.7).add_to(map_Etobicoke)  
    
map_Etobicoke

The geograpical coordinate of "Etobicoke" are: 43.605647, -79.501321.


## Methodology <a name="methodology"></a>

In this project we will direct our efforts on detecting areas of Toronto that have low restaurant density, particularly those with low number of Warehouse's Team who are responsible for providing required stuffs for running small store and restaurant . We will limit our analysis to the particulat Borough "Etobicoke".

In first step we have collected the required **data: location and type (category) of every neighbours within  from Borough center** (Etobicoke). We have also **identified  Categories** (according to Foursquare categorization).

Second step in our analysis will be calculation and exploration of '**Shop Centres density**' across different areas of Toronto(Etobicoke) - we will use **one hot cosing** to make more readable of categorical data.

In third and final step we will focus on most promising areas and within those create **clusters of locations that meet some basic requirements** established in discussion with stakeholders: we will take into consideration locations with **more closer shop centres and restaurants in radius of 250 meters**, and we want locations **without Italian restaurants in that Borough**. We will present map of all such locations but also create clusters (using **k-means clustering**) of those locations to identify general zones / neighborhoods / addresses which should be a starting point for final 'street level' exploration and search for optimal venue location by stakeholders.

### Foursquare
Now that we have our location candidates, let's use Foursquare API to get info on Customers categories in each neighborhood.

We're interested in venues in  category, but those that are far from the Borough Etobicoke  so we don't care about those. So we will include in out list only venues that have least Distance from Etobicoke.

In [19]:
CLIENT_ID = 'PQOWQNDPUNVD5BJEUBSJBDDYHHHVYJXEI3QO4LBFLD2LFHNK' # your Foursquare ID
CLIENT_SECRET = 'UZZIXURL0GEOENXL4HNTF5KC22P0RF00CZVO3LX0MD4DQP41' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
radius = 750
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: PQOWQNDPUNVD5BJEUBSJBDDYHHHVYJXEI3QO4LBFLD2LFHNK
CLIENT_SECRET:UZZIXURL0GEOENXL4HNTF5KC22P0RF00CZVO3LX0MD4DQP41


In [20]:
def foursquare_crawler (postal_code_list, neighborhood_list, lat_list, lng_list, LIMIT = 500, radius = 1000):
    result_ds = []
    counter = 0
    for postal_code, neighborhood, lat, lng in zip(postal_code_list, neighborhood_list, lat_list, lng_list):
         
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, CLIENT_SECRET, VERSION, 
            lat, lng, radius, LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        tmp_dict = {}
        tmp_dict['Postal Code'] = postal_code; tmp_dict['Neighborhood(s)'] = neighborhood; 
        tmp_dict['Latitude'] = lat; tmp_dict['Longitude'] = lng;
        tmp_dict['Crawling_result'] = results;
        result_ds.append(tmp_dict)
        counter += 1
#         print('{}.'.format(counter))
        print('{}: Postal Code: {}, Neighborhoods: {}.'.format(counter,postal_code, neighborhood))
    return result_ds;

## Analysis <a name="analysis"></a>

Let's perform some basic explanatory data analysis and derive some additional info from our raw data. First let's count the  different neighborhoods inside "Etobicoke"


In [21]:
print('Crawling different neighborhoods inside "Etobicoke"')
Etobicoke_foursquare_dataset = foursquare_crawler(list(Etobicoke_data['Postcode']),
                                                   list(Etobicoke_data['Neighborhood']),
                                                   list(Etobicoke_data['Latitude']),
                                                   list(Etobicoke_data['Longitude']),)

Crawling different neighborhoods inside "Etobicoke"
1: Postal Code: M8V, Neighborhoods: Humber Bay Shores.
2: Postal Code: M8V, Neighborhoods: Mimico South.
3: Postal Code: M8V, Neighborhoods: New Toronto.
4: Postal Code: M8W, Neighborhoods: Alderwood.
5: Postal Code: M8W, Neighborhoods: Long Branch.
6: Postal Code: M8X, Neighborhoods: The Kingsway.
7: Postal Code: M8X, Neighborhoods: Montgomery Road.
8: Postal Code: M8X, Neighborhoods: Old Mill North.
9: Postal Code: M8Y, Neighborhoods: Humber Bay.
10: Postal Code: M8Y, Neighborhoods: King's Mill Park.
11: Postal Code: M8Y, Neighborhoods: Kingsway Park South East.
12: Postal Code: M8Y, Neighborhoods: Mimico NE.
13: Postal Code: M8Y, Neighborhoods: Old Mill South.
14: Postal Code: M8Y, Neighborhoods: The Queensway East.
15: Postal Code: M8Y, Neighborhoods: Royal York South East.
16: Postal Code: M8Y, Neighborhoods: Sunnylea.
17: Postal Code: M8Z, Neighborhoods: Kingsway Park South West.
18: Postal Code: M8Z, Neighborhoods: Mimico NW.
1

 Pickling is a way to convert a python object (list, dict, etc.) into a character stream. The idea is that this character stream contains all the information necessary to reconstruct the object in another python script.

In [22]:
import pickle
with open("Etobicoke_foursquare_dataset.txt", "wb") as fp:   #Pickling
    pickle.dump(Etobicoke_foursquare_dataset, fp)
print('Data Saved to Computer.')

Data Saved to Computer.


In [23]:
with open("Etobicoke_foursquare_dataset.txt", "rb") as fp:   # Unpickling
    Etobicoke_foursquare_dataset = pickle.load(fp)

In [24]:
# Getting number of venues for each postal_code

In [25]:
def get_venue_dataset(foursquare_dataset):
    result_df = pd.DataFrame(columns = ['Postal Code', 'Neighborhood', 
                                        'Neighborhood Latitude', 'Neighborhood Longitude',
                                        'Venue', 'Venue Summary', 'Venue Category', 'Distance'])
    # print(result_df)
    
    for neigh_dict in foursquare_dataset:
        postal_code = neigh_dict['Postal Code']; neigh = neigh_dict['Neighborhood(s)']
        lat = neigh_dict['Latitude']; lng = neigh_dict['Longitude']
        print('Number of Venues in Coordination "{}" Posal Code and "{}" Negihborhood(s) is:'.format(postal_code, neigh))
        print(len(neigh_dict['Crawling_result']))
        
        for venue_dict in neigh_dict['Crawling_result']:
            summary = venue_dict['reasons']['items'][0]['summary']
            name = venue_dict['venue']['name']
            dist = venue_dict['venue']['location']['distance']
            cat =  venue_dict['venue']['categories'][0]['name']
            
            
            # print({'Postal Code': postal_code, 'Neighborhood': neigh, 
            #                   'Neighborhood Latitude': lat, 'Neighborhood Longitude':lng,
            #                   'Venue': name, 'Venue Summary': summary, 
            #                   'Venue Category': cat, 'Distance': dist})
            
            result_df = result_df.append({'Postal Code': postal_code, 'Neighborhood': neigh, 
                              'Neighborhood Latitude': lat, 'Neighborhood Longitude':lng,
                              'Venue': name, 'Venue Summary': summary, 
                              'Venue Category': cat, 'Distance': dist}, ignore_index = True)
            # print(result_df)
    
    return(result_df)

In [26]:
Etobicoke_venues = get_venue_dataset(Etobicoke_foursquare_dataset)

Number of Venues in Coordination "M8V" Posal Code and "Humber Bay Shores" Negihborhood(s) is:
18
Number of Venues in Coordination "M8V" Posal Code and "Mimico South" Negihborhood(s) is:
18
Number of Venues in Coordination "M8V" Posal Code and "New Toronto" Negihborhood(s) is:
18
Number of Venues in Coordination "M8W" Posal Code and "Alderwood" Negihborhood(s) is:
27
Number of Venues in Coordination "M8W" Posal Code and "Long Branch" Negihborhood(s) is:
27
Number of Venues in Coordination "M8X" Posal Code and "The Kingsway" Negihborhood(s) is:
45
Number of Venues in Coordination "M8X" Posal Code and "Montgomery Road" Negihborhood(s) is:
45
Number of Venues in Coordination "M8X" Posal Code and "Old Mill North" Negihborhood(s) is:
45
Number of Venues in Coordination "M8Y" Posal Code and "Humber Bay" Negihborhood(s) is:
7
Number of Venues in Coordination "M8Y" Posal Code and "King's Mill Park" Negihborhood(s) is:
7
Number of Venues in Coordination "M8Y" Posal Code and "Kingsway Park South 

In [27]:
Etobicoke_venues.head()


Unnamed: 0,Postal Code,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Summary,Venue Category,Distance
0,M8V,Humber Bay Shores,43.605647,-79.501321,LCBO,This spot is popular,Liquor Store,408
1,M8V,Humber Bay Shores,43.605647,-79.501321,Huevos Gourmet,This spot is popular,Mexican Restaurant,532
2,M8V,Humber Bay Shores,43.605647,-79.501321,Sweet Olenka's,This spot is popular,Dessert Shop,512
3,M8V,Humber Bay Shores,43.605647,-79.501321,Kitchen on 6th,This spot is popular,Breakfast Spot,540
4,M8V,Humber Bay Shores,43.605647,-79.501321,Cellar Door Restaurant,This spot is popular,Italian Restaurant,790


In [28]:
# Saving to computer and using it fir further evaluation
Etobicoke_venues.to_csv('Etobicoke_venues.csv')

In [29]:
Etobicoke_venues = pd.read_csv('Etobicoke_venues.csv')


In [30]:
neigh_list = list(Etobicoke_venues['Neighborhood'].unique())
print('Number of Neighborhoods inside Etobicoke:')
print(len(neigh_list))
print('List of Neighborhoods inside Etobicoke:')
neigh_list

Number of Neighborhoods inside Etobicoke:
44
List of Neighborhoods inside Etobicoke:


['Humber Bay Shores',
 'Mimico South',
 'New Toronto',
 'Alderwood',
 'Long Branch',
 'The Kingsway',
 'Montgomery Road',
 'Old Mill North',
 'Humber Bay',
 "King's Mill Park",
 'Kingsway Park South East',
 'Mimico NE',
 'Old Mill South',
 'The Queensway East',
 'Royal York South East',
 'Sunnylea',
 'Kingsway Park South West',
 'Mimico NW',
 'The Queensway West',
 'Royal York South West',
 'South of Bloor',
 'Cloverdale',
 'Islington',
 'Martin Grove',
 'Princess Gardens',
 'West Deane Park',
 'Bloordale Gardens',
 'Eringate',
 'Markland Wood',
 'Old Burnhamthorpe',
 'Westmount',
 'Kingsview Village',
 'Martin Grove Gardens',
 'Richview Gardens',
 'St. Phillips',
 'Albion Gardens',
 'Beaumond Heights',
 'Humbergate',
 'Jamestown',
 'Mount Olive',
 'Silverstone',
 'South Steeles',
 'Thistletown',
 'Northwest']

In [31]:
neigh_venue_summary = Etobicoke_venues.groupby('Neighborhood').count()
neigh_venue_summary.drop(columns = ['Unnamed: 0']).head()

Unnamed: 0_level_0,Postal Code,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Summary,Venue Category,Distance
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Albion Gardens,16,16,16,16,16,16,16
Alderwood,27,27,27,27,27,27,27
Beaumond Heights,16,16,16,16,16,16,16
Bloordale Gardens,16,16,16,16,16,16,16
Cloverdale,15,15,15,15,15,15,15


In [32]:
print('There are {} uniques categories.'.format(len(Etobicoke_venues['Venue Category'].unique())))

print('Here is the list of different categories:')
list(Etobicoke_venues['Venue Category'].unique())

There are 89 uniques categories.
Here is the list of different categories:


['Liquor Store',
 'Mexican Restaurant',
 'Dessert Shop',
 'Breakfast Spot',
 'Italian Restaurant',
 'Park',
 'Pizza Place',
 'Grocery Store',
 'Restaurant',
 'Indian Restaurant',
 'Skating Rink',
 'Pub',
 'Bakery',
 'Café',
 'Fried Chicken Joint',
 'Fast Food Restaurant',
 'Pharmacy',
 'Gym',
 'Discount Store',
 'Moroccan Restaurant',
 'Coffee Shop',
 'Sandwich Place',
 'Dance Studio',
 'Market',
 'Gas Station',
 'Convenience Store',
 'Donut Shop',
 'Trail',
 'Shopping Mall',
 'Intersection',
 'Garden Center',
 'French Restaurant',
 'Seafood Restaurant',
 'Tapas Restaurant',
 'Sushi Restaurant',
 'Pool Hall',
 'Gastropub',
 'Burger Joint',
 'Indie Movie Theater',
 'Thai Restaurant',
 'Toy / Game Store',
 'Greek Restaurant',
 'Bank',
 'Gourmet Shop',
 'Mobile Phone Shop',
 'River',
 'Cupcake Shop',
 'Business Service',
 'Laundry Service',
 'Pet Store',
 'Ice Cream Shop',
 'Eastern European Restaurant',
 'Wings Joint',
 'Burrito Place',
 'Yoga Studio',
 'Supplement Shop',
 'Movie Theater

In [33]:
# one hot encoding
Etobicoke_onehot = pd.get_dummies(data = Etobicoke_venues, drop_first  = False, 
                              prefix = "", prefix_sep = "", columns = ['Venue Category'])
Etobicoke_onehot.drop(columns = ['Unnamed: 0']).head()

Unnamed: 0,Postal Code,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Summary,Distance,American Restaurant,Asian Restaurant,Automotive Shop,BBQ Joint,Bakery,Bank,Beer Store,Breakfast Spot,Buffet,Burger Joint,Burrito Place,Bus Line,Business Service,Café,Caribbean Restaurant,Chinese Restaurant,Clothing Store,Coffee Shop,College Rec Center,Comfort Food Restaurant,Convenience Store,Cupcake Shop,Dance Studio,Deli / Bodega,Dessert Shop,Discount Store,Dog Run,Donut Shop,Eastern European Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flea Market,Flower Shop,French Restaurant,Fried Chicken Joint,Garden Center,Gas Station,Gastropub,Golf Course,Golf Driving Range,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hardware Store,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Italian Restaurant,Laundry Service,Liquor Store,Market,Mattress Store,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Moroccan Restaurant,Movie Theater,Park,Pet Store,Pharmacy,Pizza Place,Pool Hall,Pub,Restaurant,River,Sandwich Place,Seafood Restaurant,Shopping Mall,Skating Rink,Supermarket,Supplement Shop,Sushi Restaurant,Tanning Salon,Tapas Restaurant,Thai Restaurant,Theater,Thrift / Vintage Store,Toy / Game Store,Trail,Transportation Service,Video Store,Wings Joint,Yoga Studio
0,M8V,Humber Bay Shores,43.605647,-79.501321,LCBO,This spot is popular,408,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,M8V,Humber Bay Shores,43.605647,-79.501321,Huevos Gourmet,This spot is popular,532,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,M8V,Humber Bay Shores,43.605647,-79.501321,Sweet Olenka's,This spot is popular,512,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,M8V,Humber Bay Shores,43.605647,-79.501321,Kitchen on 6th,This spot is popular,540,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,M8V,Humber Bay Shores,43.605647,-79.501321,Cellar Door Restaurant,This spot is popular,790,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [39]:
#  This list is created manually 
important_list_of_features = [
 
 'Neighborhood','Neighborhood Latitude','Neighborhood Longitude','Automotive Shop',
 'American Restaurant',
 'Asian Restaurant',
 'BBQ Joint',
 'Bakery','Bank','Burger Joint','Burrito Place','Bus Line','French Restaurant','Eastern European Restaurant','Donut Shop','Dessert Shop',
'Cupcake Shop','Comfort Food Restaurant','Café','Business Service',
 'Caribbean Restaurant',
 'Chinese Restaurant',
'Fast Food Restaurant','Fried Chicken Joint',
 'Greek Restaurant','Grocery Store','Garden Center',
 'Hardware Store','Hotel',  
 'Indian Restaurant','Ice Cream Shop', 'Italian Restaurant',
 'Laundry Service','Liquor Store', 
 'Mediterranean Restaurant','Mexican Restaurant',
 'Middle Eastern Restaurant', 
 'Pizza Place',
 
 'Restaurant',
 'Sandwich Place',
 'Seafood Restaurant','Seafood Restaurant', 'Sushi Restaurant',
 'Thai Restaurant','Tapas Restaurant',

 'Wings Joint']

In [40]:
# Etobicoke_onehot = Etobicoke_onehot[important_list_of_features].drop(columns = ['Neighborhood Latitude', 'Neighborhood Longitude']).groupby('Neighborhood').sum()
# Etobicoke_onehot.head()

Etobicoke_onehot = Etobicoke_onehot[important_list_of_features].drop(
    columns = ['Neighborhood Latitude', 'Neighborhood Longitude']).groupby(
    'Neighborhood').sum()


Etobicoke_onehot.head()

Unnamed: 0_level_0,Automotive Shop,American Restaurant,Asian Restaurant,BBQ Joint,Bakery,Bank,Burger Joint,Burrito Place,Bus Line,French Restaurant,Eastern European Restaurant,Donut Shop,Dessert Shop,Cupcake Shop,Comfort Food Restaurant,Café,Business Service,Caribbean Restaurant,Chinese Restaurant,Fast Food Restaurant,Fried Chicken Joint,Greek Restaurant,Grocery Store,Garden Center,Hardware Store,Hotel,Indian Restaurant,Ice Cream Shop,Italian Restaurant,Laundry Service,Liquor Store,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Pizza Place,Restaurant,Sandwich Place,Seafood Restaurant,Seafood Restaurant,Sushi Restaurant,Thai Restaurant,Tapas Restaurant,Wings Joint
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1
Albion Gardens,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,1,0,3,0,1,0,0,0,0,0,0,0,0,0,3,0,1,0,0,0,0,0,0
Alderwood,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,0,0,0,2,0,1,0,0,0,0,0,0
Beaumond Heights,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,1,0,3,0,1,0,0,0,0,0,0,0,0,0,3,0,1,0,0,0,0,0,0
Bloordale Gardens,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0
Cloverdale,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,0


In [41]:
feat_name_list = list(Etobicoke_onehot.columns)
restaurant_list = []


for counter, value in enumerate(feat_name_list):
    if value.find('Restaurant') != (-1):
        restaurant_list.append(value)
        
Etobicoke_onehot['Total Restaurants'] = Etobicoke_onehot[restaurant_list].sum(axis = 1)
Etobicoke_onehot = Etobicoke_onehot.drop(columns = restaurant_list)


feat_name_list = list(Etobicoke_onehot.columns)
joint_list = []


for counter, value in enumerate(feat_name_list):
    if value.find('Joint') != (-1):
        joint_list.append(value)
        
Etobicoke_onehot['Total Joints'] = Etobicoke_onehot[joint_list].sum(axis = 1)
Etobicoke_onehot = Etobicoke_onehot.drop(columns = joint_list)

In [42]:
Etobicoke_onehot.head()


Unnamed: 0_level_0,Automotive Shop,Bakery,Bank,Burrito Place,Bus Line,Donut Shop,Dessert Shop,Cupcake Shop,Café,Business Service,Grocery Store,Garden Center,Hardware Store,Hotel,Ice Cream Shop,Laundry Service,Liquor Store,Pizza Place,Sandwich Place,Total Restaurants,Total Joints
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Albion Gardens,0,0,0,0,1,0,0,0,0,0,3,0,1,0,0,0,0,3,1,2,1
Alderwood,0,0,0,0,0,1,0,0,0,0,1,1,0,0,0,0,1,2,1,0,0
Beaumond Heights,0,0,0,0,1,0,0,0,0,0,3,0,1,0,0,0,0,3,1,2,1
Bloordale Gardens,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,1,1,0,0,0
Cloverdale,0,0,1,0,0,0,0,0,0,0,1,0,0,2,0,0,0,2,0,2,0


# Run k-means to Cluster Neighborhoods into 5 Clusters¶


In [43]:
from sklearn.cluster import KMeans

In [44]:
# run k-means clustering
kmeans = KMeans(n_clusters = 5, random_state = 0).fit(Etobicoke_onehot)

In [45]:
means_df = pd.DataFrame(kmeans.cluster_centers_)
means_df.columns = Etobicoke_onehot.columns
means_df.index = ['G1','G2','G3','G4','G5']
means_df['Total Sum'] = means_df.sum(axis = 1)
means_df.sort_values(axis = 0, by = ['Total Sum'], ascending=False)

Unnamed: 0,Automotive Shop,Bakery,Bank,Burrito Place,Bus Line,Donut Shop,Dessert Shop,Cupcake Shop,Café,Business Service,Grocery Store,Garden Center,Hardware Store,Hotel,Ice Cream Shop,Laundry Service,Liquor Store,Pizza Place,Sandwich Place,Total Restaurants,Total Joints,Total Sum
G2,1.0,2.0,1.0,3.0,1.0,0.0,2.775558e-17,0.0,1.0,0.0,2.0,0.0,1.0,2.775558e-17,2.775558e-17,0.0,1.0,0.0,2.0,14.0,5.0,34.0
G4,0.0,1.0,1.0,0.0,-5.5511150000000004e-17,6.938894e-18,2.0,1.0,1.0,1.0,0.0,6.938894e-18,0.0,0.0,0.0,1.0,1.0,2.0,0.0,15.0,2.0,28.0
G3,0.0,-5.5511150000000004e-17,5.5511150000000004e-17,0.0,1.0,-6.938894e-18,0.0,0.0,0.0,0.0,3.0,-6.938894e-18,1.0,0.0,0.0,0.0,5.5511150000000004e-17,3.0,1.0,2.0,1.0,12.0
G1,1.387779e-17,0.2727273,5.5511150000000004e-17,0.0,5.5511150000000004e-17,-6.938894e-18,0.2727273,1.387779e-17,0.272727,1.387779e-17,0.272727,-6.938894e-18,0.0,2.775558e-17,0.7272727,1.387779e-17,0.2727273,0.272727,0.0,3.545455,0.2727273,6.181818
G5,4.1633360000000003e-17,5.5511150000000004e-17,0.5294118,0.0,0.2352941,0.1176471,0.0,2.775558e-17,0.235294,2.775558e-17,0.647059,0.1176471,-5.5511150000000004e-17,0.5882353,0.05882353,2.775558e-17,0.3529412,1.411765,0.411765,1.176471,-3.330669e-16,5.882353


####  Best Clustering Group: G1
#### Second Best Group is G3;
#### Third Best Group is G4;

In [46]:
neigh_summary = pd.DataFrame([Etobicoke_onehot.index, 1 + kmeans.labels_]).T
neigh_summary.columns = ['Neighborhood', 'Group']
neigh_summary

Unnamed: 0,Neighborhood,Group
0,Albion Gardens,3
1,Alderwood,5
2,Beaumond Heights,3
3,Bloordale Gardens,5
4,Cloverdale,5
5,Eringate,5
6,Humber Bay,1
7,Humber Bay Shores,1
8,Humbergate,3
9,Islington,5


#  Best Neighborhood

In [47]:
neigh_summary[neigh_summary['Group'] == 1]


Unnamed: 0,Neighborhood,Group
6,Humber Bay,1
7,Humber Bay Shores,1
11,King's Mill Park,1
13,Kingsway Park South East,1
19,Mimico NE,1
21,Mimico South,1
24,New Toronto,1
28,Old Mill South,1
31,Royal York South East,1
37,Sunnylea,1


In [48]:
name_of_neigh = list(neigh_summary[neigh_summary['Group'] == 1]['Neighborhood'])[0]
Etobicoke_venues[Etobicoke_venues['Neighborhood'] == name_of_neigh].iloc[0,1:5].to_dict()

{'Postal Code': 'M8Y',
 'Neighborhood': 'Humber Bay',
 'Neighborhood Latitude': 43.6362579,
 'Neighborhood Longitude': -79.49850909999998}

In [49]:
# Second Best Group

In [50]:
neigh_summary[neigh_summary['Group'] == 3]


Unnamed: 0,Neighborhood,Group
0,Albion Gardens,3
2,Beaumond Heights,3
8,Humbergate,3
10,Jamestown,3
23,Mount Olive,3
33,Silverstone,3
34,South Steeles,3
41,Thistletown,3


In [51]:
name_of_neigh = list(neigh_summary[neigh_summary['Group'] == 3]['Neighborhood'])[0]
Etobicoke_venues[Etobicoke_venues['Neighborhood'] == name_of_neigh].iloc[0,1:5].to_dict()

{'Postal Code': 'M9V',
 'Neighborhood': 'Albion Gardens',
 'Neighborhood Latitude': 43.7394164,
 'Neighborhood Longitude': -79.5884369}

In [52]:
# Third Best Group

In [53]:
neigh_summary[neigh_summary['Group'] == 4]


Unnamed: 0,Neighborhood,Group
22,Montgomery Road,4
27,Old Mill North,4
38,The Kingsway,4


In [54]:
name_of_neigh = list(neigh_summary[neigh_summary['Group'] == 4]['Neighborhood'])[0]
Etobicoke_venues[Etobicoke_venues['Neighborhood'] == name_of_neigh].iloc[0,1:5].to_dict()

{'Postal Code': 'M8X',
 'Neighborhood': 'Montgomery Road',
 'Neighborhood Latitude': 43.65365360000001,
 'Neighborhood Longitude': -79.5069436}

In [55]:
# Fourth Best Group
neigh_summary[neigh_summary['Group'] == 2]


Unnamed: 0,Neighborhood,Group
14,Kingsway Park South West,2
20,Mimico NW,2
32,Royal York South West,2
35,South of Bloor,2
40,The Queensway West,2


In [56]:
# Fifth Best Group
neigh_summary[neigh_summary['Group'] == 5]


Unnamed: 0,Neighborhood,Group
1,Alderwood,5
3,Bloordale Gardens,5
4,Cloverdale,5
5,Eringate,5
9,Islington,5
12,Kingsview Village,5
15,Long Branch,5
16,Markland Wood,5
17,Martin Grove,5
18,Martin Grove Gardens,5


## Results and Discussion <a name="results"></a>

**Best Clustering Group: G5**

**Second Best Group: G1**

**Third Best Group: G4;**

**Inserting "kmeans.labels_" into the Original Etobicoke DataFrame**

**Finding the Corresponding Group for Each Neighborhood.**


Our analysis shows that although there is a great number of neighbours in Etobicoke Toronto. , there are pockets of low shopping centres and restaurant density fairly close to city center. Highest concentration of restaurants was detected Humber Bay, Humber Bay Shores, King's Mill Park, Kingsway Park South East, Mimico NE, Mimico South, New Toronto, Old Mill South Royal York South East, Sunnylea and The Queensway East.

so we focused our attention to these areas , corresponding to boroughs Etobicoke. So our attention was focused onthese centres which are more closeness to Warehouse  center, strong economic point view.

Those location venues were then clustered to create zones of interest which contain greatest number of  shopping centres locations. Addresses of centers of those zones were also generated using  geocoding to be used as markers/starting points for more detailed local analysis based on other factors.

Result of all this is 11 venues location containing largest number of potential for WArehouse's locations based on number of and distance to existing venues - both restaurants in general and all restaurants particularly. This, of course, does not imply that those zones are actually optimal locations for a new warehouse! Purpose of this analysis was to only provide info on areas close to Etobicoke Toronto.It is entirely possible that there is a very good reason for small number of Warehouse in any of those areas, reasons which would make them unsuitable for a new warehouses regardless of lack of competition in the area. Recommended zones should therefore be considered only as a starting point for more detailed analysis which could eventually result in location which has not only no nearby competition but also other factors taken into account and all other relevant conditions met.

## Conclusion <a name="conclusion"></a>

Purpose of this project was to identify Etobicoke Toronto areas close to center with higher number of shopping centres and restaurants  in order to aid stakeholders for optimal location for a new Warehouse Centres. By calculating density of any types of shops and restaurant distribution from Foursquare data we have first identified general neighbourhoods of borough Etobicoke, Toronto that justify further analysis, and then generated extensive collection of locations which satisfy some basic requirements regarding existing nearby such location for openning new Warehouses. Clustering of those locations was then performed in order to create major zones of interest (containing greatest number of potential locations) and addresses of those zone centers were created to be used as starting points for final exploration by stakeholders.

Final decission on optimal Warehouse location will be made by stakeholders based on specific characteristics of neighborhoods and locations in every recommended zone, taking into consideration additional factors like attractiveness of each location (proximity to park or water), levels of noise / proximity to major roads, real estate availability, prices, social and economic dynamics of every neighborhood etc.

## Thank You Coursera and all my peers(^_^)