# Capstone Project 
## Choosing a location for a distribution center for a food supplier to maximize business.

This notebook is to be used for the capstone project for the course "Applied Data Science Capstone" by IBM on Coursera.

### Introduction

In this project I aim to find location co-ordinates for a distribution center for a food supplier to maximize business.We choose the city of New York  to set up a distribution centre for a food supplier which is expanding. The limitation of each distribution center is that we have a maximum radius for distribution of 4km. So our goal is to choose a location which would give the outreach.

A good solution to this problem would help many distributers to select an optimum location for its center which would in turn help reduce the transportation costs which is an important consideration for any supply chain. This method can not only be used for distribution centers but also various links in any supply chain for a range of industries.


The only constraint considered in this problem is the range of the distribution centre. The quantity supplied, the number of supermarkets, the scale of each supermarket etc aren't considered taking into consideration the scope of our project. All of this can be added in a later project which can be applied to real world scenarios.

### Data to be Acquired

For this project we need to obtain the map of New York. What we are going to do is use clustering to obtain clusters of supermarkets in New York based on their proximity to one another. Hence to complete this project. We also need the coordinates of the various supermarkets in NY which can be easily obtained on Foursqaure using its API.

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

# All requested packages already installed.

Libraries imported.


In [2]:
address = 'New York City, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New York City are 40.7127281, -74.0060152.


In [3]:
# create map of New York using latitude and longitude values
map_newyork = folium.Map(location=[latitude, longitude], zoom_start=10)
map_newyork 

In [4]:
CLIENT_ID = 'IVLTISY3A1TRTL4IXKARVKP4IFIADEPRJZMMK1LJZPKNEYY5' # your Foursquare ID
CLIENT_SECRET = 'T10QGN2GGJEI455OCV1CVJKHZJBSH4HJUQMTWX4GJ2VL43HZ' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
categoryId="52f2ab2ebcbc57f1066b8b46"
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: IVLTISY3A1TRTL4IXKARVKP4IFIADEPRJZMMK1LJZPKNEYY5
CLIENT_SECRET:T10QGN2GGJEI455OCV1CVJKHZJBSH4HJUQMTWX4GJ2VL43HZ


In [5]:
LIMIT = 1000
radius = 27000
url = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&categoryId={}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    latitude, 
    longitude, 
    categoryId,
    radius, 
    LIMIT)
url

'https://api.foursquare.com/v2/venues/search?&client_id=IVLTISY3A1TRTL4IXKARVKP4IFIADEPRJZMMK1LJZPKNEYY5&client_secret=T10QGN2GGJEI455OCV1CVJKHZJBSH4HJUQMTWX4GJ2VL43HZ&v=20180605&ll=40.7127281,-74.0060152&categoryId=52f2ab2ebcbc57f1066b8b46&radius=27000&limit=1000'

In [6]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5ce53a874434b92155af278e'},
 'response': {'confident': False,
  'venues': [{'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/food_grocery_',
       'suffix': '.png'},
      'id': '4bf58dd8d48988d118951735',
      'name': 'Grocery Store',
      'pluralName': 'Grocery Stores',
      'primary': True,
      'shortName': 'Grocery Store'}],
    'hasPerk': False,
    'id': '4c43890cfb6eb713c1304e4a',
    'location': {'address': '55 Fulton St',
     'cc': 'US',
     'city': 'New York',
     'country': 'United States',
     'distance': 460,
     'formattedAddress': ['55 Fulton St',
      'New York, NY 10038',
      'United States'],
     'labeledLatLngs': [{'label': 'display',
       'lat': 40.70867762344115,
       'lng': -74.0049052224622}],
     'lat': 40.70867762344115,
     'lng': -74.0049052224622,
     'postalCode': '10038',
     'state': 'NY'},
    'name': '55 Fulton Market',
    'referralId': 'v-1558526599'},
   {'cat

In [7]:
shops = results['response']['venues']
nearby_shops = json_normalize(shops) # flatten JSON
nearby_shops
# filter columns
filtered_columns = [ 'name','location.lat','location.lng']
nearby_shops =nearby_shops.loc[:, filtered_columns]
nearby_shops

Unnamed: 0,name,location.lat,location.lng
0,55 Fulton Market,40.708678,-74.004905
1,Trader Joe's,40.743969,-73.979104
2,Trader Joe's,40.741739,-73.993653
3,Trader Joe's,40.790624,-73.969191
4,Trader Joe's,40.778527,-73.981987
5,Trader Joe's,40.94822,-74.070867
6,Food Bazaar Supermarket,40.752492,-73.921069
7,ShopRite,40.749812,-74.036071
8,Walmart Supercenter,40.862474,-74.062503
9,Walmart Supercenter,40.749548,-74.13562


In [8]:
map_markets = folium.Map(location=[latitude, longitude], zoom_start=11)
markers_colors = []
for lat, lon, name  in zip(nearby_shops['location.lat'], nearby_shops['location.lng'],nearby_shops['name']):
    label = folium.Popup(name, parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=7,
        popup=label,
        fill=True,
        fill_opacity=0.7).add_to(map_markets)
       
map_markets

In [9]:
from sklearn.cluster import MeanShift
import numpy as np
import statistics


In [10]:
filtered_coordinates = [ 'location.lat','location.lng']
nearby_shops_coordinates=nearby_shops
nearby_shops_coordinates =nearby_shops_coordinates.loc[:, filtered_coordinates]
coordinates=np.array(nearby_shops_coordinates) 

In [35]:

clustering = MeanShift(bandwidth=0.0363636).fit(coordinates)
print(clustering.labels_)
clusters=clustering.labels_
print(clustering.cluster_centers_)
centers=clustering.cluster_centers_
centersdf = pd.DataFrame(centers)
centersdf

[ 0  0  0  2  2  5  1  9  7  8  9  0  0  3 10  1  0  6  0  4  0  0  0  0  0
  1  0  2  0  0  2  9  1  2  2  2  1  0  0  0  0  0  0  1  1  1  0  0  0  0]
[[ 40.72438942 -73.98987903]
 [ 40.74380235 -73.94569124]
 [ 40.79132337 -73.97565321]
 [ 40.97405665 -73.864025  ]
 [ 40.83661982 -74.1516116 ]
 [ 40.94822016 -74.07086662]
 [ 40.71095091 -73.85819256]
 [ 40.86247418 -74.06250338]
 [ 40.749548   -74.13562   ]
 [ 40.79263537 -74.04234962]
 [ 40.66125057 -73.72609616]]


Unnamed: 0,0,1
0,40.724389,-73.989879
1,40.743802,-73.945691
2,40.791323,-73.975653
3,40.974057,-73.864025
4,40.83662,-74.151612
5,40.94822,-74.070867
6,40.710951,-73.858193
7,40.862474,-74.062503
8,40.749548,-74.13562
9,40.792635,-74.04235


In [18]:
m=statistics.mode(clustering.labels_)
print(np.count_nonzero(clustering.labels_== m))
coordinate=clustering.cluster_centers_[m]
coordinate
    

25


array([ 40.72438942, -73.98987903])

In [23]:
x = np.arange(len(clustering.cluster_centers_))
ys = [i + x + (i*x)**2 for i in range(len(clustering.cluster_centers_))]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
rainbow

['#8000ff',
 '#4e4dfc',
 '#1996f3',
 '#18cde4',
 '#4df3ce',
 '#80ffb4',
 '#b2f396',
 '#e6cd73',
 '#ff964f',
 '#ff4d27',
 '#ff0000']

In [27]:
nearby_shops['clusters']=clusters
nearby_shops.head()

Unnamed: 0,name,location.lat,location.lng,clusters
0,55 Fulton Market,40.708678,-74.004905,0
1,Trader Joe's,40.743969,-73.979104,0
2,Trader Joe's,40.741739,-73.993653,0
3,Trader Joe's,40.790624,-73.969191,2
4,Trader Joe's,40.778527,-73.981987,2


In [43]:
map_markets = folium.Map(location=[latitude, longitude], zoom_start=11)
markers_colors = []
for lat, lon, name,clusters  in zip(nearby_shops['location.lat'], nearby_shops['location.lng'],nearby_shops['name'],nearby_shops['clusters']):
    label = folium.Popup(name, parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=7,
        color=rainbow[clusters-1],
        popup=label,
        fill=True,
        fill_color=rainbow[clusters-1],
        fill_opacity=0.7).add_to(map_markets)

       
map_markets


## AS WE CAN SEE THE OPTIMAL LOCATION FOR OUR DISTRIBUTION CENTRE IS GIVEN BY THE COORDINATES  40.72438942, -73.98987903 AS THIS LOCATION HAS THE MAXIMUM NUMBER OF SHOPS ASSOCIATED WITH IT. THESE SHOPS ARE INDICATED BY RED IN THIS MAP.