<h1 align="center"> Setting up a Cafe in Ahmedabad </h1>

<p align ="center"> Nandan J Kakadiya
<br>
August 13,2021
</p>

# 1. Introduction
The city of Ahmedabad is endowed with a rich architectural heritage that is vital to the local identity and continuity of the place. Along with the foremost heritage Indo-Islamic monuments of the 15th to 17th centuries, there are potential heritage precincts in the form of the Pols, the traditional residential clusters of the medieval period, which makes Ahmedabad exceptional. Combining these all, the historic walled city of Ahmedabad has it all to be the first and only city in India to be inscribed in UNESCO's World Heritage City list of 2017.As a result, Ahmedabad city is hotspot for foreigners as well as local tourists. Opening cafe allow owner to earn good revenue. Opening a new Cafe, like any other business decision, needs careful analysis and is far more complicated than it appears. As with any business, the cafe's location is one of the most critical considerations that will determine whether the business succeeds or fails.

# 2. Business Problem

There are already many Cafes in the city and many more are being built.The objective of this project is to analyze and select the best locations in the city of Ahmedabad, India, to open a new Cafe. This project is mainly focused on geospatial analysis of the Ahmedabad City to understand which would be the best place to open a new Cafe. Using data science methodology and machine learning Algorithms like clustering, this project aims to provide solutions to answer the business question: In the city of Ahmedabad, if someone is looking to open a cafe, where would you recommend that they open it?

# 3. Data

We'll need the following data to solve the problem:

• List of neighbourhoods in Ahmedabad. This defines the scope of this project which is confined to the city of Ahmedabad.

• In order to plot the neighbourhood we will need lattitude and longitude of the neighbourhood.

• Venue data, particularly data related to Cafe. We will use this data to perform clustering on the neighbourhoods.

###3.1 Sources of Data
This page is a list of neighbourhoods in Ahmedabad. I have extract the data from the Wikipedia page, with the help of Python requests and beautifulsoup library.

Then, using the Python Geocoder library, we can get the latitude and longitude coordinates of the neighbourhoods. After that, to access the venue data for those neighbourhoods, I used the Foursquare API.

Foursquare API will provide many categories of the venue data, and we are particularly interested in the Cafe category in order to help us solve the business problem. This is a project that will make use of many data science skills, from web scraping (Wikipedia), working with API (Foursquare), data cleaning, data wrangling, to machine learning (K-means clustering) and map visualization (Folium).

#4. Methodology
Let's import first necessary python linraries for our project

In [3]:
import pandas as pd
import requests
import numpy as np
import matplotlib.cm as cm
import matplotlib.colors as colors
import folium
from bs4 import BeautifulSoup
# import k-means for the clustering stage
from sklearn.cluster import KMeans

In [4]:
!pip install geocoder
import json # library to handle JSON files
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import geocoder # to get coordinates


Collecting geocoder
  Downloading geocoder-1.38.1-py2.py3-none-any.whl (98 kB)
[?25l[K     |███▎                            | 10 kB 20.8 MB/s eta 0:00:01[K     |██████▋                         | 20 kB 10.4 MB/s eta 0:00:01[K     |██████████                      | 30 kB 8.6 MB/s eta 0:00:01[K     |█████████████▎                  | 40 kB 7.8 MB/s eta 0:00:01[K     |████████████████▋               | 51 kB 4.2 MB/s eta 0:00:01[K     |████████████████████            | 61 kB 4.4 MB/s eta 0:00:01[K     |███████████████████████▎        | 71 kB 4.7 MB/s eta 0:00:01[K     |██████████████████████████▋     | 81 kB 5.3 MB/s eta 0:00:01[K     |██████████████████████████████  | 92 kB 4.0 MB/s eta 0:00:01[K     |████████████████████████████████| 98 kB 2.9 MB/s 
[?25hCollecting ratelim
  Downloading ratelim-0.1.6-py2.py3-none-any.whl (4.0 kB)
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6


Perform scraping using Python requests and beautifulsoup packages to extract the list of neighbourhoods of Ahmedabad.

In [5]:
#use requests to get text of the web page
data = requests.get("https://en.wikipedia.org/wiki/Category:Neighbourhoods_in_Ahmedabad").text
# Parse data 
soup = BeautifulSoup(data, 'html.parser')

neighborhoods = []

In [6]:
# Append the data into the list
for row in soup.find_all("div", class_="mw-category")[0].findAll("li"):
  neighborhoods.append(row.text)

In [8]:
#create dataframe with the help of pandas and show first five rows of dataframe
df = pd.DataFrame({"Neighborhood": neighborhoods})
df.head()

Unnamed: 0,Neighborhood
0,Ahmedabad Cantonment
1,Alam Roza
2,Ambawadi
3,Amraiwadi
4,Asarwa


In [9]:
df.shape

(69, 1)

there are total 69 neighbourhood in Ahmedabad

We need to get the geographical coordinates in the form of latitude and longitude. To do so, we will use the Geocode package that will allow us to convert the address into geographical coordinates in the form of latitude and longitude.

In [10]:
def get_latlng(neighborhood):
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Ahmedabad, India'.format(neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [11]:

# call the function to get the coordinates, store in a new list using list comprehension
coords = [ get_latlng(neighborhood) for neighborhood in df["Neighborhood"].tolist() ]

In [12]:
coords

[[23.066200000000038, 72.60219000000006],
 [23.027780000000064, 72.60025000000007],
 [23.01893000000007, 72.55437000000006],
 [23.00735000000003, 72.62268000000006],
 [23.047090000000026, 72.60481000000004],
 [23.047090000000026, 72.60481000000004],
 [22.84128000000004, 72.45453000000003],
 [23.027780000000064, 72.60025000000007],
 [23.034740000000056, 72.63023000000004],
 [23.00279000000006, 72.57705000000004],
 [23.002636014183093, 72.59816375515649],
 [23.030320000000074, 72.47247000000004],
 [22.806890000000067, 72.42511000000007],
 [23.112140000000068, 72.57989000000003],
 [23.081328795273873, 72.54831386956431],
 [23.027780000000064, 72.60025000000007],
 [23.036070000000052, 72.59213000000005],
 [23.32218000000006, 72.18817000000007],
 [23.02221012727162, 72.57671896668386],
 [23.072740000000067, 72.54961000000003],
 [22.974050000000034, 72.61173000000008],
 [23.050070000000062, 72.59831000000008],
 [23.010005039633256, 72.58094004475555],
 [23.015930000000026, 72.61082000000005]

In [13]:
# save Latitude and Longitude
df_1 = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])

In [14]:
df_1.shape

(69, 2)

In [16]:
#add the into original dataset
df['Latitude'] = df_1['Latitude']
df['Longitude'] = df_1['Longitude']

In [17]:
df.head()

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Ahmedabad Cantonment,23.0662,72.60219
1,Alam Roza,23.02778,72.60025
2,Ambawadi,23.01893,72.55437
3,Amraiwadi,23.00735,72.62268
4,Asarwa,23.04709,72.60481


In [18]:

# get the coordinates of Hyderabad
address = 'Ahmedabad, India'

geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Ahmedabad {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Ahmedabad 23.0216238, 72.5797068.


In [19]:
# Let's create map of Hyderabad using latitude and longitude values
map = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, long, neighborhood in zip(df['Latitude'], df['Longitude'], df['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, long],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.5).add_to(map)

map  

####foursquare API to explore the neighbourhoods

In [20]:
CLIENT_ID = 'F1OQVWFTBJ4LKJYQIYDSTOA3LTQ1YQFUPCYEVQTXGA5TXTBU' # your Foursquare ID
CLIENT_SECRET = 'K2TZ5QLELGMTH5C3QNU4ZG4R53UHM4GSDG1BHV31VFDUQO4G' # your Foursquare Secret
ACCESS_TOKEN = 'E5YB3SAPEIKBZUL3RLV3M1HBFDZBXCYW05Q503I3VCQOQQZ2' # your FourSquare Access Token
VERSION = '20180604'
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: F1OQVWFTBJ4LKJYQIYDSTOA3LTQ1YQFUPCYEVQTXGA5TXTBU
CLIENT_SECRET:K2TZ5QLELGMTH5C3QNU4ZG4R53UHM4GSDG1BHV31VFDUQO4G


In [32]:
radius = 1000
LIMIT = 50
search_query='Cafés'

In [33]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&oauth_token={}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude,ACCESS_TOKEN, VERSION, search_query, radius, LIMIT)


In [40]:
venues = []

for lat, lng, neighborhood in zip(df['Latitude'], df['Longitude'], df['Neighborhood']):
    
    # create the API request URL
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        lng,
        radius, 
        LIMIT)
    
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    # return only relevant information for each nearby venue
    for venue in results:
        venues.append((
            neighborhood,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))


In [41]:
# convert the venues list into a new DataFrame
v_df = pd.DataFrame(venues)

In [42]:
v_df.head()

Unnamed: 0,0,1,2,3,4,5,6
0,Ahmedabad Cantonment,23.0662,72.60219,Simran Farm,23.072137,72.603881,Indian Restaurant
1,Ahmedabad Cantonment,23.0662,72.60219,Ghoda Camp,23.066185,72.601202,Athletics & Sports
2,Ahmedabad Cantonment,23.0662,72.60219,I.P.S. Mess,23.065666,72.597547,Scenic Lookout
3,Ahmedabad Cantonment,23.0662,72.60219,Camp Sadar Bajar,23.072923,72.605533,Campground
4,Alam Roza,23.02778,72.60219,Moti Mahal,23.02912,72.599724,Indian Restaurant


In [43]:
v_df.shape

(570, 7)

In [44]:
#first rename the columns
v_df.columns = ['Neighborhood', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

In [45]:
v_df.groupby(["Neighborhood"]).count()


Unnamed: 0_level_0,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Ahmedabad Cantonment,4,4,4,4,4,4
Alam Roza,7,7,7,7,7,7
Ambawadi,27,27,27,27,27,27
Amraiwadi,6,6,6,6,6,6
Asarwa,4,4,4,4,4,4
...,...,...,...,...,...,...
Thaltej,30,30,30,30,30,30
Usmanpura,14,14,14,14,14,14
Vastral,5,5,5,5,5,5
Vastrapur,50,50,50,50,50,50


this shows how many venues returned for each category

In [48]:
v_df['VenueCategory'].unique()

array(['Indian Restaurant', 'Athletics & Sports', 'Scenic Lookout',
       'Campground', 'Train Station', "Men's Store", 'Clothing Store',
       'Food Court', 'Farmers Market', 'Bakery', 'Mexican Restaurant',
       'Snack Place', 'Pizza Place', 'Café', 'Shopping Mall', 'Hotel',
       'Park', 'Fast Food Restaurant', 'Coffee Shop', 'Tea Room',
       'Sandwich Place', 'Theater', 'Health Food Store', 'ATM',
       'Moving Target', 'IT Services', 'Movie Theater', 'Historic Site',
       'Boat or Ferry', 'Cafeteria', 'Flea Market', 'Bus Stop', 'River',
       'Zoo', 'Accessories Store', 'Ice Cream Shop', 'Sports Club',
       'Breakfast Spot', 'Lake', 'Platform', 'Art Museum', 'Restaurant',
       'Asian Restaurant', 'Vegetarian / Vegan Restaurant',
       'Department Store', 'Rental Car Location', 'Auto Garage', 'Museum',
       'Bistro', 'Bus Station', 'Gym / Fitness Center', 'Art Gallery',
       'Multiplex', 'Food Truck', 'Pharmacy', 'Dessert Shop',
       'Metro Station', 'Business 

We are here only looking for Cafes and Fast Food Shops

In [76]:
venue_onehot = pd.get_dummies(v_df[['VenueCategory']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
venue_onehot['Neighborhoods'] = v_df['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [venue_onehot.columns[-1]] + list(venue_onehot.columns[:-1])
venue_onehot = venue_onehot[fixed_columns]
venue_onehot.head()

Unnamed: 0,Neighborhoods,ATM,Accessories Store,Airport Lounge,Airport Terminal,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,BBQ Joint,Bakery,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Breakfast Spot,Bus Station,Bus Stop,Business Service,Cafeteria,Café,Campground,Clothing Store,Coffee Shop,Construction & Landscaping,Cricket Ground,Cupcake Shop,Department Store,Dessert Shop,Diner,Donut Shop,Electronics Store,Farm,Farmers Market,Fast Food Restaurant,Flea Market,...,Italian Restaurant,Jewelry Store,Juice Bar,Lake,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Mobile Phone Shop,Movie Theater,Moving Target,Multicuisine Indian Restaurant,Multiplex,Museum,North Indian Restaurant,Paper / Office Supplies Store,Park,Performing Arts Venue,Pharmacy,Pizza Place,Platform,Recreation Center,Rental Car Location,Restaurant,River,Sandwich Place,Scenic Lookout,Sculpture Garden,Shopping Mall,Smoke Shop,Snack Place,Sports Club,Tea Room,Theater,Toy / Game Store,Train Station,Tree,Vegetarian / Vegan Restaurant,Video Store,Zoo
0,Ahmedabad Cantonment,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Ahmedabad Cantonment,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Ahmedabad Cantonment,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Ahmedabad Cantonment,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Alam Roza,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [77]:
v_group = venue_onehot.groupby(["Neighborhoods"]).sum().reset_index()
v_group

Unnamed: 0,Neighborhoods,ATM,Accessories Store,Airport Lounge,Airport Terminal,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,BBQ Joint,Bakery,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Breakfast Spot,Bus Station,Bus Stop,Business Service,Cafeteria,Café,Campground,Clothing Store,Coffee Shop,Construction & Landscaping,Cricket Ground,Cupcake Shop,Department Store,Dessert Shop,Diner,Donut Shop,Electronics Store,Farm,Farmers Market,Fast Food Restaurant,Flea Market,...,Italian Restaurant,Jewelry Store,Juice Bar,Lake,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Mobile Phone Shop,Movie Theater,Moving Target,Multicuisine Indian Restaurant,Multiplex,Museum,North Indian Restaurant,Paper / Office Supplies Store,Park,Performing Arts Venue,Pharmacy,Pizza Place,Platform,Recreation Center,Rental Car Location,Restaurant,River,Sandwich Place,Scenic Lookout,Sculpture Garden,Shopping Mall,Smoke Shop,Snack Place,Sports Club,Tea Room,Theater,Toy / Game Store,Train Station,Tree,Vegetarian / Vegan Restaurant,Video Store,Zoo
0,Ahmedabad Cantonment,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Alam Roza,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,...,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0
2,Ambawadi,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,1,0,0,0,0,0,0,0,0,0,0,5,0,...,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,2,0,0,2,0,0,0,0,0,1,0,0,2,0,1,0,2,1,0,0,0,0,0,0
3,Amraiwadi,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Asarwa,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
58,Thaltej,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,5,0,0,0,0,0,2,0,0,1,0,0,0,...,1,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,2,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0
59,Usmanpura,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,2,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
60,Vastral,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
61,Vastrapur,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,4,0,2,0,0,0,1,0,1,0,2,1,0,0,7,0,...,0,0,1,0,0,1,0,0,0,0,0,0,1,0,0,1,0,0,0,2,0,0,0,5,0,2,0,0,2,0,2,0,0,0,1,0,0,2,0,0


In [78]:
v_group['Total Caffes'] =  v_group[['Fast Food Restaurant','Café','Food Court','Snack Place','Sandwich Place','Cafeteria','Ice Cream Shop','Breakfast Spot','Food Truck', 'Dessert Shop','Cupcake Shop','Donut Shop']].sum(axis=1)

In [79]:
v_group

Unnamed: 0,Neighborhoods,ATM,Accessories Store,Airport Lounge,Airport Terminal,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,BBQ Joint,Bakery,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Breakfast Spot,Bus Station,Bus Stop,Business Service,Cafeteria,Café,Campground,Clothing Store,Coffee Shop,Construction & Landscaping,Cricket Ground,Cupcake Shop,Department Store,Dessert Shop,Diner,Donut Shop,Electronics Store,Farm,Farmers Market,Fast Food Restaurant,Flea Market,...,Jewelry Store,Juice Bar,Lake,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Mobile Phone Shop,Movie Theater,Moving Target,Multicuisine Indian Restaurant,Multiplex,Museum,North Indian Restaurant,Paper / Office Supplies Store,Park,Performing Arts Venue,Pharmacy,Pizza Place,Platform,Recreation Center,Rental Car Location,Restaurant,River,Sandwich Place,Scenic Lookout,Sculpture Garden,Shopping Mall,Smoke Shop,Snack Place,Sports Club,Tea Room,Theater,Toy / Game Store,Train Station,Tree,Vegetarian / Vegan Restaurant,Video Store,Zoo,Total Caffes
0,Ahmedabad Cantonment,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Alam Roza,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,...,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1
2,Ambawadi,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,1,0,0,0,0,0,0,0,0,0,0,5,0,...,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,2,0,0,2,0,0,0,0,0,1,0,0,2,0,1,0,2,1,0,0,0,0,0,0,11
3,Amraiwadi,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Asarwa,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
58,Thaltej,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,5,0,0,0,0,0,2,0,0,1,0,0,0,...,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,2,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,6
59,Usmanpura,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,2,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,6
60,Vastral,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
61,Vastrapur,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,4,0,2,0,0,0,1,0,1,0,2,1,0,0,7,0,...,0,1,0,0,1,0,0,0,0,0,0,1,0,0,1,0,0,0,2,0,0,0,5,0,2,0,0,2,0,2,0,0,0,1,0,0,2,0,0,21


We will create dataframe which has one column total cafe

In [80]:
data = v_group[["Neighborhoods","Total Caffes"]]

#Cluster the neighbourhoods
Run k-means to cluster the neighborhoods in Ahmedabad into 3 clusters.

In [81]:
# number of clusters
k = 3

final_data= data[['Total Caffes']]

# run k-means clustering
kmeans = KMeans(n_clusters=k, random_state=0).fit(final_data)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:20]

array([0, 0, 1, 0, 0, 0, 0, 0, 0, 2, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0],
      dtype=int32)

In [89]:
# create a new dataframe that includes the cluster
final_df = data.copy()

In [91]:
final_df.head()

Unnamed: 0,Neighborhoods,Total Caffes
0,Ahmedabad Cantonment,0
1,Alam Roza,1
2,Ambawadi,11
3,Amraiwadi,0
4,Asarwa,0


In [92]:
coords = [ get_latlng(neighborhood) for neighborhood in final_df["Neighborhoods"].tolist() ]

In [93]:
df2 = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])
df2.shape

(63, 2)

In [94]:
final_df['Latitude']=df2['Latitude']
final_df['Longitude']=df2['Longitude']

In [95]:
final_df['Labels']=kmeans.labels_

In [96]:
final_df.head()

Unnamed: 0,Neighborhoods,Total Caffes,Latitude,Longitude,Labels
0,Ahmedabad Cantonment,0,23.0662,72.60219,0
1,Alam Roza,1,23.02778,72.60025,0
2,Ambawadi,11,23.01893,72.55437,1
3,Amraiwadi,0,23.00735,72.62268,0
4,Asarwa,0,23.04709,72.60481,0


In [100]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(3)
ys = [i+x+(i*x)**2 for i in range(3)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(final_df['Latitude'], final_df['Longitude'], final_df['Neighborhoods'], final_df['Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

#####Cluster 0

In [102]:
final_df.loc[final_df['Labels'] == 0]

Unnamed: 0,Neighborhoods,Total Caffes,Latitude,Longitude,Labels
0,Ahmedabad Cantonment,0,23.0662,72.60219,0
1,Alam Roza,1,23.02778,72.60025,0
3,Amraiwadi,0,23.00735,72.62268,0
4,Asarwa,0,23.04709,72.60481,0
5,Asarwa Chakla,0,23.04709,72.60481,0
6,Bahiyal,1,23.02778,72.60025,0
7,Bapunagar,0,23.03474,72.63023,0
8,Behrampura,1,23.00279,72.57705,0
10,Bopal,1,23.03032,72.47247,0
12,Chandlodiya,1,23.081329,72.548314,0


#####Cluster 1

In [103]:
final_df.loc[final_df['Labels'] == 1]

Unnamed: 0,Neighborhoods,Total Caffes,Latitude,Longitude,Labels
2,Ambawadi,11,23.01893,72.55437,1
26,"Jodhpur, Gujarat",10,23.02063,72.52522,1
38,Mithakali,12,23.02851,72.56525,1
61,Vastrapur,21,23.03721,72.53087,1


#####Cluster 2

In [104]:
final_df.loc[final_df['Labels'] == 2]

Unnamed: 0,Neighborhoods,Total Caffes,Latitude,Longitude,Labels
9,Bhairavnath Road,4,23.002636,72.598164,2
11,Chandkheda,3,23.11214,72.57989,2
22,Isanpur,3,22.97137,72.59743,2
28,Kalupur,3,23.02828,72.59374,2
29,Kalyanpura (Ahmedabad),4,23.04764,72.56149,2
30,"Khadia, Ahmedabad",3,23.02081,72.59244,2
36,Maninagar,4,23.00526,72.60731,2
39,Motera,3,23.10321,72.6051,2
40,Naranpura,5,23.05508,72.55546,2
44,Paldi,7,23.01341,72.57155,2


# Final observation

A good number of Caffe are concentrated near Sabarmati river in Ahmedabad. Cluster 1 shows the neighbourhood which has highest number of cafe in surroundings which are marked with color blue. Cluster 2 shows the moderate amount of Snacks shop in the area. This represents a great opportunity and high potential areas to open new cafe as there is very little to no competition area marked with red color. Meanwhile, Shops in cluster 1 are likely suffering from intense competition due to oversupply and high concentration of Cafes. Therefore, this project recommends Cafe frenchies owners to capitalize on these findings to open new shopps in neighbourhoods in cluster 0 with little to no competition. Owner with unique idea can also try in Cluster 2.Lastly, They are advised to avoid neighbourhoods in cluster 1 which already have a high concentration of caffes and restaurents and suffering from intense competition.