## IBM Applied Data Science Capstone Course 
##### _Pranav Shekhawat_

_________________

### **Opening a Take-Away Indian Restaurant in Christchurch City of New Zealand**

#### **The project is divided into 5 parts**
* **Part 1** Loading data into dataframe using pandas Library
* **Part 2** Processing geographical coordinates using geocoder
* **Part 3** Exploring venues using Foursquare API 
* **Part 4** Using Kmeans for Clustering Analysis
* **Part 5** Observation Analysis

__Please Note:__ _My suggestion to open a take-away restaurant is solely based on the current changes around the world due to coronavirus. And people ordering online rather than dine-in. The investors can use this research to open a dine-in restaurants as well._

_______

#### **Installations and Importing Libraries**

In [1]:
import pandas as pd # library for data analysis
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)

import numpy as np # library to handle data in a vectorized manner

import json # library to handle JSON files

!pip install geocoder
!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import geocoder # to get coordinates

import requests # library to handle requests

!pip install beautifulsoup4
from bs4 import BeautifulSoup # library to parse HTML and XML documents

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

print("Load Complete")

Collecting geocoder
[?25l  Downloading https://files.pythonhosted.org/packages/4f/6b/13166c909ad2f2d76b929a4227c952630ebaf0d729f6317eb09cbceccbab/geocoder-1.38.1-py2.py3-none-any.whl (98kB)
[K     |████████████████████████████████| 102kB 5.4MB/s ta 0:00:01
Collecting ratelim (from geocoder)
  Downloading https://files.pythonhosted.org/packages/f2/98/7e6d147fd16a10a5f821db6e25f192265d6ecca3d82957a4fdd592cad49c/ratelim-0.1.6-py2.py3-none-any.whl
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    geopy-2.0.0       

______

### **Part 1:** Loading Data

**Note:** I used data from wikipedia but it was not reliable for obtaining geographical coordinates. I tried a couple for times but it did not work as required, so I did scrapping and cleaning manually and uploaded the data via local storage. 
Link to reference: https://en.wikipedia.org/wiki/Category:Suburbs_of_Christchurch

In [2]:
Data = pd.read_csv('Chch_data.csv') # List of Christchurch Suburbs 

In [6]:
Data.head()

Unnamed: 0,Suburbs
0,"Addington, New Zealand"
1,Aidanfield
2,Aranui
3,"Avondale, Christchurch"
4,Avonhead


________

### **Part 2:** Processing Geographical Coordinates

In [8]:
def get_latilong(suburbs): # define a function to get coordinates
    lati_long_coords = None     # initialize your variable to None
    while(lati_long_coords is None):     # loop until you get the coordinates
        g = geocoder.arcgis('{}, Christchurch, New Zealand'.format(suburbs))
        lati_long_coords = g.latlng
    return lati_long_coords
    
get_latilong('Christchurch Central, Christchurch')

[-43.52891999999997, 172.63236000000006]

In [10]:
# call the function to get the coordinates and storing it in list
suburbs = Data['Suburbs']    
Locate = [ get_latilong(suburbs) for suburbs in suburbs.tolist() ]

In [11]:
Locate # Checking if we have retrived coordinates for each suburb

[[-43.53964999999994, 172.60590000000002],
 [-43.564340277999975, 172.56806022000012],
 [-43.51393977899994, 172.7054613350001],
 [-43.530347448999976, 172.6324268940001],
 [-43.51032999999995, 172.55741000000012],
 [-43.521204498999964, 172.67728250100004],
 [-43.55705999999998, 172.61889000000008],
 [-43.56435291499997, 172.64296614300008],
 [-43.45934999999997, 172.62381000000005],
 [-43.51712342999997, 172.71910636600012],
 [-43.49091999999996, 172.59289000000012],
 [-43.453821836999964, 172.694461702],
 [-43.535664864999944, 172.7030473650001],
 [-43.530347448999976, 172.6324268940001],
 [-43.50445084099994, 172.59162613600006],
 [-43.530347448999976, 172.6324268940001],
 [-43.493454257999986, 172.68552574900002],
 [-43.56699999999995, 172.63544000000002],
 [-43.52891999999997, 172.63236000000006],
 [-43.530347448999976, 172.6324268940001],
 [-43.58579986099994, 172.61654545500005],
 [-43.513117284999964, 172.67515806200004],
 [-43.61832999999996, 172.72013000000004],
 [-43.750970

In [12]:
# Adding Latitude & Longitude Columns
df_Locate = pd.DataFrame(Locate, columns=['Latitude', 'Longitude'])
Data['Latitude'] = df_Locate['Latitude']
Data['Longitude'] = df_Locate['Longitude']

In [13]:
# Merging the coordinates into the original dataframe
Data['Latitude'] = df_Locate['Latitude']
Data['Longitude'] = df_Locate['Longitude']

In [14]:
# Check to approve
print(Data.shape)
Data.head()

(76, 3)


Unnamed: 0,Suburbs,Latitude,Longitude
0,"Addington, New Zealand",-43.53965,172.6059
1,Aidanfield,-43.56434,172.56806
2,Aranui,-43.51394,172.705461
3,"Avondale, Christchurch",-43.530347,172.632427
4,Avonhead,-43.51033,172.55741


In [16]:
# save the DataFrame as CSV file
Data.to_csv("Christchurch_data.csv", index=False)

In [17]:
# Lets obtain the coordinates of Christchurch City
address = 'Christchurch, New Zealand'

geolocator = Nominatim(user_agent="chch_exp")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Christchurch are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Christchurch are -43.530955, 172.6366455.


**Map of Christchurch City**

In [20]:
# creating a map of Christchurch using latitude and longitude values
chch_map = folium.Map(location=[latitude, longitude], zoom_start=11)

# adding markers for suburbs
for lat, lng, suburbs in zip(Data['Latitude'], Data['Longitude'], Data['Suburbs']):
    label = '{}'.format(suburbs)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='darkred',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.9).add_to(chch_map)  
    
chch_map

_______

### **Part 3:** Foursquare API 

**Initializing API Credentials**

In [21]:
# define Foursquare Credentials and Version
CLIENT_ID = 'Sensitive_code' # your Foursquare ID
CLIENT_SECRET = 'Sensitive_code' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

**Now using API we will explore the suburbs for different venues**

* **Radius** is set to **800** meters as it is a small city
* **Limit** will fetch top **100** venues

In [22]:
radius = 800
LIMIT = 100

venues = []

for lat, long, suburbs in zip(Data['Latitude'], Data['Longitude'], Data['Suburbs']):
    
    # create the API request URL
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        LIMIT)
    
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    # return only relevant information for each nearby venue
    for venue in results:
        venues.append((
            suburbs,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [23]:
# convert the venues list into a new DataFrame
Venue_data = pd.DataFrame(venues)

# define the column names
Venue_data.columns = ['Suburbs', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(Venue_data.shape)
Venue_data.head()

(1330, 7)


Unnamed: 0,Suburbs,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,"Addington, New Zealand",-43.53965,172.6059,Tower Junction Mega Centre,-43.538805,172.606176,Shopping Mall
1,"Addington, New Zealand",-43.53965,172.6059,The Court Theatre,-43.541363,172.610309,Theater
2,"Addington, New Zealand",-43.53965,172.6059,Addington Coffee Co-op,-43.54359,172.611595,Coffee Shop
3,"Addington, New Zealand",-43.53965,172.6059,North & South Gourmet (南北小厨),-43.543909,172.611512,Asian Restaurant
4,"Addington, New Zealand",-43.53965,172.6059,Christchurch Train Station,-43.539805,172.608015,Train Station


In [24]:
Venue_data.tail()

Unnamed: 0,Suburbs,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
1325,"Woolston, New Zealand",-43.5566,172.6801,The Tannery - Boutique Retail & Arts Emporium,-43.557187,172.680316,Boutique
1326,"Woolston, New Zealand",-43.5566,172.6801,Blue Smoke,-43.55694,172.680281,Wine Bar
1327,"Woolston, New Zealand",-43.5566,172.6801,In Situ,-43.558668,172.673332,Café
1328,"Woolston, New Zealand",-43.5566,172.6801,Three Boys Brewery,-43.555746,172.678998,Brewery
1329,"Woolston, New Zealand",-43.5566,172.6801,Mitchelli's,-43.557833,172.680027,Italian Restaurant


**Checking total number of venues for each suburb**

In [25]:
Venue_data.groupby(["Suburbs"]).count()

Unnamed: 0_level_0,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
Suburbs,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Addington, New Zealand",33,33,33,33,33,33
Aidanfield,4,4,4,4,4,4
Aranui,5,5,5,5,5,5
"Avondale, Christchurch",91,91,91,91,91,91
Avonhead,8,8,8,8,8,8
Avonside,7,7,7,7,7,7
"Barrington, New Zealand",12,12,12,12,12,12
"Beckenham, New Zealand",8,8,8,8,8,8
"Belfast, New Zealand",17,17,17,17,17,17
"Bexley, New Zealand",4,4,4,4,4,4


**Checking total numbers of unique categories of venue in the city**

In [26]:
print('There are {} uniques categories.'.format(len(Venue_data['VenueCategory'].unique())))

There are 152 uniques categories.


In [30]:
Venue_data['VenueCategory'].unique()[:150]

array(['Shopping Mall', 'Theater', 'Coffee Shop', 'Asian Restaurant',
       'Train Station', 'Stadium', 'Falafel Restaurant',
       'Afghan Restaurant', 'Liquor Store', 'Bar', 'Pet Store',
       'Performing Arts Venue', 'Pizza Place', 'Gastropub', 'Hostel',
       'Café', 'Fast Food Restaurant', "Men's Store", 'Bakery',
       'Sporting Goods Shop', 'Racetrack', 'Chinese Restaurant',
       'Rugby Pitch', 'Shipping Store', 'Market',
       'Furniture / Home Store', 'Fish & Chips Shop', 'Pharmacy',
       'Lingerie Store', 'Park', 'Art Gallery', 'Cheese Shop',
       'Arts & Crafts Store', 'History Museum', 'Botanical Garden',
       'Food Truck', 'Bookstore', 'Lounge', 'Plaza', 'Restaurant',
       'Irish Pub', 'Department Store', 'Supermarket', 'Hotel',
       'Italian Restaurant', 'Playground', 'Gym', 'Speakeasy',
       'Thai Restaurant', 'Diner', 'Mediterranean Restaurant',
       'Modern European Restaurant', 'Cajun / Creole Restaurant',
       'Burger Joint', 'Historic Site', 

**Checking to see if "Indian Restaurant" is a category**

In [31]:
"Indian Restaurant" in Venue_data['VenueCategory'].unique()

True

In [36]:
a=pd.Series(Venue_data.VenueCategory) # Checking top 10 venue categories
a.value_counts()[:20]

Café                    166
Hotel                    77
Bar                      68
Park                     59
Coffee Shop              45
Restaurant               37
Thai Restaurant          34
Italian Restaurant       28
Gastropub                27
Supermarket              26
Shopping Mall            23
Indian Restaurant        22
Pizza Place              22
Chinese Restaurant       22
Fast Food Restaurant     22
Department Store         21
Hostel                   20
Diner                    20
Bakery                   20
History Museum           17
Name: VenueCategory, dtype: int64

**Investigating each suburb by utilizing One Hot Encoding** 

In [38]:
# one hot encoding
Chch_onehot = pd.get_dummies(Venue_data[['VenueCategory']], prefix="", prefix_sep="")

# add suburbs column back to dataframe
Chch_onehot['Suburbs'] = Venue_data['Suburbs'] 

# move suburbs column to the first column
fixed_columns = [Chch_onehot.columns[-1]] + list(Chch_onehot.columns[:-1])
Chch_onehot = Chch_onehot[fixed_columns]
Chch_grouped = Chch_onehot.groupby('Suburbs').mean().reset_index()
Chch_onehot.head(5)

Unnamed: 0,Suburbs,Accessories Store,Afghan Restaurant,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,BBQ Joint,Bakery,Bar,Bay,Beach,Beer Bar,Beer Garden,Bike Rental / Bike Share,Bistro,Bookstore,Botanical Garden,Boutique,Breakfast Spot,Brewery,Burger Joint,Bus Stop,Business Service,Café,Cajun / Creole Restaurant,Camera Store,Campground,Casino,Cheese Shop,Chinese Restaurant,Clothing Store,Coffee Shop,Construction & Landscaping,Convenience Store,Cosmetics Shop,Department Store,Diner,Egyptian Restaurant,Electronics Store,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Food Court,Food Truck,Football Stadium,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden Center,Gas Station,Gastropub,Gay Bar,Gelato Shop,German Restaurant,Gift Shop,Golf Course,Grocery Store,Gym,Gym / Fitness Center,Health & Beauty Service,Hill,Historic Site,History Museum,Hobby Shop,Home Service,Hostel,Hotel,IT Services,Ice Cream Shop,Indian Restaurant,Irish Pub,Italian Restaurant,Japanese Restaurant,Kids Store,Korean Restaurant,Lawyer,Lingerie Store,Liquor Store,Lounge,Market,Mediterranean Restaurant,Memorial Site,Men's Store,Mexican Restaurant,Miscellaneous Shop,Modern European Restaurant,Motel,Mountain,Moving Target,Multiplex,Music Store,Music Venue,Nature Preserve,Neighborhood,Nightclub,Noodle House,Organic Grocery,Paper / Office Supplies Store,Park,Performing Arts Venue,Pet Store,Pharmacy,Pier,Pizza Place,Playground,Plaza,Portuguese Restaurant,Post Office,Print Shop,Pub,Racetrack,Rental Car Location,Restaurant,Rugby Pitch,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Snack Place,Soccer Field,South Indian Restaurant,Souvlaki Shop,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Szechuan Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Track,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Wine Bar,Yoga Studio
0,"Addington, New Zealand",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"Addington, New Zealand",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
2,"Addington, New Zealand",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,"Addington, New Zealand",0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,"Addington, New Zealand",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0


In [39]:
Chch_grouped = Chch_onehot.groupby('Suburbs').mean().reset_index() # group rows by suburbs and obtaing thier mean of the frequency of occurrence of each category 
print(Chch_grouped.shape)
Chch_grouped

(72, 153)


Unnamed: 0,Suburbs,Accessories Store,Afghan Restaurant,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,BBQ Joint,Bakery,Bar,Bay,Beach,Beer Bar,Beer Garden,Bike Rental / Bike Share,Bistro,Bookstore,Botanical Garden,Boutique,Breakfast Spot,Brewery,Burger Joint,Bus Stop,Business Service,Café,Cajun / Creole Restaurant,Camera Store,Campground,Casino,Cheese Shop,Chinese Restaurant,Clothing Store,Coffee Shop,Construction & Landscaping,Convenience Store,Cosmetics Shop,Department Store,Diner,Egyptian Restaurant,Electronics Store,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Food Court,Food Truck,Football Stadium,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden Center,Gas Station,Gastropub,Gay Bar,Gelato Shop,German Restaurant,Gift Shop,Golf Course,Grocery Store,Gym,Gym / Fitness Center,Health & Beauty Service,Hill,Historic Site,History Museum,Hobby Shop,Home Service,Hostel,Hotel,IT Services,Ice Cream Shop,Indian Restaurant,Irish Pub,Italian Restaurant,Japanese Restaurant,Kids Store,Korean Restaurant,Lawyer,Lingerie Store,Liquor Store,Lounge,Market,Mediterranean Restaurant,Memorial Site,Men's Store,Mexican Restaurant,Miscellaneous Shop,Modern European Restaurant,Motel,Mountain,Moving Target,Multiplex,Music Store,Music Venue,Nature Preserve,Neighborhood,Nightclub,Noodle House,Organic Grocery,Paper / Office Supplies Store,Park,Performing Arts Venue,Pet Store,Pharmacy,Pier,Pizza Place,Playground,Plaza,Portuguese Restaurant,Post Office,Print Shop,Pub,Racetrack,Rental Car Location,Restaurant,Rugby Pitch,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shipping Store,Shoe Store,Shop & Service,Shopping Mall,Snack Place,Soccer Field,South Indian Restaurant,Souvlaki Shop,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Szechuan Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Track,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Wine Bar,Yoga Studio
0,"Addington, New Zealand",0.0,0.030303,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.060606,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.151515,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.060606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.030303,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.060606,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.060606,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0
1,Aidanfield,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Aranui,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Avondale, Christchurch",0.0,0.0,0.0,0.010989,0.010989,0.010989,0.0,0.0,0.0,0.010989,0.076923,0.0,0.0,0.0,0.0,0.0,0.010989,0.010989,0.010989,0.0,0.0,0.0,0.010989,0.0,0.0,0.142857,0.010989,0.0,0.0,0.010989,0.010989,0.010989,0.010989,0.043956,0.0,0.0,0.0,0.010989,0.021978,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010989,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032967,0.010989,0.010989,0.010989,0.010989,0.0,0.0,0.010989,0.0,0.0,0.0,0.010989,0.021978,0.0,0.0,0.021978,0.098901,0.0,0.0,0.010989,0.010989,0.032967,0.010989,0.0,0.0,0.0,0.0,0.0,0.010989,0.0,0.010989,0.010989,0.0,0.010989,0.0,0.010989,0.0,0.0,0.0,0.010989,0.0,0.0,0.0,0.0,0.010989,0.0,0.0,0.0,0.010989,0.0,0.0,0.0,0.010989,0.0,0.010989,0.021978,0.0,0.0,0.0,0.010989,0.0,0.0,0.032967,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010989,0.0,0.0,0.0,0.0,0.0,0.010989,0.0,0.0,0.0,0.0,0.0,0.010989,0.0,0.0,0.0,0.0,0.021978,0.010989,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Avonhead,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Avonside,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,"Barrington, New Zealand",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.083333,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,"Beckenham, New Zealand",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,"Belfast, New Zealand",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.117647,0.0,0.0,0.117647,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.117647,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.058824,0.0,0.0,0.0,0.058824,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.117647,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,"Bexley, New Zealand",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.75,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [46]:
len(Chch_grouped[Chch_grouped["Indian Restaurant"] > 0])

20

In [47]:
# Creating new dataframe of Indian restaurants and Suburbs
Chch_IR = Chch_grouped[["Suburbs","Indian Restaurant"]] 

In [50]:
Chch_IR.head()

Unnamed: 0,Suburbs,Indian Restaurant
0,"Addington, New Zealand",0.0
1,Aidanfield,0.0
2,Aranui,0.0
3,"Avondale, Christchurch",0.010989
4,Avonhead,0.0


________

### **Part 4:** Kmeans Clustering

* **Number of Clusters** "kcluster" is set to 3, it will generate 3 clusters in the city

In [51]:
# set number of clusters
kclusters = 3

Chch_clustering = Chch_IR.drop(["Suburbs"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Chch_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int32)

In [52]:
# create a new dataframe that includes the cluster as well as the top 10 venues for each suburbs.
Chch_merged = Chch_IR.copy()

# add clustering labels
Chch_merged["Labels"] = kmeans.labels_

In [55]:
Chch_merged.rename(columns={"Suburbs": "Suburbs"}, inplace=True)
Chch_merged.head()

Unnamed: 0,Suburbs,Indian Restaurant,Labels
0,"Addington, New Zealand",0.0,1
1,Aidanfield,0.0,1
2,Aranui,0.0,1
3,"Avondale, Christchurch",0.010989,1
4,Avonhead,0.0,1


In [56]:
# merge christchurch grouped data with our orignal christchurch data 
# to add latitude/longitude for each suburb
Chch_merged = Chch_merged.join(Data.set_index("Suburbs"), on="Suburbs")

print(Chch_merged.shape)
Chch_merged.head() 

(72, 5)


Unnamed: 0,Suburbs,Indian Restaurant,Labels,Latitude,Longitude
0,"Addington, New Zealand",0.0,1,-43.53965,172.6059
1,Aidanfield,0.0,1,-43.56434,172.56806
2,Aranui,0.0,1,-43.51394,172.705461
3,"Avondale, Christchurch",0.010989,1,-43.530347,172.632427
4,Avonhead,0.0,1,-43.51033,172.55741


In [57]:
# sort the results by Cluster Labels
print(Chch_merged.shape)
Chch_merged.sort_values(["Labels"], inplace=True)
Chch_merged

(72, 5)


Unnamed: 0,Suburbs,Indian Restaurant,Labels,Latitude,Longitude
37,Merivale,0.05,0,-43.516545,172.615832
21,Edgeware,0.0625,0,-43.514293,172.647681
49,"Riccarton, New Zealand",0.037037,0,-43.5298,172.60119
23,Ferrymead,0.090909,0,-43.563577,172.702118
13,Bryndwr,0.076923,0,-43.504451,172.591626
59,"St Albans, New Zealand",0.1,0,-43.51343,172.63804
44,Papanui,0.035714,0,-43.495825,172.608419
51,"Richmond, Christchurch",0.047619,0,-43.50969,172.64245
42,"New Brighton, New Zealand",0.090909,0,-43.50733,172.72867
64,"Sydenham, New Zealand",0.052632,0,-43.55787,172.63663


**Map of Christchurch City with Cluster and Labels**

In [58]:
# create map
Chch_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Chch_merged['Latitude'], Chch_merged['Longitude'], Chch_merged['Suburbs'], Chch_merged['Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=1).add_to(Chch_clusters)
       
Chch_clusters

**Verification of clusters**

* **Cluster 0**

In [59]:
Chch_merged.loc[Chch_merged['Labels'] == 0]

Unnamed: 0,Suburbs,Indian Restaurant,Labels,Latitude,Longitude
37,Merivale,0.05,0,-43.516545,172.615832
21,Edgeware,0.0625,0,-43.514293,172.647681
49,"Riccarton, New Zealand",0.037037,0,-43.5298,172.60119
23,Ferrymead,0.090909,0,-43.563577,172.702118
13,Bryndwr,0.076923,0,-43.504451,172.591626
59,"St Albans, New Zealand",0.1,0,-43.51343,172.63804
44,Papanui,0.035714,0,-43.495825,172.608419
51,"Richmond, Christchurch",0.047619,0,-43.50969,172.64245
42,"New Brighton, New Zealand",0.090909,0,-43.50733,172.72867
64,"Sydenham, New Zealand",0.052632,0,-43.55787,172.63663


* **Cluster 1**

In [60]:
Chch_merged.loc[Chch_merged['Labels'] == 1]

Unnamed: 0,Suburbs,Indian Restaurant,Labels,Latitude,Longitude
45,"Parklands, New Zealand",0.0,1,-43.48164,172.706
50,"Richmond Hill, New Zealand",0.0,1,-43.575697,172.750407
43,Opawa,0.0,1,-43.555987,172.661955
47,Redcliffs,0.0,1,-43.565306,172.733929
48,"Redwood, Christchurch",0.0,1,-43.47743,172.6166
41,Murray Aynsley Hill,0.0,1,-43.558363,172.66588
40,"Mount Pleasant, New Zealand",0.0,1,-43.568966,172.722116
39,Moncks Bay,0.0,1,-43.568532,172.742179
46,"Phillipstown, New Zealand",0.0,1,-43.536644,172.662645
0,"Addington, New Zealand",0.0,1,-43.53965,172.6059


* **Cluster 2**

In [61]:
Chch_merged.loc[Chch_merged['Labels'] == 2]

Unnamed: 0,Suburbs,Indian Restaurant,Labels,Latitude,Longitude
60,St Andrews Hill,0.2,2,-43.55853,172.70962


### **Part 5:** Observation Analysis

**After Examining the clusters we can visualize that:**

* **Cluster 0 (Red)** contains highest number of Indian Restaurants
* **Cluster 2 (Green)** contains moderate number of Indian Restaurants
* **Cluster 1 (Purple)** Contains very few or no Indian Restaurants

**Cluster 1** can be called the cluster of oppurtunity considering it covers about ~80% of the suburbs in the city of Christchurch. Therefore, I can recommend to investors, entrepreneurs or anyone interested who are looking to open a take-away Indian restaurant to capitalize on these findings. This clusters also covers the central part of the city which has plenty of opportunity as it is under redevelopment phase and new spaces as well as land is up for sale/rent.

________
**END**
________