##### Welcome to my Capstone project with IBM




# Battle of the neighboorhoods
## Introduction

This project will help us determine and visualise where it is most optimum to establish a steak and grill restaurant in Galway.
Galway is situated on the west coast of Ireland with a population greater than 250,000.
It is renowned for its vibrant lifestyle and for hosting numerous festivals, celebrations and events such as The Galway Arts Festival. In 2018, it was named the European Region of Gastronomy. The city is currently the European Capital of Culture for 2020, alongside Rijeka, Croatia. Alongside its strong cultural presence, it leads the way in medicine, pharmaceutical research and biomedical multinational companies. 
By combining readily available information on the internet and leveraging data science techniques, we will be able to analyse each neighborhood in Galway and understand the venues/amenities. Furthermore we could combine data at a later stage such as average house prices, crime rate, happy index of neighborhood and etc. if the data was available to improve our basis of deciding on an area. 

The reason we choose to locate for a steakhouse and grill restaurant is to take advantage of the high quality meat in the west coast of Ireland and the availability of fresh fish from the Atlantic.


## Data

The data used in this project will taken from readily available data online.
We will scrap and then parse this data using the beautifulsoup package. 
As outlined in the introduction section, we can merge statistical data available from the census to combine housing pricing, employment figures, crime rates and more if required. Location and venue data will then be 
added using the foursquare api which will help us in our analysis. The report will outline the top ten venues for each neighborhood using the mean count of venues. Ideally we would prefer not to open near another restaurant 
to avoid competition. The idea would be that the customer base would travel to this location to enjoy the meal. Ample car parking facilities and proven to be in a nice location would be preferred especially away from busy
areas. Combining traffic location was preferred yet no public free data was available. Please look through the next section to start the analysis.  

### Accessing Data for project 

In [1]:
import pandas as pd
import requests
from bs4 import BeautifulSoup
from tabulate import tabulate

In [2]:
res = requests.get("http://www.galwaytransport.info/2008/12/galway-neighbourhood-maps-g.html")
soup = BeautifulSoup(res.content,'lxml')

In [3]:
final_link = soup.a
final_link.decompose()

tds = soup.find_all('td')
names1 = []
for td in tds:
    for link in td.find_all('a'):
        fulllink = link.get ('href')
        names = link.contents[0]
        names1.append(names)
        #print(fulllink)
        #print (names) #print in terminal to verify results
        


In [4]:
yup = ['Neighborhood']
df2 = pd.DataFrame(data = names1, columns = yup)
df2 = df2.assign(City=lambda x: 'Galway')
df2.drop(df2.tail(4).index,inplace=True)
df2['Address']= df2[['Neighborhood','City']].agg(','.join,axis =1 )
Galway = df2
Galway

Unnamed: 0,Neighborhood,City,Address
0,Ballinfoyle,Galway,"Ballinfoyle,Galway"
1,Ballybane,Galway,"Ballybane,Galway"
2,Ballybrit,Galway,"Ballybrit,Galway"
3,Bohermore,Galway,"Bohermore ,Galway"
4,Castlegar,Galway,"Castlegar,Galway"
5,Claddagh,Galway,"Claddagh ,Galway"
6,Cathedral,Galway,"Cathedral,Galway"
7,College Rd,Galway,"College Rd,Galway"
8,Docks,Galway,"Docks,Galway"
9,Dominic St,Galway,"Dominic St,Galway"


## Using Foursquare Api 

In [5]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files


import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans



print('Libraries imported.')

Libraries imported.


In [6]:
!conda install -c conda-forge folium=0.5.0 --yes 
import folium # map rendering library

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    branca-0.4.1               |             py_0          26 KB  conda-forge
    certifi-2020.4.5.2         |   py36h9f0ad1d_0         152 KB  conda-forge
    ca-certificates-2020.4.5.2 |       hecda079_0         147 KB  conda-forge
    altair-4.1.0               |             py_1         614 KB  conda-forge
    openssl-1.1.1g             |       h516909a_0         2.1 MB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    ------------------------------------------------------------
                       

In [7]:
!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab


Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geopy-1.22.0               |     pyh9f0ad1d_0          63 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    ------------------------------------------------------------
                                           Total:          97 KB

The following NEW packages will be INSTALLED:

    geographiclib: 1.50-py_0           conda-forge
    geopy:         1.22.0-pyh9f0ad1d_0 conda-forge


Downloading and Extracting Packages
geopy-1.22.0         | 63 KB     | ##################################### | 100% 
geographiclib-1.50   | 34 KB     | ##################################### | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

In [8]:
from geopy.extra.rate_limiter import RateLimiter
from geopy.geocoders import Nominatim 
df = Galway
locator = Nominatim(user_agent='myGeocoder')
# 1 - conveneint function to delay between geocoding calls
geocode = RateLimiter(locator.geocode, min_delay_seconds=1)
# 2- - create location column
df['location'] = df['Address'].apply(geocode)
# 3 - create longitude, laatitude and altitude from location column (returns tuple)
df['point'] = df['location'].apply(lambda loc: tuple(loc.point) if loc else None)
df

Unnamed: 0,Neighborhood,City,Address,location,point
0,Ballinfoyle,Galway,"Ballinfoyle,Galway",,
1,Ballybane,Galway,"Ballybane,Galway","(Ballybane, Ballybaan, Galway Municipal Distri...","(53.2861099, -9.0075049, 0.0)"
2,Ballybrit,Galway,"Ballybrit,Galway","(Ballybrit, Galway Municipal District, County ...","(53.29605135, -8.999850268879754, 0.0)"
3,Bohermore,Galway,"Bohermore ,Galway","(Bohermore, Townparks, Eyre Square, Galway Mun...","(53.2800015, -9.0430391, 0.0)"
4,Castlegar,Galway,"Castlegar,Galway","(Castlegar, Galway Municipal District, County ...","(53.29862035, -9.022817110178078, 0.0)"
5,Claddagh,Galway,"Claddagh ,Galway","(Claddagh, Galway Municipal District, County G...","(53.2674717, -9.0552822, 0.0)"
6,Cathedral,Galway,"Cathedral,Galway","(Cathedral, Earl's Island, Townparks, Nun's Is...","(53.2757878, -9.057282, 0.0)"
7,College Rd,Galway,"College Rd,Galway","(College Road, Townparks, Eyre Square, Galway ...","(53.280103100000005, -9.036835336996791, 0.0)"
8,Docks,Galway,"Docks,Galway","(Claddagh Basin, Cathair na Gaillimhe, County ...","(53.269408, -9.055739416010635, 0.0)"
9,Dominic St,Galway,"Dominic St,Galway","(Dominic Street, Portumna, Portumna ED, Loughr...","(53.0924733, -8.2156929, 0.0)"


In [9]:
Galway1 = df
Galway1.dropna(
    axis=0,
    how='any',
    thresh=None,
    subset=None,
    inplace=True
)
# 4 - split point column into latitude, longitude and altitude columns
Galway1[['latitude', 'longitude', 'altitude']] = pd.DataFrame(Galway1['point'].tolist(), index=df.index)
Galway1 = Galway1[Galway1.columns[[0,2,5,6]]]
Galway1

Unnamed: 0,Neighborhood,Address,latitude,longitude
1,Ballybane,"Ballybane,Galway",53.28611,-9.007505
2,Ballybrit,"Ballybrit,Galway",53.296051,-8.99985
3,Bohermore,"Bohermore ,Galway",53.280001,-9.043039
4,Castlegar,"Castlegar,Galway",53.29862,-9.022817
5,Claddagh,"Claddagh ,Galway",53.267472,-9.055282
6,Cathedral,"Cathedral,Galway",53.275788,-9.057282
7,College Rd,"College Rd,Galway",53.280103,-9.036835
8,Docks,"Docks,Galway",53.269408,-9.055739
9,Dominic St,"Dominic St,Galway",53.092473,-8.215693
10,Doughiska / Briarhill,"Doughiska / Briarhill,Galway",53.293813,-8.985367


### Begin analysing data

In [10]:
address = 'Galway, Ireland'

geolocator = Nominatim(user_agent="Moods")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Galway City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Galway City are 53.2744122, -9.0490632.


In [11]:
# create map of Galway using latitude and longitude values
Galway_ire = folium.Map(location=[latitude, longitude], zoom_start=9)

# add markers to map
for latitude, longitude, Address, Neighborhood in zip(Galway1['latitude'], Galway1['longitude'], Galway1['Address'], Galway1['Neighborhood']):
    label = '{}, {}'.format(Neighborhood, Address)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [latitude, longitude],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(Galway_ire)  

Galway_ire

In [12]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [13]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [14]:
CLIENT_ID = '3P1OQRTPOSFMIU3WGKOGEK41QFDLUCE5W2SSKYROZUR5KN2I' # your Foursquare ID
CLIENT_SECRET = 'YF2ZZ3G1O42XCKGIM2Q3NDB1CWMASN44DLMJOVRTVWLAGQNI' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 3P1OQRTPOSFMIU3WGKOGEK41QFDLUCE5W2SSKYROZUR5KN2I
CLIENT_SECRET:YF2ZZ3G1O42XCKGIM2Q3NDB1CWMASN44DLMJOVRTVWLAGQNI


In [15]:
Galway_venues = getNearbyVenues(names=Galway1['Neighborhood'],
                                   latitudes=Galway1['latitude'],
                                   longitudes=Galway1['longitude']
                                  )

Ballybane
Ballybrit
Bohermore 
Castlegar
Claddagh 
Cathedral
College Rd
Docks
Dominic St
Doughiska / Briarhill
Kirwan's Lane
Knocknacarra
Merchant's Road
Merlin Park
"New" Mervue
Old Mervue
Newcastle / Dangan
Parkmore
Prospect Hill
Renmore
Roscam
Salthill - Lower
Salthill - Upper
Shantalla
Taylor's Hill
Terryland 
University
Wellpark
Westside
Woodquay
Athenry
Barna
Clarinbridge
Claregalway
Clifden
Kilcolgan
Kinvara
Loughrea
Monivea
Moycullen
Oranmore
Oughterard
Spiddal
Spanish Point


In [16]:
#Galway_venues.groupby('Neighborhood').count()
print('There are {} uniques categories.'.format(len(Galway_venues['Venue Category'].unique())))

There are 87 uniques categories.


### Analyze each neighborhood

In [17]:
# one hot encoding
Galway_onehot = pd.get_dummies(Galway_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Galway_onehot['Neighborhood'] = Galway_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Galway_onehot.columns[-1]] + list(Galway_onehot.columns[:-1])
Galway_onehot = Galway_onehot[fixed_columns]

Galway_onehot.head()

Unnamed: 0,Neighborhood,American Restaurant,Aquarium,Arcade,Asian Restaurant,Athletics & Sports,Bakery,Bar,Bay,Beach,Bed & Breakfast,Beer Bar,Betting Shop,Bistro,Bookstore,Boxing Gym,Breakfast Spot,Burger Joint,Bus Stop,Cafeteria,Café,Castle,Cheese Shop,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Department Store,Diner,Discount Store,Electronics Store,Fast Food Restaurant,Flea Market,Food & Drink Shop,French Restaurant,Furniture / Home Store,Gas Station,Gastropub,Gift Shop,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Home Service,Hostel,Hotel,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Jewelry Store,Lake,Malay Restaurant,Mexican Restaurant,Movie Theater,Multiplex,Park,Pet Store,Pharmacy,Pie Shop,Pizza Place,Playground,Plaza,Pub,Racetrack,Rental Car Location,Restaurant,Rock Club,Rugby Pitch,Salad Place,Sandwich Place,Science Museum,Seafood Restaurant,Shopping Mall,Snack Place,Spanish Restaurant,Sports Club,Stadium,Supermarket,Tea Room,Theater,Trail,Train Station,Warehouse Store,Waste Facility,Waterfront
0,Ballybane,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Ballybane,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Ballybane,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Ballybane,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Ballybane,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [18]:
#Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category
Galway_grouped = Galway_onehot.groupby('Neighborhood').mean().reset_index()
Galway_grouped

Unnamed: 0,Neighborhood,American Restaurant,Aquarium,Arcade,Asian Restaurant,Athletics & Sports,Bakery,Bar,Bay,Beach,Bed & Breakfast,Beer Bar,Betting Shop,Bistro,Bookstore,Boxing Gym,Breakfast Spot,Burger Joint,Bus Stop,Cafeteria,Café,Castle,Cheese Shop,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Department Store,Diner,Discount Store,Electronics Store,Fast Food Restaurant,Flea Market,Food & Drink Shop,French Restaurant,Furniture / Home Store,Gas Station,Gastropub,Gift Shop,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Home Service,Hostel,Hotel,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Jewelry Store,Lake,Malay Restaurant,Mexican Restaurant,Movie Theater,Multiplex,Park,Pet Store,Pharmacy,Pie Shop,Pizza Place,Playground,Plaza,Pub,Racetrack,Rental Car Location,Restaurant,Rock Club,Rugby Pitch,Salad Place,Sandwich Place,Science Museum,Seafood Restaurant,Shopping Mall,Snack Place,Spanish Restaurant,Sports Club,Stadium,Supermarket,Tea Room,Theater,Trail,Train Station,Warehouse Store,Waste Facility,Waterfront
0,"""New"" Mervue",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Athenry,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0
2,Ballybane,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Ballybrit,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Barna,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.4,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Bohermore,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0625,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0625,0.1875,0.0,0.0,0.0,0.0,0.0625,0.0,0.0
6,Castlegar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
7,Cathedral,0.0,0.0,0.0,0.033333,0.0,0.066667,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.033333,0.033333,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.066667,0.0,0.066667,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0
8,Claddagh,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.033333,0.2,0.0,0.0,0.2,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0
9,Claregalway,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Let's print each neighborhood along with the top 5 most common venues

In [19]:
num_top_venues = 5

for hood in Galway_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = Galway_grouped[Galway_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----"New" Mervue----
                    venue  freq
0     Rental Car Location   0.2
1           Grocery Store   0.2
2  Furniture / Home Store   0.2
3            Burger Joint   0.2
4            Betting Shop   0.2


----Athenry----
                 venue  freq
0        Train Station  0.25
1               Castle  0.25
2                 Café  0.25
3         Betting Shop  0.25
4  American Restaurant  0.00


----Ballybane----
                        venue  freq
0         Rental Car Location   0.4
1  Construction & Landscaping   0.2
2               Grocery Store   0.2
3                 Gas Station   0.2
4         American Restaurant   0.0


----Ballybrit----
                 venue  freq
0            Racetrack   1.0
1  American Restaurant   0.0
2        Movie Theater   0.0
3                Plaza   0.0
4           Playground   0.0


----Barna----
           venue  freq
0   Home Service   0.4
1          Hotel   0.2
2            Gym   0.2
3         Bistro   0.2
4  Movie Theater   0.0


----Boher

In [20]:
   def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [34]:
num_top_venues = 10
import numpy as np

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Galway_grouped['Neighborhood']

for ind in np.arange(Galway_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Galway_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head(5)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"""New"" Mervue",Grocery Store,Rental Car Location,Furniture / Home Store,Burger Joint,Betting Shop,Waterfront,Discount Store,Cocktail Bar,Coffee Shop,Construction & Landscaping
1,Athenry,Castle,Café,Train Station,Betting Shop,Electronics Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Department Store
2,Ballybane,Rental Car Location,Grocery Store,Gas Station,Construction & Landscaping,Waterfront,Discount Store,Clothing Store,Cocktail Bar,Coffee Shop,Convenience Store
3,Ballybrit,Racetrack,Cheese Shop,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Department Store,Diner
4,Barna,Home Service,Gym,Bistro,Hotel,Clothing Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Department Store


### Cluster Neighborhoods

In [33]:
# set number of clusters
kclusters = 9

Galway_grouped_clustering =Galway_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Galway_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:12]

array([5, 6, 5, 2, 0, 6, 1, 0, 0, 4, 0, 0], dtype=int32)

In [35]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)


In [36]:
Galway_merged = Galway1

# merge Galway_grouped with Galway_data to add latitude/longitude for each neighborhood
Galway_merged = Galway_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

Galway_merged.head() # check the last columns!

Unnamed: 0,Neighborhood,Address,latitude,longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Ballybane,"Ballybane,Galway",53.28611,-9.007505,5.0,Rental Car Location,Grocery Store,Gas Station,Construction & Landscaping,Waterfront,Discount Store,Clothing Store,Cocktail Bar,Coffee Shop,Convenience Store
2,Ballybrit,"Ballybrit,Galway",53.296051,-8.99985,2.0,Racetrack,Cheese Shop,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Department Store,Diner
3,Bohermore,"Bohermore ,Galway",53.280001,-9.043039,6.0,Supermarket,Bed & Breakfast,Fast Food Restaurant,Stadium,Lake,Shopping Mall,Electronics Store,Racetrack,Department Store,Movie Theater
4,Castlegar,"Castlegar,Galway",53.29862,-9.022817,1.0,Waste Facility,Gym / Fitness Center,Clothing Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Department Store,Diner,Discount Store
5,Claddagh,"Claddagh ,Galway",53.267472,-9.055282,0.0,Pub,Restaurant,Pie Shop,Plaza,Rock Club,Bookstore,Mexican Restaurant,Coffee Shop,Beer Bar,Seafood Restaurant


# Create Map

In [37]:
Galway_merged[("Cluster Labels")] = Galway_merged[("Cluster Labels")].fillna(0.0).astype(int)

In [38]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Galway_merged['latitude'], Galway_merged['longitude'], Galway_merged['Neighborhood'], Galway_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Examine Clusters 

In [102]:
Galway_merged.loc[Galway_merged['Cluster Labels'] == 1, Galway_merged.columns[[1] + list(range(5, Galway_merged.shape[1]))]]

Unnamed: 0,Address,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
20,"Renmore,Galway",Beach,Waterfront,Financial or Legal Service,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Department Store,Diner,Discount Store


In [103]:
Galway_merged.loc[Galway_merged['Cluster Labels'] == 2, Galway_merged.columns[[1] + list(range(5, Galway_merged.shape[1]))]]

Unnamed: 0,Address,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,"Bohermore ,Galway",Supermarket,Fast Food Restaurant,Shopping Mall,Electronics Store,Bed & Breakfast,Lake,Stadium,Racetrack,Movie Theater,Gym
5,"Claddagh ,Galway",Restaurant,Pub,Spanish Restaurant,Rock Club,Breakfast Spot,Mexican Restaurant,Plaza,Seafood Restaurant,Beer Bar,Japanese Restaurant
6,"Cathedral,Galway",Pub,Café,Ice Cream Shop,Gastropub,Bakery,Italian Restaurant,Cheese Shop,Mexican Restaurant,Department Store,Restaurant
7,"College Rd,Galway",Bed & Breakfast,American Restaurant,Stadium,Hotel,Multiplex,Pub,Racetrack,Shopping Mall,Gym,Bay
8,"Docks,Galway",Pub,Restaurant,Bookstore,Mexican Restaurant,Cocktail Bar,Café,Coffee Shop,Pie Shop,Pizza Place,Japanese Restaurant
9,"Dominic St,Galway",Fast Food Restaurant,Grocery Store,Castle,Restaurant,Indian Restaurant,Electronics Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store
11,"Kirwan's Lane,Galway",Pub,Restaurant,Pizza Place,Bookstore,Seafood Restaurant,Cheese Shop,Mexican Restaurant,Pie Shop,Coffee Shop,Japanese Restaurant
13,"Merchant's Road,Galway",Pub,Pizza Place,Restaurant,Ice Cream Shop,Bookstore,Bistro,Seafood Restaurant,Cheese Shop,Mexican Restaurant,Pie Shop
23,"Salthill - Upper,Galway",Aquarium,Café,Fast Food Restaurant,Italian Restaurant,Pub,Waterfront,Restaurant,Arcade,Bed & Breakfast,Coffee Shop
24,"Shantalla,Galway",Grocery Store,Sandwich Place,Pizza Place,Café,Gastropub,Pub,Supermarket,Fast Food Restaurant,Mexican Restaurant,Discount Store


In [104]:
Galway_merged.loc[Galway_merged['Cluster Labels'] == 3, Galway_merged.columns[[1] + list(range(5, Galway_merged.shape[1]))]]

Unnamed: 0,Address,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,"Castlegar,Galway",Waste Facility,Fast Food Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Department Store,Diner,Discount Store


In [105]:
Galway_merged.loc[Galway_merged['Cluster Labels'] == 4, Galway_merged.columns[[1] + list(range(5, Galway_merged.shape[1]))]]

Unnamed: 0,Address,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,"Merlin Park,Galway",Gift Shop,Waterfront,Fast Food Restaurant,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Department Store,Diner,Discount Store


In [106]:
Galway_merged.loc[Galway_merged['Cluster Labels'] == 5, Galway_merged.columns[[1] + list(range(5, Galway_merged.shape[1]))]]

Unnamed: 0,Address,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
39,"Monivea,Galway",Playground,Park,Boxing Gym,Electronics Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Department Store,Diner


In [107]:
Galway_merged.loc[Galway_merged['Cluster Labels'] == 6, Galway_merged.columns[[1] + list(range(5, Galway_merged.shape[1]))]]

Unnamed: 0,Address,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,"Ballybane,Galway",Rental Car Location,Grocery Store,Gas Station,Bus Stop,Waterfront,Discount Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store
15,"""New"" Mervue,Galway",Grocery Store,Rental Car Location,Furniture / Home Store,Burger Joint,Sports Club,Waterfront,Discount Store,Cocktail Bar,Coffee Shop,Construction & Landscaping
16,"Old Mervue,Galway",Grocery Store,Rental Car Location,Furniture / Home Store,Burger Joint,Sports Club,Waterfront,Discount Store,Cocktail Bar,Coffee Shop,Construction & Landscaping


In [108]:
Galway_merged.loc[Galway_merged['Cluster Labels'] == 7, Galway_merged.columns[[1] + list(range(5, Galway_merged.shape[1]))]]

Unnamed: 0,Address,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,"Ballybrit,Galway",Racetrack,Cheese Shop,Clothing Store,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Department Store,Diner,Discount Store


In [109]:
Galway_merged.loc[Galway_merged['Cluster Labels'] == 8, Galway_merged.columns[[1] + list(range(5, Galway_merged.shape[1]))]]

Unnamed: 0,Address,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,"Newcastle / Dangan,Galway",Café,Trail,Bus Stop,Science Museum,Snack Place,Gas Station,Furniture / Home Store,Clothing Store,Cocktail Bar,Coffee Shop
27,"University,Galway",Café,Hotel,Rugby Pitch,Bus Stop,Waterfront,Electronics Store,Coffee Shop,Construction & Landscaping,Convenience Store,Department Store
36,"Kilcolgan,Galway",Café,Gastropub,Bus Stop,Pet Store,Diner,Waterfront,Electronics Store,Coffee Shop,Construction & Landscaping,Convenience Store


In [110]:
Galway_merged.loc[Galway_merged['Cluster Labels'] == 9, Galway_merged.columns[[1] + list(range(5, Galway_merged.shape[1]))]]

Unnamed: 0,Address,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
18,"Parkmore,Galway",Athletics & Sports,Sports Club,Waterfront,Fast Food Restaurant,Cocktail Bar,Coffee Shop,Construction & Landscaping,Convenience Store,Department Store,Diner


In [111]:
Galway_merged.loc[Galway_merged['Cluster Labels'] == 10, Galway_merged.columns[[1] + list(range(5, Galway_merged.shape[1]))]]

Unnamed: 0,Address,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
