**Peer-graded Assignment**

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology) [ and Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)


# Introduction

## 1.	Description of the Problem

Being the capital of one of the most prosperity Scandinavian countries, Oslo is facing ever changing reforms and new challenges. One of which is in the food industry, it is quite hard for potential business owners to find a suitable location for their business. Because most of the central and desired locations have already been taken. In addition, the rental pricing in Oslo or in Norway as whole is quite expensive, meaning that the location plays even a bigger role than it already does. This leads to finding a suitable location is quite a demanding task. This applies both for a new location or taking over an established one. However, the challenge makes the task even more attracting. I will try to use as much location data to build a portfolio to find the best suited location for a new restaurant in Oslo. 

## 2.	 Discussion of the Background

My client, who is a very great chef and an intelligent businessman. He already owes a restaurant in Oslo city center serving amazing food. His goal is to find better locations to expand his business into a chain restaurant within Oslo commune. Being the one with the experience in this business, he understands that the location among other factors plays one of the most significant roles in the successfulness of a restaurant.  
These are the client’s requirements: They should be in the topmost populated boroughs in Oslo.  The population in those areas should be higher than 55.000 people. They should also be close to public transportation since Oslo is a green city. The locations should be located inside or closed to shopping centers or public parking lots. Are there many venues in the 1000-meter radius of these locations? The renting budget is not an issue. 
This is a personal-customer oriented project. However, given the scope of the projects, it could target a wide range of audience with different requirements because the criteria could be adjusted accordingly. 

# Data

A description of the data and how it will be used to solve the problem

This project will rely on public data from 'citypopulation.de' website and Foursquare and google maps. 

1. Information of Oslo boroughs and population can be obtained from 'citypopulation.de' to judge the location with the topmost populated boroughs.
2. After the topmost populated boroughs have been identified, Foursquare will be utilized to find the venues within the 1000-meter radius around the center of each borough. These venues could be like cinemas or centers (shopping or fitness) or famous coffee shop or famous tourist attractions etc,. These should be the priorities to choose locations for restaurants. Because it will increase the visibility and attract more customers to the potential restaurants.
3. After choosing the potential locations, I will use Geo-location data from Foursquare to see if there are public transportations around the locations. Because Oslo is a green city and there are many people who use public transportation here. 
4. Utilizing Foursquare to find out other restaurants, shopping events, nightlife events etc., in the 500-meter radius to find out what kinds of businesses are in the area and if there will any potential for restaurant concept
5. Map with geo-location data will be generated with potential locations to be presented to the client. 

# Methodology and Analysis

## I will install and import the necessary packages and then extract the Geo-location data for Oslo

In [1]:
conda install --channel conda-forge lxml

Solving environment: done


  current version: 4.5.11
  latest version: 4.7.12

Please update conda by running

    $ conda update -n base -c defaults conda



# All requested packages already installed.


Note: you may need to restart the kernel to use updated packages.


In [2]:
conda install --channel conda-forge geopandas

Solving environment: done


  current version: 4.5.11
  latest version: 4.7.12

Please update conda by running

    $ conda update -n base -c defaults conda



# All requested packages already installed.


Note: you may need to restart the kernel to use updated packages.


In [3]:
conda install --channel conda-forge geopy

Solving environment: done


  current version: 4.5.11
  latest version: 4.7.12

Please update conda by running

    $ conda update -n base -c defaults conda



# All requested packages already installed.


Note: you may need to restart the kernel to use updated packages.


In [4]:
conda install --channel conda-forge folium

Solving environment: done


  current version: 4.5.11
  latest version: 4.7.12

Please update conda by running

    $ conda update -n base -c defaults conda



# All requested packages already installed.


Note: you may need to restart the kernel to use updated packages.


In [5]:
## import necessary packages
import lxml
import pandas as pd
import geopy
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter
import matplotlib.pyplot as plt
import folium
from folium.plugins import FastMarkerCluster
import seaborn as sns 
import numpy as np
# import k-means from clustering stage
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors
import pandas as pd
import random

In [6]:
#find the geolocation of Oslo
locator = Nominatim(user_agent="myGeocoder")
location = locator.geocode("Oslo, Norway")

In [7]:
# print the location
print(location.address)
print("Latitude = {}, Longitude = {}".format(location.latitude, location.longitude))

Oslo, 0026, Norge
Latitude = 59.9133301, Longitude = 10.7389701


## Getting the borough and population data from 'citypopulation.de' and importing the data into dataframe using pandas 

In [8]:
# read the data of the neigborhood in Oslo from citypopulation.de
df=pd.read_html('https://www.citypopulation.de/php/norway-oslocity.php', header=0)[0] #read the table directly from the website into dataframe using pandas
df

Unnamed: 0,Name,Status,PopulationEstimate2005-01-01,PopulationEstimate2010-01-01,PopulationEstimate2015-01-01,PopulationEstimate2019-01-01,Unnamed: 6
0,Alna,Borough,44151,46603,48770,49457,→
1,Bjerke,Borough,24448,27632,30502,32500,→
2,Frogner,Borough,45640,50396,55965,58897,→
3,Gamle Oslo,Borough,35431,42569,49854,55683,→
4,Grorud,Borough,24729,26074,27283,27583,→
5,Grünerløkka,Borough,37774,45647,54701,60844,→
6,Marka,City Forest,1614,1638,1598,1633,→
7,Nordre Aker,Borough,41060,46287,49337,51558,→
8,Nordstrand,Borough,43297,46419,49428,51882,→
9,Østensjø,Borough,42681,45577,49133,50427,→


## Cleaning up the dataframe

In [9]:
# Remove the unessary columns and last row because it contains the population of the whole Oslo 

In [10]:
df=df.drop(['Status', 'PopulationEstimate2005-01-01','PopulationEstimate2010-01-01','PopulationEstimate2015-01-01', 'Unnamed: 6'], axis=1)

In [11]:
df=df[:-1]

In [12]:
df=df.rename(columns={"Name": "Neighborhood"})
df

Unnamed: 0,Neighborhood,PopulationEstimate2019-01-01
0,Alna,49457
1,Bjerke,32500
2,Frogner,58897
3,Gamle Oslo,55683
4,Grorud,27583
5,Grünerløkka,60844
6,Marka,1633
7,Nordre Aker,51558
8,Nordstrand,51882
9,Østensjø,50427


In [13]:
# Rename the column PopulationEstimate2019-01-01 to just simply Population

In [14]:
df.rename(columns={'PopulationEstimate2019-01-01':'Population'}, inplace=True)

In [15]:
df

Unnamed: 0,Neighborhood,Population
0,Alna,49457
1,Bjerke,32500
2,Frogner,58897
3,Gamle Oslo,55683
4,Grorud,27583
5,Grünerløkka,60844
6,Marka,1633
7,Nordre Aker,51558
8,Nordstrand,51882
9,Østensjø,50427


## Extracting and importing the geo-location data for these boroughs

In [16]:
# recreate a Address column for the neighborhood 
df['ADDRESS'] = df['Neighborhood'].astype(str) + ',' + 'Oslo' + ',' + 'Norway'

In [17]:
df.head()

Unnamed: 0,Neighborhood,Population,ADDRESS
0,Alna,49457,"Alna,Oslo,Norway"
1,Bjerke,32500,"Bjerke,Oslo,Norway"
2,Frogner,58897,"Frogner,Oslo,Norway"
3,Gamle Oslo,55683,"Gamle Oslo,Oslo,Norway"
4,Grorud,27583,"Grorud,Oslo,Norway"


In [18]:
# extract the geolocations from then Address column for the neighborhood 
from geopy.extra.rate_limiter import RateLimiter
geocode = RateLimiter(locator.geocode, min_delay_seconds=1)
df['location'] = df['ADDRESS'].apply(geocode)
df['point'] = df['location'].apply(lambda loc: tuple(loc.point) if loc else None)

In [19]:
df.head()

Unnamed: 0,Neighborhood,Population,ADDRESS,location,point
0,Alna,49457,"Alna,Oslo,Norway","(Alna, Oslo, 0659, Norge, (59.9066322, 10.8061...","(59.9066322, 10.8061331, 0.0)"
1,Bjerke,32500,"Bjerke,Oslo,Norway","(Bjerke, Oslo, Norge, (59.9413947, 10.82920845...","(59.9413947, 10.8292084530298, 0.0)"
2,Frogner,58897,"Frogner,Oslo,Norway","(Frogner, Oslo, 0266, Norge, (59.9222241, 10.7...","(59.9222241, 10.7066491, 0.0)"
3,Gamle Oslo,55683,"Gamle Oslo,Oslo,Norway","(Gamle Oslo, Oslo, Norge, (59.8992367, 10.7347...","(59.8992367, 10.7347673396537, 0.0)"
4,Grorud,27583,"Grorud,Oslo,Norway","(Grorud, Oslo, Norge, (59.96234275, 10.8752898...","(59.96234275, 10.8752898462327, 0.0)"


In [20]:
df['point'][0][0]

59.9066322

## Creating new columns namely latitude and longitude for subsequent analysis 

In [21]:
# split point column into latitude, longitude and altitude columns
df[['latitude', 'longitude', 'altitude']] = pd.DataFrame(df['point'].tolist(), index=df.index)
df.head()

Unnamed: 0,Neighborhood,Population,ADDRESS,location,point,latitude,longitude,altitude
0,Alna,49457,"Alna,Oslo,Norway","(Alna, Oslo, 0659, Norge, (59.9066322, 10.8061...","(59.9066322, 10.8061331, 0.0)",59.906632,10.806133,0.0
1,Bjerke,32500,"Bjerke,Oslo,Norway","(Bjerke, Oslo, Norge, (59.9413947, 10.82920845...","(59.9413947, 10.8292084530298, 0.0)",59.941395,10.829208,0.0
2,Frogner,58897,"Frogner,Oslo,Norway","(Frogner, Oslo, 0266, Norge, (59.9222241, 10.7...","(59.9222241, 10.7066491, 0.0)",59.922224,10.706649,0.0
3,Gamle Oslo,55683,"Gamle Oslo,Oslo,Norway","(Gamle Oslo, Oslo, Norge, (59.8992367, 10.7347...","(59.8992367, 10.7347673396537, 0.0)",59.899237,10.734767,0.0
4,Grorud,27583,"Grorud,Oslo,Norway","(Grorud, Oslo, Norge, (59.96234275, 10.8752898...","(59.96234275, 10.8752898462327, 0.0)",59.962343,10.87529,0.0


## Cleaning up the new dataframe

In [22]:
# Remove the unessary columns
df=df.drop(['ADDRESS','location','point', 'altitude'], axis=1)

In [23]:
#checking the types of each column to see if they should be in their correct types
df.dtypes

Neighborhood     object
Population        int64
latitude        float64
longitude       float64
dtype: object

In [24]:
df=df.drop(df.index[11])

In [25]:
df

Unnamed: 0,Neighborhood,Population,latitude,longitude
0,Alna,49457,59.906632,10.806133
1,Bjerke,32500,59.941395,10.829208
2,Frogner,58897,59.922224,10.706649
3,Gamle Oslo,55683,59.899237,10.734767
4,Grorud,27583,59.962343,10.87529
5,Grünerløkka,60844,59.925471,10.777421
6,Marka,1633,60.040262,10.671005
7,Nordre Aker,51558,59.953638,10.756412
8,Nordstrand,51882,59.87088,10.780353
9,Østensjø,50427,59.887563,10.832748


## Where are the different neighborhoods in Oslo on a map?

In [26]:
# Map of Oslo
address = 'OSLO, NORWAY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Oslo are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Oslo are 59.9133301, 10.7389701.


In [27]:
# create map of Oslo using latitude and longitude values
map_Oslo = folium.Map(location=[latitude, longitude], zoom_start=14)

# add markers to map
for lat, lng, label in zip(df['latitude'], df['longitude'], df['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_Oslo)  
    
map_Oslo

## Filter and keep only the boroughs with population more than 55000

In [28]:
df=df[df.Population > 55000]

In [29]:
df

Unnamed: 0,Neighborhood,Population,latitude,longitude
2,Frogner,58897,59.922224,10.706649
3,Gamle Oslo,55683,59.899237,10.734767
5,Grünerløkka,60844,59.925471,10.777421


In [30]:
df=df.set_index('Neighborhood')
df

Unnamed: 0_level_0,Population,latitude,longitude
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Frogner,58897,59.922224,10.706649
Gamle Oslo,55683,59.899237,10.734767
Grünerløkka,60844,59.925471,10.777421


In [31]:
df.to_csv('Targetborough.csv')

## Explore the first borough (Frogner) to find suitable location

In [32]:
df_Boroughs=pd.read_csv('Targetborough.csv', index_col=False)

In [33]:
df_Boroughs

Unnamed: 0,Neighborhood,Population,latitude,longitude
0,Frogner,58897,59.922224,10.706649
1,Gamle Oslo,55683,59.899237,10.734767
2,Grünerløkka,60844,59.925471,10.777421


In [34]:
df_Boroughs.loc[0,'Neighborhood']

'Frogner'

In [35]:
neighborhood_latitude = df_Boroughs.loc[0, 'latitude'] # neighborhood latitude value
neighborhood_longitude = df_Boroughs.loc[0, 'longitude'] # neighborhood longitude value

neighborhood_name = df_Boroughs.loc[0, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Frogner are 59.9222241, 10.7066491.


In [None]:
## I will remove these info after sucessfully created the whole notebook
CLIENT_ID = '' # your Foursquare ID
CLIENT_SECRET = '' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

## Top 10 venues for the Frogner borough 

In [None]:
# I will remove my information regarding the clien-id and client-secret here. 
LIMIT = 10 # limit of number of venues returned by Foursquare API

radius = 500 # define radius

# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL

In [38]:
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

In [39]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5de66760216785001bb4870c'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Frogner',
  'headerFullLocation': 'Frogner, Oslo',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 26,
  'suggestedBounds': {'ne': {'lat': 59.9267241045, 'lng': 10.715611241746213},
   'sw': {'lat': 59.9177240955, 'lng': 10.697686958253787}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4bee53ce2c082d7f57553042',
       'name': 'Frognerborgen lekeplass',
       'location': {'lat': 59.92427517184612,
        'lng': 10.706790776968406,
        'labeledLatLngs': [{'label': 'display',
          'lat': 59.92427517184612,
          'lng': 10.706790776968406

## Get the category by using the category function from Foursquare, clean up the data and transfer it into a dataframe

In [40]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [41]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Frognerborgen lekeplass,Playground,59.924275,10.706791
1,Vigeland-museet,Art Museum,59.922777,10.701362
2,eckers,Sandwich Place,59.92002,10.70772
3,Vineria Ventidue,Italian Restaurant,59.920364,10.705681
4,Vinmonopolet (Elisenberg),Wine Shop,59.918818,10.707471


In [42]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

10 venues were returned by Foursquare.


## Let take a look at them at these top 10 venues

In [43]:
nearby_venues

Unnamed: 0,name,categories,lat,lng
0,Frognerborgen lekeplass,Playground,59.924275,10.706791
1,Vigeland-museet,Art Museum,59.922777,10.701362
2,eckers,Sandwich Place,59.92002,10.70772
3,Vineria Ventidue,Italian Restaurant,59.920364,10.705681
4,Vinmonopolet (Elisenberg),Wine Shop,59.918818,10.707471
5,Vigelandsanlegget,Sculpture Garden,59.926236,10.703319
6,Feinschmecker,Scandinavian Restaurant,59.918456,10.708131
7,Oslo Museum; Bymuseet,History Museum,59.924046,10.702557
8,W. B. Samson,Bakery,59.918568,10.708996
9,Fjelberg Fisk & Vilt,Gourmet Shop,59.918358,10.705312


## I will use the same process to find the top venues for all three boroughs.

In [44]:
def getNearbyVenues(names, latitudes, longitudes, radius=1000, limit=10):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [45]:
Oslo_venues = getNearbyVenues(names=df_Boroughs['Neighborhood'],
                                   latitudes=df_Boroughs['latitude'],
                                   longitudes=df_Boroughs['longitude']
                                  )

Frogner
Gamle Oslo
Grünerløkka


In [46]:
print(Oslo_venues.shape)
Oslo_venues.head()

(30, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Frogner,59.922224,10.706649,Vigeland-museet,59.922777,10.701362,Art Museum
1,Frogner,59.922224,10.706649,Frognerborgen lekeplass,59.924275,10.706791,Playground
2,Frogner,59.922224,10.706649,Vigelandsanlegget,59.926236,10.703319,Sculpture Garden
3,Frogner,59.922224,10.706649,eckers,59.92002,10.70772,Sandwich Place
4,Frogner,59.922224,10.706649,Vinmonopolet (Elisenberg),59.918818,10.707471,Wine Shop


In [47]:
Oslo_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Frogner,10,10,10,10,10,10
Gamle Oslo,10,10,10,10,10,10
Grünerløkka,10,10,10,10,10,10


Just as planned, top 10 venues for each neighborhood or borough has been identifed. 

## Let try to arrange them in a different way. So it is easy to find out which are the top 10 venues in each borough

In [48]:
# one hot encoding
Oslo_onehot = pd.get_dummies(Oslo_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Oslo_onehot['Neighborhood'] = Oslo_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Oslo_onehot.columns[-1]] + list(Oslo_onehot.columns[:-1])
Oslo_onehot = Oslo_onehot[fixed_columns]

Oslo_onehot.head()

Unnamed: 0,Neighborhood,Art Museum,Asian Restaurant,Bathing Area,Botanical Garden,Café,Castle,Cocktail Bar,Dog Run,Gourmet Shop,...,Other Nightlife,Park,Playground,Sandwich Place,Scandinavian Restaurant,Sculpture Garden,Seafood Restaurant,Sushi Restaurant,Wine Shop,Yoga Studio
0,Frogner,1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Frogner,0,0,0,0,0,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0
2,Frogner,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0
3,Frogner,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,0,0,0,0
4,Frogner,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,0


In [49]:
Oslo_grouped = Oslo_onehot.groupby('Neighborhood').mean().reset_index()
Oslo_grouped

Unnamed: 0,Neighborhood,Art Museum,Asian Restaurant,Bathing Area,Botanical Garden,Café,Castle,Cocktail Bar,Dog Run,Gourmet Shop,...,Other Nightlife,Park,Playground,Sandwich Place,Scandinavian Restaurant,Sculpture Garden,Seafood Restaurant,Sushi Restaurant,Wine Shop,Yoga Studio
0,Frogner,0.1,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.1,0.1,0.1,0.1,0.1,0.0,0.0,0.1,0.0
1,Gamle Oslo,0.0,0.0,0.1,0.0,0.1,0.2,0.0,0.0,0.0,...,0.1,0.0,0.0,0.0,0.2,0.0,0.1,0.0,0.0,0.0
2,Grünerløkka,0.0,0.0,0.0,0.1,0.2,0.0,0.1,0.1,0.1,...,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.1


In [50]:
num_top_venues = 10

for hood in Oslo_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = Oslo_grouped[Oslo_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Frogner----
                     venue  freq
0               Art Museum   0.1
1         Asian Restaurant   0.1
2                Wine Shop   0.1
3         Sculpture Garden   0.1
4  Scandinavian Restaurant   0.1
5           Sandwich Place   0.1
6               Playground   0.1
7                     Park   0.1
8            Movie Theater   0.1
9       Italian Restaurant   0.1


----Gamle Oslo----
                     venue  freq
0                   Castle   0.2
1  Scandinavian Restaurant   0.2
2       Italian Restaurant   0.1
3             Bathing Area   0.1
4                     Café   0.1
5       Seafood Restaurant   0.1
6           History Museum   0.1
7          Other Nightlife   0.1
8                     Park   0.0
9                Wine Shop   0.0


----Grünerløkka----
                  venue  freq
0                  Café   0.2
1           Yoga Studio   0.1
2      Botanical Garden   0.1
3      Sushi Restaurant   0.1
4          Cocktail Bar   0.1
5               Dog Run   0.1
6    

In [51]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [52]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Oslo_grouped['Neighborhood']

for ind in np.arange(Oslo_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Oslo_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Frogner,Italian Restaurant,Playground,Asian Restaurant,Wine Shop,Movie Theater,Park,Art Museum,Sandwich Place,Scandinavian Restaurant,Sculpture Garden
1,Gamle Oslo,Scandinavian Restaurant,Castle,Italian Restaurant,Seafood Restaurant,Bathing Area,Café,Other Nightlife,History Museum,Gourmet Shop,Asian Restaurant
2,Grünerløkka,Café,Yoga Studio,Gym / Fitness Center,Sushi Restaurant,Botanical Garden,Park,Cocktail Bar,Dog Run,Gourmet Shop,Asian Restaurant


By examining the list of the top 10 most common Venues, we could see that:
1. Frogner area has a "Movie theater" which is in the top 5 most common venues in this area, while all the top venues above it are mainly restaurants. 
2. Gamle Oslo has the "Castle" as a famous tourist attraction which is in the top 2 most common venue in this area with the top 1 is a scandanavian restaurant. 
3. Grünerløkka has a famous coffee shop "Oslovelo" which is top 1 most common venues in this area. Others in the list are restaurants and one tourist attraction and one public park. However these are seasonal. So, we will not have many people visiting this all year round. 
These locations also fit in the requirement of the client. The list also shows us that there are many restaurants around these venues, which is showing a really good sign for business. Therefore we will focus on this venue as the primary location to open the restaurants. 

## Now I will need to find out if there is public transportation close-by these venues. 

## Let extract and make the new dataframe for only potential location in each neighborhood

In [53]:
Oslo_venues

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Frogner,59.922224,10.706649,Vigeland-museet,59.922777,10.701362,Art Museum
1,Frogner,59.922224,10.706649,Frognerborgen lekeplass,59.924275,10.706791,Playground
2,Frogner,59.922224,10.706649,Vigelandsanlegget,59.926236,10.703319,Sculpture Garden
3,Frogner,59.922224,10.706649,eckers,59.92002,10.70772,Sandwich Place
4,Frogner,59.922224,10.706649,Vinmonopolet (Elisenberg),59.918818,10.707471,Wine Shop
5,Frogner,59.922224,10.706649,Feinschmecker,59.918456,10.708131,Scandinavian Restaurant
6,Frogner,59.922224,10.706649,Frognerparken,59.926894,10.701177,Park
7,Frogner,59.922224,10.706649,Gimle Kino,59.917495,10.709292,Movie Theater
8,Frogner,59.922224,10.706649,Sawan,59.918801,10.715841,Asian Restaurant
9,Frogner,59.922224,10.706649,Vineria Ventidue,59.920364,10.705681,Italian Restaurant


In [69]:
locations=Oslo_venues.take([7, 12, 29])
locations

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
7,Frogner,59.922224,10.706649,Gimle Kino,59.917495,10.709292,Movie Theater
12,Gamle Oslo,59.899237,10.734767,Akershus Festning (Akershus festning),59.906683,10.736754,Castle
29,Grünerløkka,59.925471,10.777421,Oslovelo,59.925457,10.761,Café


In [70]:
locations=locations.drop(['Neighborhood Latitude','Neighborhood Longitude'], axis=1)

In [71]:
locations.rename(columns={'Venue Category':'category', 'Venue Latitude':'latitude', 'Venue Longitude':'longitude' }, inplace=True)

In [72]:
locations

Unnamed: 0,Neighborhood,Venue,latitude,longitude,category
7,Frogner,Gimle Kino,59.917495,10.709292,Movie Theater
12,Gamle Oslo,Akershus Festning (Akershus festning),59.906683,10.736754,Castle
29,Grünerløkka,Oslovelo,59.925457,10.761,Café


In [73]:
locations=locations.set_index('Neighborhood')
locations

Unnamed: 0_level_0,Venue,latitude,longitude,category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Frogner,Gimle Kino,59.917495,10.709292,Movie Theater
Gamle Oslo,Akershus Festning (Akershus festning),59.906683,10.736754,Castle
Grünerløkka,Oslovelo,59.925457,10.761,Café


In [74]:
locations.to_csv('locations.csv')

## These are different category_IDs obtained from Foursquare website and will be used to find out if there are bus stops, cable car or metro station around these potential locations. The radius is 600 meters which is around 5-7 minute walking distance. 

In [75]:
# Cable Car: '52f2ab2ebcbc57f1066b8b50'
#'Bus Stop': '52f2ab2ebcbc57f1066b8b4f'
#'Metro Station': '4f2a23984b9023bd5841ed2c'

In [76]:
df_locations=pd.read_csv('locations.csv', index_col=False)
df_locations

Unnamed: 0,Neighborhood,Venue,latitude,longitude,category
0,Frogner,Gimle Kino,59.917495,10.709292,Movie Theater
1,Gamle Oslo,Akershus Festning (Akershus festning),59.906683,10.736754,Castle
2,Grünerløkka,Oslovelo,59.925457,10.761,Café


In [85]:
def getNearbyVenues(names, latitudes, longitudes, radius=600):
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId=52f2ab2ebcbc57f1066b8b50,52f2ab2ebcbc57f1066b8b4f,4bf58dd8d48988d1fd931735'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius,
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'category_Id']
    
    return(nearby_venues)

In [86]:
Transportation_venues = getNearbyVenues(names=df_locations['Venue'],
                                   latitudes=df_locations['latitude'],
                                   longitudes=df_locations['longitude']
                                  )

Gimle Kino
Akershus Festning (Akershus festning)
Oslovelo


In [87]:
Transportation_venues

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,category_Id
0,Gimle Kino,59.917495,10.709292,Frogner kirke (B),59.917603,10.70759,Bus Stop
1,Gimle Kino,59.917495,10.709292,Skovveien (B),59.915229,10.715638,Bus Stop
2,Gimle Kino,59.917495,10.709292,Lapsetorvet (B),59.915226,10.717355,Bus Stop
3,Gimle Kino,59.917495,10.709292,Riddervolds plass (b),59.918462,10.719029,Bus Stop
4,Akershus Festning (Akershus festning),59.906683,10.736754,Kongens gate (B),59.910858,10.742011,Bus Stop
5,Akershus Festning (Akershus festning),59.906683,10.736754,Kvadraturen (B),59.910661,10.743156,Bus Stop
6,Oslovelo,59.925457,10.761,Sannergata (B),59.928497,10.759698,Bus Stop
7,Oslovelo,59.925457,10.761,Telthusbakken (B),59.924611,10.750656,Bus Stop


## Exploring the nightlife spots around these locations within the same radius

In [88]:
def getNearbyVenues(names, latitudes, longitudes, radius=600):
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId=4d4b7105d754a06376d81259'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'category_Id']
    
    return(nearby_venues)

In [89]:
Nightlife_venues = getNearbyVenues(names=df_locations['Venue'],
                                   latitudes=df_locations['latitude'],
                                   longitudes=df_locations['longitude']
                                  )

Gimle Kino
Akershus Festning (Akershus festning)
Oslovelo


In [90]:
Nightlife_venues

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,category_Id
0,Gimle Kino,59.917495,10.709292,Bokbacka,59.917795,10.718433,Scandinavian Restaurant
1,Gimle Kino,59.917495,10.709292,Forest & Brown,59.916811,10.71463,Bar
2,Gimle Kino,59.917495,10.709292,F6 Cocktail Bar & Lounge,59.915716,10.716577,Cocktail Bar
3,Gimle Kino,59.917495,10.709292,San Francisco Bread Bowl,59.916364,10.715833,American Restaurant
4,Gimle Kino,59.917495,10.709292,Fru Burums,59.916009,10.716007,Bar
5,Gimle Kino,59.917495,10.709292,Champagneria,59.915261,10.717233,Champagne Bar
6,Gimle Kino,59.917495,10.709292,BAR Bygdøy Allé Restaurant,59.916412,10.711658,Nightclub
7,Gimle Kino,59.917495,10.709292,Lille Andys Pub,59.916684,10.713587,Sports Bar
8,Gimle Kino,59.917495,10.709292,Bar,59.916391,10.711525,Bar
9,Gimle Kino,59.917495,10.709292,Bygdøy Allé 3,59.914954,10.716959,Bar


## Exploring the food service around these locations within the same radius

In [93]:
def getNearbyVenues(names, latitudes, longitudes, radius=600, limit=10):
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId=4d4b7105d754a06374d81259'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'category_Id']
    
    return(nearby_venues)

In [94]:
Food_venues = getNearbyVenues(names=df_locations['Venue'],
                                   latitudes=df_locations['latitude'],
                                   longitudes=df_locations['longitude']
                                  )

Gimle Kino
Akershus Festning (Akershus festning)
Oslovelo


In [95]:
Food_venues

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,category_Id
0,Gimle Kino,59.917495,10.709292,Feinschmecker,59.918456,10.708131,Scandinavian Restaurant
1,Gimle Kino,59.917495,10.709292,W. B. Samson,59.918568,10.708996,Bakery
2,Gimle Kino,59.917495,10.709292,Sawan,59.918801,10.715841,Asian Restaurant
3,Gimle Kino,59.917495,10.709292,eckers,59.92002,10.70772,Sandwich Place
4,Gimle Kino,59.917495,10.709292,Jewel of India,59.915844,10.716844,Indian Restaurant
5,Gimle Kino,59.917495,10.709292,Galt,59.91658,10.713361,Restaurant
6,Gimle Kino,59.917495,10.709292,El Camino,59.916912,10.714733,Mexican Restaurant
7,Gimle Kino,59.917495,10.709292,Bokbacka,59.917795,10.718433,Scandinavian Restaurant
8,Gimle Kino,59.917495,10.709292,Pizza Da Mimmo,59.916366,10.716581,Pizza Place
9,Gimle Kino,59.917495,10.709292,Listen To Baljit,59.91513,10.717637,Indian Restaurant


## Make a map to show these potential locations for opening a chain-restaurant in Oslo

In [96]:
# create map of these potential locations using latitude and longitude values. The potential restaurants should be located wthin 500-700 meter radius of these ocations
map_Oslo = folium.Map(location=[latitude, longitude], zoom_start=15)

# add markers to map
for lat, lng, label in zip(df_locations['latitude'], df_locations['longitude'], df_locations['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_Oslo)  
    
map_Oslo

# Result and Discussion
The list of boroughs and their corresponding population data has been obtained from [citypopulation.de](https://www.citypopulation.de/php/norway-oslocity.php). Then the Geo-location data was extracted for these different boroughs.The list of boroughs have been shorten down to the ones ((Frogner, Gamle Oslo, and Grunnerløka)) that has a population higher than 55000 people. 

Foursquare was then utilized to explore each borough to find a potential location in each borough to open a restaurant (Gimle Kino, Arkehus Slott, and Tøyen Parken). By examining the list of the top 10 most common venues, we could see that:
Frogner area has a "Movie theater" which is in the top 5 most common venues in this area, while all the top venues above it are mainly restaurants. 
Gamle Oslo has the "Castle" as a famous tourist attraction which is in the top 2 most common venue in this area with the top 1 is a scandanavian restaurant. 
Grünerløkka has a famous coffee shop which is top 1 most common venues in this area. These locations also fit in the requirement of the client. The list also shows us that there are many restaurants around these venues, which is showing a really good sign for business. Therefore I mainly focused on these venues as the primary locations to open the restaurants. 

After finding the target locations, Foursquare was then used again to explore different venues around these locations within 600-meter radius, which is around 5-7 minute walking distance.

There are many bus stops or cable car and metro stations within the above mention radius. This has met the client's requirement as Oslo is a green city and lots of people use public transportation. Hence, easily access to public transportation is a must. 

There are also many nightlife spots and food services around these locations showing that they are indeed the hot spots for restaurants in Oslo. If we were to open restaurants in these locations, we would have the visibility factor in our favors and a huge potential to attract more customers.  

# Conclusion

Purpose of this project was to find suitable locations to open a chain-restaurants in Oslo. Our analysis has been able to narrow down to 3 neighborhoods (Frogner, Gamle Oslo, and Grunnerløka) and three potential locations in these boroughs (Gimle Kino, Arkehus Slott, and Oslovelo) with many hot venues (Transportation, food services and night life spots) within a 600-meter radius. Hence, opening restaurants within this radius will be a good choice for business. However, the final decission on optimal restaurant locations will be made by stakeholders based on our results and other characteristics of each neighborhood.