# Starting a new Indian restaurant in Scotland

### Clients request: 
A client living in scotland would like to invest some money opening an Indian restaurant, but he doesn't know where to start from. He has looked into several bit cities, such as Glasgow, Edinburgh, Stirling, Dundee, Inverness, and Aberdeen. The reason for him picking these cities are because they are major historical major cities, and attract a lot of tourists every year. He would like us to guide him in picking the most suitable city and the area of choice. 

### The suggested study: 
Scotland is a very outgoing country with a huge imphasis on tourism. According to the Office of National Statistics(ONS), in Q2 2017 – Q1 2018 3.4 million overseas tourists visited Scotland, spending £2.4 billion which was an increase of29% compared to the year before. Furthermore the food and drink industry is worth around £14 billion each year and accounts for one in five manufacturing jobs in Scotland. 
On the other hand according to the U.S. Bureau of Labor Statistics (BLS), approximately 20% of new businesses fail during the first two years of being open, 45% during the first five years, and 65% during the first 10 years.
Data from the ONS from 2018 also suggest that the food and beverage service is very competitive, as 18,820 new restaurants and mobile food services were born, and nearly as much 15,930 were observed dead. With an overall industry of 105,730 restaurant and mobile services, it's a nearly 15% of the market that is lost, and 18% of the market born.

To secure a sustainable investement several factors has to be taken into account, this includes: 

    1. The city where the restaurant is located (the ratio of restaurant faiclites to amount of population)
    2. The area where the restaurant is located (the amount of resturants located in a certain area)
    3. The type of food served (the cuisine served should be unique or least represented in the area of choice)
    
With all of this criteria in mind, a set of data will be gathered and studied to demonstrate the most viable place to start a new business. 

The client has decided to choose between 6 cities in Scotland (Glasgow, Edinburgh, Stirling, Dundee, Inverness, and Aberdeen), therefore data on restaurants from these cities will be collected with the location of these places on a map. Then the several criterias will be studied, such as the number of restaurants to the cities population. Are there too many resturants in a small city? or which city presents the least amount of resturants per head? The proportion of indian resturants will also be looked at and compared to the amout of people present in the city. Finally with the most appropriate city in hand, the restaurants will be clustered and the most ideal area will to open the restaurant will be defined. 


In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import seaborn as sns

import json # library to handle JSON files

!pip install geopy  # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!pip install folium # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

!pip install lxml
!pip install html5lib
!pip install BeautifulSoup4
print('Libraries imported.')

Collecting folium
[?25l  Downloading https://files.pythonhosted.org/packages/a4/f0/44e69d50519880287cc41e7c8a6acc58daa9a9acf5f6afc52bcc70f69a6d/folium-0.11.0-py2.py3-none-any.whl (93kB)
[K     |████████████████████████████████| 102kB 8.3MB/s ta 0:00:011
Collecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/13/fb/9eacc24ba3216510c6b59a4ea1cd53d87f25ba76237d7f4393abeaf4c94e/branca-0.4.1-py3-none-any.whl
Installing collected packages: branca, folium
Successfully installed branca-0.4.1 folium-0.11.0
Libraries imported.


#### First lets create a dataframe with the 6 cities that are studied with their population (as of 2018)

In [2]:
d = {'Cities': ['Glasgow', 'Edinburgh', 'Stirling', 'Dundee', 'Inverness', 'Aberdeen'], 
     'Population': [626410, 518500, 94330, 148750, 47380,227560], 
     'Lattitude': [55.86515, 55.95206, 56.11903, 56.46913, 57.47908, 57.14369], 
     'Longitude': [-4.25763, -3.19648, -3.93682, -2.97489, -4.22398, -2.09814]
    }
General_df = pd.DataFrame(data=d)
General_df

Unnamed: 0,Cities,Population,Lattitude,Longitude
0,Glasgow,626410,55.86515,-4.25763
1,Edinburgh,518500,55.95206,-3.19648
2,Stirling,94330,56.11903,-3.93682
3,Dundee,148750,56.46913,-2.97489
4,Inverness,47380,57.47908,-4.22398
5,Aberdeen,227560,57.14369,-2.09814


In [3]:
# The code was removed by Watson Studio for sharing.

In [4]:
LIMIT = 500
radius = 3000
query = 'Indian'

In [5]:
Gla_lat = 55.86515
Gla_long = -4.25763

In [6]:
# The code was removed by Watson Studio for sharing.

In [7]:
results_Gla = requests.get(Gla_url).json()['response']['venues']

In [8]:
results_Gla[0]

{'id': '513b8a7ee4b06b29ffe1c862',
 'name': "Rishi's Indian Aroma",
 'location': {'address': '61 Bath St',
  'lat': 55.863888699972144,
  'lng': -4.256816579488018,
  'labeledLatLngs': [{'label': 'display',
    'lat': 55.863888699972144,
    'lng': -4.256816579488018}],
  'distance': 149,
  'postalCode': 'G2 2DG',
  'cc': 'GB',
  'city': 'Glasgow',
  'state': 'Glasgow City',
  'country': 'United Kingdom',
  'formattedAddress': ['61 Bath St',
   'Glasgow',
   'Glasgow City',
   'G2 2DG',
   'United Kingdom']},
 'categories': [{'id': '4bf58dd8d48988d10f941735',
   'name': 'Indian Restaurant',
   'pluralName': 'Indian Restaurants',
   'shortName': 'Indian',
   'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/indian_',
    'suffix': '.png'},
   'primary': True}],
 'referralId': 'v-1589110149',
 'hasPerk': False}

In [9]:
len(results_Gla)

13

In [10]:
# define the dataframe columns
column_names = ['name', 'id', 'Latitude', 'Longitude', 'categorie'] 

# instantiate the dataframe
Resturants_Gla = pd.DataFrame(columns=column_names)
Resturants_Gla

Unnamed: 0,name,id,Latitude,Longitude,categorie


In [11]:
for data in results_Gla:
    Restaurant_name = data['name'] 
    Restaurant_ID = data['id']
    Restaurant_lat = data['location']['lat']
    Restaurant_long = data['location']['lng']
    Restaurant_categorie = data['categories'][0]['shortName']
    
        
    
    Resturants_Gla = Resturants_Gla.append({'name': Restaurant_name,
                                          'id': Restaurant_ID,
                                           'Latitude': Restaurant_lat,
                                           'Longitude': Restaurant_long,
                                           'categorie': Restaurant_categorie}, ignore_index = True)

In [12]:
Resturants_Gla

Unnamed: 0,name,id,Latitude,Longitude,categorie
0,Rishi's Indian Aroma,513b8a7ee4b06b29ffe1c862,55.863889,-4.256817,Indian
1,indian viliage,4c585a3cd12a20a10fd168bd,55.848537,-4.219788,Indian
2,Indian Gallery,4dd16936d22deadedd993c47,55.865956,-4.267554,Indian
3,Indian Cottage,53e8f21c498e2268f7d8950e,55.857654,-4.24473,Indian
4,Heera Indian Restaurant,559c7379498ebdcc20509d0b,55.864776,-4.272287,Indian
5,Heera Indian Restaurant,559c708b498ec60f8413a489,55.864728,-4.272716,Indian
6,Dakhin South Indian Kitchen,4bed97aae3562d7f8540fff8,55.858878,-4.245425,Indian
7,Indian Orchard,5c5495fcbed483002c40bf76,55.870692,-4.305811,Indian
8,Indian Grill,59666f022e26801ba33719d4,55.88901,-4.287657,Indian
9,Assam's,4b8e7009f964a520a82233e3,55.86311,-4.256551,Indian


In [13]:
Edi_lat = General_df.iloc[1,2]
Edi_long = General_df.iloc[1,3]

Str_lat = General_df.iloc[2,2]
Str_long = General_df.iloc[2,3]

Dun_lat = General_df.iloc[3,2]
Dun_long = General_df.iloc[3,3]

Inv_lat = General_df.iloc[4,2]
Inv_long = General_df.iloc[4,3]

Abe_lat = General_df.iloc[5,2]
Abe_long = General_df.iloc[5,3]

In [14]:
radius_Edi = 7000
url_Edi = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&query={}&radius={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    Edi_lat, 
    Edi_long, 
    query, 
    radius_Edi)

results_Edi = requests.get(url_Edi).json()['response']['venues']

In [15]:
results_Edi[0]

{'id': '5b1e600965211f002c79012b',
 'name': 'Indian Express',
 'location': {'lat': 55.952576,
  'lng': -3.1915596,
  'labeledLatLngs': [{'label': 'display',
    'lat': 55.952576,
    'lng': -3.1915596}],
  'distance': 311,
  'postalCode': 'EH2 2QP',
  'cc': 'GB',
  'city': 'Edinburgh',
  'state': 'Edinburgh',
  'country': 'United Kingdom',
  'formattedAddress': ['Edinburgh', 'EH2 2QP', 'United Kingdom']},
 'categories': [{'id': '4bf58dd8d48988d10f941735',
   'name': 'Indian Restaurant',
   'pluralName': 'Indian Restaurants',
   'shortName': 'Indian',
   'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/indian_',
    'suffix': '.png'},
   'primary': True}],
 'referralId': 'v-1589110150',
 'hasPerk': False}

In [16]:
len(results_Edi)

26

In [17]:
Resturants_Edi = pd.DataFrame(columns=column_names)

for data in results_Edi:
    Restaurant_name1 = data['name'] 
    Restaurant_ID1 = data['id']
    Restaurant_lat1 = data['location']['lat']
    Restaurant_long1 = data['location']['lng']
    Restaurant_categorie1 = data['categories'][0]['shortName']
        
    
    Resturants_Edi = Resturants_Edi.append({'name': Restaurant_name1,
                                          'id': Restaurant_ID1,
                                           'Latitude': Restaurant_lat1,
                                           'Longitude': Restaurant_long1,
                                           'categorie': Restaurant_categorie1}, ignore_index = True)


In [18]:
Resturants_Edi

Unnamed: 0,name,id,Latitude,Longitude,categorie
0,Indian Express,5b1e600965211f002c79012b,55.952576,-3.19156,Indian
1,Indian Lounge,5d0a9a20972c0d00230e8558,55.952026,-3.202587,Indian
2,Indian Thali,4de546fdc65b98fadae0e0c4,55.956117,-3.190878,Thai
3,Tippoo's Indian Restautant,4bf2cfd26a31d13af2ee932e,55.95199,-3.202601,Indian
4,Tipu's Indian Lounge,4d3f1d266b3d236af5d07864,55.951872,-3.202907,Indian
5,Indian Consulate,4e6dcbfa1f6e39092ed66cf3,55.948419,-3.20937,Embassy
6,Zest Indian Restaurant,4b058823f964a52026b422e3,55.955546,-3.19275,Indian
7,Shamoli Thai & Indian Restaurant (Halal),4c79794b93ef236ae3bfaf0f,55.950425,-3.186008,Thai
8,Kushi's Indian Cuisine,4c7ff084dc018cfaae0fb86c,55.956113,-3.185474,Indian
9,Indian Mela,4c6c451ca437224baee429b1,55.941792,-3.181698,Indian


In [19]:
Resturants_Edi.drop(axis=0, index=5, inplace = True)
Resturants_Edi.reset_index(drop = True, inplace = True)

In [20]:
Resturants_Edi

Unnamed: 0,name,id,Latitude,Longitude,categorie
0,Indian Express,5b1e600965211f002c79012b,55.952576,-3.19156,Indian
1,Indian Lounge,5d0a9a20972c0d00230e8558,55.952026,-3.202587,Indian
2,Indian Thali,4de546fdc65b98fadae0e0c4,55.956117,-3.190878,Thai
3,Tippoo's Indian Restautant,4bf2cfd26a31d13af2ee932e,55.95199,-3.202601,Indian
4,Tipu's Indian Lounge,4d3f1d266b3d236af5d07864,55.951872,-3.202907,Indian
5,Zest Indian Restaurant,4b058823f964a52026b422e3,55.955546,-3.19275,Indian
6,Shamoli Thai & Indian Restaurant (Halal),4c79794b93ef236ae3bfaf0f,55.950425,-3.186008,Thai
7,Kushi's Indian Cuisine,4c7ff084dc018cfaae0fb86c,55.956113,-3.185474,Indian
8,Indian Mela,4c6c451ca437224baee429b1,55.941792,-3.181698,Indian
9,Kahani Indian Resturant,5cb61f9965211f002cd9f13f,55.958127,-3.185121,Pakistani


In [21]:
radius_Str = 4000
url_Str = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&query={}&radius={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    Str_lat, 
    Str_long, 
    query, 
    radius_Str)

results_Str = requests.get(url_Str).json()['response']['venues']


Resturants_Str = pd.DataFrame(columns=column_names)

for data in results_Str:
    Restaurant_name = data['name'] 
    Restaurant_ID = data['id']
    Restaurant_lat = data['location']['lat']
    Restaurant_long = data['location']['lng']
    Restaurant_categorie = data['categories'][0]['shortName']
        
    
    Resturants_Str = Resturants_Str.append({'name': Restaurant_name,
                                          'id': Restaurant_ID,
                                           'Latitude': Restaurant_lat,
                                           'Longitude': Restaurant_long,
                                           'categorie': Restaurant_categorie}, ignore_index = True)

In [22]:
results_Str[0]

{'id': '4c967d0a533aa0931141d245',
 'name': 'Indian cottage',
 'location': {'address': '11 Dumbarton Rd',
  'lat': 56.11650445978077,
  'lng': -3.9374734677660785,
  'labeledLatLngs': [{'label': 'display',
    'lat': 56.11650445978077,
    'lng': -3.9374734677660785}],
  'distance': 284,
  'cc': 'GB',
  'city': 'Stirling',
  'state': 'Stirlingshire',
  'country': 'United Kingdom',
  'formattedAddress': ['11 Dumbarton Rd',
   'Stirling',
   'Stirlingshire',
   'United Kingdom']},
 'categories': [{'id': '4bf58dd8d48988d10f941735',
   'name': 'Indian Restaurant',
   'pluralName': 'Indian Restaurants',
   'shortName': 'Indian',
   'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/indian_',
    'suffix': '.png'},
   'primary': True}],
 'referralId': 'v-1589110150',
 'hasPerk': False}

In [23]:
len(results_Str)

1

In [24]:
Resturants_Str

Unnamed: 0,name,id,Latitude,Longitude,categorie
0,Indian cottage,4c967d0a533aa0931141d245,56.116504,-3.937473,Indian


In [25]:
radius_Dun = 5000
url_Dun = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&query={}&radius={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    Dun_lat, 
    Dun_long, 
    query, 
    radius_Dun)

results_Dun = requests.get(url_Dun).json()['response']['venues']


Resturants_Dun = pd.DataFrame(columns=column_names)

for data in results_Dun:
    Restaurant_name = data['name'] 
    Restaurant_ID = data['id']
    Restaurant_lat = data['location']['lat']
    Restaurant_long = data['location']['lng']
    Restaurant_categorie = data.get('categories', None)
    if Restaurant_categorie :
        Restaurant_categorie = Restaurant_categorie[0]['shortName']
        
    
    Resturants_Dun = Resturants_Dun.append({'name': Restaurant_name,
                                          'id': Restaurant_ID,
                                           'Latitude': Restaurant_lat,
                                           'Longitude': Restaurant_long,
                                           'categorie': Restaurant_categorie}, ignore_index = True)

In [26]:
Resturants_Dun

Unnamed: 0,name,id,Latitude,Longitude,categorie
0,Indian Express,5cd347fe0e3239002b66f077,56.458773,-2.972983,Indian
1,Taza Indian Buffet,4e301630922e47c929275f6e,56.461486,-2.962998,Indian
2,Mazaydar Indian Takeaway,5b85266d8194fc002c8f64c0,56.488645,-2.941085,Indian


In [27]:
len(results_Dun)

3

In [28]:
radius_Inv = 4000
url_Inv = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&query={}&radius={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    Inv_lat, 
    Inv_long, 
    query, 
    radius_Inv)

results_Inv = requests.get(url_Inv).json()['response']['venues']


Resturants_Inv = pd.DataFrame(columns=column_names)

for data in results_Inv:
    Restaurant_name = data['name'] 
    Restaurant_ID = data['id']
    Restaurant_lat = data['location']['lat']
    Restaurant_long = data['location']['lng']
    Restaurant_categorie = data.get('categories', None)
    if Restaurant_categorie :
        Restaurant_categorie = Restaurant_categorie[0]['shortName']
        
    
    Resturants_Inv = Resturants_Inv.append({'name': Restaurant_name,
                                          'id': Restaurant_ID,
                                           'Latitude': Restaurant_lat,
                                           'Longitude': Restaurant_long,
                                           'categorie': Restaurant_categorie}, ignore_index = True)

In [29]:
Resturants_Inv

Unnamed: 0,name,id,Latitude,Longitude,categorie
0,Indian cuisine,4c41dddcce54e21e91b20b1a,57.479537,-4.224891,[]
1,Indian Ocean Inverness,4c51b839250dd13a613eb67d,57.479663,-4.225683,Indian
2,Sams Indian Cuisine,4c48b71b3013a593877dc7e1,57.47937,-4.227251,Indian
3,Sam's Indian,4be1c9a9ae55a593358f5b62,57.478889,-4.233905,[]
4,Shapla,54140279498ec4528ff95081,57.476933,-4.226613,South Indian


In [30]:
Resturants_Inv.iat[0,4]= 'Indian'
Resturants_Inv.iat[3,4]= 'Indian'

In [31]:
Resturants_Inv

Unnamed: 0,name,id,Latitude,Longitude,categorie
0,Indian cuisine,4c41dddcce54e21e91b20b1a,57.479537,-4.224891,Indian
1,Indian Ocean Inverness,4c51b839250dd13a613eb67d,57.479663,-4.225683,Indian
2,Sams Indian Cuisine,4c48b71b3013a593877dc7e1,57.47937,-4.227251,Indian
3,Sam's Indian,4be1c9a9ae55a593358f5b62,57.478889,-4.233905,Indian
4,Shapla,54140279498ec4528ff95081,57.476933,-4.226613,South Indian


In [32]:
len(results_Inv)

5

In [33]:
radius_Abe = 7000
url_Abe = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&query={}&radius={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    Abe_lat, 
    Abe_long, 
    query, 
    radius_Abe)

results_Abe = requests.get(url_Abe).json()['response']['venues']


Resturants_Abe = pd.DataFrame(columns=column_names)

for data in results_Abe:
    Restaurant_name = data['name'] 
    Restaurant_ID = data['id']
    Restaurant_lat = data['location']['lat']
    Restaurant_long = data['location']['lng']
    Restaurant_categorie = data.get('categories', None)
    if Restaurant_categorie :
        Restaurant_categorie = Restaurant_categorie[0]['shortName']
        
    
    Resturants_Abe = Resturants_Abe.append({'name': Restaurant_name,
                                          'id': Restaurant_ID,
                                           'Latitude': Restaurant_lat,
                                           'Longitude': Restaurant_long,
                                           'categorie': Restaurant_categorie}, ignore_index = True)

In [34]:
Resturants_Abe

Unnamed: 0,name,id,Latitude,Longitude,categorie
0,Riksha Streetside Indian,5d425256ef53ee0007256b14,57.142266,-2.095972,Indian
1,Monsoona Indian,4de13fc51f6ece64738f7c98,57.145206,-2.101991,Indian
2,Rishis Indian Aroma,4dbb11b90437955ec00da471,57.151027,-2.102278,Indian
3,Shri Bheema's Indian Restaurant Bridge of Don,5230acc211d2ca6391f7fc38,57.181874,-2.115747,Indian


In [35]:
d_num = {'N_restaurants': [len(Resturants_Gla['name']), len(Resturants_Edi['name']), len(Resturants_Str['name']), len(Resturants_Dun['name']), len(Resturants_Inv['name']), len(Resturants_Abe['name'])]
    }

In [36]:
d_num

{'N_restaurants': [13, 25, 1, 3, 5, 4]}

In [37]:
General_df['N_restuaurants']= d_num['N_restaurants']

In [38]:
General_df

Unnamed: 0,Cities,Population,Lattitude,Longitude,N_restuaurants
0,Glasgow,626410,55.86515,-4.25763,13
1,Edinburgh,518500,55.95206,-3.19648,25
2,Stirling,94330,56.11903,-3.93682,1
3,Dundee,148750,56.46913,-2.97489,3
4,Inverness,47380,57.47908,-4.22398,5
5,Aberdeen,227560,57.14369,-2.09814,4


In [39]:
General_df['ratio']= General_df['Population']//General_df['N_restuaurants']

In [40]:
General_df

Unnamed: 0,Cities,Population,Lattitude,Longitude,N_restuaurants,ratio
0,Glasgow,626410,55.86515,-4.25763,13,48185
1,Edinburgh,518500,55.95206,-3.19648,25,20740
2,Stirling,94330,56.11903,-3.93682,1,94330
3,Dundee,148750,56.46913,-2.97489,3,49583
4,Inverness,47380,57.47908,-4.22398,5,9476
5,Aberdeen,227560,57.14369,-2.09814,4,56890


In [41]:
General_df.sort_values(by='ratio', ascending = False)

Unnamed: 0,Cities,Population,Lattitude,Longitude,N_restuaurants,ratio
2,Stirling,94330,56.11903,-3.93682,1,94330
5,Aberdeen,227560,57.14369,-2.09814,4,56890
3,Dundee,148750,56.46913,-2.97489,3,49583
0,Glasgow,626410,55.86515,-4.25763,13,48185
1,Edinburgh,518500,55.95206,-3.19648,25,20740
4,Inverness,47380,57.47908,-4.22398,5,9476


##### Trip advisor data 
Strling : 11
Glasgow: 192
Edinburgh: 131
Aberdeen: 59
Inverness: 14
Dundee: 29

In [42]:
General_df['tripadvisor_data']= [192, 131, 11, 29, 14, 59] 

In [43]:
General_df['ratio_tripadvisor']= General_df['Population']//General_df['tripadvisor_data']

In [44]:
General_df

Unnamed: 0,Cities,Population,Lattitude,Longitude,N_restuaurants,ratio,tripadvisor_data,ratio_tripadvisor
0,Glasgow,626410,55.86515,-4.25763,13,48185,192,3262
1,Edinburgh,518500,55.95206,-3.19648,25,20740,131,3958
2,Stirling,94330,56.11903,-3.93682,1,94330,11,8575
3,Dundee,148750,56.46913,-2.97489,3,49583,29,5129
4,Inverness,47380,57.47908,-4.22398,5,9476,14,3384
5,Aberdeen,227560,57.14369,-2.09814,4,56890,59,3856


In [45]:
General_df.sort_values(by='ratio_tripadvisor', ascending = True)

Unnamed: 0,Cities,Population,Lattitude,Longitude,N_restuaurants,ratio,tripadvisor_data,ratio_tripadvisor
0,Glasgow,626410,55.86515,-4.25763,13,48185,192,3262
4,Inverness,47380,57.47908,-4.22398,5,9476,14,3384
5,Aberdeen,227560,57.14369,-2.09814,4,56890,59,3856
1,Edinburgh,518500,55.95206,-3.19648,25,20740,131,3958
3,Dundee,148750,56.46913,-2.97489,3,49583,29,5129
2,Stirling,94330,56.11903,-3.93682,1,94330,11,8575


### Data from tripadvisor makes more sense, and glasgow seems to be the place to start a new business as they have the least number for indian restaurants

In [46]:
General_df

Unnamed: 0,Cities,Population,Lattitude,Longitude,N_restuaurants,ratio,tripadvisor_data,ratio_tripadvisor
0,Glasgow,626410,55.86515,-4.25763,13,48185,192,3262
1,Edinburgh,518500,55.95206,-3.19648,25,20740,131,3958
2,Stirling,94330,56.11903,-3.93682,1,94330,11,8575
3,Dundee,148750,56.46913,-2.97489,3,49583,29,5129
4,Inverness,47380,57.47908,-4.22398,5,9476,14,3384
5,Aberdeen,227560,57.14369,-2.09814,4,56890,59,3856


#### There are a few issues associated with the data from foursquare. 
1. Its not up to date
2. doesnt contain all current elements
3. is of limited knowledge in scotland
4. only a max of 30 elements are given for a search

Therefore it was decide to change the way this project is carried out. The ratio on tripadvisor shows a ratio of head / restaurant to be the lowest in Glasgow.
So the ares of glasgow are studied in more detail. For that they are divided in sections to capture all data from all restaurants within the region. 
Then all the data will be merged and analysed. 

In [47]:
G_pc = {'PostCode':['G1_1',
                    'G1_2', 
                    'G2_1',
                    'G2_2',
                    'G3_1',
                    'G3_2',
                    'G3_3',
                    'G3_4',
                    'G3_5',
                    'G3_6',
                    'G3_7',
                    'G3_8',
                    'G3_9',
                    'G3_10',
                    'G3_11',
                    'G4_1',
                    'G4_2',
                    'G4_3',
                    'G4_4',
                    'G4_5',
                    'G4_6', 
                    'G5_1',
                    'G5_2',
                    'G5_3',
                    'G5_4',
                    'G5_5',
                    'G5_6',
                    'G5_7'], 
        'Lattitude': [55.861598,
                      55.856487,
                      55.865950,
                      55.859977,
                      55.865954,
                      55.862729,
                      55.860294,
                      55.865069,
                      55.869649,
                      55.873021,
                      55.871403,
                      55.871910,
                      55.875749,
                      55.876957,
                      55.871379,
                      55.876118,
                      55.871110,
                      55.870099,
                      55.867739,
                      55.864031,
                      55.860274,
                      55.852342,
                      55.850048,
                      55.851794,
                      55.848915,
                      55.845271,
                      55.840775,
                      55.839934,], 
        'Longitude': [-4.245269,
                      -4.247578,
                      -4.261307,
                      -4.260471,
                      -4.297588,
                      -4.284020,
                      -4.277013,
                      -4.278244,
                      -4.275117,
                      -4.279078,
                      -4.286025,
                      -4.294293,
                      -4.291166,
                      -4.283454,
                      -4.267127,
                      -4.257704,
                      -4.255430,
                      -4.248091,
                      -4.241482,
                      -4.237706,
                      -4.233586,
                      -4.270117,
                      -4.262739,
                      -4.254192,
                      -4.247489,
                      -4.244340,
                      -4.243360,
                      -4.235486]}
Glasgow_df = pd.DataFrame(data= G_pc)
Glasgow_df

Unnamed: 0,PostCode,Lattitude,Longitude
0,G1_1,55.861598,-4.245269
1,G1_2,55.856487,-4.247578
2,G2_1,55.86595,-4.261307
3,G2_2,55.859977,-4.260471
4,G3_1,55.865954,-4.297588
5,G3_2,55.862729,-4.28402
6,G3_3,55.860294,-4.277013
7,G3_4,55.865069,-4.278244
8,G3_5,55.869649,-4.275117
9,G3_6,55.873021,-4.279078


In [48]:
def getrestaurants_Gla (names, latitudes, longitudes, query = 'restaurant', radius=1000):
    
    Rest_list=[]
    for name, lat, lng, in zip(names, latitudes, longitudes):
        print(name)
        
        url_Gla_central = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&query={}&radius={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng,
            query, 
            radius)
            
        # create the API request URL
        #url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            #CLIENT_ID, 
            #CLIENT_SECRET, 
            #VERSION, 
            #lat, 
            #lng, 
            #radius, 
            #LIMIT)
            
        # make the GET request
        results_for_G_central = requests.get(url_Gla_central).json()["response"]['venues']
        
        for data in results_for_G_central: 
            Restaurant_name = data['name'] 
            Restaurant_ID = data['id']
            Restaurant_lat = data['location']['lat']
            Restaurant_long = data['location']['lng']
            Restaurant_categorie = data.get('categories', None)
            if Restaurant_categorie :
                Restaurant_categorie = Restaurant_categorie[0]['shortName']
        
        
            Rest_list.append([(
                name, 
                lat, 
                lng,
                Restaurant_name, 
                Restaurant_lat, 
                Restaurant_long,
                Restaurant_ID,
                Restaurant_categorie)])
        
    nearby_rest = pd.DataFrame([item for Rest_list in Rest_list for item in Rest_list])
    nearby_rest.columns = ['Gla postcode', 
                'Gla Latitude', 
                'Gla Longitude', 
                'name', 
                'rest Latitude', 
                'rest Longitude',
                'rest id', 
                'rest Category']
    
    return(nearby_rest)

In [56]:
Gla_rest_All = getrestaurants_Gla (names = Glasgow_df['PostCode'],
                                   latitudes = Glasgow_df['Lattitude'],
                                   longitudes = Glasgow_df['Longitude'])

G1_1
G1_2
G2_1
G2_2
G3_1
G3_2
G3_3
G3_4
G3_5
G3_6
G3_7
G3_8
G3_9
G3_10
G3_11
G4_1
G4_2
G4_3
G4_4
G4_5
G4_6
G5_1
G5_2
G5_3
G5_4
G5_5
G5_6
G5_7


In [57]:
Gla_rest_All

Unnamed: 0,Gla postcode,Gla Latitude,Gla Longitude,name,rest Latitude,rest Longitude,rest id,rest Category
0,G1_1,55.861598,-4.245269,Windows Restaurant at the Carlton George,55.861947,-4.253082,4caf8658562d224bb9b90e88,Diner
1,G1_1,55.861598,-4.245269,Kezban Mediterranean Restaurant,55.857076,-4.243293,5323286511d2a3b706ddd6fd,Mediterranean
2,G1_1,55.861598,-4.245269,Tempus Bar and Restaurant | Grand Central Hotel,55.85994,-4.258971,4d98ed6de07ea35d311bfe02,Bar
3,G1_1,55.861598,-4.245269,The Restaurant Bar and Grill,55.85927,-4.253622,4be1550f2b27ef3b7de0746a,Restaurant
4,G1_1,55.861598,-4.245269,Scholars Restaurant,55.863466,-4.245876,52fb4e04498e8a45319c049b,Restaurant
5,G1_1,55.861598,-4.245269,Restaurant Jurys Inn,55.856651,-4.256972,5224c5d911d2dd44ddacd97c,Eastern European
6,G1_1,55.861598,-4.245269,Eda Restaurant,55.856729,-4.24342,5d3f30414788eb0008a8c59b,Turkish
7,G1_1,55.861598,-4.245269,Eliá Greek Restaurant,55.861239,-4.250873,4bec4684fd60a593c4c33af1,Greek
8,G1_1,55.861598,-4.245269,Bill's Restaurant,55.860905,-4.255365,55155ce6498e8ea2acc3d359,English
9,G1_1,55.861598,-4.245269,Citation Taverne & Restaurant,55.858793,-4.247261,4b88517bf964a520bfee31e3,Bar


In [58]:
Gla_rest_All.drop_duplicates (subset='rest id',
                              keep = 'first',
                              inplace = True)
Gla_rest_All.reset_index(drop= True, inplace = True)

In [65]:
Gla_rest_All

Unnamed: 0,Gla postcode,Gla Latitude,Gla Longitude,name,rest Latitude,rest Longitude,rest id,rest Category
0,G1_1,55.861598,-4.245269,Windows Restaurant at the Carlton George,55.861947,-4.253082,4caf8658562d224bb9b90e88,Diner
1,G1_1,55.861598,-4.245269,Kezban Mediterranean Restaurant,55.857076,-4.243293,5323286511d2a3b706ddd6fd,Mediterranean
2,G1_1,55.861598,-4.245269,Tempus Bar and Restaurant | Grand Central Hotel,55.85994,-4.258971,4d98ed6de07ea35d311bfe02,Bar
3,G1_1,55.861598,-4.245269,The Restaurant Bar and Grill,55.85927,-4.253622,4be1550f2b27ef3b7de0746a,Restaurant
4,G1_1,55.861598,-4.245269,Scholars Restaurant,55.863466,-4.245876,52fb4e04498e8a45319c049b,Restaurant
5,G1_1,55.861598,-4.245269,Restaurant Jurys Inn,55.856651,-4.256972,5224c5d911d2dd44ddacd97c,Eastern European
6,G1_1,55.861598,-4.245269,Eda Restaurant,55.856729,-4.24342,5d3f30414788eb0008a8c59b,Turkish
7,G1_1,55.861598,-4.245269,Eliá Greek Restaurant,55.861239,-4.250873,4bec4684fd60a593c4c33af1,Greek
8,G1_1,55.861598,-4.245269,Bill's Restaurant,55.860905,-4.255365,55155ce6498e8ea2acc3d359,English
9,G1_1,55.861598,-4.245269,Citation Taverne & Restaurant,55.858793,-4.247261,4b88517bf964a520bfee31e3,Bar


In [78]:
for data in Gla_rest_All.loc[Gla_rest_All['name']=='Scotts Restaurant']:
    data.replace({'','restaurant'})



#Gla_rest_All.loc[Gla_rest_All['name']=='Italiano and pizza restaurant']
#for data in Gla_rest_All['name']:
    #if data == 'Scotts Restaurant':
       # print (Gla_rest_All['rest Category'])

#Gla_rest_All.iat[68,7]='italian'
#Gla_rest_All.iat[47,7]='Restaurant'

#Gla_rest_All.iat[48,7]='Indian'

TypeError: replace() takes at least 2 arguments (1 given)

In [76]:
Gla_rest_All.shape
Gla_rest_All

Unnamed: 0,Gla postcode,Gla Latitude,Gla Longitude,name,rest Latitude,rest Longitude,rest id,rest Category
0,G1_1,55.861598,-4.245269,Windows Restaurant at the Carlton George,55.861947,-4.253082,4caf8658562d224bb9b90e88,Diner
1,G1_1,55.861598,-4.245269,Kezban Mediterranean Restaurant,55.857076,-4.243293,5323286511d2a3b706ddd6fd,Mediterranean
2,G1_1,55.861598,-4.245269,Tempus Bar and Restaurant | Grand Central Hotel,55.85994,-4.258971,4d98ed6de07ea35d311bfe02,Bar
3,G1_1,55.861598,-4.245269,The Restaurant Bar and Grill,55.85927,-4.253622,4be1550f2b27ef3b7de0746a,Restaurant
4,G1_1,55.861598,-4.245269,Scholars Restaurant,55.863466,-4.245876,52fb4e04498e8a45319c049b,Restaurant
5,G1_1,55.861598,-4.245269,Restaurant Jurys Inn,55.856651,-4.256972,5224c5d911d2dd44ddacd97c,Eastern European
6,G1_1,55.861598,-4.245269,Eda Restaurant,55.856729,-4.24342,5d3f30414788eb0008a8c59b,Turkish
7,G1_1,55.861598,-4.245269,Eliá Greek Restaurant,55.861239,-4.250873,4bec4684fd60a593c4c33af1,Greek
8,G1_1,55.861598,-4.245269,Bill's Restaurant,55.860905,-4.255365,55155ce6498e8ea2acc3d359,English
9,G1_1,55.861598,-4.245269,Citation Taverne & Restaurant,55.858793,-4.247261,4b88517bf964a520bfee31e3,Bar


In [55]:
rest_by_category = Gla_rest_All.groupby(['rest Category']).count()

TypeError: unhashable type: 'list'

In [None]:
rest_by_category

In [None]:
print('There are {} unique restaurants types in Glasgow.'.format(len(Gla_rest_All['rest Category'].unique())))


### There are a total of 7 Indian restaurants in the area of Glasgow, on a overall count of 73 restaurants. 

In [None]:
Gla_lat = 55.86515
Gla_long = -4.25763
# create map of Toronto using latitude and longitude values
map_glasgow = folium.Map(location=[Gla_lat, Gla_long], zoom_start=14)

# add markers to map
for lat, lng, name, rest_Category in zip(Gla_rest_All['rest Latitude'],
                                     Gla_rest_All['rest Longitude'],
                                     Gla_rest_All['name'],
                                     Gla_rest_All['rest Category']
                                     ):
    label = '{}, {}'.format(name, rest_Category)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_glasgow)  
    
map_glasgow

In [None]:
# one hot encoding
Glasgow_onehot = pd.get_dummies(Gla_rest_All['rest Category'], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Glasgow_onehot['Gla postcode'] = Gla_rest_All['Gla postcode'] 

# move neighborhood column to the first column
fixed_columns = [Glasgow_onehot.columns[-1]] + list(Glasgow_onehot.columns[:-1])
Glasgow_onehot = Glasgow_onehot[fixed_columns]

Glasgow_onehot.head()