# Introduction / Business problem

Korean cuisine is one of the trends right now. People from al around the world want to have good Korean restaurants in their city, but in Spain there isn't many of them.  
The business problem will be to decide in which of the biggest cities in Spain would be optimal to start a Korean restaurant.  
For this analysis we will take in consideration the number of restaurants of the same kind and number of restaurants in general.

# Data

I will use data from Foursquare to get restaurants of each city and their category.  
Also I will need a database with the cities' coordinates.

# Analysis

In [2]:
import requests
!pip install folium
import folium # map rendering library
#!pip install geopy
from geopy.geocoders import Nominatim
import pandas as pd

Collecting folium
  Downloading folium-0.12.1-py2.py3-none-any.whl (94 kB)
[K     |████████████████████████████████| 94 kB 6.7 MB/s  eta 0:00:01
Collecting branca>=0.3.0
  Downloading branca-0.4.2-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.2 folium-0.12.1


In [58]:
column_names = ['City','Latitude', 'Longitude','Density']
spain_df = pd.DataFrame(columns=column_names)
spain_df

Unnamed: 0,City,Latitude,Longitude,Density


Appending the data from https://www.geodatos.net/coordenadas/espana

In [63]:
spain_df = spain_df.append({'City':'Madrid','Latitude':40.4165,'Longitude':-3.70256,'Density':5418.47},ignore_index=True)
spain_df = spain_df.append({'City':'Barcelona','Latitude':41.38879,'Longitude':2.15899,'Density':15992.2},ignore_index=True)
spain_df = spain_df.append({'City':'Valencia','Latitude':39.46975,'Longitude':-0.37739,'Density':5850.78},ignore_index=True)
spain_df = spain_df.append({'City':'Zaragoza','Latitude':41.65606,'Longitude':-0.87734,'Density':682.84},ignore_index=True)
spain_df = spain_df.append({'City':'Malaga','Latitude':36.72016,'Longitude':-4.42034,'Density':1428.76},ignore_index=True)
spain_df = spain_df.append({'City':'Murcia','Latitude':37.98704,'Longitude':-1.13004,'Density':513.98},ignore_index=True)
spain_df = spain_df.append({'City':'Bilbao','Latitude':43.26271,'Longitude':-2.92528,'Density':8295.91},ignore_index=True)
spain_df = spain_df.append({'City':'Sevilla','Latitude':37.38283,'Longitude':-5.97317,'Density':4896.55},ignore_index=True)
spain_df = spain_df.append({'City':'Valladolid','Latitude':41.65518,'Longitude': -4.72372,'Density':1514.4},ignore_index=True)
spain_df = spain_df.append({'City':'Vigo','Latitude':42.23282,'Longitude':-8.72264,'Density':2686.47},ignore_index=True)
spain_df = spain_df.append({'City':'A Coruña','Latitude':43.37135,'Longitude':-8.396,'Density':6452.52},ignore_index=True)
spain_df = spain_df.append({'City':'Granada','Latitude':37.18817,'Longitude':-3.60667,'Density':2654.41},ignore_index=True)
spain_df = spain_df.append({'City':'Oviedo','Latitude':43.36029,'Longitude': -5.84476,'Density':1180.29},ignore_index=True)
spain_df = spain_df.append({'City':'Cartagena','Latitude':37.60512,'Longitude':-0.98623,'Density':383.77},ignore_index=True)
spain_df

Unnamed: 0,City,Latitude,Longitude,Density
0,Madrid,40.4165,-3.70256,5418.47
1,Barcelona,41.38879,2.15899,15992.2
2,Valencia,39.46975,-0.37739,5850.78
3,Zaragoza,41.65606,-0.87734,682.84
4,Malaga,36.72016,-4.42034,1428.76
5,Murcia,37.98704,-1.13004,513.98
6,Bilbao,43.26271,-2.92528,8295.91
7,Sevilla,37.38283,-5.97317,4896.55
8,Valladolid,41.65518,-4.72372,1514.4
9,Vigo,42.23282,-8.72264,2686.47


In [64]:
CLIENT_ID = 'Y2ODIMTHIOERYBD10IHG4DSQ5MNZ3I1P4DL4EYJVBO50TP4M' # your Foursquare ID
CLIENT_SECRET = 'R2KQUMZ4DIW5BMCTPXRDRMH0OC1PM54EP2DRSQ22134XVKVW' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 200 # A default Foursquare API limit value

### Checking that each spot has the correct coordenates

In [65]:
map_spain = folium.Map(location=[40.41650, -3.70256], zoom_start=7)

# add markers to map
for lat, lng, city in zip(spain_df['Latitude'], spain_df['Longitude'], spain_df['City']):
    label = city
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_spain)  
    
map_spain

### Getting how many korean restaurants has each city

In [66]:
def getNearbyKoreanRestaurants(names, latitudes, longitudes, radius=5000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId=4bf58dd8d48988d113941735'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['City', 
                  'City Latitude', 
                  'City Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [67]:
spain_korean = getNearbyKoreanRestaurants(names=spain_df['City'],
                                   latitudes=spain_df['Latitude'],
                                   longitudes=spain_df['Longitude']
                                  )

Madrid
Barcelona
Valencia
Zaragoza
Malaga
Murcia
Bilbao
Sevilla
Valladolid
Vigo
A Coruña
Granada
Oviedo
Cartagena


In [68]:
grouped_korean = spain_korean.groupby('City').count()
grouped_korean.drop(labels=['City Latitude', 'City Longitude','Venue Latitude', 'Venue Longitude', 'Venue Category'],axis = 1, inplace = True)
grouped_korean

Unnamed: 0_level_0,Venue
City,Unnamed: 1_level_1
Barcelona,27
Granada,1
Madrid,22
Malaga,3
Sevilla,3
Valencia,4


### Getting how many asian restaurants (Including korean) each city has

In [69]:
def getNearbyAsianRestaurants(names, latitudes, longitudes, radius=5000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId=4bf58dd8d48988d142941735'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['City', 
                  'City Latitude', 
                  'City Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [70]:
spain_asian = getNearbyAsianRestaurants(names=spain_df['City'],
                                   latitudes=spain_df['Latitude'],
                                   longitudes=spain_df['Longitude']
                                  )

Madrid
Barcelona
Valencia
Zaragoza
Malaga
Murcia
Bilbao
Sevilla
Valladolid
Vigo
A Coruña
Granada
Oviedo
Cartagena


In [71]:
grouped_asian = spain_asian.groupby('City').count()
grouped_asian.drop(labels=['City Latitude', 'City Longitude','Venue Latitude', 'Venue Longitude', 'Venue Category'],axis = 1, inplace = True)
grouped_asian

Unnamed: 0_level_0,Venue
City,Unnamed: 1_level_1
A Coruña,21
Barcelona,100
Bilbao,35
Cartagena,5
Granada,33
Madrid,100
Malaga,59
Murcia,30
Oviedo,12
Sevilla,42


(I reached the limit in Barcelona, Madrid and Valencia so those 3 cities will look worse with this metric than they really are, I will take account on that)

## Got all data in the same DataFrame

In [72]:
asian_df = pd.merge(
    spain_df,
    grouped_asian,
    how="inner",
    on='City',
    left_on=None,
    right_on=None,
    left_index=False,
    right_index=False,
    sort=True,
    suffixes=("_x", "_y"),
    copy=True,
    indicator=False,
    validate=None,
)
asian_df.rename(columns={'Venue':'Asian Restaurants'},inplace = True)
df = pd.merge(
    asian_df,
    grouped_korean,
    how="left",
    on='City',
    left_on=None,
    right_on=None,
    left_index=False,
    right_index=False,
    sort=True,
    suffixes=("_x", "_y"),
    copy=True,
    indicator=False,
    validate=None,
)
df.rename(columns={'Venue':'Korean Restaurants'},inplace = True)
df['Korean Restaurants'] = df['Korean Restaurants'].fillna(0)
df

Unnamed: 0,City,Latitude,Longitude,Density,Asian Restaurants,Korean Restaurants
0,A Coruña,43.37135,-8.396,6452.52,21,0.0
1,Barcelona,41.38879,2.15899,15992.2,100,27.0
2,Bilbao,43.26271,-2.92528,8295.91,35,0.0
3,Cartagena,37.60512,-0.98623,383.77,5,0.0
4,Granada,37.18817,-3.60667,2654.41,33,1.0
5,Madrid,40.4165,-3.70256,5418.47,100,22.0
6,Malaga,36.72016,-4.42034,1428.76,59,3.0
7,Murcia,37.98704,-1.13004,513.98,30,0.0
8,Oviedo,43.36029,-5.84476,1180.29,12,0.0
9,Sevilla,37.38283,-5.97317,4896.55,42,3.0


In [73]:
values = df['Korean Restaurants'] / df['Asian Restaurants']
df['Ratio'] = values
df.sort_values(by=['Asian Restaurants'])

Unnamed: 0,City,Latitude,Longitude,Density,Asian Restaurants,Korean Restaurants,Ratio
3,Cartagena,37.60512,-0.98623,383.77,5,0.0,0.0
8,Oviedo,43.36029,-5.84476,1180.29,12,0.0,0.0
12,Vigo,42.23282,-8.72264,2686.47,12,0.0,0.0
11,Valladolid,41.65518,-4.72372,1514.4,16,0.0,0.0
0,A Coruña,43.37135,-8.396,6452.52,21,0.0,0.0
7,Murcia,37.98704,-1.13004,513.98,30,0.0,0.0
4,Granada,37.18817,-3.60667,2654.41,33,1.0,0.030303
2,Bilbao,43.26271,-2.92528,8295.91,35,0.0,0.0
9,Sevilla,37.38283,-5.97317,4896.55,42,3.0,0.071429
13,Zaragoza,41.65606,-0.87734,682.84,44,0.0,0.0


# Conclusion

With this data we can take different options  
  
  
## Option 1  
The first option is to start the business in a city where there isn't any other korean restaurant, but there is enough asian restaurants that you know the people of that city don't mind trying new cusines. Cities like that would be Bilbao or A Coruña. Zaragoza and Murcia aren't bad options either but because of their low density of population a restaurant would be more succesfull in the other two. 
  
## Option 2
The second choice could be start a korean restaurant in a place where there are some others Korean restaurants but not that many.  
In that case the best option is Valencia because maybe it doesn't have the best ratio (Granada is the 1st) but we have to remember that we reached the limit when we askked for the asian restaurants. Another option in Sevilla as it has similar density and number of korean restaurants but not that many asian restaurants.

## Option 3
Making a restaurant in Madrid or Barcelona. This is the option that has the most risk because there is a lot of other korean restaurants on those cities, and being there is more expensive. But if the business plan also has a good and accurate marketing plan it can be done and it can earn more money than in the other places. If this is the case I would recommend trying in Barcelona 