# Data Extraction

Original Michelin restaurant data was extracted from Kaggle (https://www.kaggle.com/jackywang529/michelin-restaurants).

The data includes one, two, three Michelin star restaurants from the following regions:

Austria, California, Chicago, Croatia, Czech Republic, Denmark, Finland, Greece, Hong Kong, Hungary, Iceland, Macau, Norway, New York City, Poland, Ireland, Rio de Janeiro, Sao Paulo, South Korea, Singapore, Sweden, Taipei, Thailand, Washington DC, and United Kingdom.

The following regions are not included in the dataset:

Belgium, France, Germany, Italy, Japan, Luxembourg, Netherlands, Portugal, China, Spain, and Switzerland.

#### Import Dependencies

In [17]:
# Dependencies
import pandas as pd
from sqlalchemy import create_engine
import requests
import json

# Google developer API key
from config import g_key

#### Load CSV Files into Pandas Dataframes

In [12]:
# Create file paths
one_star_csv = "Resources/data/one-star-michelin-restaurants.csv"
two_star_csv = "Resources/data/two-stars-michelin-restaurants.csv"
three_star_csv = "Resources/data/three-stars-michelin-restaurants.csv"

In [10]:
# Load one star restaurants
raw_one_star_df = pd.read_csv(one_star_csv)
raw_one_star_df.head()

Unnamed: 0,name,year,latitude,longitude,city,region,zipCode,cuisine,price,url
0,Kilian Stuba,2019,47.34858,10.17114,Kleinwalsertal,Austria,87568,Creative,$$$$$,https://guide.michelin.com/at/en/vorarlberg/kl...
1,Pfefferschiff,2019,47.83787,13.07917,Hallwang,Austria,5300,Classic cuisine,$$$$$,https://guide.michelin.com/at/en/salzburg-regi...
2,Esszimmer,2019,47.80685,13.03409,Salzburg,Austria,5020,Creative,$$$$$,https://guide.michelin.com/at/en/salzburg-regi...
3,Carpe Diem,2019,47.80001,13.04006,Salzburg,Austria,5020,Market cuisine,$$$$$,https://guide.michelin.com/at/en/salzburg-regi...
4,Edvard,2019,48.216503,16.36852,Wien,Austria,1010,Modern cuisine,$$$$,https://guide.michelin.com/at/en/vienna/wien/r...


In [13]:
# Load two star restaurants
raw_two_star_df = pd.read_csv(two_star_csv)
raw_two_star_df.head()

Unnamed: 0,name,year,latitude,longitude,city,region,zipCode,cuisine,price,url
0,SENNS.Restaurant,2019,47.83636,13.06389,Salzburg,Austria,5020,Creative,$$$$$,https://guide.michelin.com/at/en/salzburg-regi...
1,Ikarus,2019,47.79536,13.00695,Salzburg,Austria,5020,Creative,$$$$$,https://guide.michelin.com/at/en/salzburg-regi...
2,Mraz & Sohn,2019,48.23129,16.37637,Wien,Austria,1200,Creative,$$$$$,https://guide.michelin.com/at/en/vienna/wien/r...
3,Konstantin Filippou,2019,48.21056,16.37996,Wien,Austria,1010,Modern cuisine,$$$$$,https://guide.michelin.com/at/en/vienna/wien/r...
4,Silvio Nickol Gourmet Restaurant,2019,48.20558,16.37693,Wien,Austria,1010,Modern cuisine,$$$$$,https://guide.michelin.com/at/en/vienna/wien/r...


In [14]:
# Load three star restaurants
raw_three_star_df = pd.read_csv(three_star_csv)
raw_three_star_df.head()

Unnamed: 0,name,year,latitude,longitude,city,region,zipCode,cuisine,price,url
0,Amador,2019,48.25406,16.35915,Wien,Austria,1190,Creative,$$$$$,https://guide.michelin.com/at/en/vienna/wien/r...
1,Manresa,2019,37.22761,-121.98071,South San Francisco,California,95030,Contemporary,$$$$,https://guide.michelin.com/us/en/california/so...
2,Benu,2019,37.78521,-122.39876,San Francisco,California,94105,Asian,$$$$,https://guide.michelin.com/us/en/california/sa...
3,Quince,2019,37.79762,-122.40337,San Francisco,California,94133,Contemporary,$$$$,https://guide.michelin.com/us/en/california/sa...
4,Atelier Crenn,2019,37.79835,-122.43586,San Francisco,California,94123,Contemporary,$$$$,https://guide.michelin.com/us/en/california/sa...


In [16]:
# Row counts
print("In the raw data there are:")
print(f"{raw_one_star_df.shape[0]} rows for one star restaurants.")
print(f"{raw_two_star_df.shape[0]} rows for two star restaurants.")
print(f"{raw_three_star_df.shape[0]} rows for three star restaurants.")

In the raw data there are:
549 rows for one star restaurants.
110 rows for two star restaurants.
36 rows for three star restaurants.


#### Google API Calls

In [21]:
# Subset original data frames
one_star_hotel_df = raw_one_star_df[['name', 'latitude', 'longitude']]
two_star_hotel_df = raw_two_star_df[['name', 'latitude', 'longitude']]
three_star_hotel_df = raw_three_star_df[['name', 'latitude', 'longitude']]
one_star_hotel_df.head()

Unnamed: 0,name,latitude,longitude
0,Kilian Stuba,47.34858,10.17114
1,Pfefferschiff,47.83787,13.07917
2,Esszimmer,47.80685,13.03409
3,Carpe Diem,47.80001,13.04006
4,Edvard,48.216503,16.36852


In [23]:
# Set up url for Google Places API Calls
base_url = "https://maps.googleapis.com/maps/api/place/nearbysearch/json"
params = {
    "location" : "",
    "rankby" : "distance",
    "type": "lodging",
    "key": g_key,
}

In [30]:
# Loop through rows for coordinates to make API call for one star
for index, row in one_star_hotel_df.iterrows():
    lat = row.latitude
    lng = row.longitude
    params['location'] = f"{lat}, {lng}"
    response = requests.get(base_url, params=params).json()
    one_star_hotel_df.loc[index, "Hotel_Name"] = response["results"][0]["name"]
    #hotel_df.loc[index, "City"] = row.City.title()
    #hotel_df.loc[index, "Country"] = row.Country
    #hotel_df.loc[index, "Lat"] = response["results"][0]["geometry"]["location"]["lat"]
    #hotel_df.loc[index, "Lng"] = response["results"][0]["geometry"]["location"]["lng"]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[key] = _infer_fill_value(value)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  isetter(loc, value)


In [28]:
# Loop through rows for coordinates to make API call for two star
for index, row in two_star_hotel_df.iterrows():
    lat = row.latitude
    lng = row.longitude
    params['location'] = f"{lat}, {lng}"
    response = requests.get(base_url, params=params).json()
    two_star_hotel_df.loc[index, "Hotel_Name"] = response["results"][0]["name"]
    #hotel_df.loc[index, "City"] = row.City.title()
    #hotel_df.loc[index, "Country"] = row.Country
    #hotel_df.loc[index, "Lat"] = response["results"][0]["geometry"]["location"]["lat"]
    #hotel_df.loc[index, "Lng"] = response["results"][0]["geometry"]["location"]["lng"]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[key] = _infer_fill_value(value)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  isetter(loc, value)


In [29]:
two_star_hotel_df.head()

Unnamed: 0,name,latitude,longitude,Hotel_Name
0,SENNS.Restaurant,47.83636,13.06389,Inovum
1,Ikarus,47.79536,13.00695,Kärntner Chalets
2,Mraz & Sohn,48.23129,16.37637,Wohnen im Kapellenhof
3,Konstantin Filippou,48.21056,16.37996,PuzzleHotel Apartments Postgasse
4,Silvio Nickol Gourmet Restaurant,48.20558,16.37693,Palais Coburg


In [26]:
# Loop through rows for coordinates to make API call for three star
for index, row in three_star_hotel_df.iterrows():
    lat = row.latitude
    lng = row.longitude
    params['location'] = f"{lat}, {lng}"
    response = requests.get(base_url, params=params).json()
    three_star_hotel_df.loc[index, "Hotel_Name"] = response["results"][0]["name"]
    #three_star_hotel_df.loc[index, "City"] = row.City.title()
    #three_star_hotel_df.loc[index, "Country"] = row.Country
    #three_star_hotel_df.loc[index, "Lat"] = response["results"][0]["geometry"]["location"]["lat"]
    #three_star_hotel_df.loc[index, "Lng"] = response["results"][0]["geometry"]["location"]["lng"]

In [27]:
three_star_hotel_df.head()

Unnamed: 0,name,latitude,longitude,Hotel_Name
0,Amador,48.25406,16.35915,Zum Alten Stadttor
1,Manresa,37.22761,-121.98071,Inspired Leadership Group
2,Benu,37.78521,-122.39876,W San Francisco
3,Quince,37.79762,-122.40337,SnapTravel
4,Atelier Crenn,37.79835,-122.43586,The gula place
