# Michelin Rated Restaurants

In [1]:
import pandas as pd

## Initial Exploration & Preparation of Data

In [2]:
michelin = pd.read_csv("../data/Michelin/michelin_data_2024.csv")
michelin.head(10)

Unnamed: 0,Name,Address,Location,Price,Cuisine,Longitude,Latitude,PhoneNumber,Url,WebsiteUrl,Award,FacilitiesAndServices,Description
0,Taian,"1-21-2 Shimanouchi, Chuo-ku, Osaka, 542-0082, ...","Osaka, Japan",¥¥¥,Japanese,,,81661200000.0,https://guide.michelin.com/en/osaka-region/osa...,,3 Stars,"Air conditioning,Counter dining",Cuisine that does not change does not improve....
1,Hyotei,"35 Nanzenji Kusagawacho, Sakyo-ku, Kyoto, 606-...","Kyoto, Japan",¥¥¥¥,Japanese,135.786742,35.011355,81757710000.0,https://guide.michelin.com/en/kyoto-region/kyo...,http://hyotei.co.jp/,3 Stars,"Air conditioning,Car park,Shoes must be removed",This traditional restaurant began its 450-year...
2,HAJIME,"1-9-11 Edobori, Nishi-ku, Osaka, 550-0002, Japan","Osaka, Japan",¥¥¥¥,Innovative,135.496084,34.688612,81664480000.0,https://guide.michelin.com/en/osaka-region/osa...,http://www.hajime-artistes.com/,3 Stars,"Air conditioning,Interesting wine list,Restaur...",An artwork resembling a planet dominates the d...
3,Kikunoi Honten,"459 Shimokawaracho, Higashiyama-ku, Kyoto, 605...","Kyoto, Japan",¥¥¥¥,Japanese,135.782079,35.001535,81755610000.0,https://guide.michelin.com/en/kyoto-region/kyo...,https://kikunoi.jp/,3 Stars,"Air conditioning,Car park,Shoes must be removed",The elegant garden and well-appointed ceremoni...
4,Isshisoden Nakamura,"136 Matsushitacho, Nakagyo-ku, Kyoto, 604-8093...","Kyoto, Japan",¥¥¥¥,Japanese,135.765083,35.010397,81752220000.0,https://guide.michelin.com/en/kyoto-region/kyo...,http://www.kyoryori-nakamura.com/,3 Stars,"Air conditioning,Shoes must be removed",Isshisoden Nakamura started out as a business ...
5,Mizai,"613 Maruyamacho, Higashiyama-ku, Kyoto, 605-00...","Kyoto, Japan",¥¥¥¥,Japanese,135.782364,35.002993,81755510000.0,https://guide.michelin.com/en/kyoto-region/kyo...,https://mizai.jp/,3 Stars,"Air conditioning,Cash only,Counter dining,Nota...","‘Mizai’ means ‘not there yet’, the creed by wh..."
6,Gion Sasaki,"566-27 Komatsucho, Higashiyama-ku, Kyoto, 605-...","Kyoto, Japan",¥¥¥¥,Japanese,135.775049,34.998881,81755520000.0,https://guide.michelin.com/en/kyoto-region/kyo...,http://gionsasaki.com/,3 Stars,"Air conditioning,Counter dining,Shoes must be ...","Gion Sasaki is tackling new challenges, where ..."
7,Kashiwaya Osaka Senriyama,"2-5-18 Senriyamanishi, Suita, Osaka, 565-0851,...","Osaka, Japan",¥¥¥,Japanese,135.501289,34.770287,81663860000.0,https://guide.michelin.com/en/osaka-region/osa...,https://jp-kashiwaya.com/,3 Stars,"Air conditioning,Interesting wine list,Notable...","The interior, ceremonial space and dining-ware..."
8,ES:SENZ,"Mietenkamer Straße 65, Grassau, 83224, Germany","Grassau, Germany",€€€€,"Creative, Modern Cuisine",12.465618,47.78563,498641400000.0,https://guide.michelin.com/en/bayern/grassau/r...,https://www.das-achental.com/,3 Stars,"Air conditioning,Car park,Interesting wine list","Here in the restaurant of Das Achental hotel, ..."
9,Waldhotel Sonnora,"Auf'm Eichelfeld 1, Dreis, 54518, Germany","Dreis, Germany",€€€€,Classic French,6.810934,49.937732,49657900000.0,https://guide.michelin.com/en/rheinland-pfalz/...,https://www.hotel-sonnora.de/,3 Stars,"Car park,Garden or park,Interesting wine list",This legendary fine dining establishment has a...


In [3]:
print(f"Columns:\n{michelin.columns.to_list()}")

Columns:
['Name', 'Address', 'Location', 'Price', 'Cuisine', 'Longitude', 'Latitude', 'PhoneNumber', 'Url', 'WebsiteUrl', 'Award', 'FacilitiesAndServices', 'Description']


We aim to plot the coordinates on a map and search for population density correlation and compare UK and France.

We drop `Url`, `PhoneNumber` and `FacilitiesAndService`. `Url` could perhaps become useful

In [4]:
michelin = michelin[['Name', 'Address', 'Location', 'Price', 'Cuisine', 'WebsiteUrl', 'Award', 'Longitude', 'Latitude']]
michelin.head()

Unnamed: 0,Name,Address,Location,Price,Cuisine,WebsiteUrl,Award,Longitude,Latitude
0,Taian,"1-21-2 Shimanouchi, Chuo-ku, Osaka, 542-0082, ...","Osaka, Japan",¥¥¥,Japanese,,3 Stars,,
1,Hyotei,"35 Nanzenji Kusagawacho, Sakyo-ku, Kyoto, 606-...","Kyoto, Japan",¥¥¥¥,Japanese,http://hyotei.co.jp/,3 Stars,135.786742,35.011355
2,HAJIME,"1-9-11 Edobori, Nishi-ku, Osaka, 550-0002, Japan","Osaka, Japan",¥¥¥¥,Innovative,http://www.hajime-artistes.com/,3 Stars,135.496084,34.688612
3,Kikunoi Honten,"459 Shimokawaracho, Higashiyama-ku, Kyoto, 605...","Kyoto, Japan",¥¥¥¥,Japanese,https://kikunoi.jp/,3 Stars,135.782079,35.001535
4,Isshisoden Nakamura,"136 Matsushitacho, Nakagyo-ku, Kyoto, 604-8093...","Kyoto, Japan",¥¥¥¥,Japanese,http://www.kyoryori-nakamura.com/,3 Stars,135.765083,35.010397


In [5]:
# Columns are converted to lowercase for convenience
michelin.columns = michelin.columns.str.lower()

In [6]:
michelin.rename({'websiteurl': 'url'}, axis=1, inplace=True)
print(f"Columns:\n{michelin.columns.tolist()}")

Columns:
['name', 'address', 'location', 'price', 'cuisine', 'url', 'award', 'longitude', 'latitude']


In [7]:
michelin.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6700 entries, 0 to 6699
Data columns (total 9 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   name       6700 non-null   object 
 1   address    6700 non-null   object 
 2   location   6700 non-null   object 
 3   price      6698 non-null   object 
 4   cuisine    6700 non-null   object 
 5   url        5515 non-null   object 
 6   award      6700 non-null   object 
 7   longitude  6363 non-null   float64
 8   latitude   6363 non-null   float64
dtypes: float64(2), object(7)
memory usage: 471.2+ KB


There exist missing values which will be dealt with once partitioned by location

Range of values for specific columns

In [8]:
ignore = ['longitude', 'latitude', 'url', 'cuisine']

for column in michelin:
    if column in ignore:
        pass
    else:
        print(f"\nUnique {column}s: {michelin[column].unique()}\nTotal Unique: {len(michelin[column].unique())} values")


Unique names: ['Taian' 'Hyotei' 'HAJIME' ... 'A Casa do Porco' 'Bistrot de Paris'
 'AE! Café & Cozinha']
Total Unique: 6571 values

Unique addresss: ['1-21-2 Shimanouchi, Chuo-ku, Osaka, 542-0082, Japan'
 '35 Nanzenji Kusagawacho, Sakyo-ku, Kyoto, 606-8437, Japan'
 '1-9-11 Edobori, Nishi-ku, Osaka, 550-0002, Japan' ...
 'Rua Cotoxó 493, São Paulo, 05021-000, Brazil'
 'Rua Araujo 124, São Paulo, 01220-020, Brazil'
 'Rua Áurea 285, São Paulo, 04015-070, Brazil']
Total Unique: 6587 values

Unique locations: ['Osaka, Japan' 'Kyoto, Japan' 'Grassau, Germany' ... 'Velp, Netherlands'
 'Veeningen, Netherlands' 'Meerssen, Netherlands']
Total Unique: 2612 values

Unique prices: ['¥¥¥' '¥¥¥¥' '€€€€' '$$$$' '$$$' '₩₩₩₩' '££££' '$$' '€€€' '฿฿฿฿' '¥¥'
 '₺₺₺₺' '€€' '₩₩' '₩₩₩' '£££' '££' '฿฿฿' '฿฿' '₺₺₺' '₺₺' '$' '₫₫' '₫₫₫₫'
 '¥' '€' '₩' '£' '฿' '₺' nan '₫']
Total Unique: 32 values

Unique awards: ['3 Stars' '2 Stars' '1 Star' 'Bib Gourmand']
Total Unique: 4 values


----
&nbsp;
## Separate `location` column

- `country` column. eg, 'USA'
- `city` column. eg, 'San Fransisco'

`location` is comma separated "country, city"

In [9]:
# Are there non-comma separated entries in `location`?
no_comma = michelin[~michelin['location'].str.contains(',')]
no_comma['location'].unique().tolist()

['Hong Kong', 'Macau', 'Singapore', 'Dubai', 'Luxembourg', 'Abu Dhabi']

These are all 'city states' or principalities.

In [10]:
# Create a dictionary for special cases
special_cases = {'Hong Kong': 'Hong Kong, Hong Kong SAR China',
                 'Macau': 'Macau, Macau SAR China',
                 'Singapore': 'Singapore, Singapore',
                 'Dubai': 'Dubai, United Arab Emirates',
                 'Luxembourg': 'Luxembourg, Luxembourg',
                 'Abu Dhabi': 'Abu Dhabi, United Arab Emirates'}

In [11]:
# Apply special cases to the 'location' column
michelin['location'] = michelin['location'].replace(special_cases)

# Now split the 'location' column
locations = michelin['location'].str.split(',', expand=True)
locations.columns = ['city', 'country']

# Remove leading or trailing whitespace from 'city' and 'country' columns
locations['city'] = locations['city'].str.strip()
locations['country'] = locations['country'].str.strip()

# Replace the original 'location' column with the new 'country' and 'city' columns
michelin = michelin.drop('location', axis=1).join(locations)

In [12]:
print(f"New Columns: {michelin.columns.tolist()}")

New Columns: ['name', 'address', 'price', 'cuisine', 'url', 'award', 'longitude', 'latitude', 'city', 'country']


In [13]:
print(f"Unique Countries: {michelin['country'].unique()}"
      f"\nTotal Unique = {len(michelin['country'].unique())} values")

Unique Countries: ['Japan' 'Germany' 'France' 'Hong Kong SAR China' 'Macau SAR China'
 'Belgium' 'South Korea' 'United Kingdom' 'Spain' 'China Mainland' 'Italy'
 'USA' 'Switzerland' 'Slovenia' 'Taiwan' 'Singapore' 'Denmark' 'Norway'
 'Sweden' 'Netherlands' 'Austria' 'Malta' 'Portugal' 'Luxembourg'
 'Ireland' 'Thailand' 'Argentina' 'Malaysia' 'Türkiye' 'Greece' 'Canada'
 'Hungary' 'Finland' 'Estonia' 'United Arab Emirates' 'Poland' 'Brazil'
 'Andorra' 'Latvia' 'Croatia' 'Iceland' 'Vietnam' 'Czech Republic'
 'Serbia']
Total Unique = 44 values


In [14]:
print(f"Unique Cities: {michelin['city'].unique()}"
      f"\nTotal Unique = {len(michelin['city'].unique())} values")

Unique Cities: ['Osaka' 'Kyoto' 'Grassau' ... 'Velp' 'Veeningen' 'Meerssen']
Total Unique = 2609 values


In [15]:
michelin = michelin[['name', 'address', 'city', 'country', 'price', 'cuisine', 'url', 'award', 'longitude', 'latitude']]
michelin.head()

Unnamed: 0,name,address,city,country,price,cuisine,url,award,longitude,latitude
0,Taian,"1-21-2 Shimanouchi, Chuo-ku, Osaka, 542-0082, ...",Osaka,Japan,¥¥¥,Japanese,,3 Stars,,
1,Hyotei,"35 Nanzenji Kusagawacho, Sakyo-ku, Kyoto, 606-...",Kyoto,Japan,¥¥¥¥,Japanese,http://hyotei.co.jp/,3 Stars,135.786742,35.011355
2,HAJIME,"1-9-11 Edobori, Nishi-ku, Osaka, 550-0002, Japan",Osaka,Japan,¥¥¥¥,Innovative,http://www.hajime-artistes.com/,3 Stars,135.496084,34.688612
3,Kikunoi Honten,"459 Shimokawaracho, Higashiyama-ku, Kyoto, 605...",Kyoto,Japan,¥¥¥¥,Japanese,https://kikunoi.jp/,3 Stars,135.782079,35.001535
4,Isshisoden Nakamura,"136 Matsushitacho, Nakagyo-ku, Kyoto, 604-8093...",Kyoto,Japan,¥¥¥¥,Japanese,http://www.kyoryori-nakamura.com/,3 Stars,135.765083,35.010397


----

## `awards` columns

In the Michelin dataset, we have a column named 'awards' that designates the level of recognition a restaurant has achieved according to Michelin's rating system. These awards are '3 Stars', '2 Stars', '1 Star', and 'Bib Gourmand', which is a different award for good quality, good value restaurants.

However, in order to make the analysis more tractable and to create a more uniform scale, we transform these awards into numerical values. This transformation will allow us to perform quantitative analysis and make mathematical computations with this data, which wouldn't be possible with the original textual data.

The '3 MICHELIN Stars', '2 MICHELIN Stars', and '1 MICHELIN Star' awards are straightforwardly transformed into the numerical values 3, 2, and 1, respectively. However, the 'Bib Gourmand' award doesn't fit directly into this star system. In consideration of the prestige and value attached to this award, we've decided to map 'Bib Gourmand' to the value 0.5. It's important to note that this decision, while somewhat arbitrary, is made with the understanding that the 'Bib Gourmand' recognizes a different aspect of restaurant quality and is not strictly comparable to the star awards.

In [16]:
award_dict = {'3 Stars': 3, '2 Stars': 2, '1 Star': 1, 'Bib Gourmand': 0.5}
michelin['stars'] = michelin['award'].replace(award_dict)

In [17]:
cols = michelin.columns.tolist()

cols.remove('stars')
# insert 'stars' at the desired position next to 'award' which we retain
cols.insert(-2, 'stars')

# reindex the DataFrame
michelin = michelin.reindex(columns=cols)

In [18]:
michelin.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6700 entries, 0 to 6699
Data columns (total 11 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   name       6700 non-null   object 
 1   address    6700 non-null   object 
 2   city       6700 non-null   object 
 3   country    6700 non-null   object 
 4   price      6698 non-null   object 
 5   cuisine    6700 non-null   object 
 6   url        5515 non-null   object 
 7   award      6700 non-null   object 
 8   stars      6700 non-null   float64
 9   longitude  6363 non-null   float64
 10  latitude   6363 non-null   float64
dtypes: float64(3), object(8)
memory usage: 575.9+ KB


----
&nbsp;
## `price` column

There is also the price column to organise which lists a number of different currencies. It is unclear if, for example, $ refers to USD, HKD etc..

This attribute is easier to deal with piecewise by country

----
&nbsp;
## Export `UK` and `France` datasets for further analysis

In [19]:
# Filter the DataFrame for records where country == 'UK'
uk_data = michelin[michelin['country'] == 'United Kingdom']
uk_data.head()

Unnamed: 0,name,address,city,country,price,cuisine,url,award,stars,longitude,latitude
60,L'Enclume,"Cavendish Street, Cartmel, LA11 6QA, United Ki...",Cartmel,United Kingdom,££££,Creative British,https://www.lenclume.co.uk/,3 Stars,3.0,,
61,Hélène Darroze at The Connaught,"Carlos Place, Mayfair, London, W1K 2AL, United...",London,United Kingdom,££££,Modern Cuisine,https://www.the-connaught.co.uk/restaurants-ba...,3 Stars,3.0,-0.14929,51.510188
62,Alain Ducasse at The Dorchester,"Park Lane, Mayfair, London, W1K 1QA, United Ki...",London,United Kingdom,££££,French,https://www.alainducasse-dorchester.com/,3 Stars,3.0,-0.152575,51.507338
63,Fat Duck,"High Street, Bray, SL6 2AQ, United Kingdom",Bray,United Kingdom,££££,Creative,https://thefatduck.co.uk/,3 Stars,3.0,-0.701753,51.507858
64,"Sketch, The Lecture Room and Library","9 Conduit Street, Mayfair, London, W1S 2XG, Un...",London,United Kingdom,££££,Modern French,https://sketch.london/,3 Stars,3.0,-0.141537,51.512678


In [20]:
# Export the UK data to a csv file
uk_data.to_csv('../data/UK/uk_data_2024.csv', index=False)

In [21]:
# Filter the DataFrame for records where country == 'France'
france_data = michelin[michelin['country'] == 'France']
france_data.head()

Unnamed: 0,name,address,city,country,price,cuisine,url,award,stars,longitude,latitude
18,La Table du Castellet,"3001 route des Hauts-du-Camp, au Circuit Paul ...",Le Castellet,France,€€€€,Creative,http://www.hotelducastellet.net/fr/restaurants...,3 Stars,3.0,5.783887,43.249929
19,Plénitude - Cheval Blanc Paris,"8 quai du Louvre, Paris, 75001, France",Paris,France,€€€€,Creative,https://www.chevalblanc.com/fr/maison/paris/,3 Stars,3.0,2.342159,48.858815
20,Le Petit Nice,"Anse de Maldormé, Marseille, 13007, France",Marseille,France,€€€€,Seafood,https://www.passedat.fr,3 Stars,3.0,,
21,Mirazur,"30 avenue Aristide-Briand, Menton, 06500, France",Menton,France,€€€€,Creative,https://www.mirazur.fr/,3 Stars,3.0,7.528051,43.78593
22,AM par Alexandre Mazzia,"9 rue François-Rocca, Marseille, 13008, France",Marseille,France,€€€€,Creative,https://www.alexandre-mazzia.com/,3 Stars,3.0,5.386233,43.27011


In [22]:
# Define and export Monaco
monaco = france_data[france_data['city'] == 'Monaco']
monaco.to_csv('../data/France/monaco_2024.csv', index=False)

We remove Monaco from metropolitan France

In [23]:
france_data = france_data[france_data['city'] != 'Monaco']
france_data.shape

(1017, 11)

In [24]:
# Export the France data to a csv file
france_data.to_csv('../data/France/france_master_2024.csv', index=False)

----
&nbsp;
## Restaurants could be further partitioned by country from this point