# **Data Visualization**

# Objectives
Explore and visualize restaurant data across 31 European cities to uncover patterns and insights related to cuisine, ratings, rankings, pricing, and customer engagement. Through interactive visualizations and descriptive analysis, we aim to:

- Understand the distribution and popularity of different cuisine styles across cities.

- Analyze the relationship between restaurant rankings, ratings, and number of reviews.

- Examine how price ranges correlate with ratings and cuisine and city.

These insights can be useful for tourists, restaurant owners, and analysts seeking to better understand the European restaurant landscape.

# Input
* The input can be found [here](../data_set/processed/TA_restaurants_cleaned.csv)
* This is a csv file contained the cleaned data outputted by the ETL process.

# Outputs

- All the visualizations have been saved as PNG files and are stored in a designated folder for easy access and reference, which can be found [here](../Images).

---

# Change working directory
Change the working directory from its current folder to its parent folder as the notebooks will be stored in a subfolder
* We access the current directory with os.getcwd()

In [1]:
import os
current_dir = os.getcwd()
current_dir

'c:\\Users\\amron\\Desktop\\euro-dine-insights\\jupyter_notebooks'

Make the parent of the current directory the new current directory
* os.path.dirname() gets the parent directory
* os.chir() defines the new current directory

In [2]:
os.chdir(os.path.dirname(current_dir))
print("You set a new current directory")

You set a new current directory


Confirm the new current directory

In [3]:
current_dir = os.getcwd()
current_dir

'c:\\Users\\amron\\Desktop\\euro-dine-insights'

Changing path directory to the dataset

In [4]:
#path directory
raw_data_dir = os.path.join(current_dir, 'data_set/raw') 

#path directory
processed_data_dir = os.path.join(current_dir, 'data_set/processed') 


---

# Import packages

In [5]:
import pandas as pd # Import pandas
import matplotlib.pyplot as plt # Import matplotlib
import seaborn as sns # Import seaborn
import plotly.express as px # Import plotly
sns.set_style('whitegrid') # Set style for visuals

---

# Load the cleaned dataset

In [7]:
#load the cleaned dataset
df = pd.read_csv(os.path.join(processed_data_dir, 'TA_restaurants_cleaned.csv'))

#display first 5 rows of data
df.head() 

Unnamed: 0,Name,City,Cuisine,Ranking,Rating,Price_Range,Number_of_Reviews,Cuisine_Counts,Country_Name
0,Martine of Martine's Table,Amsterdam,"['French', 'Dutch', 'European']",1.0,5.0,$$ - $$$,136,3,Netherlands
1,De Silveren Spiegel,Amsterdam,"['Dutch', 'European', 'Vegetarian Friendly', '...",2.0,4.5,$$$$,812,4,Netherlands
2,La Rive,Amsterdam,"['Mediterranean', 'French', 'International', '...",3.0,4.5,$$$$,567,6,Netherlands
3,Vinkeles,Amsterdam,"['French', 'European', 'International', 'Conte...",4.0,5.0,$$$$,564,7,Netherlands
4,Librije's Zusje Amsterdam,Amsterdam,"['Dutch', 'European', 'International', 'Vegeta...",5.0,4.5,$$$$,316,6,Netherlands


---