##  <p align="center" style="background-color:#16A085; font-family:newtimeroman; color:#FFF9ED; font-size:120%; border-radius:10px 10px;"> Description </p>



# Restaurant Dashboard
- Type of Challenge: `Learning`
- Duration: `8 days`
- Development Deadline: `03/05/2023 4:30 PM`
- Repo Deadline: `12/05/2023 4:00 PM`
- Challenge: Individual (or Team)

![restaurant_food](https://media.giphy.com/media/7JzHsh3UTip20/giphy.gif)


## Mission objectives

- Be able to use data visualization libraries `matplotlib`, `seaborn`, or data tools like PowerBI to explore the data.
- Be able to clean a dataset for analysis.
- Be able to use colors in visualizations correctly.
- Be able to establish conclusions about a dataset.
- Be able to find and answer creative questions about data.
- Be able to think outside the box.
- Be able to create a dashboard containg visualizations that bring business insights to the client.


## The Mission

You are a data analysis consultant at an European travel agency. Your mission is to help the company find business insights from their data that will help them grow their business. 

To do so, you will create a dashboard! What is important to include in the dashboard? Ideally, this dashboard would help travel agents make recommendation to travellers on the best food destination for their trips across Europe.

As a starting point, they provide you with data they scrapped from Trip Advisor, a popular travel website. 

Dataset: [TripAdvisor Restaurants Info for 31 Euro-Cities](https://www.kaggle.com/datasets/damienbeneschi/krakow-ta-restaurans-data-raw)

##  <p align="center" style="background-color:#16A085; font-family:newtimeroman; color:#FFF9ED; font-size:120%; border-radius:10px 10px;"> Imports </p>



In [90]:
import pandas as pd 
import numpy as np 
from skimpy import clean_columns

In [91]:
df = pd.read_csv('../TripAdvisor_Restaurants_31Cities/Assests/TA_restaurants_curated.csv', index_col=None)

In [92]:
df.drop(columns='Unnamed: 0', inplace= True)

##  <p align="center" style="background-color:#16A085; font-family:newtimeroman; color:#FFF9ED; font-size:120%; border-radius:10px 10px;"> Data Preproccesing </p>

In [93]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 125527 entries, 0 to 125526
Data columns (total 10 columns):
 #   Column             Non-Null Count   Dtype  
---  ------             --------------   -----  
 0   Name               125527 non-null  object 
 1   City               125527 non-null  object 
 2   Cuisine Style      94176 non-null   object 
 3   Ranking            115876 non-null  float64
 4   Rating             115897 non-null  float64
 5   Price Range        77672 non-null   object 
 6   Number of Reviews  108183 non-null  float64
 7   Reviews            115911 non-null  object 
 8   URL_TA             125527 non-null  object 
 9   ID_TA              125527 non-null  object 
dtypes: float64(3), object(7)
memory usage: 9.6+ MB


In [99]:
df.head(7)

Unnamed: 0,Name,City,Cuisine Style,Ranking,Rating,Price Range,Number of Reviews,Reviews
0,Martine of Martine's Table,Amsterdam,"['French', 'Dutch', 'European']",1.0,5.0,$$ - $$$,136.0,"[['Just like home', 'A Warm Welcome to Wintry ..."
1,De Silveren Spiegel,Amsterdam,"['Dutch', 'European', 'Vegetarian Friendly', '...",2.0,4.5,$$$$,812.0,"[['Great food and staff', 'just perfect'], ['0..."
2,La Rive,Amsterdam,"['Mediterranean', 'French', 'International', '...",3.0,4.5,$$$$,567.0,"[['Satisfaction', 'Delicious old school restau..."
3,Vinkeles,Amsterdam,"['French', 'European', 'International', 'Conte...",4.0,5.0,$$$$,564.0,"[['True five star dinner', 'A superb evening o..."
4,Librije's Zusje Amsterdam,Amsterdam,"['Dutch', 'European', 'International', 'Vegeta...",5.0,4.5,$$$$,316.0,"[['Best meal.... EVER', 'super food experience..."
5,Ciel Bleu Restaurant,Amsterdam,"['Contemporary', 'International', 'Vegetarian ...",6.0,4.5,$$$$,745.0,"[['A treat!', 'Wow just Wow'], ['01/01/2018', ..."
6,Zaza's,Amsterdam,"['French', 'International', 'Mediterranean', '...",7.0,4.5,$$ - $$$,1455.0,"[['40th Birthday with my Family', 'One of the ..."


In [95]:
df.drop(['URL_TA', 'ID_TA'],axis = 1, inplace= True) # I'ö dropping this two columns because I don't need them. 

In [96]:
# functions that showing the percantage of the null values 

def number_nulls(df):
    missing = df.isnull().sum() 
    return missing

def percantage_nulls(serial):
    return serial.isnull().sum() * 100 / serial.shape[0]

In [97]:
for col in df.columns:
    
    print(col + " : " + str(percantage_nulls(df[col])))

Name : 0.0
City : 0.0
Cuisine Style : 24.975503278179197
Ranking : 7.688385765612179
Rating : 7.67165629705163
Price Range : 38.123272284050444
Number of Reviews : 13.816947748293195
Reviews : 7.660503318011265


In [100]:
for col in df.columns:
    
    print(col + " : " + str(number_nulls(df[col])))

Name : 0
City : 0
Cuisine Style : 31351
Ranking : 9651
Rating : 9630
Price Range : 47855
Number of Reviews : 17344
Reviews : 9616


In [101]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 125527 entries, 0 to 125526
Data columns (total 8 columns):
 #   Column             Non-Null Count   Dtype  
---  ------             --------------   -----  
 0   Name               125527 non-null  object 
 1   City               125527 non-null  object 
 2   Cuisine Style      94176 non-null   object 
 3   Ranking            115876 non-null  float64
 4   Rating             115897 non-null  float64
 5   Price Range        77672 non-null   object 
 6   Number of Reviews  108183 non-null  float64
 7   Reviews            115911 non-null  object 
dtypes: float64(3), object(5)
memory usage: 7.7+ MB
