# 🍽️ Zomato Dataset - Exploratory Data Analysis (EDA)
This notebook explores the Zomato dataset to understand cuisine trends, city-level patterns, and restaurant characteristics across countries. The analysis supports building an interactive Streamlit dashboard.

**Zomato Dataset Exploratory Analysis**

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
df= pd.read_csv('zomato.csv',encoding='Latin-1')
df.head()

In [None]:
df.columns

In [None]:
df.info()


In [None]:
df.describe()

In [None]:
df.shape

In [None]:
##Cheacking weather there is a null value in the dataset
df.isnull().sum()

In [None]:
df_country=pd.read_excel('Country-Code.xlsx')
df_country.head()

In [None]:
df.columns

In [None]:
final_df=pd.merge(df,df_country,on='Country Code',how='left')
final_df.head()

In [None]:
## plotting a pie chart - Top 3 countries
country_names=final_df.Country.value_counts().index
country_vals=final_df.Country.value_counts().values


In [None]:
plt.pie(country_vals[:3],labels=country_names[:3],autopct='%1.2f%%')
plt.show()

**OBSERVATION** : zomato's maximum record of transactions are from India after that USA and then united kingdom

In [None]:
final_df.columns

In [None]:
ratings=final_df.groupby(['Aggregate rating', 'Rating color', 'Rating text']).size().reset_index().rename(columns={0:'Rating count'})

In [None]:
ratings

**Observation**
1. When Rating is between 4.5 to 4.9---> Excellent
2. When Rating are between 4.0 to 3.4--->very good
3. when Rating is between 3.5 to 3.9----> good
4. when Rating is between 3.0 to 3.4----> average
5. when Rating is between 2.5 to 2.9----> average
6. when Rating is between 2.0 to 2.4----> Poor

In [None]:
ratings.head()

In [None]:
import matplotlib
matplotlib.rcParams['figure.figsize'] = (12, 6)
sns.barplot(x='Aggregate rating',y='Rating count',hue='Rating color',data=ratings,palette=['blue','red','orange','yellow','green','green'])
plt.show()

**OBSERVATION:**
 1. not rated count is very high
 2. maximum rating count is between 2.8 and 3.9

In [None]:
# Countplot
sns.countplot(x='Rating color',data=ratings,palette=['blue','red','orange','yellow','green','green'])
plt.show()

In [None]:
# Finding the country names which gave 0 ratings
final_df[final_df['Rating color']=='white'].groupby('Country').size().reset_index()

**Observation:** maximum number of zero rating are from indian customers

In [None]:
# finding which currency is used by which country
final_df.groupby(['Country','Currency']).size().reset_index().rename(columns={0:'count'})

In [None]:
# Which countries have online delivery option
final_df[final_df['Has Online delivery']=='Yes'].Country.value_counts()

In [None]:
final_df.groupby(['Has Online delivery','Country']).size().reset_index()

**OBSERVATION:** online delivery are only present in India and UAE

In [None]:
final_df.columns

In [None]:
# creating a pie chart for top 5 city distribution 


In [None]:
final_df.City.value_counts().index

In [None]:
city_value= final_df.City.value_counts().values
city_name= final_df.City.value_counts().index

In [None]:
plt.pie(city_value[:5],labels=city_name[:5],autopct='%1.2f%%')
plt.show()