# Project 4 â€” Zomato EDA (Exploratory Data Analysis)

This project explores Zomato restaurant dataset using visualizations to discover insights about cities, cuisines, ratings, price distributions, and customer preferences.

**Dataset Example:** `zomato.csv` (commonly available on Kaggle)

**Objectives:**
- Analyze restaurant distribution across major cities
- Explore most popular cuisines
- Understand relationship between pricing and ratings
- Identify trends for business decisions


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

sns.set(style='whitegrid')

In [None]:
df = pd.read_csv('zomato.csv', encoding='latin-1')
print('Shape:', df.shape)
df.head()

In [None]:
df = df.dropna(subset=['name'])

if 'rate' in df.columns:
    df['rate'] = df['rate'].replace('NEW', np.nan)
    df['rate'] = df['rate'].str.replace('/5', '').str.strip()
    df['rate'] = pd.to_numeric(df['rate'], errors='coerce')

df[['name','city','cuisines','rate']].head()

In [None]:
top_cities = df['city'].value_counts().head(10)
plt.figure(figsize=(8,5))
sns.barplot(x=top_cities.values, y=top_cities.index)
plt.title('Top 10 Cities - Restaurant Count')
plt.xlabel('Count')
plt.ylabel('City')
plt.show()

In [None]:
cuisine_freq = df['cuisines'].dropna().str.split(',', expand=True).stack().str.strip().value_counts().head(15)
plt.figure(figsize=(8,5))
sns.barplot(x=cuisine_freq.values, y=cuisine_freq.index)
plt.title('Most Popular Cuisines')
plt.xlabel('Frequency')
plt.ylabel('Cuisine')
plt.show()

In [None]:
if 'approx_cost(for two people)' in df.columns:
    df['approx_cost(for two people)'] = pd.to_numeric(df['approx_cost(for two people)'].astype(str).str.replace(',', ''), errors='coerce')

    plt.figure(figsize=(7,5))
    sns.scatterplot(x='rate', y='approx_cost(for two people)', data=df, alpha=0.4)
    plt.title('Price vs Rating')
    plt.xlabel('Ratings')
    plt.ylabel('Cost for Two')
    plt.show()

### Insights & Observations
- Some cities have much denser restaurant competition than others.
- Understanding popular cuisines helps in demand forecasting.
- Higher prices do not always mean better ratings; customer expectations vary.
- Additional analysis like word clouds, geo-maps, sentiment analysis can improve insights.

### Next Enhancements
- Use NLP on reviews to extract opinion trends.
- Add geospatial visualization using Folium or Plotly.
- Build dashboard in Power BI or Tableau.
