# Cognifyz Data Science Internship
## Level 3 Tasks

This notebook focuses on customer preference analysis and data visualization to gain insights from restaurant data.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
# Load dataset
df = pd.read_csv("Dataset .csv")
df.head()

In [None]:
# Basic preprocessing
df = df.dropna(subset=["Aggregate rating"])
df["Votes"] = df["Votes"].fillna(0)

## Task 1: Customer Preference Analysis
Analyze the relationship between cuisine type, customer votes, and ratings.

In [None]:
# Top cuisines by average rating
top_cuisines = df.groupby("Cuisines")["Aggregate rating"].mean().sort_values(ascending=False).head(10)
top_cuisines

In [None]:
# Most popular cuisines by total votes
popular_cuisines = df.groupby("Cuisines")["Votes"].sum().sort_values(ascending=False).head(10)
popular_cuisines

## Task 2: Data Visualization

In [None]:
# Distribution of ratings
plt.hist(df["Aggregate rating"], bins=20)
plt.xlabel("Aggregate Rating")
plt.ylabel("Restaurant Count")
plt.title("Distribution of Restaurant Ratings")
plt.show()

In [None]:
# Average rating by city (Top 10)
city_rating = df.groupby("City")["Aggregate rating"].mean().sort_values(ascending=False).head(10)
city_rating.plot(kind="bar")
plt.ylabel("Average Rating")
plt.title("Top 10 Cities by Average Rating")
plt.show()

In [None]:
# Votes vs Rating relationship
plt.scatter(df["Votes"], df["Aggregate rating"], alpha=0.5)
plt.xlabel("Votes")
plt.ylabel("Aggregate Rating")
plt.title("Votes vs Aggregate Rating")
plt.show()

## Key Insights
- Certain cuisines consistently receive higher average ratings.
- Higher customer engagement (votes) correlates with more reliable ratings.
- City and cuisine significantly influence restaurant performance.