In [1]:
import pandas as pd
import numpy as np

To determine the number of cities (including subregions) where Swiggy has its restaurants listed, I'll start by loading the data from the provided CSV file. After that, I'll inspect the columns to identify the relevant column(s) that contain city/subregion information. Once identified, I'll count the unique entries in that column to determine the total number of cities and subregions.

First I cleaned data and saved cleaned data to a csv file. I shared this file also. Please check my submissions. 

##### Loading the data:

In [2]:
data = pd.read_csv("SwiggyCleanedData.csv")

  data = pd.read_csv("SwiggyCleanedData.csv")


In [3]:
data

Unnamed: 0,city,sub_region,resturant_name,rating,rating_count,cost,cuisine
0,Ahmedabad,Vastrapur,M.A.D By Tomato'S,4.3,100+ ratings,₹ 1200,"Indian,Chinese"
1,Ahmedabad,Vastrapur,Tea Post,4.0,100+ ratings,₹ 150,Fast Food
2,Ahmedabad,Vastrapur,Shanghai Chicken Lolipops,--,Too Few Ratings,₹ 300,"Chinese,Fast Food"
3,Ahmedabad,Vastrapur,Ministry Of Momos,--,Too Few Ratings,₹ 300,Chinese
4,Ahmedabad,Vastrapur,Sizzling - The Cake Room,--,Too Few Ratings,₹ 350,Desserts
...,...,...,...,...,...,...,...
181399,,Yavatmal,The Food Delight,--,--,₹ 200,"Fast Food,Snacks"
181400,,Yavatmal,MAITRI FOODS & BEVERAGES,--,--,₹ 300,Pizzas
181401,,Yavatmal,Cafe Bella Ciao,--,--,₹ 300,"Fast Food,Snacks"
181402,,Yavatmal,GRILL ZILLA,--,--,₹ 250,Continental


The dataset contains a "city" column and a "sub_region" column that represent the cities and their subregions, respectively. To determine the number of unique cities and subregions where Swiggy has its restaurants listed, we can count the unique combinations of "city" and "sub_region".

### 1. How many cities (including subregions) where Swiggy is having its restaurants listed?

In [4]:

# Count the unique combinations of city and sub_region
unique_city_subregions = data.drop_duplicates(subset=['city', 'sub_region'])

# Determine the number of unique combinations
num_unique_city_subregions = unique_city_subregions.shape[0]

num_unique_city_subregions



865

#### Swiggy has its restaurants listed in 865 unique cities and subregions combined.

### 2. How many cities  (don't include subregions) where Swiggy is having their restaurants listed?

In [5]:
# Count the unique cities
unique_cities = data['city'].unique()

# Determine the number of unique cities
num_unique_cities = len(unique_cities)

num_unique_cities



27

#### Swiggy has its restaurants listed in 27 unique cities.

### 3. The Subregion of Delhi with the maximum number of restaurants listed on Swiggy?


In [6]:

# Filter the data for restaurants in Delhi
delhi_data = data[data['city'] == 'Delhi']

# Count the number of restaurants for each subregion of Delhi
subregion_counts = delhi_data['sub_region'].value_counts()

# Get the subregion with the maximum number of restaurants
max_subregion = subregion_counts.idxmax()
max_count = subregion_counts.max()

max_subregion, max_count



('Indirapuram', 1279)

#### The subregion of Delhi with the maximum number of restaurants listed on Swiggy is "Indirapuram" with 1,279 restaurants.

### 4. Name the top 5 Most Expensive Cities in the Datasets.

To determine the top 5 most expensive cities, we can calculate the average cost of restaurants for each city. The "cost" column seems to contain the restaurant's average cost, which we can use for this computation.

However, before computing the average cost, we need to clean the "cost" column as it contains currency symbols and might also have commas or other characters that could prevent direct numerical operations.

##### first clean the "cost" column, and then compute the average costs for each city to determine the top 5 most expensive cities.

In [7]:
# Replace empty strings with NaN and then convert the cost column to numeric values
data['numeric_cost'] = pd.to_numeric(data['cost'].str.replace('₹', '').str.replace(',', ''), errors='coerce')

# Calculate the average cost for each city again
avg_city_cost = data.groupby('city')['numeric_cost'].mean()

# Get the top 5 most expensive cities based on average cost
top_5_expensive_cities = avg_city_cost.sort_values(ascending=False).head(5)

top_5_expensive_cities


city
Mumbai       366.408890
Delhi        316.988494
Guwahati     315.705210
Kolkata      310.537793
Bangalore    302.755718
Name: numeric_cost, dtype: float64

#### The top 5 most expensive cities based on the average cost of restaurants listed on Swiggy are:

##### * Mumbai: ₹366.41 (approx.)
##### * Delhi: ₹316.99 (approx.)
##### * Guwahati: ₹315.71 (approx.)
##### * Kolkata: ₹310.54 (approx.)
##### * Bangalore: ₹302.76 (approx.)

### 5. List out the top 5 Restaurants with Maximum & minimum ratings throughout the dataset.

To determine the top 5 restaurants with the maximum and minimum ratings, we'll first need to clean and convert the "rating" column to a numeric format, since it appears to have string values (based on the initial data inspection).

Once the ratings are in a numeric format, we can identify the restaurants with the maximum and minimum ratings.



#### Cleaning the "rating" column:

In [8]:
# Convert rating column to numeric values, setting errors='coerce' to handle non-numeric values
data['numeric_rating'] = pd.to_numeric(data['rating'], errors='coerce')

# Get the top 5 restaurants with maximum ratings
top_5_max_ratings = data.sort_values(by='numeric_rating', ascending=False).drop_duplicates(subset='resturant_name').head(5)

# Get the top 5 restaurants with minimum ratings (excluding NaN values)
top_5_min_ratings = data[data['numeric_rating'].notna()].sort_values(by='numeric_rating').drop_duplicates(subset='resturant_name').head(5)

top_5_max_ratings[['resturant_name', 'numeric_rating']], top_5_min_ratings[['resturant_name', 'numeric_rating']]



(                      resturant_name  numeric_rating
 48995   Dessert Studio by Third Wave             5.0
 88280                Wraps and Rolls             5.0
 102450                 HRX by Eatfit             5.0
 38455                Regalo Delights             5.0
 88254                  Hungry Buddha             5.0,
                            resturant_name  numeric_rating
 21958             Ice Cream and Shakes Co             1.0
 13244                     Persian Delight             1.1
 24051   SHAWARMA WRAP - ROLL YOUR SECRETS             1.2
 180072   Champaran Mutton Hundy & Biryani             1.2
 38067                      THE TARI STORY             1.2)

#### Here are the top 5 restaurants with:

#### Maximum Ratings:

1. Dessert Studio by Third Wave: Rating 5.0
2. Wraps and Rolls: Rating 5.0
3. HRX by Eatfit: Rating 5.0
4. Regalo Delights: Rating 5.0
5. Hungry Buddha: Rating 5.0
#### Minimum Ratings:

1. Ice Cream and Shakes Co: Rating 1.0
2. Persian Delight: Rating 1.1
3. SHAWARMA WRAP - ROLL YOUR SECRETS: Rating 1.2
4. Champaran Mutton Hundy & Biryani: Rating 1.2
5. THE TARI STORY: Rating 1.2
The minimum ratings list has multiple restaurants with a rating of 1.2, indicating a tie for some positions.


### 6. Name of top 5 cities with the highest number of restaurants listed.

In [9]:
# Count the number of restaurants for each city
city_restaurant_counts = data['city'].value_counts()

# Get the top 5 cities with the highest number of restaurants listed
top_5_cities_with_most_restaurants = city_restaurant_counts.head(5)

top_5_cities_with_most_restaurants


city
Bangalore    16701
Delhi        14386
Pune         13145
Hyderabad    12439
Chennai      10987
Name: count, dtype: int64

#### The top 5 cities with the highest number of restaurants listed on Swiggy are:

* Bangalore: 16,701 restaurants
* Delhi: 14,386 restaurants
* Pune: 13,145 restaurants
* Hyderabad: 12,439 restaurants
* Chennai: 10,987 restaurants.

### 7. Top 10 cities as per the number of restaurants listed?

In [10]:
# Get the top 10 cities with the highest number of restaurants listed
top_10_cities_with_most_restaurants = city_restaurant_counts.head(10)

top_10_cities_with_most_restaurants



city
Bangalore    16701
Delhi        14386
Pune         13145
Hyderabad    12439
Chennai      10987
Kolkata       9651
Mumbai        7221
Jaipur        6250
Ahmedabad     4736
Gurgaon       4106
Name: count, dtype: int64

#### The top 10 cities with the highest number of restaurants listed on Swiggy are:

1. Bangalore: 16,701 restaurants
2. Delhi: 14,386 restaurants
3. Pune: 13,145 restaurants
4. Hyderabad: 12,439 restaurants
5. Chennai: 10,987 restaurants
6. Kolkata: 9,651 restaurants
7. Mumbai: 7,221 restaurants
8. Jaipur: 6,250 restaurants
9. Ahmedabad: 4,736 restaurants
10. Gurgaon: 4,106 restaurants.

### 8. Name the top 5 Most Popular Restaurants in Pune.

To determine the top 5 most popular restaurants in Pune, we'll consider the "rating_count" column as an indicator of popularity. A higher rating count would suggest that the restaurant is more popular, as more people have rated it.

However, before we can sort by this column, we need to clean and convert the "rating_count" column to a numeric format, since it appears to contain string values like "100+ ratings".

#### Cleaning the "rating_count" column and then identify the top 5 most popular restaurants in Pune:

In [11]:
# Replace "Too Few Ratings" with NaN and then convert the rating_count column to numeric values
data['numeric_rating_count'] = pd.to_numeric(data['rating_count'].str.replace(' ratings', '').str.replace('+', '').str.replace(',', '').replace('Too Few Ratings', float('nan')), errors='coerce')

# Filter for restaurants in Pune and sort by the numeric_rating_count to get the most popular restaurants again
pune_popular_restaurants = data[data['city'] == 'Pune'].sort_values(by='numeric_rating_count', ascending=False).drop_duplicates(subset='resturant_name')

# Get the top 5 most popular restaurants in Pune
top_5_popular_pune_restaurants = pune_popular_restaurants.head(5)

top_5_popular_pune_restaurants[['resturant_name', 'numeric_rating_count']]


Unnamed: 0,resturant_name,numeric_rating_count
98025,Yaron Da Adda - Dil se Punjabi,500.0
98900,Bedekar Misal,500.0
95113,Delhi Kitchen (Pimple Saudagar),500.0
95107,Mad Over Bowls (MOB),500.0
95095,Bird Valley,500.0


#### The top 5 most popular restaurants in Pune, based on the number of ratings, are:

* Yaron Da Adda - Dil se Punjabi: 500 ratings
* Bedekar Misal: 500 ratings
* Delhi Kitchen (Pimple Saudagar): 500 ratings
* Mad Over Bowls (MOB): 500 ratings
* Bird Valley: 500 ratings

Multiple restaurants have the same number of ratings, indicating a tie for some positions.

### 9. Which SubRegion in Delhi is having the least expensive restaurant in terms of cost?

To determine the subregion in Delhi with the least expensive restaurant, we'll look for the minimum value in the numeric_cost column for restaurants in Delhi and then identify the corresponding subregion.


In [12]:
# Filter data for Delhi
delhi_data = data[data['city'] == 'Delhi']

# Find the row with the minimum cost in Delhi
least_expensive_row = delhi_data[delhi_data['numeric_cost'] == delhi_data['numeric_cost'].min()]

# Extract the subregion of the least expensive restaurant
least_expensive_subregion = least_expensive_row['sub_region'].iloc[0]
least_expensive_cost = least_expensive_row['numeric_cost'].iloc[0]

least_expensive_subregion, least_expensive_cost


('Dilshad Gardens', 1.0)

#### The subregion in Delhi with the least expensive restaurant, in terms of cost, is "Dilshad Gardens" with a cost of ₹1.0.

### 10. Top 5 most popular restaurant chains in India?

To determine the top 5 most popular restaurant chains in India, we'll consider the "numeric_rating_count" column as an indicator of popularity. A chain is defined as a restaurant name that appears in multiple cities or subregions.

We'll aggregate the total rating counts for each restaurant name across the dataset and then identify the top 5 chains based on these aggregated counts.

In [13]:
# Group by restaurant name and aggregate the total rating counts
restaurant_chain_popularity = data.groupby('resturant_name')['numeric_rating_count'].sum()

# Sort the restaurants by total rating counts to get the most popular chains
top_5_popular_chains = restaurant_chain_popularity.sort_values(ascending=False).head(5)

top_5_popular_chains


resturant_name
KFC                       41216.9
Subway                    37744.6
NIC Natural Ice Creams    37175.9
Pizza Hut                 34515.0
Domino's Pizza            32043.8
Name: numeric_rating_count, dtype: float64

#### The top 5 most popular restaurant chains in India, based on the number of ratings, are:

1. KFC: 41,216.9 ratings
2. Subway: 37,744.6 ratings
3. NIC Natural Ice Creams: 37,175.9 ratings
4. Pizza Hut: 34,515.0 ratings
5. Domino's Pizza: 32,043.8 ratings

### 11. Which restaurant in Pune has the most number of people visiting?

The number of people visiting can be inferred from the "numeric_rating_count" column, as a higher rating count indicates more people have visited and rated the restaurant.

To determine which restaurant in Pune has the most number of people visiting, we'll filter for restaurants in Pune and then identify the one with the highest rating count.

In [14]:
# Filter for restaurants in Pune and sort by the numeric_rating_count to get the most visited restaurant
most_visited_pune_restaurant = data[data['city'] == 'Pune'].sort_values(by='numeric_rating_count', ascending=False).head(1)

# Extract the restaurant name and its rating count
most_visited_restaurant_name = most_visited_pune_restaurant['resturant_name'].iloc[0]
most_visited_restaurant_count = most_visited_pune_restaurant['numeric_rating_count'].iloc[0]

most_visited_restaurant_name, most_visited_restaurant_count


('Yaron Da Adda - Dil se Punjabi', 500.0)

#### The restaurant in Pune with the most number of people visiting (based on rating count) is "Yaron Da Adda - Dil se Punjabi" with 500 ratings.

### 12. Top 10 Restaurants with Maximum Ratings in Banglore

In [15]:
# Filter for restaurants in Bangalore and sort by the numeric_rating to get the top 10 restaurants with maximum ratings
top_10_rated_bangalore_restaurants = data[data['city'] == 'Bangalore'].sort_values(by='numeric_rating', ascending=False).drop_duplicates(subset='resturant_name').head(10)

top_10_rated_bangalore_restaurants[['resturant_name', 'numeric_rating']]


Unnamed: 0,resturant_name,numeric_rating
113027,SHAWARMA INDIAH,5.0
14272,BigBites,5.0
11269,Joao's Croissant Fusion,5.0
11291,Hot Chillies Fast Food,5.0
11299,PEDRO DE GOA,5.0
8227,Mamaji PavBhaji,5.0
11557,ESCOBAR PANINI s,5.0
11562,Mels SubHub,5.0
11742,HRX by Eatfit,5.0
8134,The Lassi Pub,5.0


#### The top 10 restaurants in Bangalore with the maximum ratings are:

SHAWARMA INDIAH: Rating 5.0
BigBites: Rating 5.0
Joao's Croissant Fusion: Rating 5.0
Hot Chillies Fast Food: Rating 5.0
PEDRO DE GOA: Rating 5.0
Mamaji PavBhaji: Rating 5.0
ESCOBAR PANINI s: Rating 5.0
Mels SubHub: Rating 5.0
HRX by Eatfit: Rating 5.0
The Lassi Pub: Rating 5.0

Multiple restaurants have a rating of 5.0, indicating a tie for the top positions.

### 13. Top 10 Restaurant in Patna w.r.t rating 

In [16]:

patna_data = data[data['city'] == 'Patna']
len(patna_data)


0

In [17]:
# Filter for restaurants in Patna and sort by the numeric_rating to get the top 10 restaurants based on ratings
top_10_rated_patna_restaurants = data[data['city'] == 'Patna'].sort_values(by='numeric_rating', ascending=False).drop_duplicates(subset='resturant_name').head(10)

top_10_rated_patna_restaurants[['resturant_name', 'numeric_rating']]


Unnamed: 0,resturant_name,numeric_rating


#### It appears that there are no restaurants listed for Patna in the dataset. Thus, we cannot provide the top 10 restaurants in Patna with respect to ratings.

### Conclusion:

#### Geographical Presence:
   * Swiggy has its restaurants listed in 27 unique cities and 865 unique city-subregion combinations.
   * The top cities with the highest number of restaurant listings are Bangalore, Delhi, Pune, Hyderabad, and Chennai.
#### Restaurant Popularity:
   * The top 5 most popular restaurant chains in India, based on the number of ratings, are KFC, Subway, NIC Natural Ice Creams, Pizza Hut, and Domino's Pizza.
   * In Pune, the most visited restaurant is "Yaron Da Adda - Dil se Punjabi".
   * In Bangalore, multiple restaurants have a perfect rating of 5.0, making them the top-rated ones.
#### Cost Analysis:
   * Mumbai is the most expensive city in terms of average restaurant cost, followed by Delhi, Guwahati, Kolkata, and Bangalore.
   * The subregion in Delhi with the least expensive restaurant (cost of ₹1.0) is Dilshad Gardens.
#### Ratings:
   * The subregion in Delhi with the most number of restaurants listed on Swiggy is Indirapuram.
   * Several restaurants, particularly in major cities, have achieved the maximum rating of 5.0.
   * There were no listings for Patna in the dataset, so we couldn't determine the top-rated restaurants there.     

##### This dataset provides valuable insights into the Indian restaurant landscape as seen through Swiggy. It showcases consumer preferences, restaurant popularity, and geographical distribution. Such insights can be instrumental for restaurant owners to understand market dynamics, for consumers to make informed dining decisions, and for platforms like Swiggy to refine their strategies and offerings.