Let's import the Pandas library

In [None]:
import pandas as pd

Let's load a dataset containing information about restaurants available on the Swiggy food delivery platform. Note that this data is compiled by third-party sources and isn't affiliated with Swiggy or any of the mentioned restaurants.

In [None]:
df = pd.read_csv('swiggy.csv')
df.head()

In [None]:
df = pd.read_csv('swiggy.csv', index_col = 'ID')
df.head()

In [None]:
df.info()

The 'Price' column denotes the average food prices at the restaurant. The 'Delivery time' column denotes the average delivery times at the restaurant.

Let's begin by finding all restaurants with an average rating that is 4.0 or higher.

In [None]:
df['Avg ratings'] >= 4.0

In [None]:
df[df['Avg ratings'] >= 4.0]

Let's restrict this to restaurants that have at least 500 ratings

In [None]:
df[(df['Avg ratings'] >= 4.0) & (df['Total ratings'] >= 500)]

Let's find fast and affordable restaurants which have a delivery time of less than 30 minutes and a price point of less than 300

In [None]:
df[(df['Delivery time'] < 30) & (df['Price'] < 300)]

Let's view just the restaurant names, city, and average ratings of the restaurants obtained in our previous query

In [None]:
df[(df['Delivery time'] < 30) & (df['Price'] < 300)][['Restaurant', 'City', 'Avg ratings']]

In [None]:
df.loc[(df['Delivery time'] < 30) & (df['Price'] < 300), ['Restaurant', 'City', 'Avg ratings']]

Let's find restaurants in the Koramangala region that serve Chinese cuisine

In [None]:
df[(df['Area'] == 'Koramangala') & (df['Food type'].str.contains('Chinese'))]

What if we wanted restaurants in Koramangala which exclusively serve Chinese cuisine?

In [None]:
df[(df['Area'] == 'Koramangala') & (df['Food type'].str.contains('Chinese')) & (~df['Food type'].str.contains(','))]

There's been a lot of complaints about restaurants not serving their food on time. We want to increase the delivery time by five minutes to account for any possible delays. Note that the following operation will modify our DataFrame.

In [None]:
df['Delivery time'] += 5

Let's run our previous filtering query to inspect these changes

In [None]:
df[(df['Area'] == 'Koramangala') & (df['Food type'].str.contains('Chinese')) & (~df['Food type'].str.contains(','))]

For future verification purposes, let's save this query as a mask

In [None]:
(df['Area'] == 'Koramangala') & (df['Food type'].str.contains('Chinese')) & (~df['Food type'].str.contains(','))

In [None]:
mask = (df['Area'] == 'Koramangala') & (df['Food type'].str.contains('Chinese')) & (~df['Food type'].str.contains(','))

There's a platform-wide discount running for restaurants rated 4.0 or higher. Let's discount all prices by 15%.

In [None]:
# df.loc['Avg ratings' >= 4.0, 'Price'] *= 0.85  # why won't this work?

In [None]:
df.loc[df['Avg ratings'] >= 4.0, 'Price']

In [None]:
df.loc[df['Avg ratings'] >= 4.0, 'Price'] *= 0.85

In [None]:
df[mask]

Let's create a new column to indicate if a restaurant has an 'Address' which is different from its 'Area'. Let's call this 'Detailed address available'.

In [None]:
df['Detailed address available'] = df['Area'] != df['Address']

In [None]:
df[mask]

Let's find the average of 'Delivery time' across all restaurants

In [None]:
df['Delivery time'].mean()

Let's make a new column 'Quality' to categorise each restaurant based on their ratings, using the criteria
*   'Excellent' if the average rating is at least 4.5
*   'Good' if the average rating is at least 4.0 but less than 4.5
*   'Average' if the average rating is at least 3.0 but less than 4.0
*   'Poor' if the average rating is less than 3.0




In [None]:
df['Avg ratings'] >= 4.5

In [None]:
def categorise_quality(ratings_row):
    if ratings_row >= 4.5:
        return 'Excellent'
    elif ratings_row >= 4.0:
        return 'Good'
    elif ratings_row >= 3.0:
        return 'Average'
    else:
        return 'Poor'

In [None]:
df['Avg ratings']

In [None]:
# categorise_quality(df['Avg ratings'])  # why won't this work?

In [None]:
# df['Quality']

In [None]:
df['Quality'] = df['Avg ratings'].apply(categorise_quality)

In [None]:
df['Quality']

In [None]:
df[mask]

Let's try making another function do to the same thing. However, this time, we will change how the function works internally.

In [None]:
def categorise_quality_2(row):
    if row['Avg ratings'] >= 4.5:
        return 'Excellent'
    elif row['Avg ratings'] >= 4.0:
        return 'Good'
    elif row['Avg ratings'] >= 3.0:
        return 'Average'
    else:
        return 'Poor'

In [None]:
# df['Quality 2'] = df['Avg ratings'].apply(categorise_quality_2)  # why won't this work?

In [None]:
# df['Quality 2'] = df.apply(categorise_quality_2)  # why won't this work?

In [None]:
df['Quality 2'] = df.apply(categorise_quality_2, axis = 1)

In [None]:
df[mask]

Let's count the unique number of words used in naming restaurants across our database

In [None]:
def unique_name_words(restaurant_name_series):
    restaurant_names = restaurant_name_series.str.split().sum()
    return len(set(restaurant_names))

In [None]:
df['Restaurant']

In [None]:
# df['Restaurant'].apply(unique_name_words)  # why won't this work?

In [None]:
unique_name_words(df['Restaurant'])