# Advanced Aggregations and Business Insights

## 1. Data Overview
- loading data
- quick check

In [128]:
import pandas as pd

df = pd.read_csv("tips.csv")
df.head()


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


In [129]:
df.info()
df.describe()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 244 entries, 0 to 243
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   total_bill  244 non-null    float64
 1   tip         244 non-null    float64
 2   sex         244 non-null    object 
 3   smoker      244 non-null    object 
 4   day         244 non-null    object 
 5   time        244 non-null    object 
 6   size        244 non-null    int64  
dtypes: float64(2), int64(1), object(4)
memory usage: 13.5+ KB


Unnamed: 0,total_bill,tip,size
count,244.0,244.0,244.0
mean,19.785943,2.998279,2.569672
std,8.902412,1.383638,0.9511
min,3.07,1.0,1.0
25%,13.3475,2.0,2.0
50%,17.795,2.9,2.0
75%,24.1275,3.5625,3.0
max,50.81,10.0,6.0


## 2. Advanced Groupby & Aggregations


- multiple aggregations

In [130]:
df.groupby("day")["total_bill"].agg(
    mean_bill="mean",
    max_bill="max",
    count_orders="count" 
)

Unnamed: 0_level_0,mean_bill,max_bill,count_orders
day,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Fri,17.151579,40.17,19
Sat,20.441379,50.81,87
Sun,21.41,48.17,76
Thur,17.682742,43.11,62


- multiple columns

In [131]:
df.groupby("day").agg(
    avg_bill=("total_bill", "mean"),
    avg_tip=("tip", "mean"),
    max_tip=("tip", "max")
)

Unnamed: 0_level_0,avg_bill,avg_tip,max_tip
day,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Fri,17.151579,2.734737,4.73
Sat,20.441379,2.993103,10.0
Sun,21.41,3.255132,6.5
Thur,17.682742,2.771452,6.7


## 3. Sorting and Rankings

- best vs worst days

In [132]:
summary_days = df.groupby("day")["total_bill"].mean()
summary_days.sort_values(ascending=False)

day
Sun     21.410000
Sat     20.441379
Thur    17.682742
Fri     17.151579
Name: total_bill, dtype: float64

- lunch vs dinner

In [133]:
summary_time = df.groupby("time")["total_bill"].mean()
summary_time.sort_values(ascending=False)

time
Dinner    20.797159
Lunch     17.168676
Name: total_bill, dtype: float64

Preview first/last rows (head/tail) — not ranking

head()/tail() return the first/last rows in the current index order.
This is not the same as “top/bottom by value” unless the Series is sorted first.

In [134]:
print(summary_days.head(2))
print(summary_days.tail(1))

day
Fri    17.151579
Sat    20.441379
Name: total_bill, dtype: float64
day
Thur    17.682742
Name: total_bill, dtype: float64


## 4. Mini-tasks

-Average bill and average tip by day, sorted in descending order by bill

In [135]:
df.groupby("day").agg(
    av_bill=("total_bill", "mean"),
    av_tip=("tip", "mean")
).sort_values(by="av_bill", ascending=False)

Unnamed: 0_level_0,av_bill,av_tip
day,Unnamed: 1_level_1,Unnamed: 2_level_1
Sun,21.41,3.255132
Sat,20.441379,2.993103
Thur,17.682742,2.771452
Fri,17.151579,2.734737


-Average bill: Lunch vs Dinner

In [136]:
df.groupby("time").agg(
    av_bill=("total_bill", "mean"),
)

Unnamed: 0_level_0,av_bill
time,Unnamed: 1_level_1
Dinner,20.797159
Lunch,17.168676


-Day with the highest number of orders

In [137]:
df.groupby("day").agg(
    number_orders=("total_bill", "count"),
).sort_values(by="number_orders", ascending=False).head(1)

Unnamed: 0_level_0,number_orders
day,Unnamed: 1_level_1
Sat,87


-Which gender leaves a higher average tip

In [138]:
df.groupby("sex").agg(
    avg_tip=("tip", "mean"),
).sort_values(by="avg_tip", ascending=False).head(1)

Unnamed: 0_level_0,avg_tip
sex,Unnamed: 1_level_1
Male,3.089618


-Top 3 days with the highest average tip

In [139]:
df.groupby("day").agg(
    avg_tip=("tip", "mean"),
).sort_values(by="avg_tip", ascending=False).head(3)

Unnamed: 0_level_0,avg_tip
day,Unnamed: 1_level_1
Sun,3.255132
Sat,2.993103
Thur,2.771452


## 5. Mini day project 

Analysis:

“When does the restaurant earn the most?”

Calculate:

Average bill by day

Average bill by time of day

Number of orders by day

Average tip by day

In [140]:
df.groupby("day").agg(
    mean_bill=("total_bill", "mean"),
).sort_values(by="mean_bill", ascending=False)

Unnamed: 0_level_0,mean_bill
day,Unnamed: 1_level_1
Sun,21.41
Sat,20.441379
Thur,17.682742
Fri,17.151579


In [141]:
df.groupby("time").agg(
    mean_bill=("total_bill", "mean"),
).sort_values("mean_bill", ascending=False)

Unnamed: 0_level_0,mean_bill
time,Unnamed: 1_level_1
Dinner,20.797159
Lunch,17.168676


In [142]:
df.groupby("day").size().sort_values(ascending=False).to_frame("orders")


Unnamed: 0_level_0,orders
day,Unnamed: 1_level_1
Sat,87
Sun,76
Thur,62
Fri,19


In [143]:
df.groupby("day").agg(
    mean_tip=("tip", "mean"),
)

Unnamed: 0_level_0,mean_tip
day,Unnamed: 1_level_1
Fri,2.734737
Sat,2.993103
Sun,3.255132
Thur,2.771452


In [144]:
df.groupby(["day","time"]).agg(
    avg_bill=("total_bill","mean"),
    avg_tip=("tip","mean"),
    orders=("total_bill","count")
).sort_values(by="avg_bill", ascending=False)


Unnamed: 0_level_0,Unnamed: 1_level_0,avg_bill,avg_tip,orders
day,time,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Sun,Dinner,21.41,3.255132,76
Sat,Dinner,20.441379,2.993103,87
Fri,Dinner,19.663333,2.94,12
Thur,Dinner,18.78,3.0,1
Thur,Lunch,17.664754,2.767705,61
Fri,Lunch,12.845714,2.382857,7


In [145]:
df.assign(tip_rate=df["tip"]/df["total_bill"]).groupby("day")["tip_rate"].mean().sort_values(ascending=False)


day
Fri     0.169913
Sun     0.166897
Thur    0.161276
Sat     0.153152
Name: tip_rate, dtype: float64

## Conclusions

Sundays generate the highest average bill, followed closely by Saturdays. This suggests that weekends are the most profitable days for the restaurant in terms of customer spending.

Dinner time has a significantly higher average bill than lunch, indicating that evenings are the most lucrative time of day. Customers tend to spend more during dinner services.

Average tips are highest on Sundays, which aligns with higher total bills on that day. This indicates that customers are more generous when overall spending is higher.

Fridays have the lowest average tip, despite being close to the weekend. This may suggest more price-sensitive customers or shorter visits on Fridays.

Overall, the restaurant performs best during weekend dinners, combining both higher bills and higher tips. Optimizing staffing and promotions for these periods could further increase revenue.