# Explanation of data

## 1-total_bill:

Description: The total amount of the bill (in dollars) for the meal, including the cost of food and drinks.
Data Type: Float
Example: 16.99
## 2-tip:

Description: The amount of the tip (in dollars) given by the customer to the server.
Data Type: Float
Example: 1.01
## 3-sex:

Description: The gender of the person who paid for the meal.
Data Type: Categorical (string)
Possible Values: 'Male', 'Female'
Example: 'Female'
## 4-smoker:

Description: Indicates whether there were smokers in the party.
Data Type: Categorical (string)
Possible Values: 'Yes', 'No'
Example: 'No'
## 5-day:

Description: The day of the week when the meal was served.
Data Type: Categorical (string)
Possible Values: 'Thur' (Thursday), 'Fri' (Friday), 'Sat' (Saturday), 'Sun' (Sunday)
Example: 'Sun'
## 6-time:

Description: The time of day when the meal was served.
Data Type: Categorical (string)
Possible Values: 'Lunch', 'Dinner'
Example: 'Dinner'
## 7-size:

Description: The number of people in the party.
Data Type: Integer
Example: 2

## Load the dataset

In [1]:
import numpy as np
import pandas as pd

df = pd.read_csv("tips.csv")

## Display the first few rows

In [2]:
df.head(10)

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4
5,25.29,4.71,Male,No,Sun,Dinner,4
6,8.77,2.0,Male,No,Sun,Dinner,2
7,26.88,3.12,Male,No,Sun,Dinner,4
8,15.04,1.96,Male,No,Sun,Dinner,2
9,14.78,3.23,Male,No,Sun,Dinner,2


## Question 1: What is the average total bill amount?

In [3]:
df["total_bill"].mean()

19.78594262295082

## Question 2: How many records are in the dataset?


In [4]:
df.info()
# 244 records

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 244 entries, 0 to 243
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   total_bill  244 non-null    float64
 1   tip         244 non-null    float64
 2   sex         244 non-null    object 
 3   smoker      244 non-null    object 
 4   day         244 non-null    object 
 5   time        244 non-null    object 
 6   size        244 non-null    int64  
dtypes: float64(2), int64(1), object(4)
memory usage: 13.5+ KB


## Question 3: What is the total amount of tips given by male customers?


In [6]:
filter1 = df["sex"] == "Male"
df["tip"].where(filter1, inplace=False).sum()

485.06999999999994

## Question 4: What is the maximum tip given?


In [7]:
df["tip"].max()

10.0

## Question 5: What are the unique days on which the tips were recorded?


In [9]:
filter2 = df["tip"] != 0.00
df["day"].where(filter2, inplace=False).unique()

array(['Sun', 'Sat', 'Thur', 'Fri'], dtype=object)

## Question 6: What is the average tip amount given by female customers?


In [10]:
filter3 = df["sex"] == "Female"
df["tip"].where(filter3, inplace=False).mean()

2.833448275862069

## Question 7: How many customers are non-smokers?


In [11]:
df["smoker"].value_counts()
# 151 non-smokers

smoker
No     151
Yes     93
Name: count, dtype: int64

## Question 8: What is the average total bill for dinners?


In [12]:
filter4 = df["time"] == "Dinner"
df["total_bill"].where(filter4, inplace=False).mean()

20.79715909090909

## Question 9: What is the smallest size of a group recorded?


In [13]:
df["size"].min()

1

## Question 10: What is the standard deviation of the total bill amounts?


In [14]:
df["total_bill"].describe()
# 8.902412

count    244.000000
mean      19.785943
std        8.902412
min        3.070000
25%       13.347500
50%       17.795000
75%       24.127500
max       50.810000
Name: total_bill, dtype: float64

## Question 11: How many male smokers are there compared to female smokers?


In [58]:
filter14 = df["smoker"] == "Yes"
filter15 = df["sex"] == "Male"
filter16 = df["sex"] == "Female"

print(df["sex"].where(filter14 & filter15, inplace=False).count())
# male smokers = 60
print(df["sex"].where(filter14 & filter16, inplace=False).count())
# female smokers = 33

60
33


## Question 12: How much more do male customers tip on average than female customers?

## Question 13: What is the median tip amount given by customers on Thursdays?

In [15]:
filter5 = df["day"] == "Thur" 
df["tip"].where(filter5, inplace=False).median()

2.3049999999999997

## Question 14: What is the total tip amount given by customers on Friday?


In [16]:
filter6 = df["day"] == "Fri"
df["tip"].where(filter6, inplace=False).sum()

51.96

## Question 15: What is the average tip amount for non-smokers during dinner time?

In [20]:
filter7 = df["time"] == "Dinner" 
filter8 = df["smoker"] == "No"
df["tip"].where(filter7 & filter8, inplace=False).mean()

3.1268867924528303

## Question 16: How much does the total bill vary between smokers and non-smokers?

In [53]:
filter12 = df.loc[(df["smoker"] == "Yes"), 'total_bill']
filter12.sum()
# 1930.3400000000001

filter13 = df.loc[(df["smoker"] == "No"), 'total_bill']
filter13.sum()
# 2897.4299999999994

print(filter13.sum() - filter12.sum())

967.0899999999992


## Question 17: What is the most common day for customers to visit the restaurant?

In [37]:
df["day"].value_counts()
# Saturday 

day
Sat     87
Sun     76
Thur    62
Fri     19
Name: count, dtype: int64

## Question 18: What is the range of tip amounts for female customers?


In [46]:
filter11 = df.loc[(df["sex"] == "Female"), 'tip']
print(f"({filter11.min()}, {filter11.max()})")

(1.0, 6.5)


## Question 19: Calculate the total bill for parties of 5 or more people (size).


In [29]:
filter9 = df["size"] >= 5
# They're 9 parties
print(df["total_bill"].where(filter9, inplace=False).count())
# Their total bills sum = 289.65999999999997
df["total_bill"].where(filter9, inplace=False).sum()

9


289.65999999999997

## Question 20: What is the highest total bill recorded during lunch?


In [30]:
filter10 = df["time"] == "Lunch"
df["total_bill"].where(filter10, inplace=False).max()

43.11

## Question 21: How many customers visited the restaurant on weekends (Saturday and Sunday) and ordered a total bill of more than $20?



```
# Hint isin ()

```

