# Explanation of data

## 1-total_bill:

Description: The total amount of the bill (in dollars) for the meal, including the cost of food and drinks.
Data Type: Float
Example: 16.99
## 2-tip:

Description: The amount of the tip (in dollars) given by the customer to the server.
Data Type: Float
Example: 1.01
## 3-sex:

Description: The gender of the person who paid for the meal.
Data Type: Categorical (string)
Possible Values: 'Male', 'Female'
Example: 'Female'
## 4-smoker:

Description: Indicates whether there were smokers in the party.
Data Type: Categorical (string)
Possible Values: 'Yes', 'No'
Example: 'No'
## 5-day:

Description: The day of the week when the meal was served.
Data Type: Categorical (string)
Possible Values: 'Thur' (Thursday), 'Fri' (Friday), 'Sat' (Saturday), 'Sun' (Sunday)
Example: 'Sun'
## 6-time:

Description: The time of day when the meal was served.
Data Type: Categorical (string)
Possible Values: 'Lunch', 'Dinner'
Example: 'Dinner'
## 7-size:

Description: The number of people in the party.
Data Type: Integer
Example: 2

## Load the dataset

In [50]:
import numpy as np
import pandas as pd

# Load the dataset
data1 = pd.read_csv("tips.csv")
df=data1.copy()


## Display the first few rows

In [51]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


## Question 1: What is the average total bill amount?

In [52]:
average_total_bill = df['total_bill'].mean()
average_total_bill

19.78594262295082

## Question 2: How many records are in the dataset?


In [53]:
num_records = df.shape[0]
num_records

244

## Question 3: What is the total amount of tips given by male customers?


In [54]:
male_tips = df[df['sex'] == 'Male']['tip'].sum()
male_tips

485.07

## Question 4: What is the maximum tip given?


In [55]:
max_tip = df['tip'].max()
max_tip

10.0

## Question 5: What are the unique days on which the tips were recorded?


In [56]:
unique_days = df['day'].unique()
unique_days

array(['Sun', 'Sat', 'Thur', 'Fri'], dtype=object)

## Question 6: What is the average tip amount given by female customers?


In [57]:
female_average_tip = df[df['sex'] == 'Female']['tip'].mean()
female_average_tip

2.8334482758620685

## Question 7: How many customers are non-smokers?


In [58]:
total_customers = df.shape[0]
non_smokers = df[df['smoker'] == 'No'].shape[0]
print("no of non smokers:", non_smokers)
percentage_non_smokers = (non_smokers / total_customers) * 100
print("percentage of non smokers:", percentage_non_smokers.__round__(2), "%")

no of non smokers: 151
percentage of non smokers: 61.89 %


## Question 8: What is the average total bill for dinners?


In [59]:
average_dinner_bill = df[(df['time'] == 'Dinner')]['total_bill'].mean()
average_dinner_bill

20.79715909090909

## Question 9: What is the smallest size of a group recorded?


In [60]:
smallest_group_size = df['size'].min()
smallest_group_size

1

## Question 10: What is the standard deviation of the total bill amounts?


In [61]:
std_dev_total_bill = df['total_bill'].std()
std_dev_total_bill

8.902411954856856

## Question 11: How many male smokers are there compared to female smokers?


In [62]:
male_smokers = df[(df['sex'] == 'Male') & (df['smoker'] == 'Yes')].shape[0]
female_smokers = df[(df['sex'] == 'Female') & (df['smoker'] == 'Yes')].shape[0]
difference_in_smokers_no = male_smokers - female_smokers
print(difference_in_smokers_no)

27


## Question 12: How much more do male customers tip on average than female customers?

In [63]:
average_male_tip = df[df['sex'] == 'Male']['tip'].mean()
average_female_tip = df[df['sex'] == 'Female']['tip'].mean()
difference_in_average_tips = average_male_tip - average_female_tip
difference_in_average_tips

0.25616955853283585

## Question 13: What is the median tip amount given by customers on Thursdays?

In [64]:
thursday_median_tip = df[(df['day'] == 'Thur')]['tip'].median()
thursday_median_tip

2.3049999999999997

## Question 14: What is the total tip amount given by customers on Friday?


In [65]:
friday_total_tip = df[(df['day'] == 'Fri')]['tip'].sum()
friday_total_tip

51.959999999999994

## Question 15: What is the average tip amount for non-smokers during dinner time?

In [66]:
non_smoker_dinner_average_tip = df[(df['smoker'] == 'No') & (df['time'] == 'Dinner')]['tip'].mean()
non_smoker_dinner_average_tip

3.1268867924528303

## Question 16: How much does the total bill vary between smokers and non-smokers?

In [67]:
smoker_total_bill = df[df['smoker'] == 'Yes']['total_bill'].sum()
non_smoker_total_bill = df[df['smoker'] == 'No']['total_bill'].sum()
difference_in_total_bill = smoker_total_bill - non_smoker_total_bill
print(f"The difference in total bill between smokers and non-smokers is : {difference_in_total_bill}")

The difference in total bill between smokers and non-smokers is : -967.0899999999992


## Question 17: What is the most common day for customers to visit the restaurant?

In [68]:
most_common_day = df['day'].mode()[0]
print(f"The most common day for customers to visit the restaurant is: {most_common_day}")


The most common day for customers to visit the restaurant is: Sat


## Question 18: What is the range of tip amounts for female customers?


In [69]:
female_min_tip = df[(df['sex'] == 'Female')]['tip'].min()
female_max_tip = df[(df['sex'] == 'Female')]['tip'].max()
female_tip_range = female_max_tip - female_min_tip
print(f"The range of tip amounts for female customers is: {female_tip_range}")

The range of tip amounts for female customers is: 5.5


## Question 19: Calculate the total bill for parties of 5 or more people (size).


In [70]:
party_of_five_or_more = df[df['size'] >= 5]['total_bill'].sum()
print(f"The total bill for parties of 5 or more people is: {party_of_five_or_more}")

The total bill for parties of 5 or more people is: 289.65999999999997


## Question 20: What is the highest total bill recorded during lunch?


In [71]:
highest_lunch_bill = df[(df['time'] == 'Lunch')]['total_bill'].max()
print(f"The highest total bill recorded during lunch is: {highest_lunch_bill}")

The highest total bill recorded during lunch is: 43.11


## Question 21: How many customers visited the restaurant on weekends (Saturday and Sunday) and ordered a total bill of more than $20?

In [72]:
weekend_customers = df[(df['day'].isin(['Sat', 'Sun'])) & (df['total_bill'] > 20)].shape[0]
print(f"The customers visited the restaurant on weekends are: {weekend_customers}")

The customers visited the restaurant on weekends are: 75




```
# Hint isin ()

```

