In [3]:
import pandas as pd
import tests

# Data Description
In these exercises you'll use a data that contains information for tips.

Details for the columns:
* **total_bill**: a numeric vector, **the bill amount (dollars)**
* **tip**: a numeric vector, **the tip amount (dollars)**
* **sex**: a factor with levels **\[Female, Male\], gender of the payer of the bill**
* **smoker**: a factor with levels **\[No, Yes\], whether the party included smokers**
* **day**: a factor with levels **\['Sun', 'Sat', 'Thur', 'Fri'\], day of the week**
* **time**: a factor with levels **\[Dinner, Launch\], rough time of day**
* **size**: a numeric vector, **number of people in party**

# Data Source
https://github.com/mwaskom/seaborn-data/

# Loading the Data
Go to this [link](https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv) to download the tips data.

After you download the data, you have to save it in a variable called **tips_data** called **as a pandas DataFrame**. Then you'll use the DataFrame method **head** to **see the first 5 rows** of the data.

In [4]:
tips_data = pd.read_csv("tips.csv")
tips_data.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


Let's see how many rows and columns have our DataFrame. This is solved for you.

In [5]:
tips_data.shape

(244, 7)

The DataFrame have 244 rows and 7 columns.

# Exploratory Data Analysis
This part will be solved for you.

## Size

In [6]:
def print_min_max(column):
    print("The min value for the column \"" + column + "\" is:",  tips_data[column].min())
    print("The max value for the column \"" + column + "\" is:",  tips_data[column].max())

In [7]:
print_min_max("tip")
print()
print_min_max("total_bill")
print()
print_min_max("size")

The min value for the column "tip" is: 1.0
The max value for the column "tip" is: 10.0

The min value for the column "total_bill" is: 3.07
The max value for the column "total_bill" is: 50.81

The min value for the column "size" is: 1
The max value for the column "size" is: 6


# Subsetting / Filtering the Data
If you want **to prevent very long outputs** you can use the **.head() method** to select only the **first 5 rows** of your result.

## Task 1
Get all records where **tip is euqal to 10**.

In [8]:
answer = tips_data[tips_data.tip == 10]
tests.assert_task1(answer)

Correct answer!


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
170,50.81,10.0,Male,Yes,Sat,Dinner,3


## Task 2
Get all records where **number of the people (the size column) is euqal to 5 or 6**. 

Be careful, the pandas DataFrame object has a method called **"size"**. In this case, **we must use the square brackets syntax** to get the **"size" column**.

In [12]:
answer = tips_data[tips_data["size"].between(5, 6)]
tests.assert_task2(answer)

Correct answer!


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
125,29.8,4.2,Female,No,Thur,Lunch,6
141,34.3,6.7,Male,No,Thur,Lunch,6
142,41.19,5.0,Male,No,Thur,Lunch,5
143,27.05,5.0,Female,No,Thur,Lunch,6
155,29.85,5.14,Female,No,Sun,Dinner,5


## Task 3
Get all records where the **party includes smokers**.

In [14]:
answer = tips_data[tips_data.smoker == "Yes"]
tests.assert_task3(answer)

Correct answer!


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
56,38.01,3.0,Male,Yes,Sat,Dinner,4
58,11.24,1.76,Male,Yes,Sat,Dinner,2
60,20.29,3.21,Male,Yes,Sat,Dinner,2
61,13.81,2.0,Male,Yes,Sat,Dinner,2
62,11.02,1.98,Male,Yes,Sat,Dinner,2


## Task 4
Get all records where the party **doesn't include smokers**.

In [16]:
answer = tips_data[tips_data.smoker == "No"]
tests.assert_task4(answer)

Correct answer!


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


## Taks 5
Get all records where the **tip is equal to 5, 7 or 10**.

In [17]:
answer = tips_data[tips_data.tip.isin([5, 7, 10])]
tests.assert_task5(answer)

Correct answer!


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
11,35.26,5.0,Female,No,Sun,Dinner,4
39,31.27,5.0,Male,No,Sat,Dinner,3
46,22.23,5.0,Male,No,Sun,Dinner,2
73,25.28,5.0,Female,Yes,Sat,Dinner,2
83,32.68,5.0,Male,Yes,Thur,Lunch,2


## Task 6
Get all records where the **day is Saturday or Sunday**. Remember that **the values in the day column** are written like that: **'Sun', 'Sat', 'Thur', 'Fri'**

In [18]:
answer = tips_data[tips_data.day.isin(["Sat", "Sun"])]
tests.assert_task6(answer)

Correct answer!


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


## Task 7
Get all records where **the number of people is greater than 4** and **the payer of the bill** (the sex column) **is a female**.

In [20]:
answer = tips_data[(tips_data["size"] > 4) & (tips_data.sex == "Female")]
tests.assert_task7(answer)

Correct answer!


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
125,29.8,4.2,Female,No,Thur,Lunch,6
143,27.05,5.0,Female,No,Thur,Lunch,6
155,29.85,5.14,Female,No,Sun,Dinner,5


## Task 8
Get all records where **the total bill is greater than 40** and **the tip is greater than 5**.

In [21]:
answer = tips_data[(tips_data.total_bill > 40) & (tips_data.tip > 5)]
tests.assert_task8(answer)

Correct answer!


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
59,48.27,6.73,Male,No,Sat,Dinner,4
170,50.81,10.0,Male,Yes,Sat,Dinner,3
212,48.33,9.0,Male,No,Sat,Dinner,4


## Task 9
Get all records where the number of **the people is greater than 4** or **it's a dinner** party (the time column).

In [22]:
answer = tips_data[(tips_data["size"] > 4) | (tips_data.time == "Dinner")]
tests.assert_task9(answer)

Correct answer!


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


## Task 10
Get all records where the **the party doesn't include smokers**, the **total bill is greater than 30**, and the **day is Thurday or Friday**.

In [26]:
smoker_filter = tips_data.smoker == "No"
total_bill_filter = tips_data.total_bill > 30
day_filter = tips_data.day.isin(["Thur", "Fri"])
all_filters = smoker_filter & total_bill_filter & day_filter

answer = tips_data[all_filters]
tests.assert_task10(answer)

Correct answer!


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
85,34.83,5.17,Female,No,Thur,Lunch,4
141,34.3,6.7,Male,No,Thur,Lunch,6
142,41.19,5.0,Male,No,Thur,Lunch,5


## Task 11
Get all records where the **day is Friday or Saturday**, the **time is a lunch**, and the **number of the people is less than 4**.

In [27]:
day_filter = tips_data.day.isin(["Fri", "Sat"])
time_filter = tips_data.time == "Lunch"
size_filter = tips_data["size"] < 4
all_filters = day_filter & time_filter & size_filter

answer = tips_data[all_filters]
tests.assert_task11(answer)

Correct answer!


Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
220,12.16,2.2,Male,Yes,Fri,Lunch,2
221,13.42,3.48,Female,Yes,Fri,Lunch,2
222,8.58,1.92,Male,Yes,Fri,Lunch,1
223,15.98,3.0,Female,No,Fri,Lunch,3
224,13.42,1.58,Male,Yes,Fri,Lunch,2


## Task 12

Get all records where 

the **day is Friday**, the **number of the people is less than 3** and the **payer of the bill is a female** 

or 

the **day is Sunday**, the **number of the people is greater than 4** and the **payer of the bill is a male**

In [17]:
first_filter = (tips_data.day == "Fri") & (tips_data["size"] < 3) & (tips_data.sex == "Female")
second_filter = (tips_data.day == "Sun") & (tips_data["size"] > 4) & (tips_data.sex == "Male")
all_filters = first_filter | second_filter

answer = tips_data[all_filters]
tests.assert_task11(answer)

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
92,5.75,1.0,Female,Yes,Fri,Dinner,2
93,16.32,4.3,Female,Yes,Fri,Dinner,2
94,22.75,3.25,Female,No,Fri,Dinner,2
100,11.35,2.5,Female,Yes,Fri,Dinner,2
101,15.38,3.0,Female,Yes,Fri,Dinner,2
156,48.17,5.0,Male,No,Sun,Dinner,6
185,20.69,5.0,Male,No,Sun,Dinner,5
187,30.46,2.0,Male,Yes,Sun,Dinner,5
221,13.42,3.48,Female,Yes,Fri,Lunch,2
225,16.27,2.5,Female,Yes,Fri,Lunch,2
