# Ex1 - Filtering and Sorting Data

This time we are going to pull data directly from the internet.
Special thanks to: https://github.com/justmarkham for sharing the dataset and materials.

### Step 1. Import the necessary libraries

In [24]:
import pandas as pd

### Step 2. Import the dataset from this [address](https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv). 

### Step 3. Assign it to a variable called chipo.

In [25]:
chipo = pd.read_csv("https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv", delimiter="\t")

### Step 4. How many products cost more than $10.00?

Tips: Which is a faster way to convert item prices from str to float?

In [26]:
%timeit chipo["item_price"].str.replace("$", "").astype("float")

2.2 ms ± 20.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [27]:
%timeit chipo["item_price"].apply(lambda s: float(s.replace("$", "")))

2.86 ms ± 24.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [28]:
%timeit [float(s[1:-1]) for s in chipo["item_price"]]

2.2 ms ± 13 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [29]:
chipo["item_price"] = chipo["item_price"].str.replace("$", "").astype("float")

In [30]:
chipo[chipo["item_price"] > 10]["item_name"].nunique()
# or should I filter to rows of quantity = 1 as the solution shows?
# What does "product" mean here? is that not an item? or a combination of an item and a choice?

31

### Step 5. What is the price of each item? 
###### print a data frame with only two columns item_name and item_price

In [34]:
# OK, Let's omit rows whose quantity is more than 1
# and drop duplicates to see the prices of all the combinations of items and choices available in the data
chipo_items = chipo[chipo['quantity'] == 1][["item_name", "choice_description", "item_price"]].drop_duplicates(["item_name", "choice_description", "item_price"])
chipo_items

Unnamed: 0,item_name,choice_description,item_price
0,Chips and Fresh Tomato Salsa,,2.39
1,Izze,[Clementine],3.39
2,Nantucket Nectar,[Apple],3.39
3,Chips and Tomatillo-Green Chili Salsa,,2.39
5,Chicken Bowl,"[Fresh Tomato Salsa (Mild), [Rice, Cheese, Sou...",10.98
...,...,...,...
4602,Barbacoa Burrito,[Tomatillo Green Chili Salsa],9.25
4607,Steak Burrito,"[Tomatillo Green Chili Salsa, [Rice, Cheese, S...",11.75
4610,Steak Burrito,"[Fresh Tomato Salsa, [Rice, Sour Cream, Cheese...",11.75
4611,Veggie Burrito,"[Tomatillo Green Chili Salsa, [Rice, Fajita Ve...",11.25


### Step 6. Sort by the name of the item

In [35]:
chipo_items.sort_values(by="item_name")

Unnamed: 0,item_name,choice_description,item_price
357,6 Pack Soft Drink,[Coke],6.49
341,6 Pack Soft Drink,[Diet Coke],6.49
298,6 Pack Soft Drink,[Sprite],6.49
721,6 Pack Soft Drink,[Nestea],6.49
3141,6 Pack Soft Drink,[Lemonade],6.49
...,...,...,...
2384,Veggie Soft Tacos,"[Roasted Chili Corn Salsa, [Fajita Vegetables,...",8.75
781,Veggie Soft Tacos,"[Fresh Tomato Salsa, [Black Beans, Cheese, Sou...",8.75
1395,Veggie Soft Tacos,"[Fresh Tomato Salsa (Mild), [Pinto Beans, Rice...",8.49
2851,Veggie Soft Tacos,"[Roasted Chili Corn Salsa (Medium), [Black Bea...",8.49


### Step 7. What was the quantity of the most expensive item ordered?

In [39]:
chipo.sort_values(by="item_price", ascending=False).head(1)

Unnamed: 0,order_id,quantity,item_name,choice_description,item_price
3598,1443,15,Chips and Fresh Tomato Salsa,,44.25


### Step 8. How many times was a Veggie Salad Bowl ordered?

In [44]:
chipo[chipo["item_name"] == "Veggie Salad Bowl"]["order_id"].nunique()

18

### Step 9. How many times did someone order more than one Canned Soda?

In [49]:
chipo[(chipo["item_name"] == "Canned Soda") & (chipo["quantity"] > 1)]
# The solution answers as 20 (number of rows of this table), but the actual number of orders is 18 because there are 2 orders having more than 1 canned soda order.

Unnamed: 0,order_id,quantity,item_name,choice_description,item_price
18,9,2,Canned Soda,[Sprite],2.18
51,23,2,Canned Soda,[Mountain Dew],2.18
162,73,2,Canned Soda,[Diet Coke],2.18
171,76,2,Canned Soda,[Diet Dr. Pepper],2.18
350,150,2,Canned Soda,[Diet Coke],2.18
352,151,2,Canned Soda,[Coca Cola],2.18
698,287,2,Canned Soda,[Coca Cola],2.18
700,288,2,Canned Soda,[Coca Cola],2.18
909,376,2,Canned Soda,[Mountain Dew],2.18
1091,450,2,Canned Soda,[Dr. Pepper],2.18
