# Review of last week's exercises

### Starting with number 5

Last week I told you to:

>Calculate, for each order number, how much the order costs. 
>
>In other words, you're going to iterate through the DataFrame and, for each order number, add up the `item_price * quantity` for each item in the order. 
>
>Store the order prices in a list. Then, tell me how much the 100th, 500th, and 1000th order cost. 

There are a bunch of ways to do this. Here's one:

In [None]:
# import pandas
import pandas as pd
# read in chipotle data
chip = pd.read_csv("chipotle.tsv", sep='\t')

In [None]:
# make a new DataFrame so we can keep our original data safe
newchip = chip

# make a new column that contains the total item price of `quantity * item_price`
newchip['total_item_price'] = newchip['quantity'] * newchip['item_price']

# groupby the order id and sum up the total item prices
newchip = newchip.groupby("order_id").sum()['total_item_price']

# display the 100th, 500th, and 1000th items
print(newchip[[100,500,1000]])

# Okay cool but what the **heck** was _Exercise 6_?

This is what it said:

>Using a filter function, drop every row where only one item was ordered. In other words, if `quantity > 1`, keep the row. If not, drop the row. 
>
>Then, print the first ten items in the new DataFrame.
>
>Hint: this is very similar to the filter we wrote when dealing with community center attendance.

So... what was this asking for?

There are lots of entries `chip` where a single item was ordered. This question wants us to remove the rows where only _one_ of an item was ordered. 

The hint was **not** meant to say that you _need_ to use a filter function, though you absolutely can. It was just meant to remind you that we have done something similar to this before.

Let's move on to some different ways to solve this.

## Method 1!

Though you didn't _have_ to use a filter, this is one way that you can. 

In [None]:
# define the filter function
# return true only for items that have above the lower bound of orders
def filter_func(df, lower_bound):
    return df['quantity'] > lower_bound

# groupy + filter
# remember how you got the columns with chip.columns? we can do the same with the rows
# since we want to look at each ROW, we group it by the INDEX
# this way, each row will be analyzed on its own
new_chip = chip.groupby(chip.index).filter(filter_func, lower_bound=1)

# display the first ten
new_chip.head(10)

---
## Method 2!

If you don't wanna use a filter for this, that makes sense! In fact, it's _easier_ without a filter.

Remember how we can index values in a DataFram by using a list? Well, we can extend that idea to a _boolean series_.

In [None]:
# get a list where each row is True if and only if the quantity is greater than 1
(chip['quantity'] > 1)[:20]

In [None]:
# if we .loc using this boolean series, we get only the rows we want!
big_orders = chip.loc[chip['quantity'] > 1]

# you can cross-reference these indices with the list above. Look at 4 and 18!
big_orders.head(10)

---

# Alright, how about Exercise 7?

It said:

>Make a new dictionary. Then, iterate through the DataFrame. For each item_name, do the following:
>
>* if dict[item_name] exists, add quantity to it;
>* if not, set dict[item_name] to quantity;
>
>You're essentially counting the number of times that a given item has been ordered.

Okay, let's go step by step. First, we need to make a dictionary

In [None]:
# make the dictionary
chip_dict = dict()

You could also do `chip_dict = {}`, but it's up to personal preference.

Next, we need to iterate through our `chip` DataFrame. Our first instinct might be to do something like this:

In [None]:
for entry in chip:
    print(entry)

As we can see, that gives us the columns. But, what we really want is each row since we have to pull out item information from each one. So, let's change our appraoch:

In [None]:
for i in range(len(chip)):
    # do stuff
    pass

`len(chip)` returns the number of rows in the DataFrame

`for i in range(len(chip))` iterates from 0 to the length of chip

Now, remember how we were supposed to populate the dict? If it's there, add the quantity to it. Otherwise, set the item entry to the quantity. 
Let's put it all together:

In [None]:
# make dictionary
chip_dict = dict()

# iterate through the dataframe
for i in range(len(chip)):
    # get the item name and quantity
    item = chip.loc[i, 'item_name']
    num = chip.loc[i, 'quantity']
    
    if item in chip_dict: # if it's in the dict
        chip_dict[item] += num # add the quantity to it
    else:
        chip_dict[item] = num # otherwise, set it to the quantity

chip_dict