# Analyzing Chipotle Data

In this workbook I am exploring [order data from Chipotle](https://github.com/TheUpshot/chipotle), compliments of _The New York Times'_ "The Upshot."

---

---

## Basic Level

### Part 1: Read in the file with `csv.reader()` and store it in an object called `file_nested_list`.

Hint: This is a TSV (tab-separated value) file, and `csv.reader()` needs to be told [how to handle it](https://docs.python.org/2/library/csv.html).

In [20]:
import csv
from collections import namedtuple   # Convenient to store the data rows

DATA_FILE = './datasets/chipotle.tsv'

### Part 2: Separate `file_nested_list` into the `header` and the `data`.


In [23]:
with open(DATA_FILE, mode='r') as f:
    chipotle_data = [row for row in csv.reader(f, delimiter='\t')]

In [38]:
# ['order_id', 'quantity', 'item_name', 'choice_description', 'item_price']
chipotle_header = chipotle_data[0]
# ['1', '1', 'Chips and Fresh Tomato Salsa', 'NULL', '$2.39 ']
chipotle_rows = chipotle_data[1:]

---

## Intermediate Level

### Part 3: Calculate the average price of an order.

Hint: Examine the data to see if the `quantity` column is relevant to this calculation.

Hint: Think carefully about the simplest way to do this!

In [43]:
def returnOrderItems(order_id):
    orderItems = []
    for item in chipotle_rows:
        if (item[0] == str(order_id)):
            orderItems.append(item)
    return orderItems

[['1', '1', 'Chips and Fresh Tomato Salsa', 'NULL', '$2.39 '],
 ['1', '1', 'Izze', '[Clementine]', '$3.39 '],
 ['1', '1', 'Nantucket Nectar', '[Apple]', '$3.39 '],
 ['1', '1', 'Chips and Tomatillo-Green Chili Salsa', 'NULL', '$2.39 ']]

In [66]:
import statistics
def returnAvgPrice(order_id):
    order = returnOrderItems(order_id)
    itemsCost = []
    for item in order:
        price = float(item[4][1:][:-1])
        itemsCost.append(int(item[1])*price)
        
    return statistics.mean(itemsCost)

In [77]:
print('Order 4 includes, ', returnOrderItems(4))
print('Order 4 avg price is, $', returnAvgPrice(4))

Order 4 includes,  [['4', '1', 'Steak Burrito', '[Tomatillo Red Chili Salsa, [Fajita Vegetables, Black Beans, Pinto Beans, Cheese, Sour Cream, Guacamole, Lettuce]]', '$11.75 '], ['4', '1', 'Steak Soft Tacos', '[Tomatillo Green Chili Salsa, [Pinto Beans, Cheese, Sour Cream, Lettuce]]', '$9.25 ']]
Order 4 avg price is, $ 10.5


### Part 4: Create a list (or set) named `unique_sodas` containing all of unique sodas and soft drinks that Chipotle sells.

Note: Just look for `'Canned Soda'` and `'Canned Soft Drink'`, and ignore other drinks like `'Izze'`.

In [97]:
unique_sodas = []
for item in chipotle_rows:
    if (item[2] == 'Canned Soft Drink' or item[2] == 'Canned Soda'):
        if (item[3][1:][:-1] not in unique_sodas):
            unique_sodas.append(item[3][1:][:-1])
        
print(unique_sodas)

['Sprite', 'Dr. Pepper', 'Mountain Dew', 'Diet Dr. Pepper', 'Coca Cola', 'Diet Coke', 'Coke', 'Lemonade', 'Nestea']


---

## Advanced Level


### Part 5: Calculate the average number of toppings per burrito.

Note: Let's ignore the `quantity` column to simplify this task.

Hint: Think carefully about the easiest way to count the number of toppings!


In [116]:
def averageBurritoToppings(burritoType):
    toppingsCount = []
    for item in chipotle_rows:
        count = 0
        if item[2] == burritoType:
            for topping in item[3].strip('][').split(', '):
                if isinstance(topping, list):
                    count += len(topping)
                else:
                    count += 1
            toppingsCount.append(count)
    return statistics.mean(toppingsCount)

averageBurritoToppings('Steak Burrito')

5.407608695652174

### Part 6: Create a dictionary. Let the keys represent chip orders and the values represent the total number of orders.

Expected output: `{'Chips and Roasted Chili-Corn Salsa': 18, ... }`

Note: Please take the `quantity` column into account!

Optional: Learn how to use `.defaultdict()` to simplify your code.

In [123]:
chipsCount = {}

True


---

## Bonus: Craft a problem statement about this data that interests you, and then answer it!
