<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Project 2: Analyzing Chipotle Data

_Author: Joseph Nelson (DC)_

---

For Project 2, you will complete a series of exercises exploring [order data from Chipotle](https://github.com/TheUpshot/chipotle), compliments of _The New York Times'_ "The Upshot."

For these exercises, you will conduct basic exploratory data analysis (Pandas not required) to understand the essentials of Chipotle's order data: how many orders are being made, the average price per order, how many different ingredients are used, etc. These allow you to practice business analysis skills while also becoming comfortable with Python.

---

## Basic Level

### Part 1: Read in the file with `csv.reader()` and store it in an object called `file_nested_list`.

Hint: This is a TSV (tab-separated value) file, and `csv.reader()` needs to be told [how to handle it](https://docs.python.org/2/library/csv.html).

In [93]:
import csv
from collections import namedtuple   # Convenient to store the data rows

chipotle = './data/chipotle.tsv'

with open(chipotle) as tsvfile:
    file_nested_list = csv.reader(tsvfile, delimiter='\t')
    data = [r for r in file_nested_list]

### Part 2: Separate `file_nested_list` into the `header` and the `data`.


In [94]:
header = data.pop(0)

---

## Intermediate Level

### Part 3: Calculate the average price of an order.

Hint: Examine the data to see if the `quantity` column is relevant to this calculation.

Hint: Think carefully about the simplest way to do this!

In [95]:
header

['order_id', 'quantity', 'item_name', 'choice_description', 'item_price']

In [100]:
quantities = [int(row[1]) for row in data]
prices = [float(row[4].strip('$')) for row in data]

order_cost = [quantity * price for quantity,price in zip(quantities,prices)]

### Part 4: Create a list (or set) named `unique_sodas` containing all of unique sodas and soft drinks that Chipotle sells.

Note: Just look for `'Canned Soda'` and `'Canned Soft Drink'`, and ignore other drinks like `'Izze'`.

In [102]:
header

['order_id', 'quantity', 'item_name', 'choice_description', 'item_price']

In [127]:
items = [row[2] for row in data]
choices = [row[3] for row in data]

In [128]:
count = 0
soda = []
for item in items:
    if item in ['Canned Soda', 'Canned Soft Drink']:
        soda.append(choices[count])
    count += 1

In [130]:
set(soda)

{'[Coca Cola]',
 '[Coke]',
 '[Diet Coke]',
 '[Diet Dr. Pepper]',
 '[Dr. Pepper]',
 '[Lemonade]',
 '[Mountain Dew]',
 '[Nestea]',
 '[Sprite]'}

---

## Advanced Level


### Part 5: Calculate the average number of toppings per burrito.

Note: Let's ignore the `quantity` column to simplify this task.

Hint: Think carefully about the easiest way to count the number of toppings!


In [180]:
# find the names of the items ordered and the number of toppings for each order

names = [row[2] for row in data]
toppings = [len(row[3].split(',')) for row in data]

In [181]:
# if a burrito was ordered, record the number of toppings

burrito_top = []
count = 0

for name in names:
    if 'Burrito' in name.split(' '):
        burrito_top.append(toppings[count])
    count += 1

In [188]:
# find the average

round(sum(burrito_top)/len(burrito_top), 2)

5.4

### Part 6: Create a dictionary. Let the keys represent chip orders and the values represent the total number of orders.

Expected output: `{'Chips and Roasted Chili-Corn Salsa': 18, ... }`

Note: Please take the `quantity` column into account!

Optional: Learn how to use `.defaultdict()` to simplify your code.

In [249]:
# If we have a set of order types, we can iterate through the quantities and add them up for each item

names = ([row[2] for row in data if 'Chips' in row[2]])

In [270]:
# Create a list of elements that signifies each unique chip order

new_names = [name.replace('-', ' ') for name in names]
set_names = list(set(new_names))
list_names = [name.split(' ') for name in set_names]

In [272]:
# Iterate over all orders for each type of chip order and calculate the total number of orders

num_orders = []
for name in list_names:
    order_count = 0
    for order in new_names:
        # new_names is the list of orders from abover that has replaced '-' with ' ' in each order
        order = order.split(' ')
        if order == name:
            order_count += 1
    num_orders.append(order_count)

num_orders

[40, 101, 479, 211, 74, 110, 1, 68]

In [273]:
# Create dictionary

full_names = [' '.join(name) for name in list_names]
final_dict = dict(zip(full_names,num_orders))

final_dict

{'Chips': 211,
 'Chips and Fresh Tomato Salsa': 110,
 'Chips and Guacamole': 479,
 'Chips and Mild Fresh Tomato Salsa': 1,
 'Chips and Roasted Chili Corn Salsa': 40,
 'Chips and Tomatillo Green Chili Salsa': 74,
 'Chips and Tomatillo Red Chili Salsa': 68,
 'Side of Chips': 101}

---

## Bonus: Craft a problem statement about this data that interests you, and then answer it!
