# Chipotle Sales
### Data in Motion: Data Analysis Challenge 1

## Scenario
You are a financial data analyst at Chipotle and your manager has tasked you with analyzing the most recent sales numbers. She has provided the following set of questions she would like answered.

## Link to dataset
https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv

## Challenge Questions

- Which was the most-ordered item?
- For the most-ordered item, how many items were ordered?
- What was the most ordered item in the choice_description column?
- How many items were ordered in total?
- Turn the item price into a float
- How much was the revenue for the period in the dataset?
- How many orders were made in the period?
- What is the average revenue amount per order?
- How many different items are sold?

### Import Libraries

In [2]:
import pandas as pd

### Load the dataset

In [3]:
url = 'https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv'
chipo = pd.read_csv(url, sep = '\t')

### Explore the dataset

In [4]:
chipo.head()

Unnamed: 0,order_id,quantity,item_name,choice_description,item_price
0,1,1,Chips and Fresh Tomato Salsa,,$2.39
1,1,1,Izze,[Clementine],$3.39
2,1,1,Nantucket Nectar,[Apple],$3.39
3,1,1,Chips and Tomatillo-Green Chili Salsa,,$2.39
4,2,2,Chicken Bowl,"[Tomatillo-Red Chili Salsa (Hot), [Black Beans...",$16.98


In [5]:
chipo.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4622 entries, 0 to 4621
Data columns (total 5 columns):
 #   Column              Non-Null Count  Dtype 
---  ------              --------------  ----- 
 0   order_id            4622 non-null   int64 
 1   quantity            4622 non-null   int64 
 2   item_name           4622 non-null   object
 3   choice_description  3376 non-null   object
 4   item_price          4622 non-null   object
dtypes: int64(2), object(3)
memory usage: 180.7+ KB


In [58]:
# Which was the most-ordered item? -- Chicken Bowl
most_ordered_item = chipo.groupby(['item_name']).sum().sort_values(['quantity'],ascending=False)
most_ordered_item.head(1)

  most_ordered_item = chipo.groupby(['item_name']).sum().sort_values(['quantity'],ascending=False)


Unnamed: 0_level_0,order_id,quantity,item_price,total
item_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Chicken Bowl,713926,761,7342.73,8044.63


In [59]:
# For the most-ordered item, how many items were ordered? (1st way) -- 761
most_ordered_item['quantity'].head(1)

item_name
Chicken Bowl    761
Name: quantity, dtype: int64

In [24]:
# For the most-ordered item, how many items were ordered? (2nd way) -- 761
chipo[chipo['item_name']=="Chicken Bowl"].quantity.sum()

761

In [26]:
# What was the most ordered item in the choice_description column? -- Diet Coke
most_ordered_choice = chipo.groupby(['choice_description']).sum().sort_values(['quantity'],ascending=False)
most_ordered_choice.head(1)

  most_ordered_choice = chipo.groupby(['choice_description']).sum().sort_values(['quantity'],ascending=False)


Unnamed: 0_level_0,order_id,quantity
choice_description,Unnamed: 1_level_1,Unnamed: 2_level_1
[Diet Coke],123455,159


In [35]:
# How many items were ordered in total? -- 4972
chipo.quantity.sum()

4972

In [38]:
# Turn the item price into a float
chipo['item_price'] = chipo['item_price'].apply(lambda x: x[1:]).astype(float)                                                                                                                     

In [43]:
# How much was the revenue for the period in the dataset? -- 39237.02
chipo['total'] = chipo['item_price'] * chipo['quantity']
chipo.total.sum()

39237.02

In [53]:
# How many orders were made in the period? -- 1834
chipo.order_id.nunique()

1834

In [55]:
# What is the average revenue amount per order? -- 21.39
total = chipo.total.sum()
orders = chipo.order_id.nunique()
avg_amt_per_order = total/orders
print(avg_amt_per_order)

21.39423118865867


In [56]:
# How many different items are sold? -- 50
chipo.item_name.nunique()

50

## Solutions

- Which was the most-ordered item?
    ```Chicken Bowl```
- For the most-ordered item, how many items were ordered?
    ```761```
- What was the most ordered item in the choice_description column?
    ```Diet Coke```
- How many items were ordered in total?
    ```4972```
- Turn the item price into a float
- How much was the revenue for the period in the dataset?
    ```39237.02```
- How many orders were made in the period?
    ```1834```
- What is the average revenue amount per order?
    ```21.39```
- How many different items are sold?
    ```50```