# **Challenge #1: Chipotle Sales**

### **Questions to Answer**
* Which was the most-ordered item?
* For the most-ordered item, how many items were ordered?
* What was the most ordered item in the choice_description column?
* How many items were ordered in total?
* Turn the item price into a float
* How much was the revenue for the period in the dataset?
* How many orders were made in the period?
* What is the average revenue amount per order?
* How many different items are sold?

In [None]:
#import libraries
import pandas as pd
import matplotlib.pyplot as plt


In [None]:
url = 'https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv'
chipo = pd.read_csv(url, sep = '\t')


# **Data Exploration**

In [None]:
chipo.head()

Unnamed: 0,order_id,quantity,item_name,choice_description,item_price
0,1,1,Chips and Fresh Tomato Salsa,,$2.39
1,1,1,Izze,[Clementine],$3.39
2,1,1,Nantucket Nectar,[Apple],$3.39
3,1,1,Chips and Tomatillo-Green Chili Salsa,,$2.39
4,2,2,Chicken Bowl,"[Tomatillo-Red Chili Salsa (Hot), [Black Beans...",$16.98


In [None]:
chipo.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4622 entries, 0 to 4621
Data columns (total 5 columns):
 #   Column              Non-Null Count  Dtype 
---  ------              --------------  ----- 
 0   order_id            4622 non-null   int64 
 1   quantity            4622 non-null   int64 
 2   item_name           4622 non-null   object
 3   choice_description  3376 non-null   object
 4   item_price          4622 non-null   object
dtypes: int64(2), object(3)
memory usage: 180.7+ KB


## **Which was the most-ordered item?**
## **For the most-ordered item, how many items were ordered?**

In [None]:
most_ordered = chipo[['item_name','quantity']]
most_ordered = most_ordered.groupby(by='item_name')['quantity'].sum().sort_values(ascending=False).reset_index()
most_ordered.head()

Unnamed: 0,item_name,quantity
0,Chicken Bowl,761
1,Chicken Burrito,591
2,Chips and Guacamole,506
3,Steak Burrito,386
4,Canned Soft Drink,351


Chicken Bowl is the top item sold. There were 761 items ordered.

## **What was the most ordered item in the choice_description column?**
## **How many items were ordered in total?**

In [None]:
most_ordered_choice = chipo[['choice_description','quantity']]
most_ordered_choice = most_ordered_choice.groupby(by='choice_description')['quantity'].sum().sort_values(ascending=False).reset_index()
most_ordered_choice.head()

Unnamed: 0,choice_description,quantity
0,[Diet Coke],159
1,[Coke],143
2,[Sprite],89
3,"[Fresh Tomato Salsa, [Rice, Black Beans, Chees...",49
4,"[Fresh Tomato Salsa, [Rice, Black Beans, Chees...",42


Diet Coke was the most ordered item in the `choice_description` column. There were 159 Diet Cokes ordered.

## **How many items were ordered in total?**

In [None]:
total = chipo['quantity'].sum()
print(f'There were {total} items ordered in total.')

There were 4972 items ordered in total.


## **Turn the item price into a float**

### **Before**

In [None]:
chipo['item_price'].head()

0     $2.39 
1     $3.39 
2     $3.39 
3     $2.39 
4    $16.98 
Name: item_price, dtype: object

In [None]:
#remove '$'
chipo['item_price'] = chipo['item_price'].str.replace('$', '', regex=False)
#change data type
chipo['item_price'] = chipo['item_price'].astype('float')


### **After**

In [None]:
chipo['item_price'].head()

0     2.39
1     3.39
2     3.39
3     2.39
4    16.98
Name: item_price, dtype: float64

When `item_pric` is a float instead of a string

## **How much was the revenue for the period in the dataset?**

In [None]:
revenue = chipo[['item_name','item_price','quantity']].groupby(by='item_name').agg({'quantity': 'sum', 'item_price': 'first'}).reset_index()

In [None]:
def total_revenue(df):
  total_revenue = 0
  for index, col in df.iterrows():
    quantity = col['quantity']
    item_price = col['item_price']
    revenue = quantity * item_price
    total_revenue += revenue
  return total_revenue

print(f'Total revenue is ${total_revenue(revenue)}')


Total revenue is $40361.88


## **How many orders were made in the period?**

In [None]:
num_orders = chipo['order_id'].count()
print(f'There were {num_orders} orders made during that period.')

There were 4622 orders made during that period.


## **What is the average revenue amount per order?**


In [None]:
avg_revenue = total_revenue(revenue) / num_orders
print(f'The average revenue per order is ${round(avg_revenue,2)}')

The average revenue per order is $8.73


## **How many different items are sold?**

In [None]:
diff_items_count = most_ordered.groupby(by='item_name')['quantity'].sum().sort_values(ascending=False).reset_index()
num_diff_items = diff_items_count['item_name'].count()
print(f'There are {num_diff_items} different items sold at Chipotle.')


There are 50 different items sold at Chipotle.
