# Pandas DataFrame Exploration: FoodHub Online Food Orders

FoodHub is a restaurant that started last January offering free delivery for all its online orders.  You are a restaurant manager interested in knowing whether the average online food orders per month is increasing over time.  Over one year, each time an online order is placed, you record the MONTH the order was placed, the MONTH NUMBER representation and the TOTAL ORDER PRICE in a CSV file.  Your data is located in the data directory.

**Question 1)** Import your data into a Pandas DataFrame called *orders* from the PATH: *data/orders.csv*.  Then PRINT the FIRST 10 rows of your *orders* DataFrame to the screen.

In [19]:
import pandas as pd
orders = pd.read_csv('data/orders.csv')
df = pd.DataFrame(orders)
df.head(10)

Unnamed: 0,May,5,24.9245
0,December,12,46.03949
1,May,5,21.8279
2,December,12,21.04339
3,February,2,57.25843
4,September,9,42.96091
5,July,7,17.41394
6,April,4,17.55809
7,November,11,16.63762
8,February,2,16.33314
9,September,9,17.68876


**Question 2)** After examining the FIRST 10 rows of your dataset, you notice there are no column LABELS.  Create them now using the following labels:  Month, Month_Number, Total_Price.  Re-examine the FIRST 10 rows by printing them to the screen.

In [20]:
df.columns = ['Month', 'Month_Number', 'Total_Price']
df

Unnamed: 0,Month,Month_Number,Total_Price
0,December,12,46.03949
1,May,5,21.8279
2,December,12,21.04339
3,February,2,57.25843
4,September,9,42.96091
5,July,7,17.41394
6,April,4,17.55809
7,November,11,16.63762
8,February,2,16.33314
9,September,9,17.68876


**Question 3)** To ensure that you recorded your data correctly (i.e., there are no typos in month names and there are no months not accounted for), aggregate the Month column by UNIQUE month names.  Do the month names look good?  Are there any months with no data recorded?

In [21]:
df['Month'].unique()

array(['December', 'May', 'February', 'September', 'July', 'April',
       'November', 'June', 'March', 'October', 'January', 'August'],
      dtype=object)

**Question 4)** If the data looks good, GROUP the data by AVERAGE Total_Price for each Month_Number.  Save this as a new Pandas Object, *orders_grouped*.


In [16]:
orders_grouped = df.groupby('Month_Number').Total_Price.mean()
orders_grouped

Month_Number
1     18.618593
2     28.540582
3     18.397830
4     24.859380
5     23.787443
6     18.823300
7     18.065253
8     21.136897
9     27.925567
10    21.858637
11    23.623683
12    25.293015
Name: Total_Price, dtype: float64

**IMPORTANT!**  Your new Pandas Object is not a DataFrame.  Since it was created as the output of a grouping in the previous step, **orders_grouped** is a Pandas Series. 

**Question 5)** The average online food orders do not appear to have a clear increasing trend.  It may be useful to FILTER *orders_grouped* to see which Month_Number had the LEAST OR the GREATEST AVERAGE Total_Price.
<br><br>
**Hint:** Recall that the *orders_grouped* is not a DataFrame, it is a Pandas Series.  Therefore, it is filtered like a Numpy array.  To find both the MIN and MAX month numbers, you will need a 2 criteria filter separated by the OR logical operator.

In [36]:
orders_grouped[(orders_grouped == 25.293015) | (orders_grouped == 18.0)]


Month_Number
12    25.293015
Name: Total_Price, dtype: float64

Great job!  You did not uncover a clear increasing trend in the average online food orders per month.  So, you can't say whether FoodHub's average online food orders per month are doing better over time.  But, perhaps more data collection and analysis could uncover a seasonal pattern of higher average total order prices in the winter months and lower average total order prices in the summer months.
<br><br>
Understanding these types of trends and patterns could give you as a manager insight on issues like how to staff your delivery drivers.  If your average online food orders are higher in the winter months, then you may need to have more drivers on staff during that time.
<br><br>
You now have a very good foundation in using Python for data science and are ready to end this workshop with a project that demonstrates the use of data science in the real world.