# <center>Nutrition Facts for McDonald's Menu</center>

In this notebook I will try to give the answer for the McDonald's dataset. Below is the questions:

1. How many calories does the average McDonald's value meal contain? 
2. How much do beverages, like soda or coffee, contribute to the overall caloric intake? 
3. Does ordered grilled chicken instead of crispy increase a sandwich's nutritional value? 
4. What about ordering egg whites instead of whole eggs? 
5. What is the least number of items could you order from the menu to meet one day's nutritional requirements?

## Read the Data

In [None]:
import pandas as pd
df=pd.read_csv('../input/nutrition-facts/menu.csv')
df.head()

## Analyze the data

In [None]:
df.info()

In [None]:
df.describe()

In [None]:
#checking all the categories in our dataset
df['Category'].unique()

All of the item in the data set is grouped by 9 categories, namely 'Breakfast', 'Beef & Pork', 'Chicken & Fish', 'Salads', 'Snacks & Sides', 
'Desserts', 'Beverages', 'Coffee & Tea','Smoothies & Shakes'.

In [None]:
#checking the missing value
df.isnull().sum()

Since there is no missing value and it seems that there is no problem about this data set, we can continue to answer the questions.

### How many calories does the average McDonald's value meal contain?

In [None]:
#making a new dataframe for first question
df1=df[['Calories']].groupby(df['Category']).mean().reset_index()
df1

We can make it into a bar plot to see it clearly and we can compare one category to another easier.  

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt
plt.figure(figsize=(16, 6))
sns.barplot('Category', 'Calories', data=df1);

This is the first answer for the first question, as you can see we had the average calories for each category. From the result Chicken and fish has the highest average calories. 

### How much do beverages, like soda or coffee, contribute to the overall caloric intake?

To answer this question I will collect the item that contain soda or coffee. Then calculate the average of the item and divided it with the average of calorie requirements per day. An ideal daily intake of calories varies depending on age, metabolism and levels of physical activity, among other things. Generally, the recommended daily calorie intake is 2,000 calories a day for female and 2,500 for male.

In [None]:
male_calories = 2500
female_calories = 2000

In [None]:
df_soda=df[df['Category']=='Beverages']
df_soda

From the result we can se the beverages that contain soda (carbonated drinks) like Coca Cola,Coke, Sprite, and Dr Pepper. The index is from 110 to 129. So we can drop the other item from index 130 to 136 then we can continue to see the coffee category. 

In [None]:
df_soda=(df_soda.loc[110:129])
df_soda

In [None]:
#calculate the average calories of soda
average_soda_calories = df_soda['Calories'].mean()
print('The average calories of soda =',average_soda_calories)
#calculate the male caloric intake from soda
print('Male calories intake from soda =', average_soda_calories/male_calories*100,'%')
#calculate the female caloric intake from soda
print('Female calories intake from soda =', average_soda_calories/female_calories*100,'%')

From the calculation we can see the average calories intake from soda, for male it is 4.28%, and for female it is 5.35%. 


In [None]:
df_coffee=df[df['Item'].str.contains('Coffee')]
df_coffee

This is the category that contains coffee, the index is from 145 to 210. Now we can continue to calculate the average calories intake from coffee. 

In [None]:
#calculate the average calories of coffee
average_coffee_calories = df_coffee['Calories'].mean()
print('The average calories of coffee =',average_coffee_calories)
#calculate the male caloric intake from coffee
print('Male calories intake from coffee =', average_coffee_calories/male_calories*100,'%')
#calculate the female caloric intake from soda
print('Female calories intake from coffee =', average_coffee_calories/female_calories*100,'%')

From the calculation we can see the average calories intake from coffee, for male it is 5.82%, and for female it is 7.27%. We can conclude that the calories intake from coffee is higher than soda.

### Does ordered grilled chicken instead of crispy increase a sandwich's nutritional value?

For this question I will take the Item that contains 'Sandwichs' and see the difference of the nutritional value between grilled and crispy chicken.

In [None]:
df3=df[df['Item'].str.contains('Sandwich')]
df3

In [None]:
df3.loc[df3['Item'].str.contains('Grilled'), 'Item']='Grilled Chicken'
df3.loc[df3['Item'].str.contains('Crispy'), 'Item']='Crispy Chicken'
df3

In [None]:
df3=df3.groupby('Item').mean()
df3

Comparing nutritional value, we have to compare the same two things, so I decided to compare the '%Daily Value' column only. Before that we must calculate the %Daily Value of Calories because it is a very important column. We can set the average Calories requirements for male and female namely 2250. 

In [None]:
df3['Calories (% Daily Value)'] = (df3['Calories']/2250)*100
df3

In [None]:
#dropping columns
df3 = df3.drop(['Calories','Calories from Fat', 'Total Fat', 'Saturated Fat', 'Trans Fat', 'Cholesterol', 'Sodium', 'Carbohydrates', 'Dietary Fiber', 'Sugars', 'Protein'], axis=1)

df3

In [None]:
df3_new = df3.rename(columns={'Total Fat (% Daily Value)' : 'Total Fat', 
                                    'Saturated Fat (% Daily Value)' : 'Saturated Fat',
                                    'Cholesterol (% Daily Value)' : 'Cholesterol',
                                    'Sodium (% Daily Value)' : 'Sodium',
                                    'Carbohydrates (% Daily Value)' : 'Carbohydrates',
                                    'Dietary Fiber (% Daily Value)' : 'Dietary Fiber',
                                    'Vitamin A (% Daily Value)' : 'Vitamin A',
                                    'Vitamin C (% Daily Value)' : 'Vitamin C',
                                    'Calcium (% Daily Value)' : 'Calcium',
                                    'Iron (% Daily Value)' : 'Iron',
                                    'Calories (% Daily Value)' : 'Calories'
                                   })
df3_new

In [None]:
import numpy as np

col_ = list(df3_new.columns)
row_ = list(df3_new.index)
ncol, nrow = len(col_), len(row_)
fig, ax = plt.subplots(figsize = (16, 8))
pos = np.arange(ncol)
width = 1/1.5/nrow
def autolabel(rects):
    for rect in rects:
        height = rect.get_height()
        ax.annotate('{}'.format(int(height*10)/10),
                    xy=(rect.get_x() + rect.get_width() / 2, height),
                    xytext=(0, 3),  # 3 points vertical offset
                    textcoords="offset points",
                    ha='center', va='bottom')
        
for i, lbl in enumerate(df3_new.index):
    c = next(ax._get_lines.prop_cycler)['color']
    p_ = ax.bar(pos + i*width, [df3_new[col][lbl] for col in col_], width, color = c, label = lbl)
    autolabel(p_)

ax.set_xticks(pos + nrow*width/4)
ax.set_xticklabels(col_)   
ax.set_title('Crispy vs Grilled Chicken Nutritional Value (% Daily Value)')
ax.legend(loc = 'upper right')

plt.show()

From the plot you can compare the nutritional value of crispy and grilled chicken. Is the crispy chicken better than grilled chicken? For some nutritious value crispy is higher like calories, sodium, total fat, and grilled nutritious value is higher in vitamins and fiber. Is the crispy chicken better than grilled chicken? I think it is very relative because what make it better is based on the customer preference. Some customer may just care about the calories or vitamins or fiber. So we can't conclude crispy is better than grilled, it is up to you. 

### What about ordering egg whites instead of whole eggs?
The process to answer this questionis the same with the previous question but it is for Whole Egg and Egg Whites Only.

In [None]:
df4 = df[df['Item'].str.contains('Egg White | Egg')]
df4

In [None]:
df4.loc[df4['Item'].str.contains('White'), 'Item']='Whites Only'
df4.loc[df4['Item'].str.contains('Egg'), 'Item']='Whole Egg'
df4

In [None]:
df4=df4.groupby('Item').mean()
df4

In [None]:
df4['Calories (% Daily Value)'] = (df4['Calories']/2250)*100
df4

In [None]:
df4 = df4.drop(['Calories','Calories from Fat', 'Total Fat', 'Saturated Fat', 'Trans Fat', 'Cholesterol', 'Sodium', 'Carbohydrates', 'Dietary Fiber', 'Sugars', 'Protein'], axis=1)
df4

In [None]:
df4_new = df4.rename(columns={'Total Fat (% Daily Value)' : 'Total Fat', 
                                    'Saturated Fat (% Daily Value)' : 'Saturated Fat',
                                    'Cholesterol (% Daily Value)' : 'Cholesterol',
                                    'Sodium (% Daily Value)' : 'Sodium',
                                    'Carbohydrates (% Daily Value)' : 'Carbohydrates',
                                    'Dietary Fiber (% Daily Value)' : 'Dietary Fiber',
                                    'Vitamin A (% Daily Value)' : 'Vitamin A',
                                    'Vitamin C (% Daily Value)' : 'Vitamin C',
                                    'Calcium (% Daily Value)' : 'Calcium',
                                    'Iron (% Daily Value)' : 'Iron',
                                    'Calories (% Daily Value)' : 'Calories'
                                   })
df4_new

In [None]:
col_ = list(df4_new.columns)
row_ = list(df4_new.index)
ncol, nrow = len(col_), len(row_)
fig, ax = plt.subplots(figsize = (16, 8))
pos = np.arange(ncol)
width = 1/1.5/nrow
def autolabel(rects):
    for rect in rects:
        height = rect.get_height()
        ax.annotate('{}'.format(int(height*10)/10),
                    xy=(rect.get_x() + rect.get_width() / 2, height),
                    xytext=(0, 3),  # 3 points vertical offset
                    textcoords="offset points",
                    ha='center', va='bottom')
        
for i, lbl in enumerate(df4_new.index):
    c = next(ax._get_lines.prop_cycler)['color']
    p_ = ax.bar(pos + i*width, [df4_new[col][lbl] for col in col_], width, color = c, label = lbl)
    autolabel(p_)

ax.set_xticks(pos + nrow*width/4)
ax.set_xticklabels(col_)   
ax.set_title('Egg Whites Only vs Whole Egg Nutritional Value (% Daily Value)')
ax.legend(loc = 'upper right')

plt.show()

Generally the nutrious value of the whole egg is higher than whites only of course because in the whole egg we have egg yolk to make it more nutrious.

### What is the least number of items could you order from the menu to meet one day's nutritional requirements?

For this question I make only calories requirement per day as the reference. Let's say it is 2500 calories, then we must looking for the items that fulfill this requirements. I only give one example because there will be so many combination of the item. 
In one day we usually eat three times, so we must have at least 833 calories. 

In [None]:
#ckecking the item that have 833 or higher calories
df5=df[df['Calories']>=833]
df5

The result above is only and example, you can make another combination like coffee plus other item, or soda+other item. I think this question is have many combination.