![logo](./images/logo.png "McDonald's Logo")

<center><h1>Exploratory Data Analysis of the McDonald's Menu</h1></center>
<br/>
<center>Practical Data Science</center>
<center>Shannon Lu, Calvin Lui</center>

For Project 1, we were feeling hungry so we decided to explore the world of fast food. Particularly, our goal was to analyze data related to McDonald's. Most of our work is done on the McDonald's menu, but we also made an effort to incorporate related datasets. These include tables for daily recommended values and menus for other major fast food chains. The following notebook is a visual guide of the GOOD's, BAD's, and REALLY BAD's of McDonald's and fast food.

## Technologies

There were two main ways we carried out our exploratory data analysis. The first was through Tableau, which is embedded throughout the notebook (look for %%HTML). The second was through Plotly, an aesthetic, browser-based graphing package for Python. Since both of our graphing techniques depend on online connectivity, some cells may require rerunning to display their visualizations.

In [1]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
import plotly
from plotly.graph_objs import Bar, Scatter, Figure, Layout, Histogram, Pie, Scatterpolar
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import plotly.figure_factory as ff

init_notebook_mode(connected=True)

## 0. What does the McDonald's menu look like?

From: https://www.kaggle.com/mcdonalds/nutrition-facts

In [3]:
mcdonalds = pd.read_csv('data/mcdonalds.csv')
mcdonalds.head()

Unnamed: 0,Category,Item,Serving Size,Calories,Calories from Fat,Total Fat,Total Fat (% Daily Value),Saturated Fat,Saturated Fat (% Daily Value),Trans Fat,...,Carbohydrates,Carbohydrates (% Daily Value),Dietary Fiber,Dietary Fiber (% Daily Value),Sugars,Protein,Vitamin A (% Daily Value),Vitamin C (% Daily Value),Calcium (% Daily Value),Iron (% Daily Value)
0,Breakfast,Egg McMuffin,4.8 oz (136 g),300,120,13.0,20,5.0,25,0.0,...,31,10,4,17,3,17,10,0,25,15
1,Breakfast,Egg White Delight,4.8 oz (135 g),250,70,8.0,12,3.0,15,0.0,...,30,10,4,17,3,18,6,0,25,8
2,Breakfast,Sausage McMuffin,3.9 oz (111 g),370,200,23.0,35,8.0,42,0.0,...,29,10,4,17,2,14,8,0,25,10
3,Breakfast,Sausage McMuffin with Egg,5.7 oz (161 g),450,250,28.0,43,10.0,52,0.0,...,30,10,4,17,2,21,15,0,30,15
4,Breakfast,Sausage McMuffin with Egg Whites,5.7 oz (161 g),400,210,23.0,35,8.0,42,0.0,...,30,10,4,17,2,21,6,0,25,10


In [4]:
%%HTML
<div class='tableauPlaceholder' id='viz1520299809449' style='position: relative'><noscript><a href='#'><img alt='Daily Percentages of Cholesterol and Sodium, by Category ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;un&#47;unhealthy&#47;bubble&#47;1_rss.png' style='border: none' /></a></noscript><object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> <param name='embed_code_version' value='3' /> <param name='site_root' value='' /><param name='name' value='unhealthy&#47;bubble' /><param name='tabs' value='no' /><param name='toolbar' value='yes' /><param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;un&#47;unhealthy&#47;bubble&#47;1.png' /> <param name='animate_transition' value='yes' /><param name='display_static_image' value='yes' /><param name='display_spinner' value='yes' /><param name='display_overlay' value='yes' /><param name='display_count' value='yes' /><param name='filter' value='publish=yes' /></object></div>                <script type='text/javascript'>                    var divElement = document.getElementById('viz1520299809449');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

## 1. Which menu items are the "healthiest"?

#### Assumptions:

a) Healthiness ~ low amounts of: calories, saturated fat, trans fat, cholesterol, sodium, and sugars  
b) Each nutritional attribute plays the same role in determining healthiness (no weighted values)

#### Approach A: Count of Max/Min Select Nutrition Values

Healthiness determined by each item's count of maximum or minimum nutrition values across the menu. 

NOTE: this approach is focused on extracting the extremes, or the very best/worst food items

In [5]:
df = mcdonalds

### Healthiest ###
lowCalorieItems = df.where(df['Calories'] != 0).sort_values(by=['Calories'],ascending=True).head(10)
lowSatFat = df.where(df['Saturated Fat'] != 0).sort_values(by=['Saturated Fat'],ascending=True).head(10)
lowTransFat = df.where(df['Trans Fat'] != 0).sort_values(by=['Trans Fat'],ascending=True).head(10)
lowCholesterol = df.where(df['Cholesterol'] != 0).sort_values(by=['Cholesterol'],ascending=True).head(10)
lowSugars = df.where(df['Sugars'] != 0).sort_values(by=['Sugars'],ascending=True).head(10)
lowSodium = df.where(df['Sodium'] != 0).sort_values(by=['Sodium'],ascending=True).head(10)

components = [lowCalorieItems, lowSatFat, lowTransFat, lowCholesterol, lowSodium, lowSugars]
potentialHealthiest = pd.concat(components)

### UnHealthiest ###
highCalorieItems = df.sort_values(by=['Calories'],ascending=False).head(10)
highSatFat = df.sort_values(by=['Saturated Fat'],ascending=False).head(10)
highTransFat = df.sort_values(by=['Trans Fat'],ascending=False).head(10)
highCholesterol = df.sort_values(by=['Cholesterol'],ascending=False).head(10)
highSugars = df.sort_values(by=['Sugars'],ascending=False).head(10)
highSodium = df.sort_values(by=['Sodium'],ascending=False).head(10)

components = [highCalorieItems, highSatFat, highTransFat, highCholesterol, highSodium, highSugars]
potentialUnhealthiest = pd.concat(components)

In [6]:
potentialHealthiest.head()

Unnamed: 0,Category,Item,Serving Size,Calories,Calories from Fat,Total Fat,Total Fat (% Daily Value),Saturated Fat,Saturated Fat (% Daily Value),Trans Fat,...,Carbohydrates,Carbohydrates (% Daily Value),Dietary Fiber,Dietary Fiber (% Daily Value),Sugars,Protein,Vitamin A (% Daily Value),Vitamin C (% Daily Value),Calcium (% Daily Value),Iron (% Daily Value)
101,Snacks & Sides,Apple Slices,1.2 oz (34 g),15.0,0.0,0.0,0.0,0.0,0.0,0.0,...,4.0,1.0,0.0,0.0,3.0,0.0,0.0,160.0,2.0,0.0
100,Snacks & Sides,Side Salad,3.1 oz (87 g),20.0,0.0,0.0,0.0,0.0,0.0,0.0,...,4.0,1.0,1.0,6.0,2.0,1.0,45.0,25.0,2.0,4.0
106,Desserts,Kids Ice Cream Cone,1 oz (29 g),45.0,10.0,1.5,2.0,1.0,4.0,0.0,...,7.0,2.0,0.0,0.0,6.0,1.0,2.0,0.0,4.0,0.0
132,Beverages,Minute Maid 100% Apple Juice Box,6 fl oz (177 ml),80.0,0.0,0.0,0.0,0.0,0.0,0.0,...,21.0,7.0,0.0,0.0,19.0,0.0,0.0,100.0,10.0,0.0
208,Coffee & Tea,Iced Coffee with Sugar Free French Vanilla Syr...,16 fl oz cup,80.0,40.0,4.5,7.0,3.0,15.0,0.0,...,9.0,3.0,0.0,0.0,1.0,1.0,4.0,0.0,4.0,0.0


In [7]:
healthiest_counts = potentialHealthiest.groupby('Item').size().reset_index(name='Count').sort_values(by='Count', ascending=False).head(5)

trace = Pie(
    labels=healthiest_counts['Item'],
    values=healthiest_counts['Count'],
    hoverinfo='label',
    textinfo='value',
    textfont=dict(size=20),
    hole=.4,
)

data = [trace]
layout = Layout(title="Items with the Highest Healthiest Counts")
figure = Figure(data=data, layout=layout)
iplot(figure)

In [8]:
potentialUnhealthiest.head()

Unnamed: 0,Category,Item,Serving Size,Calories,Calories from Fat,Total Fat,Total Fat (% Daily Value),Saturated Fat,Saturated Fat (% Daily Value),Trans Fat,...,Carbohydrates,Carbohydrates (% Daily Value),Dietary Fiber,Dietary Fiber (% Daily Value),Sugars,Protein,Vitamin A (% Daily Value),Vitamin C (% Daily Value),Calcium (% Daily Value),Iron (% Daily Value)
82,Chicken & Fish,Chicken McNuggets (40 piece),22.8 oz (646 g),1880,1060,118.0,182,20.0,101,1.0,...,118,39,6,24,1,87,0,15,8,25
32,Breakfast,Big Breakfast with Hotcakes (Large Biscuit),15.3 oz (434 g),1150,540,60.0,93,20.0,100,0.0,...,116,39,7,28,17,36,15,2,30,40
31,Breakfast,Big Breakfast with Hotcakes (Regular Biscuit),14.8 oz (420 g),1090,510,56.0,87,19.0,96,0.0,...,111,37,6,23,17,36,15,2,25,40
34,Breakfast,Big Breakfast with Hotcakes and Egg Whites (La...,15.4 oz (437 g),1050,450,50.0,77,16.0,81,0.0,...,115,38,7,28,18,35,4,2,25,30
33,Breakfast,Big Breakfast with Hotcakes and Egg Whites (Re...,14.9 oz (423 g),990,410,46.0,70,16.0,78,0.0,...,110,37,6,23,17,35,0,2,25,30


In [9]:
unhealthiest_counts = potentialUnhealthiest.groupby('Item').size().reset_index(name='Count').sort_values(by='Count', ascending=False).head(5)

trace = Pie(
    labels=unhealthiest_counts['Item'],
    values=unhealthiest_counts['Count'],
    hoverinfo='label',
    textinfo='value',
    textfont=dict(size=20),
    hole=.4,
)

data = [trace]
layout = Layout(title="Items with the Highest Unhealthiest Counts")
figure = Figure(data=data, layout=layout)
iplot(figure)

#### Approach B: Healthiness Index

Healthiness Index = Sum of Standardized Nutritional Values  
(Negative = Relatively Healthy, Positive = Relatively Unhealthy)

In [10]:
def standardize(df, label):
    """
    standardizes a series with name ``label'' within the pd.DataFrame
    ``df''.
    """
    df = df.copy(deep=True)
    series = df.loc[:, label]
    avg = series.mean()
    stdv = series.std()
    series_standardized = (series - avg)/ stdv
    return series_standardized

In [11]:
scaled = df[['Item', 'Category', 'Calories', 'Total Fat', 'Saturated Fat', 'Trans Fat', 'Cholesterol', 'Sodium', 'Sugars']].copy()
scaled.Calories = standardize(scaled, 'Calories')
scaled['Total Fat'] = standardize(scaled, 'Total Fat')
scaled['Saturated Fat'] = standardize(scaled, 'Saturated Fat')
scaled['Trans Fat'] = standardize(scaled, 'Trans Fat')
scaled.Cholesterol = standardize(scaled, 'Cholesterol')
scaled.Sodium = standardize(scaled, 'Sodium')
scaled.Sugars = standardize(scaled, 'Sugars')

scaled = scaled[scaled.Category != "Beverages"]
scaled = scaled[scaled.Category != "Coffee & Tea"]
scaled['Healthiness Index'] = scaled.sum(axis = 1)

In [12]:
th = scaled.sort_values('Healthiness Index').head()

th['Calories'] = th['Calories'].abs()
th['Total Fat'] = th['Total Fat'].abs()
th['Saturated Fat'] = th['Saturated Fat'].abs()
th['Trans Fat'] = th['Trans Fat'].abs()
th['Cholesterol'] = th['Cholesterol'].abs()
th['Sodium'] = th['Sodium'].abs()
th['Sugars'] = th['Sugars'].abs()

th

Unnamed: 0,Item,Category,Calories,Total Fat,Saturated Fat,Trans Fat,Cholesterol,Sodium,Sugars,Healthiness Index
101,Apple Slices,Snacks & Sides,1.470302,0.997141,1.128868,0.475019,0.629572,0.859146,0.921313,-6.481361
100,Side Salad,Snacks & Sides,1.449492,0.997141,1.128868,0.475019,0.629572,0.841816,0.956181,-6.478089
106,Kids Ice Cream Cone,Desserts,1.345442,0.891552,0.940964,0.475019,0.572278,0.824486,0.81671,-5.866451
99,Kids French Fries,Snacks & Sides,1.074913,0.645177,0.940964,0.475019,0.629572,0.7465,1.025917,-5.538062
87,Premium Southwest Salad (without Chicken),Salads,0.950053,0.680374,0.75306,0.475019,0.514984,0.599193,0.81671,-4.789393


In [13]:
health_axes = ['(Low) Calories','(Low) Total Fat','(Low) Saturated Fat', '(Low) Trans Fat', '(Low) Cholesterol', '(Low) Sodium', '(Low) Sugars']

trace0 = Scatterpolar(
         r = th.iloc[0, 2:9],
         theta = health_axes,
         fill = 'toself',
         fillcolor = 'rgba(0, 155, 0, 0.7)',
         marker=dict(color='rgba(0, 100, 0, 0.7)'),
         name = th.iloc[0, 0]
         )

trace4 = Scatterpolar(
         r = th.iloc[4, 2:9],
         theta = health_axes,
         fill = 'toself',
         fillcolor = 'rgba(0, 255, 0, 0.7)',
         marker=dict(color='rgba(0, 100, 0, 0.7)'),
         name = th.iloc[4, 0]
         )

In [14]:
data = [trace0]
layout = Layout(title="Apple Slices", polar=dict(radialaxis = dict(visible = True,range = [0, 1.5])))
figure = Figure(data=data, layout=layout)
iplot(figure)

Apple Slices has the best Healthiness Index Score on the menu.

In [15]:
data = [trace4]
layout = Layout(title="Premium Southwest Salad (without Chicken)", polar=dict(radialaxis = dict(visible = True,range = [0, 1.5])))
figure = Figure(data=data, layout=layout)
iplot(figure)

Premium Southwest Salad (without Chicken) has the 5th best Healthiness Index Score on the menu.

In [16]:
tuh = scaled.sort_values('Healthiness Index', ascending=False).head()
tuh

Unnamed: 0,Item,Category,Calories,Total Fat,Saturated Fat,Trans Fat,Cholesterol,Sodium,Sugars,Healthiness Index
82,Chicken McNuggets (40 piece),Chicken & Fish,6.291803,7.309209,2.629207,1.855262,2.407007,5.379737,-0.991049,24.881177
32,Big Breakfast with Hotcakes (Large Biscuit),Breakfast,3.253553,3.226427,2.629207,-0.475019,5.959231,3.057486,-0.433165,17.217721
31,Big Breakfast with Hotcakes (Regular Biscuit),Breakfast,3.003834,2.944856,2.441304,-0.475019,5.959231,2.866854,-0.433165,16.307894
47,Double Quarter Pounder with Cheese,Beef & Pork,1.588758,2.029749,2.441304,5.350683,1.203834,1.359123,-0.677239,13.296212
28,Big Breakfast (Large Biscuit),Breakfast,1.796858,2.663285,2.2534,-0.475019,5.730056,2.052333,-0.921313,13.099598


In [17]:
unhealth_axes = ['Calories','Total Fat','Saturated Fat', 'Trans Fat', 'Cholesterol', 'Sodium', 'Sugars']
trace0 = Scatterpolar(
         r = tuh.iloc[0, 2:9],
         theta = unhealth_axes,
         fill = 'toself',
         fillcolor = 'rgba(155, 0, 0, 0.7)',
         marker=dict(color='rgba(100, 0, 0, 0.7)'),
         name = tuh.iloc[0, 0]
         )

trace4 = Scatterpolar(
         r = tuh.iloc[4, 2:9],
         theta = unhealth_axes,
         fill = 'toself',
         fillcolor = 'rgba(255, 0, 0, 0.7)',
         marker=dict(color='rgba(100, 0, 0, 0.7)'),
         name = tuh.iloc[4, 0]
         )

In [18]:
data = [trace0]
layout = Layout(title="Chicken McNuggets (40 piece)", polar=dict(radialaxis = dict(visible = True,range = [0, 7.5])))
figure = Figure(data=data, layout=layout)
iplot(figure)

Chicken McNuggets (40 piece) has the worst Healthiness Index Score on the menu.

In [19]:
data = [trace4]
layout = Layout(title="Big Breakfast (Large Biscuit)", polar=dict(radialaxis = dict(visible = True,range = [0, 7.5])))
figure = Figure(data=data, layout=layout)
iplot(figure)

Big Breakfast (Large Biscuit) has the 5th worst Healthiness Index Score on the menu.

#### Observations:
* Healthiest food options at McDonald’s include Apple Slices, Side Salad and Kid’s Ice Cream Cone  
* Unhealthiest food options include Big Breakfast with Hotcakes and 40 pc Chicken McNuggets

## 2. McDonald's Items vs. Daily Recommended Values

Source: https://www.ncbi.nlm.nih.gov/books/NBK56068/table/summarytables.t2  
(scraped and converted to csv via Scrapy)

In [20]:
vitamins = pd.read_csv('data/vitamins.csv', encoding='utf-8')
vitamins.set_index('Unnamed: 0', inplace=True)
vitamins.index.names = ['index']
vitamins.head()

Unnamed: 0_level_0,life_stage_group,vitamin_a_ug,vitamin_a,vitamin_c,vitamin_d_ug,vitamin_d,vitamin_e,vitamin_k_ug,vitamin_k,vitamin_b_6,vitamin_b_12_ug,vitamin_b_12,life_stage
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
1,0–6 mo,400,0.4,40,10,0.01,4,2.0,0.002,0.1,0.4,0.0004,Infants
2,6–12 mo,500,0.5,50,10,0.01,5,2.5,0.0025,0.3,0.5,0.0005,Infants
4,1–3 y,300,0.3,15,15,0.015,6,30.0,0.03,0.5,0.9,0.0009,Children
5,4–8 y,400,0.4,25,15,0.015,7,55.0,0.055,0.6,1.2,0.0012,Children
7,9–13 y,600,0.6,45,15,0.015,11,60.0,0.06,1.0,1.8,0.0018,Males


Source: https://www.ncbi.nlm.nih.gov/books/NBK56068/table/summarytables.t4  
(scraped and converted to csv via Scrapy)

In [21]:
macronutrients = pd.read_csv('data/macronutrients.csv', encoding='utf-8')
macronutrients.set_index('Unnamed: 0', inplace=True)
macronutrients.index.names = ['index']
macronutrients.head()

Unnamed: 0_level_0,life_stage_group,carbohydrate,total_fiber,fat,linoleic_acid,protein,life_stage
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1,0–6 mo,60,ND,31,4.4,9.1,Infants
2,6–12 mo,95,ND,30,4.6,11.0,Infants
4,1–3 y,130,19,"ND,c",7.0,13.0,Children
5,4–8 y,130,25,ND,10.0,19.0,Children
7,9–13 y,130,31,ND,12.0,34.0,Males


In [22]:
%%HTML
<div class='tableauPlaceholder' id='viz1520280080951' style='position: relative'><noscript><a href='#'><img alt='Daily Recommended Values for Carbohydrates and Protein, by Life Stage ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;ma&#47;macronutrients_eda&#47;carbs_protein&#47;1_rss.png' style='border: none' /></a></noscript><object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> <param name='embed_code_version' value='3' /> <param name='site_root' value='' /><param name='name' value='macronutrients_eda&#47;carbs_protein' /><param name='tabs' value='no' /><param name='toolbar' value='yes' /><param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;ma&#47;macronutrients_eda&#47;carbs_protein&#47;1.png' /> <param name='animate_transition' value='yes' /><param name='display_static_image' value='yes' /><param name='display_spinner' value='yes' /><param name='display_overlay' value='yes' /><param name='display_count' value='yes' /><param name='filter' value='publish=yes' /></object></div>                <script type='text/javascript'>                    var divElement = document.getElementById('viz1520280080951');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

In [23]:
carbohy = macronutrients[['life_stage', 'life_stage_group', 'carbohydrate']]
protein = macronutrients[['life_stage', 'life_stage_group', 'protein']]

NOTE: Daily Recommended Values for Total Fat is not determined for many demographic groups.

#### 4 Most Popular McDonald's Items
1. French Fries

2. Big Mac

3. Snack Wrap

4. Happy Meal

Source: https://money.howstuffworks.com/10-popular-mcdonalds-menu-items6.htm

In [24]:
# Number 1 French Fries + Number 2 Big Mac + Beverage = Big Mac Meal
big_mac = mcdonalds.loc[mcdonalds['Item'] == "Big Mac"]
med_fries = mcdonalds.loc[mcdonalds['Item'] == "Medium French Fries"]
med_drink = mcdonalds.loc[mcdonalds['Item'] == "Coca-Cola Classic (Medium)"]

In [25]:
colors = ['rgba(255, 0, 0, 1)', 'rgba(205, 0, 0, 1)', 'rgba(155, 0, 0, 1)', 'rgba(105, 0, 0, 1)']
count = 0

meal_carbohy = big_mac.Carbohydrates.values[0] + med_fries.Carbohydrates.values[0] + med_drink.Carbohydrates.values[0]

trace1 = Bar(
    x=carbohy.life_stage,
    y=[meal_carbohy] * len(carbohy.life_stage),
    marker=dict(
        color=colors[count]),
    name='1 Big Mac Meal'
)
count = count + 1

trace2 = Bar(
    x=carbohy.life_stage,
    y=[meal_carbohy * 2] * len(carbohy.life_stage),
    marker=dict(
        color=colors[count]),
    name='2 Big Mac Meals'
)
count = count + 1

trace3 = Bar(
    x=carbohy.life_stage,
    y=carbohy.carbohydrate,
    marker=dict(
        color='rgb(0, 191, 255)'),
    name='Daily Recommended Values'
)

data = [trace1, trace2, trace3]
layout = Layout(title="Daily Recommended Carbohydrates vs. Big Mac Medium Meal", yaxis={'title':'Carbohydrates (g)'})
figure = Figure(data=data, layout=layout)
iplot(figure)

In [26]:
colors = ['rgba(255, 0, 0, 1)', 'rgba(205, 0, 0, 1)', 'rgba(155, 0, 0, 1)', 'rgba(105, 0, 0, 1)']
count = 0

meal_protein = big_mac.Protein.values[0] + med_fries.Protein.values[0] + med_drink.Protein.values[0]

trace1 = Bar(
    x=protein.life_stage,
    y=[meal_protein] * len(protein.life_stage),
    marker=dict(
        color=colors[count]),
    name='1 Big Mac Meal'
)
count = count + 1

trace2 = Bar(
    x=protein.life_stage,
    y=[meal_protein * 2] * len(protein.life_stage),
    marker=dict(
        color=colors[count]),
    name='2 Big Mac Meals'
)
count = count + 1

trace3 = Bar(
    x=protein.life_stage,
    y=protein.protein,
    marker=dict(
        color='rgb(0, 191, 255)'),
    name='Daily Recommended Values'
)

data = [trace1, trace2, trace3]
layout = Layout(title="Daily Recommended Protein vs. Big Mac Medium Meal", yaxis={'title':'Protein (g)'})
figure = Figure(data=data, layout=layout)
iplot(figure)

#### "McDonald’s Is Taking Cheeseburgers and Chocolate Milk Off the Happy Meal Menu"
Source: http://people.com/food/mcdonalds-happy-meal-cheeseburgers-chocolate-milk/

In [27]:
# Happy Meal Proposed to be Removed by June 2018
cheeseburger = mcdonalds.loc[mcdonalds['Item'] == "Cheeseburger"]
kid_fries = mcdonalds.loc[mcdonalds['Item'] == "Kids French Fries"]
choco_milk = mcdonalds.loc[mcdonalds['Item'] == "Fat Free Chocolate Milk Jug"]

In [28]:
colors = ['rgba(255, 0, 0, 1)', 'rgba(205, 0, 0, 1)', 'rgba(155, 0, 0, 1)', 'rgba(105, 0, 0, 1)']
count = 0

meal_carbohy = cheeseburger.Carbohydrates.values[0] + kid_fries.Carbohydrates.values[0] + choco_milk.Carbohydrates.values[0]

trace1 = Bar(
    x=carbohy.life_stage,
    y=[meal_carbohy] * len(carbohy.life_stage[0:3]),
    marker=dict(
        color=colors[count]),
    name='1 Happy Meal'
)
count = count + 1

trace2 = Bar(
    x=carbohy.life_stage,
    y=[meal_carbohy * 2] * len(carbohy.life_stage[0:3]),
    marker=dict(
        color=colors[count]),
    name='2 Happy Meals'
)
count = count + 1

trace3 = Bar(
    x=carbohy.life_stage[0:3],
    y=carbohy.carbohydrate,
    marker=dict(
        color='rgb(0, 191, 255)'),
    name='Daily Recommended Values'
)

data = [trace1, trace2, trace3]
layout = Layout(title="Daily Recommended Carbohydrates vs. Happy Meal", yaxis={'title':'Carbohydrates (g)'})
figure = Figure(data=data, layout=layout)
iplot(figure)

In [29]:
colors = ['rgba(255, 0, 0, 1)', 'rgba(205, 0, 0, 1)', 'rgba(155, 0, 0, 1)', 'rgba(105, 0, 0, 1)']
count = 0

meal_protein = cheeseburger.Protein.values[0] + kid_fries.Protein.values[0] + choco_milk.Protein.values[0]

trace1 = Bar(
    x=protein.life_stage,
    y=[meal_protein] * len(protein.life_stage[0:3]),
    marker=dict(
        color=colors[count]),
    name='1 Happy Meal'
)
count = count + 1

trace2 = Bar(
    x=carbohy.life_stage,
    y=[meal_protein * 2] * len(protein.life_stage[0:3]),
    marker=dict(
        color=colors[count]),
    name='2 Happy Meals'
)
count = count + 1

trace3 = Bar(
    x=protein.life_stage[0:3],
    y=protein.protein,
    marker=dict(
        color='rgb(0, 191, 255)'),
    name='Daily Recommended Values'
)

data = [trace1, trace2, trace3]
layout = Layout(title="Daily Recommended Protein vs. Happy Meal", yaxis={'title':'Protein (g)'})
figure = Figure(data=data, layout=layout)
iplot(figure)

In [30]:
# Number 3 Snack Wrap vs. Premium McWrap
all_wraps = mcdonalds[mcdonalds['Item'].str.contains("Wrap")]
snack_wraps = mcdonalds[mcdonalds['Item'].str.contains("Snack Wrap")]
premium_wraps = mcdonalds[mcdonalds['Item'].str.contains("Premium McWrap")]

all_wraps[["Item", "Calories", "Total Fat", "Carbohydrates", "Protein"]].sort_values(by=['Calories', 'Total Fat'], ascending=False)

Unnamed: 0,Item,Calories,Total Fat,Carbohydrates,Protein
74,Premium McWrap Southwest Chicken (Crispy Chicken),670,33.0,68,27
70,Premium McWrap Chicken & Bacon (Crispy Chicken),630,32.0,56,32
72,Premium McWrap Chicken & Ranch (Crispy Chicken),610,31.0,56,27
76,Premium McWrap Chicken Sweet Chili (Crispy Chi...,540,23.0,61,23
75,Premium McWrap Southwest Chicken (Grilled Chic...,520,20.0,55,31
71,Premium McWrap Chicken & Bacon (Grilled Chicken),480,19.0,42,36
73,Premium McWrap Chicken & Ranch (Grilled Chicken),450,18.0,42,30
77,Premium McWrap Chicken Sweet Chili (Grilled Ch...,380,10.0,47,27
94,Ranch Snack Wrap (Crispy Chicken),360,20.0,32,15
90,Chipotle BBQ Snack Wrap (Crispy Chicken),340,15.0,37,14


In [31]:
snack_cal = snack_wraps["Calories"]
premium_cal = premium_wraps["Calories"]
snack_fat = snack_wraps["Total Fat (% Daily Value)"]
premium_fat = premium_wraps["Total Fat (% Daily Value)"]
snack_carb = snack_wraps["Carbohydrates"]
premium_carb = premium_wraps["Carbohydrates"]
snack_protein = snack_wraps["Protein"]
premium_protein = premium_wraps["Protein"]

trace0 = Scatter(
    x=snack_cal,
    y=snack_fat,
    name='Snack Wraps',
    mode='markers',
    marker=dict(
        color='rgb(255, 144, 14)',
        opacity=snack_protein,
        size=snack_carb,
    )
)

trace1 = Scatter(
    x=premium_cal,
    y=premium_fat,
    name='Premium McWraps',
    mode='markers',
    marker=dict(
        color='rgb(93, 164, 214)',
        opacity=premium_protein,
        size=premium_carb,
    )
)

data = [trace0, trace1]
layout = Layout(title="Snack Wraps vs. Premium McWraps", xaxis={'title':'Calories'}, yaxis={'title':'Total Fat (% Daily Value)'})
figure = Figure(data=data, layout=layout)
iplot(figure)

In [32]:
%%HTML
<div class='tableauPlaceholder' id='viz1520280102932' style='position: relative'><noscript><a href='#'><img alt='Daily Recommended Values for Vitamins, by Life Stage ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;vi&#47;vitamins_eda&#47;vitamins&#47;1_rss.png' style='border: none' /></a></noscript><object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> <param name='embed_code_version' value='3' /> <param name='site_root' value='' /><param name='name' value='vitamins_eda&#47;vitamins' /><param name='tabs' value='no' /><param name='toolbar' value='yes' /><param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;vi&#47;vitamins_eda&#47;vitamins&#47;1.png' /> <param name='animate_transition' value='yes' /><param name='display_static_image' value='yes' /><param name='display_spinner' value='yes' /><param name='display_overlay' value='yes' /><param name='display_count' value='yes' /><param name='filter' value='publish=yes' /></object></div>                <script type='text/javascript'>                    var divElement = document.getElementById('viz1520280102932');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

NOTE: McDonald's only provides data for Vitamin A and Vitamin C.

In [157]:
mcdonalds_vitamin = mcdonalds[["Item", "Vitamin A (% Daily Value)", "Vitamin C (% Daily Value)"]].sort_values(by=['Vitamin C (% Daily Value)', 'Vitamin A (% Daily Value)'], ascending=True)

trace0 = Bar(
    y=mcdonalds_vitamin["Item"],
    x=mcdonalds_vitamin["Vitamin A (% Daily Value)"],
    name='Vitamin A',
    orientation='h',
)

trace1 = Bar(
    y=mcdonalds_vitamin["Item"],
    x=mcdonalds_vitamin["Vitamin C (% Daily Value)"],
    name='Vitamin C',
    orientation='h',
)

data = [trace0, trace1]
layout = Layout(title="Vitamins in McDonald's Items", bargap=.1, xaxis={'title':'% of Daily Recommended Values'}, margin=dict(l=350, r=10, t=140, b=80))
figure = Figure(data=data, layout=layout)
iplot(figure)

#### Observations:
* Multiple McDonald’s Meals per Day = BAD  
* Happy Meal for Children = BAD  
* Snack Wraps are healthier than Premium McWraps

## 3. Is there a correlation between menu categories and their nutritional facts?

#### Tables Included Below: Distribution of Calories, Fat, Cholesterol, Sodium, Carbohydrates, Fiber, Sugar, Protein

In [158]:
%%HTML
<div class='tableauPlaceholder' id='viz1520301270407' style='position: relative'><noscript><a href='#'><img alt=' ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;di&#47;distribution_values&#47;calories_box&#47;1_rss.png' style='border: none' /></a></noscript><object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> <param name='embed_code_version' value='3' /> <param name='site_root' value='' /><param name='name' value='distribution_values&#47;calories_box' /><param name='tabs' value='yes' /><param name='toolbar' value='yes' /><param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;di&#47;distribution_values&#47;calories_box&#47;1.png' /> <param name='animate_transition' value='yes' /><param name='display_static_image' value='yes' /><param name='display_spinner' value='yes' /><param name='display_overlay' value='yes' /><param name='display_count' value='yes' /><param name='filter' value='publish=yes' /></object></div>                <script type='text/javascript'>                    var divElement = document.getElementById('viz1520301270407');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

#### Tables Included Below: Average Nutrition Values, Daily Percentages

In [159]:
%%HTML
<div class='tableauPlaceholder' id='viz1520301176734' style='position: relative'><noscript><a href='#'><img alt=' ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;da&#47;daily_bar&#47;values_bar&#47;1_rss.png' style='border: none' /></a></noscript><object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> <param name='embed_code_version' value='3' /> <param name='site_root' value='' /><param name='name' value='daily_bar&#47;values_bar' /><param name='tabs' value='yes' /><param name='toolbar' value='yes' /><param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;da&#47;daily_bar&#47;values_bar&#47;1.png' /> <param name='animate_transition' value='yes' /><param name='display_static_image' value='yes' /><param name='display_spinner' value='yes' /><param name='display_overlay' value='yes' /><param name='display_count' value='yes' /></object></div>                <script type='text/javascript'>                    var divElement = document.getElementById('viz1520301176734');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

#### Observations:
* Breakfast foods from McDonald’s tend to have a high number of unhealthy values  
* Desserts from McDonald’s are better than expected (compared to other categories)

## 4. McDonald's Locations vs. Health Across the U.S.

#### Tables Included Below: Diabetes Map, Scatter

In [160]:
%%HTML
<div class='tableauPlaceholder' id='viz1520277605349' style='position: relative'><noscript><a href='#'><img alt=' ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;di&#47;diabetes_eda&#47;map&#47;1_rss.png' style='border: none' /></a></noscript><object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> <param name='embed_code_version' value='3' /> <param name='site_root' value='' /><param name='name' value='diabetes_eda&#47;map' /><param name='tabs' value='yes' /><param name='toolbar' value='yes' /><param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;di&#47;diabetes_eda&#47;map&#47;1.png' /> <param name='animate_transition' value='yes' /><param name='display_static_image' value='yes' /><param name='display_spinner' value='yes' /><param name='display_overlay' value='yes' /><param name='display_count' value='yes' /></object></div>                <script type='text/javascript'>                    var divElement = document.getElementById('viz1520277605349');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

#### Tables Included Below: Obese Map, Scatter

In [161]:
%%HTML
<div class='tableauPlaceholder' id='viz1520277507495' style='position: relative'><noscript><a href='#'><img alt=' ' src='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;ob&#47;obesity_eda&#47;map&#47;1_rss.png' style='border: none' /></a></noscript><object class='tableauViz'  style='display:none;'><param name='host_url' value='https%3A%2F%2Fpublic.tableau.com%2F' /> <param name='embed_code_version' value='3' /> <param name='site_root' value='' /><param name='name' value='obesity_eda&#47;map' /><param name='tabs' value='yes' /><param name='toolbar' value='yes' /><param name='static_image' value='https:&#47;&#47;public.tableau.com&#47;static&#47;images&#47;ob&#47;obesity_eda&#47;map&#47;1.png' /> <param name='animate_transition' value='yes' /><param name='display_static_image' value='yes' /><param name='display_spinner' value='yes' /><param name='display_overlay' value='yes' /><param name='display_count' value='yes' /></object></div>                <script type='text/javascript'>                    var divElement = document.getElementById('viz1520277507495');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);                </script>

#### Observations:
* There exists a mild correlation between diabetes/obesity prevalence and McDonald’s locations

## 5. McDonald's vs. Other Fast Food Chains

In [162]:
# reformatting base mcdonalds dataframe
mcd = df[['Item', 'Calories', 'Total Fat', 'Saturated Fat', 'Trans Fat', 'Cholesterol', 'Sodium', 'Sugars']].copy()
mcd['restaurant'] = 'McDonalds'
mcd = mcd.rename(index=str, columns={"Item": "item", "Calories": "calories", "Total Fat": "total_fat", "Saturated Fat": "saturated_fat", "Trans Fat": "trans_fat", "Cholesterol": "cholesterol", "Sodium": "sodium", "Sugars": "sugars"})
mcd.head()

Unnamed: 0,item,calories,total_fat,saturated_fat,trans_fat,cholesterol,sodium,sugars,restaurant
0,Egg McMuffin,300,13.0,5.0,0.0,260,750,3,McDonalds
1,Egg White Delight,250,8.0,3.0,0.0,25,770,3,McDonalds
2,Sausage McMuffin,370,23.0,8.0,0.0,45,780,2,McDonalds
3,Sausage McMuffin with Egg,450,28.0,10.0,0.0,285,860,2,McDonalds
4,Sausage McMuffin with Egg Whites,400,23.0,8.0,0.0,50,880,2,McDonalds


From: https://fastfoodnutrition.org/burger-king  
(scraped and converted to csv using pandas.read_html)

In [163]:
bk = pd.read_csv('data/burger_king.csv')
bk['restaurant'] = 'Burger King'
bk = bk[['item', 'calories', 'total_fat', 'saturated_fat', 'trans_fat', 'cholesterol', 'sodium', 'sugars', 'restaurant']].copy()
bk.head()

Unnamed: 0,item,calories,total_fat,saturated_fat,trans_fat,cholesterol,sodium,sugars,restaurant
0,Bacon Cheeseburger,330,16,7.0,0.0,55,830,7,Burger King
1,Bacon Cheeseburger Deluxe,290,14,6.0,0.5,40,720,7,Burger King
2,Bacon King,1040,48,28.0,2.5,220,1900,10,Burger King
3,Bacon King Jr,730,39,9.0,0.0,90,1930,16,Burger King
4,BBQ Bacon King,1100,75,29.0,3.0,220,1850,13,Burger King


From: https://fastfoodnutrition.org/wendys  
(scraped and converted to csv using pandas.read_html)

In [164]:
wd = pd.read_csv('data/wendys.csv')
wd['restaurant'] = 'Wendys'
wd = wd[['item', 'calories', 'total_fat', 'saturated_fat', 'trans_fat', 'cholesterol', 'sodium', 'sugars', 'restaurant']].copy()
wd.head()

Unnamed: 0,item,calories,total_fat,saturated_fat,trans_fat,cholesterol,sodium,sugars,restaurant
0,Asiago Ranch Club w/ Homestyle Chicken,690,36,12.0,0.0,95,1630,9,Wendys
1,Asiago Ranch Club w/SpicyChicken,710,37,12.0,0.0,110,1630,9,Wendys
2,Asiago Ranch Club w/Ultimate Chicken Grill,570,27,10.0,0.0,125,1530,9,Wendys
3,Bacon Queso Burger,550,32,14.0,1.5,110,1140,7,Wendys
4,Bacon Queso Chicken Sandwich,590,27,9.0,0.0,105,1600,7,Wendys


From: https://fastfoodnutrition.org/taco-bell  
(scraped and converted to csv using pandas.read_html)

In [165]:
tb = pd.read_csv('data/taco_bell.csv')
tb['restaurant'] = 'Taco Bell'
tb = tb[['item', 'calories', 'total_fat', 'saturated_fat', 'trans_fat', 'cholesterol', 'sodium', 'sugars', 'restaurant']].copy()
tb.head()

Unnamed: 0,item,calories,total_fat,saturated_fat,trans_fat,cholesterol,sodium,sugars,restaurant
0,1/2 lb.* Cheesy Potato Burrito,540,26,7.0,1.0,45,1360,4,Taco Bell
1,1/2 lb.* Combo Burrito,460,18,7.0,1.0,45,1320,3,Taco Bell
2,7-Layer Burrito,510,19,7.0,0.0,20,1090,4,Taco Bell
3,Bean Burrito,370,11,4.0,0.0,5,960,3,Taco Bell
4,Beefy 5-Layer Burrito,550,22,8.0,0.0,35,1270,5,Taco Bell


From: https://fastfoodnutrition.org/chick-fil-a  
(scraped and converted to csv using pandas.read_html)

In [166]:
cfa = pd.read_csv('data/chick_fil_a.csv')
cfa['restaurant'] = 'Chick Fil A'
cfa = cfa[['item', 'calories', 'total_fat', 'saturated_fat', 'trans_fat', 'cholesterol', 'sodium', 'sugars', 'restaurant']].copy()
cfa.head()

Unnamed: 0,item,calories,total_fat,saturated_fat,trans_fat,cholesterol,sodium,sugars,restaurant
0,Chargrilled Chicken Club Sandwich,430.0,16.0,8.0,0.0,85.0,1120.0,7.0,Chick Fil A
1,Chargrilled Chicken Sandwich,310.0,6.0,2.0,0.0,55.0,820.0,7.0,Chick Fil A
2,Chick-n-Strips 1 Piece,120.0,6.0,3.0,0.0,25.0,320.0,1.0,Chick Fil A
3,Chick-n-Strips 2 Piece,230.0,12.0,3.0,0.0,55.0,630.0,1.0,Chick Fil A
4,Chick-n-Strips 3 Piece,350.0,17.0,3.0,0.0,70.0,940.0,3.0,Chick Fil A


In [167]:
mcd_cal = mcd["calories"].dropna(axis=0, how='any')
bk_cal = bk["calories"].dropna(axis=0, how='any')
wd_cal = wd["calories"].dropna(axis=0, how='any')
tb_cal = tb["calories"].dropna(axis=0, how='any')
cfa_cal = cfa["calories"].dropna(axis=0, how='any')

hist_data = [mcd_cal, bk_cal, wd_cal, tb_cal, cfa_cal]
group_labels = ["McDonald's", 'Burger King', "Wendy's", 'Taco Bell', 'Chick Fil-A']
colors = ['#E74C3C', '#F39C12', '#F4D03F', '#2ECC71', '#3498DB']

figure = ff.create_distplot(hist_data, group_labels, show_hist=False, colors=colors, bin_size=[50]*5)
figure['layout'].update(title='Distribution Plot of Calories, Fast Food Items')
iplot(figure)

In [168]:
mcd_fat = mcd["total_fat"].dropna(axis=0, how='any')
bk_fat = bk["total_fat"].dropna(axis=0, how='any')
wd_fat = wd["total_fat"].dropna(axis=0, how='any')
tb_fat = tb["total_fat"].dropna(axis=0, how='any')
cfa_fat = cfa["total_fat"].dropna(axis=0, how='any')

hist_data = [mcd_fat, bk_fat, wd_fat, tb_fat, cfa_fat]
group_labels = ["McDonald's", 'Burger King', "Wendy's", 'Taco Bell', 'Chick Fil-A']
colors = ['#E74C3C', '#F39C12', '#F4D03F', '#2ECC71', '#3498DB']

figure = ff.create_distplot(hist_data, group_labels, show_hist=False, colors=colors, bin_size=[50]*5)
figure['layout'].update(title='Distribution Plot of Total Fat, Fast Food Items')
iplot(figure)

In [169]:
mcd_chol = mcd["cholesterol"].dropna(axis=0, how='any')
bk_chol = bk["cholesterol"].dropna(axis=0, how='any')
wd_chol = wd["cholesterol"].dropna(axis=0, how='any')
tb_chol = tb["cholesterol"].dropna(axis=0, how='any')
cfa_chol = cfa["cholesterol"].dropna(axis=0, how='any')

hist_data = [mcd_chol, bk_chol, wd_chol, tb_chol, cfa_chol]
group_labels = ["McDonald's", 'Burger King', "Wendy's", 'Taco Bell', 'Chick Fil-A']
colors = ['#E74C3C', '#F39C12', '#F4D03F', '#2ECC71', '#3498DB']

figure = ff.create_distplot(hist_data, group_labels, show_hist=False, colors=colors, bin_size=[50]*5)
figure['layout'].update(title='Distribution Plot of Cholesterol, Fast Food Items')
iplot(figure)

In [170]:
mcd_sod = mcd["sodium"].dropna(axis=0, how='any')
bk_sod = bk["sodium"].dropna(axis=0, how='any')
wd_sod = wd["sodium"].dropna(axis=0, how='any')
tb_sod = tb["sodium"].dropna(axis=0, how='any')
cfa_sod = cfa["sodium"].dropna(axis=0, how='any')

hist_data = [mcd_sod, bk_sod, wd_sod, tb_sod, cfa_sod]
group_labels = ["McDonald's", 'Burger King', "Wendy's", 'Taco Bell', 'Chick Fil-A']
colors = ['#E74C3C', '#F39C12', '#F4D03F', '#2ECC71', '#3498DB']

figure = ff.create_distplot(hist_data, group_labels, show_hist=False, colors=colors, bin_size=[50]*5)
figure['layout'].update(title='Distribution Plot of Sodium, Fast Food Items')
iplot(figure)

In [171]:
mcd_sug = mcd["sugars"].dropna(axis=0, how='any')
bk_sug = bk["sugars"].dropna(axis=0, how='any')
wd_sug = wd["sugars"].dropna(axis=0, how='any')
tb_sug = tb["sugars"].dropna(axis=0, how='any')
cfa_sug = cfa["sugars"].dropna(axis=0, how='any')

hist_data = [mcd_sug, bk_sug, wd_sug, tb_sug, cfa_sug]
group_labels = ["McDonald's", 'Burger King', "Wendy's", 'Taco Bell', 'Chick Fil-A']
colors = ['#E74C3C', '#F39C12', '#F4D03F', '#2ECC71', '#3498DB']

figure = ff.create_distplot(hist_data, group_labels, show_hist=False, colors=colors, bin_size=[50]*5)
figure['layout'].update(title='Distribution Plot of Sugar, Fast Food Items')
iplot(figure)

#### Revisiting the Healthiness Index...

In [172]:
restaurants = pd.concat([mcd, bk, cfa, tb, wd])

scaledRestData = restaurants.copy()
scaledRestData['calories'] = standardize(scaledRestData, 'calories')
scaledRestData['total_fat'] = standardize(scaledRestData, 'total_fat')
scaledRestData['saturated_fat'] = standardize(scaledRestData, 'saturated_fat')
scaledRestData['trans_fat'] = standardize(scaledRestData, 'trans_fat')
scaledRestData['cholesterol'] = standardize(scaledRestData, 'cholesterol')
scaledRestData['sodium'] = standardize(scaledRestData, 'sodium')
scaledRestData['sugars'] = standardize(scaledRestData, 'sugars')
scaledRestData['Healthiness Index'] = scaledRestData.sum(axis = 1)

In [173]:
noBeverages = scaledRestData.dropna(axis=0, how='any')
noBeverages = noBeverages[(~noBeverages["item"].str.contains("Water")) &
                          (~noBeverages["item"].str.contains("Coffee")) &
                          (~noBeverages["item"].str.contains("Tea")) &
                          (~noBeverages["item"].str.contains("Coke")) &
                          (~noBeverages["item"].str.contains("Lemonade")) &
                          (~noBeverages["item"].str.contains("Pepsi")) &
                          (~noBeverages["item"].str.contains("Dr Pepper")) &
                          (~noBeverages["item"].str.contains("Mountain Dew"))]

noBeverages['calories'] = noBeverages['calories'].abs()
noBeverages['total_fat'] = noBeverages['total_fat'].abs()
noBeverages['saturated_fat'] = noBeverages['saturated_fat'].abs()
noBeverages['trans_fat'] = noBeverages['trans_fat'].abs()
noBeverages['cholesterol'] = noBeverages['cholesterol'].abs()
noBeverages['sodium'] = noBeverages['sodium'].abs()
noBeverages['sugars'] = noBeverages['sugars'].abs()

noBeverages = noBeverages.sort_values('Healthiness Index').head(10)
noBeverages

Unnamed: 0,item,calories,total_fat,saturated_fat,trans_fat,cholesterol,sodium,sugars,restaurant,Healthiness Index
175,Border Sauce Mild,1.469456,1.020191,0.982463,0.38441,0.681726,0.967239,0.728437,Taco Bell,-6.233921
174,Border Sauce Hot,1.469456,1.020191,0.982463,0.38441,0.681726,0.949601,0.728437,Taco Bell,-6.216283
173,Border Sauce Fire,1.469456,1.020191,0.982463,0.38441,0.681726,0.923144,0.728437,Taco Bell,-6.189826
190,Salsa Verde,1.447971,1.020191,0.982463,0.38441,0.681726,0.931963,0.728437,Taco Bell,-6.177159
182,Pico de Gallo,1.447971,1.020191,0.982463,0.38441,0.681726,0.967239,0.692999,Taco Bell,-6.176998
101,Apple Slices,1.404999,1.020191,0.982463,0.38441,0.681726,1.028973,0.622124,McDonalds,-6.124885
100,Side Salad,1.383514,1.020191,0.982463,0.38441,0.681726,1.011335,0.657561,McDonalds,-6.121199
187,Pickles (2),1.469456,1.020191,0.982463,0.38441,0.681726,0.852591,0.728437,Burger King,-6.119273
188,Salsa,1.447971,1.020191,0.982463,0.38441,0.681726,0.887867,0.692999,Taco Bell,-6.097626
183,Pizza Sauce,1.426485,1.020191,0.982463,0.38441,0.681726,0.887867,0.692999,Taco Bell,-6.07614


In [174]:
health_axes = ['(Low) Calories','(Low) Total Fat','(Low) Saturated Fat', '(Low) Trans Fat', '(Low) Cholesterol', '(Low) Sodium', '(Low) Sugars']

trace0 = Scatterpolar(
         r = noBeverages.iloc[4, 1:8],
         theta = health_axes,
         fill = 'toself',
         fillcolor = 'rgba(0, 255, 0, 0.7)',
         marker=dict(color='rgba(0, 100, 0, 0.7)'),
         name = noBeverages.iloc[4, 0]
         )

In [175]:
data = [trace0]
layout = Layout(title="TB Pico de Gallo", polar=dict(radialaxis = dict(visible = True,range = [0, 1.5])))
figure = Figure(data=data, layout=layout)
iplot(figure)

Taco Bell's Pico de Gallo has the best Healthiness Index Score across all menus.

In [176]:
unhealthiest = scaledRestData.sort_values('Healthiness Index', ascending = False).head()
unhealthiest

Unnamed: 0,item,calories,total_fat,saturated_fat,trans_fat,cholesterol,sodium,sugars,restaurant,Healthiness Index
70,Bk Ultimate Breakfast Platter,4.76139,4.833817,4.604431,1.79964,6.078975,4.121382,0.724507,Burger King,26.924142
82,Chicken McNuggets (40 piece),6.609158,7.203296,2.742133,1.79964,2.865969,5.32078,-0.692999,McDonalds,25.847977
16,Rodeo King,3.901963,4.694436,4.790661,7.259765,2.397405,2.974899,-0.23231,Burger King,25.786819
11,Farmhouse King,3.773049,4.555055,4.231971,6.16774,3.803096,2.586858,-0.196872,Burger King,24.920897
14,Dave's Triple Cheeseburger,3.085507,3.649077,4.604431,8.35179,2.732094,2.533944,-0.37406,Wendys,24.582782


In [177]:
unhealth_axes = ['Calories','Total Fat','Saturated Fat', 'Trans Fat', 'Cholesterol', 'Sodium', 'Sugars']
trace0 = Scatterpolar(
         r = unhealthiest.iloc[0, 1:8],
         theta = unhealth_axes,
         fill = 'toself',
         fillcolor = 'rgba(255, 0, 0, 0.7)',
         marker=dict(color='rgba(100, 0, 0, 0.7)'),
         name = unhealthiest.iloc[0, 0]
         )

trace1 = Scatterpolar(
         r = unhealthiest.iloc[1, 1:8],
         theta = unhealth_axes,
         fill = 'toself',
         fillcolor = 'rgba(255, 165, 0, 0.7)',
         marker=dict(color='rgba(100, 0, 0, 0.7)'),
         name = unhealthiest.iloc[1, 0]
         )

trace2 = Scatterpolar(
         r = unhealthiest.iloc[2, 1:8],
         theta = unhealth_axes,
         fill = 'toself',
         fillcolor = 'rgba(255, 100, 255, 0.7)',
         marker=dict(color='rgba(100, 0, 0, 0.7)'),
         name = unhealthiest.iloc[2, 0]
         )

In [178]:
data = [trace0]
layout = Layout(title="BK Ultimate Breakfast Platter", polar=dict(radialaxis = dict(visible = True,range = [0, 7.5])))
figure = Figure(data=data, layout=layout)
iplot(figure)

Burger King's Ultimate Breakfast Platter has the worst Healthiness Index Score across all menus.

In [179]:
data = [trace0, trace1, trace2]
layout = Layout(title="Top 3 Most Unhealthiest Fast Food Items", polar=dict(radialaxis = dict(visible = True,range = [0, 7.5])))
figure = Figure(data=data, layout=layout)
iplot(figure)

#### Observations:
* No major fast food chain is particularly healthier than the others  
* Burger King had 7 items in the Top 10 unhealthiest items across all fast food chains

## Findings (Recap of Observations)

#### From Question 1:
* Healthiest food options at McDonald’s include Apple Slices, Side Salad and Kid’s Ice Cream Cone  
* Unhealthiest food options include Big Breakfast with Hotcakes and 40 pc Chicken McNuggets

#### From Question 2:
* Multiple McDonald’s Meals per Day = BAD  
* Happy Meal for Children = BAD  
* Snack Wraps are healthier than Premium McWraps

#### From Question 3:
* Breakfast foods from McDonald’s tend to have a high number of unhealthy values  
* Desserts from McDonald’s are better than expected (compared to other categories)

#### From Question 4:
* There exists a mild correlation between diabetes/obesity prevalence and McDonald’s locations

#### From Question 5:
* No major fast food chain is particularly healthier than the others  
* Burger King had 7 items in the Top 10 unhealthiest items across all fast food chains

## Implications

* Very important to be aware of the nutritional value of fast food
* Parents should especially be on the lookout for their children
* Public image of McDonald’s is generally accurate
* Opportunity for a major fast food chain to brand itself as a champion of health


## Questions for Future Analysis

* How healthy is eating at McDonald’s compared to eating at a restaurant?
* What are the long-term health implications of eating at McDonald’s?
* Is the price and convenience of McDonald’s worth the health tradeoffs?