# Introduction
The dataset I want to analyze is called “Nutrition Fact for McDonald’s Menu”, this dataset records every nutrition 
analysis of every menu item on the United States McDonald’s menu. This dataset contains 24 columns and 260 rows. 
Some existing visualizations analyze this dataset focusing on bar charts to compare between different variable 
rather than scatterplots which would identify the relationship between variables. I will be using this as an inspiration for the project where I would look at it in different aspects and potentially implement user’s 
interaction on my data visualization. My inspiration includes: What is the average calories of a McDonald’s meal?
How does a McDonald’s meal fit into the recommended daily nutrient intake? Are the any correlations or 
relationships between different nutrients, for example, a positive relationship between total fat and sodium? If you have certain conditions (high blood pressure, high cholesterol, etc), what type of meal would you want to avoid at McDonalds?

# The Dataset

In [205]:
import pandas as pd
data= pd.read_csv('menu.csv')
df = pd.DataFrame(data)
df.head()

Unnamed: 0,Category,Item,Serving Size,Calories,Calories from Fat,Total Fat,Total Fat (% Daily Value),Saturated Fat,Saturated Fat (% Daily Value),Trans Fat,...,Carbohydrates,Carbohydrates (% Daily Value),Dietary Fiber,Dietary Fiber (% Daily Value),Sugars,Protein,Vitamin A (% Daily Value),Vitamin C (% Daily Value),Calcium (% Daily Value),Iron (% Daily Value)
0,Breakfast,Egg McMuffin,4.8 oz (136 g),300,120,13.0,20,5.0,25,0.0,...,31,10,4,17,3,17,10,0,25,15
1,Breakfast,Egg White Delight,4.8 oz (135 g),250,70,8.0,12,3.0,15,0.0,...,30,10,4,17,3,18,6,0,25,8
2,Breakfast,Sausage McMuffin,3.9 oz (111 g),370,200,23.0,35,8.0,42,0.0,...,29,10,4,17,2,14,8,0,25,10
3,Breakfast,Sausage McMuffin with Egg,5.7 oz (161 g),450,250,28.0,43,10.0,52,0.0,...,30,10,4,17,2,21,15,0,30,15
4,Breakfast,Sausage McMuffin with Egg Whites,5.7 oz (161 g),400,210,23.0,35,8.0,42,0.0,...,30,10,4,17,2,21,6,0,25,10


# Data Visualization 

First, I would like to explore the relationship between calories and total fat. Calories and total fat are the two most important factors for health-consicious users when making decision on what to eat. The key element I chose to visalize this interaction is categorization - this helps with decision-making where it is easier for user to visualize based on their chosen category. For color, I chose hue as we are dealing with categorical data and this helps users to distinguish between variables quickly.

In [206]:
import altair as alt
dropdown = alt.binding_select (options=data["Category"].unique(), name="Select a Category:")
selection = alt.selection(type="single", fields=["Category"], bind=dropdown)

alt.Chart(df).mark_circle().encode(
    x = "Calories",
    y = "Total Fat",
    color=alt.Color('Category', scale=alt.Scale(scheme='set2')),
    tooltip=["Item", "Calories"],
    opacity=alt.condition(selection,alt.value(1),alt.value(.2))
).add_selection(selection)

It can be seen that Chicken MccNuggets (40 pieces) is an outliner that is irrelavent to our dataset because 
it is a family meal so I have decided to remove this data point from the data set, the following graph show the relationship between calories and total fat without outliner.

In [207]:
df=df[df.Item != 'Chicken McNuggets (40 piece)']
import altair as alt
dropdown = alt.binding_select (options=data["Category"].unique(), name="Select a Category:")
selection = alt.selection(type="single", fields=["Category"], bind=dropdown)

alt.Chart(df).mark_circle().encode(
    x = "Calories",
    y = "Total Fat",
    color=alt.Color('Category', scale=alt.Scale(scheme='set2')),
    tooltip=["Item", "Calories from Fat"],
    opacity=alt.condition(selection,alt.value(1),alt.value(.2))
).add_selection(selection).properties(title='Calories vs. Total Fat')


Next, I would like to compare the means between categories; to see on average, which meal has the highest calories. I chose to visualize this using a bar chart and sort the output in ascending order. By sorting it in ascending order, users can immediately tell which category has the lowest and highest calorie means. Since means of calories are sequential data, using lightness and saturation for color to communicate the magnitude of the data which helps with the flow of decision making.

In [208]:
#Sort calories in ascending order based on category
base = alt.Chart(df).mark_bar().encode(
    y='mean(Calories):Q',
    color=alt.Color('mean(Calories):Q')
).properties(width=400, height=400)

base.encode(
    alt.X(field='Category', type='nominal', sort='y')
).properties(
    title='Mean of calories based on category'
)


I will compare the total amount of sugar and protein in each meal by implemeting filtering using dynamic queries and selection that uses dynamic query widget. Drop down allows users to easily select a category and making visualization easier by having the chosen category pops out. 

In [209]:
dropdown = alt.binding_select (options=data["Category"].unique(), name="Select a Category:")
selection = alt.selection(type="single", fields=["Category"], bind=dropdown)
a1=alt.Chart(df).mark_circle().encode(
    x = "Category",
    y = "Sugars",
    color=alt.Color('Category', scale=alt.Scale(scheme='set2')),
    tooltip=["Item", "Sugars"],
    opacity=alt.condition(selection,alt.value(1),alt.value(.2))
).add_selection(selection).properties(title='Total Sugars')

a2=alt.Chart(df).mark_circle().encode(
    x = "Category",
    y = "Protein",
    color=alt.Color('Category', scale=alt.Scale(scheme='set2')),
    tooltip=["Item", "Protein"],
    opacity=alt.condition(selection,alt.value(1),alt.value(.2))
).add_selection(selection).properties(title='Total Protein')

a1 | a2

For the % Daily Value of Total Fat, Cholesterol, Sodium, and Carbohydrates, I implimented the same key element of visualization like above. Although, instead of showing the total amount of nutrients, I showed the % of Daily Value, and users can hover the mouse over a point to see the name of the item and the amount of nutrients.

In [210]:
dropdown = alt.binding_select (options=data["Category"].unique(), name="Select a Category:")
selection = alt.selection(type="single", fields=["Category"], bind=dropdown)

b1=alt.Chart(df).mark_circle().encode(
    x = "Category",
    y = "Total Fat (% Daily Value)",
    color=alt.Color('Category', scale=alt.Scale(scheme='set2')),
    tooltip=["Item", "Total Fat"],
    opacity=alt.condition(selection,alt.value(1),alt.value(.2))
).add_selection(selection).properties(title='%DV of Total Fat')

b2=alt.Chart(df).mark_circle().encode(
    x = "Category",
    y = "Cholesterol (% Daily Value)",
    color=alt.Color('Category', scale=alt.Scale(scheme='set2')),
    tooltip=["Item", "Cholesterol"],
    opacity=alt.condition(selection,alt.value(1),alt.value(.2))
).add_selection(selection).properties(title='%DV of Cholesterol')

b3=alt.Chart(df).mark_circle().encode(
    x = "Category",
    y = "Sodium (% Daily Value)",
    color=alt.Color('Category', scale=alt.Scale(scheme='set2')),
    tooltip=["Item", "Sodium"],
    opacity=alt.condition(selection,alt.value(1),alt.value(.2))
).add_selection(selection).properties(title='%DV of Sodium')

b4=alt.Chart(df).mark_circle().encode(
    x = "Category",
    y = "Carbohydrates (% Daily Value)",
    color=alt.Color('Category', scale=alt.Scale(scheme='set2')),
    tooltip=["Item", "Carbohydrates"],
    opacity=alt.condition(selection,alt.value(1),alt.value(.2))
).add_selection(selection).properties(title='%DV of Carbohydrates')

b1 | b2 | b3 | b4

# Conclusion

The evaluation approach I used was insight-based evaluation, specifically think-aloud studies. My target population to evaluate this visualization is Mcdonald's customers. Though, with limited access to a large pool of Mcdonald's customers, I recruited my family members and friends who also frequent McDonald's. The main objective is to see if the visualizations help them with the flow of decision-making, if it is easier to order using the visualization, and how fast they make decisions in comparison to the traditional menu. I created two scenarios where the first scenario, I give my users a traditional menu, asking them to order a meal, given they are health-conscious people who read the nutrition facts for everything they eat. Then, for the second scenario, I show my users the visualizations and ask them the same question. I let my users explore and interact with the data for 15 minutes, I also ask them to write down questions they would like to pursue as well as make any observations.

The feedback I received from most users is the color for drop-down; when a category is selected, some light colors appear more blended in and not distinguished from other colors. Moreover, since some points are overlapped and not as spread out, this makes the tooltip function not as effective when hovering over a point to see its information. Otherwise, the users found the visualizations much easier to use than the traditional menu, which helps them make a decision faster and better when they can see the nutrient facts comparisons of every item on the menu. They also found the % Daily Value visualizations of nutrients helpful in not going over their daily limit.

Since my main objective is to see if the visualizations help my users with the flow of decision-making, and my users have stated that the visualization helps them make decisiona faster and better than the traditional menu; hence, I would consider my visualizations to be successful. For my future iterations, I would choose a more distinguished set of colors for my points, and make the graphs bigger so the points can be more spread out and clear.