# Making Mcdonald's Healthy 

##  A mathematical optimization approach with Python and Pulp



McDonald’s is known for being unhealthy, but is there a way to create a healthy combination of menu items that meets nutritional guidelines? To answer this question, we can use the power of linear optimization and Python.

First, we need data. A full menu in tabular format with information on calories, macronutrients, and food type is readily available on Kaggle. We also need nutritional guidelines to aim for. A quick Google search leads us to the NHS, which provides recommendations for daily intake of energy, total fat, carbohydrates, sugars, protein, and sodium.

Using linear programming, we can find the optimal combination of McDonald’s menu items that meets these nutritional guidelines. This technique is implemented in the Pulp Python package and allows us to set limitations (nutritional data) and variables (menu items) to achieve our goal.

While McDonald’s is typically viewed as an unhealthy fast food option, by using the power of data analysis and optimization, we can potentially create a healthier menu combination.

In [None]:
#importing all the necessary libraries needed for this analysis

import numpy as np
import pandas as pd
from pulp import *
from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True)
import plotly.graph_objs as go
import matplotlib.pyplot as plt
import os

In [7]:
#loading the data using pandas and looking at the first 5 rows of the data to view the dataset

Mcdata = pd.read_csv('menu.csv')
Mcdata.head()

Unnamed: 0,Category,Item,Serving Size,Calories,Calories from Fat,Total Fat,Total Fat (% Daily Value),Saturated Fat,Saturated Fat (% Daily Value),Trans Fat,...,Carbohydrates,Carbohydrates (% Daily Value),Dietary Fiber,Dietary Fiber (% Daily Value),Sugars,Protein,Vitamin A (% Daily Value),Vitamin C (% Daily Value),Calcium (% Daily Value),Iron (% Daily Value)
0,Breakfast,Egg McMuffin,4.8 oz (136 g),300,120,13.0,20,5.0,25,0.0,...,31,10,4,17,3,17,10,0,25,15
1,Breakfast,Egg White Delight,4.8 oz (135 g),250,70,8.0,12,3.0,15,0.0,...,30,10,4,17,3,18,6,0,25,8
2,Breakfast,Sausage McMuffin,3.9 oz (111 g),370,200,23.0,35,8.0,42,0.0,...,29,10,4,17,2,14,8,0,25,10
3,Breakfast,Sausage McMuffin with Egg,5.7 oz (161 g),450,250,28.0,43,10.0,52,0.0,...,30,10,4,17,2,21,15,0,30,15
4,Breakfast,Sausage McMuffin with Egg Whites,5.7 oz (161 g),400,210,23.0,35,8.0,42,0.0,...,30,10,4,17,2,21,6,0,25,10


In [20]:
# The function creates a Scatter graph object (go) and uses the data frame .isin() selection to extract the requested information

def make_scatter(Mcdata,category,x_cat,y_cat):
    return go.Scatter(
        x=Mcdata[Mcdata['Category'].isin([category])][x_cat],
        y=Mcdata[Mcdata['Category'].isin([category])][y_cat],
        mode='markers',
        name=category,
        text=Mcdata.Item)

In [24]:
# Define our categories to plot
x_cat = 'Calories'; y_cat = 'Carbohydrates'

# Create a list of scatter plots to view all at once
data = [make_scatter(Mcdata,cat,x_cat,y_cat) for cat in Mcdata.Category.unique().tolist()]

# Define the plot layout (title, ticks etc.)
layout = dict(title = 'McDonalds Nutrition',
   xaxis= dict(title= 'Calories',ticklen=5,zeroline= False,range=[0,1250]),
   yaxis= dict(title= 'Carbohydrates(g)',ticklen= 5,zeroline=False))

# Finally we will plot the data with the layout
fig = dict(data = data, layout = layout)
iplot(fig)

### Let's compare Sodium vs Fat?


In [25]:
# Define our categories to plot
x_cat = 'Total Fat'; y_cat = 'Sodium'

# Create a list of scatter plots to view all at once
data = [make_scatter(Mcdata,cat,x_cat,y_cat) for cat in Mcdata.Category.unique().tolist()]

# Define the plot layout (title, ticks etc.)
layout = dict(title = 'McDonalds Nutrition',
   xaxis= dict(title= 'Total Far',ticklen=5,zeroline= False,range=[0,65]),
   yaxis= dict(title= 'Sodium',ticklen= 5,zeroline=False))

# Finally we will plot the data with the layout
fig = dict(data = data, layout = layout)
iplot(fig)

We're getting into the fun stuff now - optimizing our meal plan to meet our nutritional goals while minimizing calories.

So, to start, let's define our objective function. Basically, this is what we want to achieve - in this case, we want to minimize calories. After all, who doesn't love tasty food that doesn't pack on the pounds?

But we can't just focus on calories alone. We also want to make sure we're getting all the nutrients we need to keep our bodies happy and healthy. So, we'll need to set some constraints to make sure we're not missing out on anything important.

To do that, we'll use some information from [2] to figure out how much of each nutrient we should be getting each day. Then, we'll convert our data into dictionaries so we can feed it into the optimization function.

It might seem like we're trying to achieve the impossible - getting all the nutrients we need without consuming any calories - but don't worry, we're not expecting miracles here. We just want to get as close as we can to our goals while still enjoying our food. After all, life is too short to eat boring meals!

In [26]:
# Convert the item names to a list
MenuItems = Mcdata.Item.tolist()


# Convert all of the macro nutrients fields to be dictionaries of the item names
Calories = Mcdata.set_index('Item')['Calories'].to_dict()
TotalFat = Mcdata.set_index('Item')['Total Fat'].to_dict()
SaturatedFat = Mcdata.set_index('Item')['Saturated Fat'].to_dict()
Carbohydrates = Mcdata.set_index('Item')['Carbohydrates'].to_dict()
Sugars = Mcdata.set_index('Item')['Sugars'].to_dict()
Protein = Mcdata.set_index('Item')['Protein'].to_dict()
Sodium = Mcdata.set_index('Item')['Sodium'].to_dict()

In [27]:
Calories

{'Egg McMuffin': 300,
 'Egg White Delight': 250,
 'Sausage McMuffin': 370,
 'Sausage McMuffin with Egg': 450,
 'Sausage McMuffin with Egg Whites': 400,
 'Steak & Egg McMuffin': 430,
 'Bacon, Egg & Cheese Biscuit (Regular Biscuit)': 460,
 'Bacon, Egg & Cheese Biscuit (Large Biscuit)': 520,
 'Bacon, Egg & Cheese Biscuit with Egg Whites (Regular Biscuit)': 410,
 'Bacon, Egg & Cheese Biscuit with Egg Whites (Large Biscuit)': 470,
 'Sausage Biscuit (Regular Biscuit)': 430,
 'Sausage Biscuit (Large Biscuit)': 480,
 'Sausage Biscuit with Egg (Regular Biscuit)': 510,
 'Sausage Biscuit with Egg (Large Biscuit)': 570,
 'Sausage Biscuit with Egg Whites (Regular Biscuit)': 460,
 'Sausage Biscuit with Egg Whites (Large Biscuit)': 520,
 'Southern Style Chicken Biscuit (Regular Biscuit)': 410,
 'Southern Style Chicken Biscuit (Large Biscuit)': 470,
 'Steak & Egg Biscuit (Regular Biscuit)': 540,
 'Bacon, Egg & Cheese McGriddles': 460,
 'Bacon, Egg & Cheese McGriddles with Egg Whites': 400,
 'Sausage M

In [29]:
Sugars

{'Egg McMuffin': 3,
 'Egg White Delight': 3,
 'Sausage McMuffin': 2,
 'Sausage McMuffin with Egg': 2,
 'Sausage McMuffin with Egg Whites': 2,
 'Steak & Egg McMuffin': 3,
 'Bacon, Egg & Cheese Biscuit (Regular Biscuit)': 3,
 'Bacon, Egg & Cheese Biscuit (Large Biscuit)': 4,
 'Bacon, Egg & Cheese Biscuit with Egg Whites (Regular Biscuit)': 3,
 'Bacon, Egg & Cheese Biscuit with Egg Whites (Large Biscuit)': 4,
 'Sausage Biscuit (Regular Biscuit)': 2,
 'Sausage Biscuit (Large Biscuit)': 3,
 'Sausage Biscuit with Egg (Regular Biscuit)': 2,
 'Sausage Biscuit with Egg (Large Biscuit)': 3,
 'Sausage Biscuit with Egg Whites (Regular Biscuit)': 3,
 'Sausage Biscuit with Egg Whites (Large Biscuit)': 3,
 'Southern Style Chicken Biscuit (Regular Biscuit)': 3,
 'Southern Style Chicken Biscuit (Large Biscuit)': 4,
 'Steak & Egg Biscuit (Regular Biscuit)': 3,
 'Bacon, Egg & Cheese McGriddles': 15,
 'Bacon, Egg & Cheese McGriddles with Egg Whites': 16,
 'Sausage McGriddles': 15,
 'Sausage, Egg & Cheese 

####  Now that we have all of the data in the correct formats we can go ahead and set up the optimizer!

## Setting Up the Optimizer


In [30]:
# Set it up as a minimization problem
prob = LpProblem('McOptimization Problem', LpMinimize)


Spaces are not permitted in the name. Converted to '_'



In addition, we can tell the optimizer that we are only interested in Integer solutions. ie. we will say it is impossible to have only 0.5 of an item (no half cheeseburgers). On top of this we can choose a max and min number of items for a solution:

In [32]:
MenuItems_vars = LpVariable.dicts('MenuItems',MenuItems,lowBound=0,
   upBound=10,cat='Integer')

In [36]:
# First entry is the calorie calculation (this is our objective)
prob += lpSum([Calories[i]*MenuItems_vars[i] for i in MenuItems]),'Calories'
    
# Total Fat must be <= 70 g
prob += lpSum([TotalFat[i]*MenuItems_vars[i] for i in MenuItems]) <= 70, 'TotalFat'
    
# Saturated Fat is <= 20 g
prob += lpSum([SaturatedFat[i]*MenuItems_vars[i] for i in MenuItems]) <= 20, 'Saturated Fat'

# Carbohydrates must be more than 260 g
prob += lpSum([Carbohydrates[i]*MenuItems_vars[i] for i in MenuItems]) >= 260, 'Carbohydrates_lower'

# Sugar between 80-100 g
prob += lpSum([Sugars[i]*MenuItems_vars[i] for i in MenuItems]) >= 80, 'Sugars_lower'
prob += lpSum([Sugars[i]*MenuItems_vars[i] for i in MenuItems]) <= 100, 'Sugars_upper'
    
# Protein between 45-55g
prob += lpSum([Protein[i]*MenuItems_vars[i] for i in MenuItems]) >= 45, 'Protein_lower'
prob += lpSum([Protein[i]*MenuItems_vars[i] for i in MenuItems]) <= 55, 'Protein_upper'
    
# Sodium <= 6000 mg
prob += lpSum([Sodium[i]*MenuItems_vars[i] for i in MenuItems]) <= 6000, 'Sodium'

In [37]:
prob.solve()

Welcome to the CBC MILP Solver 
Version: 2.10.3 
Build Date: Dec 15 2019 

command line - /Users/zishanvisram/opt/anaconda3/lib/python3.9/site-packages/pulp/solverdir/cbc/osx/64/cbc /var/folders/z3/4grcssw54_sdsl_mwrvm44rw0000gn/T/67a4c219f8474b368e8e35b6ab4a1840-pulp.mps timeMode elapsed branch printingOptions all solution /var/folders/z3/4grcssw54_sdsl_mwrvm44rw0000gn/T/67a4c219f8474b368e8e35b6ab4a1840-pulp.sol (default strategy 1)
At line 2 NAME          MODEL
At line 3 ROWS
At line 13 COLUMNS
At line 2612 RHS
At line 2621 BOUNDS
At line 2878 ENDATA
Problem MODEL has 8 rows, 256 columns and 1842 elements
Coin0008I MODEL read with 0 errors
Option for timeMode changed from cpu to elapsed
Continuous objective value is 1359.9 - 0.00 seconds
Cgl0004I processed model has 6 rows, 237 columns (237 integer (97 of which binary)) and 1303 elements
Cutoff increment increased from 1e-05 to 4.9999
Cbc0038I Initial state - 3 integers unsatisfied sum - 0.996193
Cbc0038I Solution found of 1359.9
Cbc

1

In [38]:
print("Status:", LpStatus[prob.status])

Status: Optimal


In [43]:
# Get the total calories (minimized)
print("Total Calories = ", value(prob.objective))

# Loop over the constraint set and get the final solution
results = {}
for constraint in prob.constraints:
    s = 0
    for var, coefficient in prob.constraints[constraint].items():
        s += var.varValue * coefficient
    results[prob.constraints[constraint].name.replace('_lower','').replace('_upper','')] = s


Total Calories =  1370.0


In [46]:
results

{'TotalFat': 20.0,
 'Saturated_Fat': 7.5,
 'Carbohydrates': 261.0,
 'Sugars': 100.0,
 'Protein': 55.0,
 'Sodium': 1575.0}

In [59]:
# Loop over the decision variables and print the items with non-zero quantity
for item in MenuItems_vars:
    if MenuItems_vars[item].value() is not None and MenuItems_vars[item].value() > 0:
        print(item, ":", MenuItems_vars[item].value())


Fruit & Maple Oatmeal without Brown Sugar : 5.0
Side Salad : 2.0
Apple Slices : 2.0
Diet Dr Pepper (Large) : 7.0
