# Minimizing refined sugar consumption from my favorite restaurants

## Background
I recently read [Michael Grothaus's article](https://www.fastcompany.com/3050319/lessons-learned/how-giving-up-refined-sugar-changed-my-brain) on his positive experiences with giving up refined sugar. Reading this article convinced me that abstaining from sugar was a good idea, and my doctor was enthusiastic about the idea.

In the pursuit of minimizing my refined/added sugar consumption, I have two additional considerations. First, I don't cook and I have no interest in learning. Second, I'm not concerned about minimizing calories because I'm about 6'2", around 200 lbs, and my favorite way to exercise is lifting weights (bro). I think I need around 3000 calories per day to maintain my regimen, but I've never rigorously calculated it. I usually just eat until I feel full.

As anyone who has tried to quit refined sugar knows, it is hard to find food that does not contain it -- especially at restaurants. Scouring the internet, I've found endless articles and blog posts about how to cook food without sugar. To belabor the point, I'm happy for these people but I don't care to learn how to cook. Changing a single lifestyle habit such as consuming no refined/added sugar is difficult enough on its own. Changing two (eliminating sugar and cooking instead of going out) is a recipe for failure.

Instead of manually looking over menus and nutrition data for the places I patronize, I decided to build some tools to help with the following analysis.

I should make the disclaimer that I am absolutely **not** a dietician nor a physician and so you should take the following analysis with a grain of salt (heh). The responsible thing to do is to talk to your doctor if you have concerns about how your diet might be affecting your health.

[The American Heart Association recommends](http://www.heart.org/HEARTORG/GettingHealthy/NutritionCenter/HealthyEating/Frequently-Asked-Questions-About-Sugar_UCM_306725_Article.jsp) that women consume no more than 100 calories per day of added sugars and men consume no more than 150 calories per day of added sugars. These values work out to about 20g for women and 30g for men, assuming (lets call it) a specific energy of [5cal/g](http://www.nutritionix.com/i/usda/sugars-granulated-1-serving-1-cube/463d6237e531d7b3a164a862) for sugar.

Its also very important to note that the American Heart Association makes a distinction between naturally occuring sugars and added sugar:

> **Are all sugars bad?**
>
> No, but added sugars add calories and zero nutrients to food. Adding a limited amount of sugars to foods that provide important nutrients—such as whole-grain cereal, flavored milk or yogurt—to improve their taste, especially for children, is a better use of added sugars than nutrient-poor, highly sweetened foods.

My conclusion from this statement: you could do worse than to eat 1000 calories of fresh blueberries.

Unfortunately, if you are someone who wants to limit their added/refined sugar consumption, there is no US Federal regulation requiring manufacturers to include information about the amount of added vs. naturally occuring sugar in a food product. From what I've read, one can get an idea of the source of sugar in a food product by reading the list of ingredients; if the word "sugar" appears in the list, chances are that sugar is refined. For example, [a giant 32oz container of Dannon plain yogurt](http://www.amazon.com/Dannon-Natural-Quart-Plain-Yogurt/dp/B00RASDV2E/) contains 12g of sugar per serving, which seems like a lot in light of the American Heart Association recommendations above. However, reading the ingredients list shows this yogurt is made only from milk and yogurt cultures. Thus, this food product contains no added sugar (nevertheless, you probably shouldn't eat the whole 32oz in one sitting).


## Analysis
The six restaurants I frequently patronize fall into two main categories: burger places and "Mexican". There are a mix of the more legacy fast-food places like McDonald's, Wendy's, (and to perhaps a lesser extent) Taco Bell. The rest, Five Guys, Chipotle, and Qdoba, are upstart fast casual.

I first wanted to get a sense of the sugar in each restaurant's menu. I downloaded the nutrition information of each restaurant's entire menu using the [Nutritionix API](http://www.nutritionix.com/api) and attempted to plot a histogram, but I soon found that each restaurant's menu didn't yield an apples-to-apples comparison. Some restaurants included beverages, condiments, etc. while others did not. Therefore, I categorized the menu items of each restaurant by adding category data to each database element. The categories I used were beverage, dessert, condiment, side, and entree. Note that I did not categorize menu items according to meal, i.e. breakfast, lunch, and dinner. Once the items were categorized, I was able to plot histograms of the entree menu items for each restaurant; those histograms are given below.

In [1]:
# Initialize environment
%matplotlib inline
import matplotlib.pyplot as plt
import json
import minimum_sugar

# Load menu data from file
with open("menu_data.json", "r") as f:
    menu_data = json.load(f)
    
# I want the names in the following order
restaurant_names = ["Wendy's",
                    "McDonald's",
                    "Five Guys",
                    "Qdoba",
                    "Taco Bell",
                    "Chipotle",]

In [17]:
entree_items = minimum_sugar.filter_menu_items(menu_data, "menu_category", "entree")
x_max = max(minimum_sugar.extract_variable(entree_items, "nf_sugars"))

# for restaurant_name in restaurant_names:
#     restaurant_menu_items = minimum_sugar.filter_menu_items(menu_data, "brand_name", restaurant_name)
#     restaurant_entree_items = minimum_sugar.filter_menu_items(restaurant_menu_items, "menu_category", "entree")
#     f = minimum_sugar.menu_histogram(restaurant_entree_items, "nf_sugars", title=restaurant_name, param_name="Sugar [g]")
#     ax = f.axes[0]
#     ax.set_xlim([0, x_max])
#     plt.show()

### Maximum sugar
Perhaps not surprisingly, the individual menu items featuring the maximum amount of sugar are offered by Wendy's and McDonald's. Wendy's is the winner in this category with a healthy sounding menu item: "[Steel-Cut Oatmeal with Cranberries and Pecans](http://www.nutritionix.com/i//steel-cut-oatmeal-with-cranberries-and-pecans/ae4920268b167116cadff337)" containing 33g of sugar. I wanted to give Wendy's the benefit of the doubt and believe that the sugar comes from the fruit. Unfortunately, Wendy's website features so much unnecessary HTML bling that I wasn't able to find the ingrediants for this menu item within a single click. This poor website design choice soured my positive feelings about Wendy's and now I assume they are trying to obfuscate their nutrition information because they have something to hide.

In [2]:
minimum_sugar.print_max_sugar_menu_item(menu_data, "Wendy's")

Max sugar: 33
Iten name: Steel-Cut Oatmeal with Cranberries and Pecans


The most sugar-rich menu item for McDonald's is the [Big Breakfast With Hotcakes And Egg Whites (Large Biscuit)](http://www.nutritionix.com/i/mcdonalds/big-breakfast-with-hotcakes-and-egg-whites-large-biscuit-/521b95c74a56d006d578b11b) with 18g. Note that Wendy's has six entree menu items with sugar greater than McDonald's Big Breakfast, listed below.

In [3]:
minimum_sugar.print_max_sugar_menu_item(menu_data, "McDonald's")

Max sugar: 18
Iten name: Big Breakfast With Hotcakes And Egg Whites (Large Biscuit)


In [12]:
wendys_menu_items = minimum_sugar.filter_menu_items(menu_data, "brand_name", "Wendy's")
wendys_entree_items = minimum_sugar.filter_menu_items(wendys_menu_items, "menu_category", "entree")
wendys_high_sugar_menu_items = [menu_item for menu_item in wendys_entree_items if menu_item["nf_sugars"] > 18]
for menu_item in wendys_high_sugar_menu_items:
    print menu_item["item_name"] + ":", menu_item["nf_sugars"]    

Steel-Cut Oatmeal with Summer Berries: 20
Steel-Cut Oatmeal with Cranberries and Pecans: 33
Double Chocolate Chip Cookie: 28
Steel-Cut Oatmeal with Apples and Caramel: 26
Pulled Pork Sandwich w/ Spicy BBQ: 25
Pulled Pork Sandwich w/ Sweet BBQ: 23


Of the burger places, Five Guys has the lowest ceiling on sugar. Unfortunately for the vegetarians out there, the two menu items with 14g sugar are the [Veggie Sandwich](http://www.nutritionix.com/i//veggie-sandwich/521b95cb4a56d006d578b9bc) and [Cheese Veggie Sandwich](http://www.nutritionix.com/i/five-guys/cheese-veggie-sandwich/521b95cb4a56d006d578b9a7). I've never ordered either of these sandwiches (nor did I realize they exist), but my guess is the sugar comes from the [bun](http://www.nutritionix.com/i//bun/521b95cb4a56d006d578b9a3) and ketchup -- removing those components should lower the total sugar. On the other hand, removing the bun from these sandwiches leaves you with a pile of vegetables, at which point you should probably just head over to Chopt. In fact, are there any vegetarians that go to Five Guys for food?

In [4]:
minimum_sugar.print_max_sugar_menu_item(menu_data, "Five Guys")

Max sugar: 14
Iten name: Veggie Sandwich
Iten name: Cheese Veggie Sandwich


Looking at the histograms, it seems that Chipotle's menu has the lowest amount of sugar per menu item. This conclusion is a little deceiving: the menu items listed for Chipotle are actually components used to assemble menu items such as delicious burritos. The [Sofritas](http://www.nutritionix.com/i//sofritas/52cdcbe1051cb9eb320014de) has the most sugar at 5g, but there's no mention of sugar in the [list of ingrediants](http://chipotle.com/ingredient-statement) for this item. My guess is the sugar comes in with the soybeans.

In [5]:
minimum_sugar.print_max_sugar_menu_item(menu_data, "Chipotle")

Max sugar: 5
Iten name: Sofritas


Taco Bell has 9 entree menu items tied for the most sugar (7g). In list form:

* [Biscuit Taco - Bacon, Egg & Cheese](http://www.nutritionix.com/i/taco-bell/biscuit-taco-bacon-egg-cheese/463d0791647dd7e10220bcec)
* [Biscuit Taco - Sausage, Egg & Cheese](http://www.nutritionix.com/i/taco-bell/biscuit-taco-sausage-egg-cheese/463d0791de7395dfeb525dcf)
* [Biscuit Taco](http://www.nutritionix.com/i//biscuit-taco/463d0791fd4113f3304e4ad8)
* [Fiesta Taco Salad](http://www.nutritionix.com/i//fiesta-taco-salad/463d07918aba11ca9cce9a9a)
* [Biscuit Taco - Sausage & Cheese](http://www.nutritionix.com/i//biscuit-taco-sausage-cheese/463d07919d093ab2244fdab0)
* [Fiesta Taco Salad - Chicken](http://www.nutritionix.com/i//fiesta-taco-salad-chicken/463d07911a4c50142dc1e860)
* [Fiesta Taco Salad - Beef](http://www.nutritionix.com/i//fiesta-taco-salad-beef/463d079126dd0a381c93e2d8)
* [Fiesta Taco Salad - Steak](http://www.nutritionix.com/i//fiesta-taco-salad-steak/463d079195e49a2a64314a35)
* [Biscuit Taco - Egg & Cheese](http://www.nutritionix.com/i//biscuit-taco-egg-cheese/463d0791dd2a297b4b709929)

Generally speaking, the burger places (McDonald's, Wendy's, and Five Guys) have a wider distribution in terms of sugar than the "Mexican" places.

In [6]:
minimum_sugar.print_max_sugar_menu_item(menu_data, "Taco Bell")

Max sugar: 7
Iten name: Biscuit Taco - Bacon, Egg & Cheese
Iten name: Biscuit Taco - Sausage, Egg & Cheese
Iten name: Biscuit Taco
Iten name: Fiesta Taco Salad
Iten name: Biscuit Taco - Sausage & Cheese
Iten name: Fiesta Taco Salad - Chicken
Iten name: Fiesta Taco Salad - Beef
Iten name: Fiesta Taco Salad - Steak
Iten name: Biscuit Taco - Egg & Cheese


### Minimum sugar vs. calories
Its possible to calculate things like average and standard deviation for the histograms above, but I don't see the utility in that information. Nobody goes to a restaurant and selects random items off the menu. My rule of thumb is to not order anything with more than 3g of sugar, and zero sugar is preferred. Using this rubric, the restaurants I considered have the following number of menu items at that part of the distribution:

In [16]:
for brand_name in restaurant_names:
    restaurant_menu_items = minimum_sugar.filter_menu_items(menu_data, "brand_name", brand_name)
    restaurant_entree_items = minimum_sugar.filter_menu_items(restaurant_menu_items, "menu_category", "entree")
    low_sugar_menu_items = [menu_item for menu_item in restaurant_entree_items if menu_item["nf_sugars"] < 3]
    print brand_name + ":", len(low_sugar_menu_items)

Wendy's: 24
McDonald's: 14
Five Guys: 12
Qdoba: 67
Taco Bell: 33
Chipotle: 19


As I mentioned above, I am interested in consuming more calories than someone smaller or with a different exercise regimen. Thus, my visualization of interest is the histogram of calories for the menu items containing less than three grams of sugar.

In [18]:
# Code to plot figure

(ADDITIONAL ANALYSIS)

## The worst offenders? Beverages.
Perhaps not surprisingly, beverages are a great way to consume a lot of sugar over a short period. To make this point, consider the following histogram plots in which each restaurant's entree itmes appear along with beverages. Please note that Nutritionix had no beverages listed in its database for some of the restaurants in this analysis. I still plot their histograms with the others for the sake of comparison.

In [19]:
# Code to plot figure

(ADDITIONAL ANALYSIS)

## Remarks
Rules of thumb: 

* If Ron Swanson wouldn't drink it (water, black coffee, scotch), it probably contains sugar. The sugar histograms show that beverages contain lots of sugar.
* Avoid ketchup. And most other sauces. [Mustard](http://www.nutritionix.com/i/frenchs/mustard-classic-yellow/51d2fc51cc9bff111580e297) is the superior condiment anyway so use it instead. I didn't show the analysis, but typically sauces and dressings contain too much sugar. Eastern NC style barbecue sauce, The One True Barbecue Sauce, is the [exception](http://uncpress.unc.edu/HolySmoke/samplerecipes.html).
* Order your burger without ketchup and the bun and you should have enough calories with minimal sugar.

During the course of this analysis, I was struck by the fact that the legacy/non-fast casual restaurants have a **ton** of menu items. McDonald's has >350 where Chipotle only has like 25. Granted, Nutritionix had no data on beverage offerings from Chipotle whereas they did have beverages from McDonalds. Nonetheless, McDonalds still has 92 items I categorized as entrees vs. Chipotle's 25.

I can't imagine the amout of complexity that number of menu items adds to the management of the company. I also can't see how McDonalds gets rid of this complexity (i.e. sheds menu items) without alienating the customers these items intended to serve. It seems like this amount of complexity is an accretion over many years and is likely a result of their success. It seems eminently plausible that over the years executives at McDonalds thought, "We are dominating this part of the market which is basically tapped out. In order to experience even more growth, we need to expand into other markets. How do we expand into other markets while leveraging the power of this brand to crush the competition?"

In [20]:
restaurant_menu_items = minimum_sugar.filter_menu_items(menu_data, "brand_name", "McDonald's")
num_mcds_entree_items = len(minimum_sugar.filter_menu_items(restaurant_menu_items, "menu_category", "entree"))
print "Number of McDonald's entree menu items:", num_mcds_entree_items

Number of McDonald's entree menu items: 92


## Todo
Either write these in the report or transfer them to the issue tracker:

* Plot beverage histograms on entree histograms. In this way I will be able to make the point that many of these beverages contain a large amount of sugar.
* List menu items containing less than 4g sugar for each restaurant.
* Add links to menu item names in the final report.

## Notes
It would be great if the FDA required restaurants and manufacturers of food to include the amount of added and/or refined sugars to their label.

It would also be great if ingrediant information was ubiquitous for these restaurants.

## Misc. observations during development
This section contains some notes on observations I made during development. I intend to include these observations in the report, but rewritten into the body itself and not this section.

Second, executing this project has made me vaguely aware that some of my development practices may not be suited for data science projects. For example, I think the workflow I am using to perform this analysis may not be the most effective. The workflow is: download all data from the server, then write filtering code to eventually get the data I want. I feel like some incarnation of this workflow is what a seasoned data scientist might do, but I suspect most of the filtering will be done by the server or at the database as opposed to at the level of the data scientists local machine.

As of commit [e297afb9](https://github.com/jrsmith3/minimum_sugar/commit/e297afb990153e07a80e8442aedcd4babb6b458b), I switched to a flat data structure. I thought this approach was going to make things easier, but I didn't realize how much easier it made things. I now just consider the menu item data to be a big pile of data, and I let the computer extract what I need based on queries I submit. I suspect this situation is how things are when one has a well-constructed database. Based on this experience, I should learn how to use SQL.

## Scratch

In [5]:
restaurant_name = "Wendy's"

restaurant_menu_items = minimum_sugar.filter_menu_items(menu_data, "brand_name", restaurant_name)
restaurant_entree_items = minimum_sugar.filter_menu_items(restaurant_menu_items, "menu_category", "entree")
max_sugar = max(minimum_sugar.extract_variable(restaurant_entree_items, "nf_sugars"))

print "Max sugar:", max_sugar
print "\r"

menu_items = minimum_sugar.filter_menu_items(restaurant_entree_items, "nf_sugars", max_sugar)

print "Number of items:", len(menu_items)
for menu_item in menu_items:
    print "* [" + menu_item["item_name"] + "]()"

Max sugar: 33

Number of items: 1
* [Steel-Cut Oatmeal with Cranberries and Pecans]()


In [40]:
for restaurant_name in restaurant_names:
    print "*", restaurant_name

* Wendy's
* McDonald's
* Five Guys
* Qdoba
* Taco Bell
* Chipotle
