# Overview of sugar content of menu items at various restaurants

This project fundamentally relies on data from the [Nutritionix API](http://www.nutritionix.com/api). I am very grateful for the use of their data.

In [1]:
import requests
import json
import os
import minimum_sugar

# Load credential data from file
with open("credentials.json", "r") as f:
    credentials = json.load(f)
    
# Load menu data from file
with open("menu_data.json", "r") as f:
    menu_data = json.load(f)

## Restaurant IDs
Nutritionix identifies restaurants by a unique number, but from what I can tell they do not list those numbers. The code in this section creates a mapping between the names of restaurants I frequent and Nutritionix's restaurant ID.

In [2]:
restaurant_names = ["McDonalds",
                    "Wendy's",
                    "Taco Bell",
                    "Qdoba",
                    "Chipotle",
                    "Five Guys",]
                    #"Costco",]

restaurant_ids = [minimum_sugar.fetch_restaurant_id(name, credentials) for name in restaurant_names]

In [3]:
# Save the data to a file for reading at a later date.
# with open("restaurant_ids.json", "w") as f:
#     f.write(json.dumps(restaurant_ids, indent=4, separators=(',', ': ')))

## Restaurant menus
Download restaurant menu nutrition data.

In [4]:
# Fetch menu data and add to each dict in `restaurant_ids`
# for restaurant in restaurant_ids:
#     restaurant["menu"] = minimum_sugar.fetch_menu_item_data(restaurant["id"], credentials)

In [10]:
# Could do some list comprehension here, but I think it code readability would suffer.
menu_data = []
for restaurant in restaurant_ids:
    menu_data.extend(minimum_sugar.fetch_menu_item_data(restaurant["id"], credentials))

In [20]:
# Write the data to a file for future analysis
# with open("menu_data.json", "w") as f:
#     f.write(json.dumps(menu_data, indent=4, separators=(',', ': ')))

## Categorizing menu items
It turns out that the menu items of these various restaurants are not categoriezed according to any universal scheme (e.g. beverage, condiment, etc.). Moreover, some restaurants (like McDonald's) list beverages on their menu whereas others (e.g. Chipotle) do not. Beverages really skew the sugar content of the histogram, and the data I really wanted was simply the sugar content of the entree items.

I ended up categorizing the menu items partially by hand. Some of the intermediate details can be found in `menu_item_categorization.ipynb`. The goal was to categorize menu items according to the following categories:

* beverage
* dessert
* condiment
* side order
* entree

The overall workflow was to first create a list of `item_name`s from the list of menu item data. The `item_names` list was written to a file (`item_names.dat`) with one `item_name` per line. I started with category "beverage" and deleted everything from the `item_names.dat` file that looked like it was a drink. I then used the remainder and the original `item_names` list to create python `set` objects and recover the list of beverage item names. The following code is an example of the process (not exactly what happened, but captures the essence):

```python
# Remove beverage item names from the `item_names.dat` file by hand.

# Bring the result back into memory.
with open("item_names.dat", "r") as f:
    remainder = [line.strip() for line in f]

item_names_set = set(item_names)
remainder_set = set(remainder)

beverage_item_names_set = item_names_set - remainder_set
beverage_item_names = list(beverage_item_names_set)
beverage_item_names.sort()

with open("beverage_item_names.json", "w") as f:
    f.write(json.dumps(beverage_item_names, indent=4, separators=(',', ': ')))
```

I repeated this process of elimination to generate lists of menu items according to each category.

## Misc. observations during development
This section contains some notes on observations I made during development. I intend to include these observations in the report, but rewritten into the body itself and not this section.

First, the legacy/non-fast casual restaurants have a **ton** of menu items. McDonald's has >350 where Chipotle only has like 25. I can't imagine the amout of complexity that number of menu items adds to the management of the company. I also can't see how McDonalds gets rid of this complexity (i.e. sheds menu items) without alienating the customers these items intended to serve. It seems like this amount of complexity is an accretion over many years and is likely a result of their success. It seems eminently plausible that over the years executives at McDonalds thought, "We are dominating this part of the market which is basically tapped out. In order to experience even more growth, we need to expand into other markets. How do we expand into other markets while leveraging the power of this brand to crush the competition?"

Second, executing this project has made me vaguely aware that some of my development practices may not be suited for data science projects. For example, I think the workflow I am using to perform this analysis may not be the most effective. The workflow is: download all data from the server, then write filtering code to eventually get the data I want. I feel like some incarnation of this workflow is what a seasoned data scientist might do, but I suspect most of the filtering will be done by the server or at the database as opposed to at the level of the data scientists local machine.

As of commit [e297afb9](https://github.com/jrsmith3/minimum_sugar/commit/e297afb990153e07a80e8442aedcd4babb6b458b), I switched to a flat data structure. I thought this approach was going to make things easier, but I didn't realize how much easier it made things. I now just consider the menu item data to be a big pile of data, and I let the computer extract what I need based on queries I submit. I suspect this situation is how things are when one has a well-constructed database. Based on this experience, I should learn how to use SQL.

In [3]:
chipotle_menu = minimum_sugar.filter_menu_items(menu_data, "brand_name", "Chipotle")
minimum_sugar.extract_variable(chipotle_menu, "item_name")

[u'Chips',
 u'Black Beans',
 u'Chipotle Vinaigrette',
 u'Flour Tortilla (burrito)',
 u'Steak',
 u'Crispy Taco Shell',
 u'Red Tomatillo Salsa',
 u'Fajita Veggies',
 u'Soft Corn Tortilla',
 u'Chicken',
 u'Pinto Beans',
 u'Flour Tortilla (taco)',
 u'Cheese',
 u'Lettuce',
 u'Cilantro-Lime Rice',
 u'Brown Rice',
 u'Tomato Salsa',
 u'Corn Salsa',
 u'Sour Cream',
 u'Green Tomatillo Salsa',
 u'Sofritas',
 u'Carnitas',
 u'Barbacoa',
 u'Romaine Lettuce (salad)',
 u'Guacamole']