This notebook contains the code to pull data from the [Nutritionix API](http://www.nutritionix.com/api) as well as sort it into categories.

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import requests
import json
import os
import minimum_sugar

# Load credential data from file
with open("credentials.json", "r") as f:
    credentials = json.load(f)
    
# Load menu data from file
with open("menu_data.json", "r") as f:
    menu_data = json.load(f)

## Restaurant IDs
Nutritionix identifies restaurants by a unique number, but from what I can tell they do not list those numbers. The code in this section creates a mapping between the names of restaurants I frequent and Nutritionix's restaurant ID.

In [2]:
restaurant_names = ["Wendy's",
                    "McDonald's",
                    "Five Guys",
                    "Qdoba",
                    "Taco Bell",
                    "Chipotle",]

In [None]:
restaurant_ids = [minimum_sugar.fetch_restaurant_id(name, credentials) for name in restaurant_names]

In [3]:
# Save the data to a file for reading at a later date.
# with open("restaurant_ids.json", "w") as f:
#     f.write(json.dumps(restaurant_ids, indent=4, separators=(',', ': ')))

## Restaurant menus
Download restaurant menu nutrition data.

In [4]:
# Fetch menu data and add to each dict in `restaurant_ids`
# for restaurant in restaurant_ids:
#     restaurant["menu"] = minimum_sugar.fetch_menu_item_data(restaurant["id"], credentials)

In [6]:
# Could do some list comprehension here, but I think it code readability would suffer.
menu_data = []
for restaurant in restaurant_ids:
    menu_data.extend(minimum_sugar.fetch_menu_item_data(restaurant["id"], credentials))

In [7]:
# Write the data to a file for future analysis
# with open("menu_data.json", "w") as f:
#     f.write(json.dumps(menu_data, indent=4, separators=(',', ': ')))

## Categorizing menu items
It turns out that the menu items of these various restaurants are not categoriezed according to any universal scheme (e.g. beverage, condiment, etc.). Moreover, some restaurants (like McDonald's) list beverages on their menu whereas others (e.g. Chipotle) do not. Beverages really skew the sugar content of the histogram, and the data I really wanted was simply the sugar content of the entree items.

I ended up categorizing the menu items partially by hand. Some of the intermediate details can be found in `menu_item_categorization.ipynb`. The goal was to categorize menu items according to the following categories:

* beverage
* dessert
* condiment
* side order
* entree

The overall workflow was to first create a list of `item_name`s from the list of menu item data. The `item_names` list was written to a file (`item_names.dat`) with one `item_name` per line. I started with category "beverage" and deleted everything from the `item_names.dat` file that looked like it was a drink. I then used the remainder and the original `item_names` list to create python `set` objects and recover the list of beverage item names. The following code is an example of the process (not exactly what happened, but captures the essence):

```python
# Remove beverage item names from the `item_names.dat` file by hand.

# Bring the result back into memory.
with open("item_names.dat", "r") as f:
    remainder = [line.strip() for line in f]

item_names_set = set(item_names)
remainder_set = set(remainder)

beverage_item_names_set = item_names_set - remainder_set
beverage_item_names = list(beverage_item_names_set)
beverage_item_names.sort()

with open("beverage_item_names.json", "w") as f:
    f.write(json.dumps(beverage_item_names, indent=4, separators=(',', ': ')))
```

I repeated this process of elimination to generate lists of menu items according to each category.

## Update the "database" with `menu_category` field and check results
Once this process was completed, I had several files of item names; each file contained the items of a particular category. For maximum utility, these categories should be included in the makeshift database in `menu_data.json`. Once the `menu_category` field is populated for each item in the `menu_data.json` list, I can write some simple code to check my categorization, broken down by each restaurant.

The code to check the categorization and update `menu_data.json` is as follows:

```python
categories = ["beverage",
    "dessert",
    "condiment",
    "side",
    "entree",]

categorized_item_names = {}

for category in categories:
    filename = category + "_item_names.json"
    with open(filename, "r") as f:
        item_names = json.load(f)

    categorized_items = {item_name: category for item_name in item_names}
    categorized_item_names.update(categorized_items)
    
# Check that items weren't dropped during categorization.
# The best way to do this check is to create two sets:
# One set of all of the menu item names in `menu_data`.
# The other set from the keys of `categorized_item_names`.
# I can then use set operations to compare these two sets.

assert set(minimum_sugar.extract_variable(menu_data, "item_name")) == set(categorized_item_names.keys())

# Categorize each menu item in `menu_data` by adding 
# the "menu_category" field and data from above.
for item in menu_data:
    # I could fold the following line into the set operation on the `item` dict, 
    # but I'm separating the operations for the sake of code readability.
    item_name = item["item_name"]
    item["menu_category"] = categorized_item_names[item_name]
    
# Reformat data into a dict so I'm not viewing superfluous data and so that I can categorize
# according to restaurant and menu category.
categorized_item_names = {}

for restaurant_name in restaurant_names:
    restaurant_menu_data = minimum_sugar.filter_menu_items(menu_data, "brand_name", restaurant_name)

    subdict = {}
    for category in categories:
        restaurant_categorized_menu_items = minimum_sugar.filter_menu_items(restaurant_menu_data, "menu_category", category)
        restr_categ_menu_item_names = minimum_sugar.extract_variable(restaurant_categorized_menu_items, "item_name")
        restr_categ_menu_item_names.sort()

        subdict[category] = restr_categ_menu_item_names

    categorized_item_names[restaurant_name] = subdict
    
# Print the result to check categorization
print json.dumps(categorized_item_names, indent=4, separators=(',', ': '))

# Everything is categorized properly, save the result.
with open("menu_data.json", "w") as f:
    f.write(json.dumps(menu_data, indent=4, separators=(',', ': ')))
```