# The Diet Project

This project recommends a food diet to fitness enthusiasts. Note that this project is not guaranteed to be accurate. The numbers and range may be inaccurate resulting in an inaccurate diet. So one should not rely on the diet proposed by this software and must consult an expert.


## Quick Start

Install the requirements.

```shell
pip install -r requirements.txt
```

Run The project.

```shell
python3 manage.py runserver 8000
```

Access the UI via browser under the url [http://localhost:8000](http://localhost:8000).


## Implementation Documentation

### User-Friendly UI Using Django

In this project, we used a python web-site and web-server framework named Django to implement the desired UI for the bonus section.

The project is named `diet_lp` and we created an app inside it named `diet` ([diet](diet)) where we store most of our project codes.

The files related to UI can be found under `forms.py`, `views.py`, `widgets.py`, and the `templates` directory.

A description in more details can be found under file [gui_description.mp4](gui_description.mp4).

### Data Preprocess

We downloaded a dataset from USDA named SR Legacy released at 4/2018. It contains several data about several foods. We do not need all these data, so we simplify the data into another csv file. We crated a python script to extract the required data from the dataset.

To run the preprocess script over the dataset, run the following command.

```shell
python3 manage.py preprocess diet/data/usda diet/data/usda.csv
```

Given the directory `usda`, it generates the file `usda.csv`.

We follow these steps in the script.

In [249]:
import pathlib

import pandas as pd

input_dir = pathlib.Path("diet/data/usda")

foods = pd.read_csv(input_dir / "food.csv")

print(foods)

      fdc_id       data_type  \
0     167512  sr_legacy_food   
1     167513  sr_legacy_food   
2     167514  sr_legacy_food   
3     167515  sr_legacy_food   
4     167516  sr_legacy_food   
...      ...             ...   
7788  175300  sr_legacy_food   
7789  175301  sr_legacy_food   
7790  175302  sr_legacy_food   
7791  175303  sr_legacy_food   
7792  175304  sr_legacy_food   

                                            description  food_category_id  \
0     Pillsbury Golden Layer Buttermilk Biscuits, Ar...                18   
1     Pillsbury, Cinnamon Rolls with Icing, refriger...                18   
2     Kraft Foods, Shake N Bake Original Recipe, Coa...                18   
3        George Weston Bakeries, Thomas English Muffins                18   
4            Waffles, buttermilk, frozen, ready-to-heat                18   
...                                                 ...               ...   
7788         Game meat, buffalo, water, cooked, roasted                17   

#### Drop Unwanted Columns

In [210]:
# We will not need these columns

foods = foods.drop(['data_type', 'publication_date'], axis=1)

print(foods)

      fdc_id                                        description  \
0     167512  Pillsbury Golden Layer Buttermilk Biscuits, Ar...   
1     167513  Pillsbury, Cinnamon Rolls with Icing, refriger...   
2     167514  Kraft Foods, Shake N Bake Original Recipe, Coa...   
3     167515     George Weston Bakeries, Thomas English Muffins   
4     167516         Waffles, buttermilk, frozen, ready-to-heat   
...      ...                                                ...   
7788  175300         Game meat, buffalo, water, cooked, roasted   
7789  175301                                Game meat, elk, raw   
7790  175302                    Game meat, elk, cooked, roasted   
7791  175303                               Game meat, goat, raw   
7792  175304                   Game meat, goat, cooked, roasted   

      food_category_id  
0                   18  
1                   18  
2                   18  
3                   18  
4                   18  
...                ...  
7788                

#### Filter Unwanted Data

In [211]:
# We define 5 meals.

MEAL_BREAKFAST = "breakfast"
MEAL_MORNING_SNACK = "morning_snack"
MEAL_LUNCH = "lunch"
MEAL_AFTERNOON_SNACK = "afternoon_snack"
MEAL_DINNER = "dinner"

In [212]:
# We define all food categories based on the file food_category.csv

FOOD_CATEGORY_DAIRY_AND_EGG_PRODUCTS = "dairy_and_egg_products"
FOOD_CATEGORY_SPICES_AND_HERBS = "spices_and_herbs"
FOOD_CATEGORY_BABY_FOODS = "baby_foods"
FOOD_CATEGORY_FATS_AND_OILS = "fats_and_oils"
FOOD_CATEGORY_POULTRY_PRODUCTS = "poultry_products"
FOOD_CATEGORY_SOUPS_SAUCES_AND_GRAVIES = "soups_sauces_and_gravies"
FOOD_CATEGORY_SAUSAGES_AND_LUNCHEON_MEATS = "sausages_and_luncheon_meats"
FOOD_CATEGORY_BREAKFAST_CEREALS = "breakfast_cereals"
FOOD_CATEGORY_FRUITS_AND_FRUIT_JUICES = "fruits_and_fruit_juices"
FOOD_CATEGORY_PORK_PRODUCTS = "pork_products"
FOOD_CATEGORY_VEGETABLES_AND_VEGETABLE_PRODUCTS = "vegetables_and_vegetable_products"
FOOD_CATEGORY_NUT_AND_SEED_PRODUCTS = "nut_and_seed_products"
FOOD_CATEGORY_BEEF_PRODUCTS = "beef_products"
FOOD_CATEGORY_BEVERAGES = "beverages"
FOOD_CATEGORY_FINFISH_AND_SHELLFISH_PRODUCTS = "finfish_and_shellfish_products"
FOOD_CATEGORY_LEGUMES_AND_LEGUME_PRODUCTS = "legumes_and_legume_products"
FOOD_CATEGORY_LAMB_VEAL_AND_GAME_PRODUCTS = "lamb_veal_and_game_products"
FOOD_CATEGORY_BAKED_PRODUCTS = "baked_products"
FOOD_CATEGORY_SWEETS = "sweets"
FOOD_CATEGORY_CEREAL_GRAINS_AND_PASTA = "cereal_grains_and_pasta"
FOOD_CATEGORY_FAST_FOODS = "fast_foods"
FOOD_CATEGORY_MEALS_ENTREES_AND_SIDE_DISHES = "meals_entrees_and_side_dishes"
FOOD_CATEGORY_SNACKS = "snacks"
FOOD_CATEGORY_AMERICAN_INDIAN_ALASKA_NATIVE_FOODS = "american_indian_alaska_native_foods"
FOOD_CATEGORY_RESTAURANT_FOODS = "restaurant_foods"

# From food_category.csv
FOOD_CATEGORY_ID_MAPPING = {
    FOOD_CATEGORY_DAIRY_AND_EGG_PRODUCTS: 1,
    FOOD_CATEGORY_SPICES_AND_HERBS: 2,
    FOOD_CATEGORY_BABY_FOODS: 3,
    FOOD_CATEGORY_FATS_AND_OILS: 4,
    FOOD_CATEGORY_POULTRY_PRODUCTS: 5,
    FOOD_CATEGORY_SOUPS_SAUCES_AND_GRAVIES: 6,
    FOOD_CATEGORY_SAUSAGES_AND_LUNCHEON_MEATS: 7,
    FOOD_CATEGORY_BREAKFAST_CEREALS: 8,
    FOOD_CATEGORY_FRUITS_AND_FRUIT_JUICES: 9,
    FOOD_CATEGORY_PORK_PRODUCTS: 10,
    FOOD_CATEGORY_VEGETABLES_AND_VEGETABLE_PRODUCTS: 11,
    FOOD_CATEGORY_NUT_AND_SEED_PRODUCTS: 12,
    FOOD_CATEGORY_BEEF_PRODUCTS: 13,
    FOOD_CATEGORY_BEVERAGES: 14,
    FOOD_CATEGORY_FINFISH_AND_SHELLFISH_PRODUCTS: 15,
    FOOD_CATEGORY_LEGUMES_AND_LEGUME_PRODUCTS: 16,
    FOOD_CATEGORY_LAMB_VEAL_AND_GAME_PRODUCTS: 17,
    FOOD_CATEGORY_BAKED_PRODUCTS: 18,
    FOOD_CATEGORY_SWEETS: 19,
    FOOD_CATEGORY_CEREAL_GRAINS_AND_PASTA: 20,
    FOOD_CATEGORY_FAST_FOODS: 21,
    FOOD_CATEGORY_MEALS_ENTREES_AND_SIDE_DISHES: 22,
    FOOD_CATEGORY_SNACKS: 23,
    FOOD_CATEGORY_AMERICAN_INDIAN_ALASKA_NATIVE_FOODS: 24,
    FOOD_CATEGORY_RESTAURANT_FOODS: 25,
}

In [213]:
# Now we select some food categories for each meal.

import typing
from functools import reduce

MEAL_FOOD_CATEGORY_MAPPING: typing.Dict[str, set] = {
    MEAL_BREAKFAST: {
        FOOD_CATEGORY_DAIRY_AND_EGG_PRODUCTS,
        FOOD_CATEGORY_SAUSAGES_AND_LUNCHEON_MEATS,
        FOOD_CATEGORY_BREAKFAST_CEREALS,
        FOOD_CATEGORY_FRUITS_AND_FRUIT_JUICES,
        FOOD_CATEGORY_VEGETABLES_AND_VEGETABLE_PRODUCTS,
        FOOD_CATEGORY_NUT_AND_SEED_PRODUCTS,
        FOOD_CATEGORY_BEVERAGES,
        FOOD_CATEGORY_LEGUMES_AND_LEGUME_PRODUCTS,
        FOOD_CATEGORY_BAKED_PRODUCTS,
    },
    MEAL_MORNING_SNACK: {
        FOOD_CATEGORY_SAUSAGES_AND_LUNCHEON_MEATS,
        FOOD_CATEGORY_FRUITS_AND_FRUIT_JUICES,
        FOOD_CATEGORY_VEGETABLES_AND_VEGETABLE_PRODUCTS,
        FOOD_CATEGORY_NUT_AND_SEED_PRODUCTS,
        FOOD_CATEGORY_BEVERAGES,
        FOOD_CATEGORY_LEGUMES_AND_LEGUME_PRODUCTS,
        FOOD_CATEGORY_BAKED_PRODUCTS,
        FOOD_CATEGORY_SNACKS,
    },
    MEAL_LUNCH: {
        FOOD_CATEGORY_POULTRY_PRODUCTS,
        FOOD_CATEGORY_SOUPS_SAUCES_AND_GRAVIES,
        FOOD_CATEGORY_SAUSAGES_AND_LUNCHEON_MEATS,
        FOOD_CATEGORY_FRUITS_AND_FRUIT_JUICES,
        FOOD_CATEGORY_VEGETABLES_AND_VEGETABLE_PRODUCTS,
        FOOD_CATEGORY_NUT_AND_SEED_PRODUCTS,
        FOOD_CATEGORY_BEEF_PRODUCTS,
        FOOD_CATEGORY_BEVERAGES,
        FOOD_CATEGORY_FINFISH_AND_SHELLFISH_PRODUCTS,
        FOOD_CATEGORY_LEGUMES_AND_LEGUME_PRODUCTS,
        FOOD_CATEGORY_LAMB_VEAL_AND_GAME_PRODUCTS,
        FOOD_CATEGORY_BAKED_PRODUCTS,
        FOOD_CATEGORY_CEREAL_GRAINS_AND_PASTA,
        FOOD_CATEGORY_FAST_FOODS,
        FOOD_CATEGORY_RESTAURANT_FOODS,
    },
    MEAL_AFTERNOON_SNACK: {
        FOOD_CATEGORY_SOUPS_SAUCES_AND_GRAVIES,
        FOOD_CATEGORY_SAUSAGES_AND_LUNCHEON_MEATS,
        FOOD_CATEGORY_FRUITS_AND_FRUIT_JUICES,
        FOOD_CATEGORY_VEGETABLES_AND_VEGETABLE_PRODUCTS,
        FOOD_CATEGORY_NUT_AND_SEED_PRODUCTS,
        FOOD_CATEGORY_BEVERAGES,
        FOOD_CATEGORY_LEGUMES_AND_LEGUME_PRODUCTS,
        FOOD_CATEGORY_BAKED_PRODUCTS,
        FOOD_CATEGORY_SNACKS,
    },
    MEAL_DINNER: {
        FOOD_CATEGORY_POULTRY_PRODUCTS,
        FOOD_CATEGORY_SOUPS_SAUCES_AND_GRAVIES,
        FOOD_CATEGORY_SAUSAGES_AND_LUNCHEON_MEATS,
        FOOD_CATEGORY_FRUITS_AND_FRUIT_JUICES,
        FOOD_CATEGORY_VEGETABLES_AND_VEGETABLE_PRODUCTS,
        FOOD_CATEGORY_NUT_AND_SEED_PRODUCTS,
        FOOD_CATEGORY_BEEF_PRODUCTS,
        FOOD_CATEGORY_BEVERAGES,
        FOOD_CATEGORY_FINFISH_AND_SHELLFISH_PRODUCTS,
        FOOD_CATEGORY_LEGUMES_AND_LEGUME_PRODUCTS,
        FOOD_CATEGORY_LAMB_VEAL_AND_GAME_PRODUCTS,
        FOOD_CATEGORY_BAKED_PRODUCTS,
        FOOD_CATEGORY_CEREAL_GRAINS_AND_PASTA,
        FOOD_CATEGORY_FAST_FOODS,
        FOOD_CATEGORY_RESTAURANT_FOODS,
    },
}


def category_id_mapper(cat):
    return FOOD_CATEGORY_ID_MAPPING[cat]


wanted_category_ids = list(map(
    category_id_mapper,
    reduce(
        lambda x, y: x.union(y),
        [
            MEAL_FOOD_CATEGORY_MAPPING[MEAL_BREAKFAST],
            MEAL_FOOD_CATEGORY_MAPPING[MEAL_MORNING_SNACK],
            MEAL_FOOD_CATEGORY_MAPPING[MEAL_LUNCH],
            MEAL_FOOD_CATEGORY_MAPPING[MEAL_AFTERNOON_SNACK],
            MEAL_FOOD_CATEGORY_MAPPING[MEAL_DINNER],
        ],
        set(),
    )
))

In [214]:
# We are going to need this helper function.

import re


def _contains_any_of_words(text: str, words: typing.List[str]):
    for word in words:
        if re.search(r"\b" + re.escape(word) + r"\b", text, flags=re.IGNORECASE):
            return True

    return False

In [215]:
# 1. We drop any food that does not fall into any of these categories.

COLUMN_FOOD_CATEGORY_ID = "food_category_id"

for index, row in foods.iterrows():
    if row[COLUMN_FOOD_CATEGORY_ID] not in wanted_category_ids:
        foods.drop(index, inplace=True)

print(foods)

      fdc_id                                        description  \
0     167512  Pillsbury Golden Layer Buttermilk Biscuits, Ar...   
1     167513  Pillsbury, Cinnamon Rolls with Icing, refriger...   
2     167514  Kraft Foods, Shake N Bake Original Recipe, Coa...   
3     167515     George Weston Bakeries, Thomas English Muffins   
4     167516         Waffles, buttermilk, frozen, ready-to-heat   
...      ...                                                ...   
7788  175300         Game meat, buffalo, water, cooked, roasted   
7789  175301                                Game meat, elk, raw   
7790  175302                    Game meat, elk, cooked, roasted   
7791  175303                               Game meat, goat, raw   
7792  175304                   Game meat, goat, cooked, roasted   

      food_category_id  
0                   18  
1                   18  
2                   18  
3                   18  
4                   18  
...                ...  
7788                

In [216]:
# 2. Remove raw foods.

COLUMN_DESCRIPTION = "description"

for index, row in foods.iterrows():
    if _contains_any_of_words(row[COLUMN_DESCRIPTION], ["raw", "unprepared"]):
        foods.drop(index, inplace=True)

print(foods)

      fdc_id                                        description  \
0     167512  Pillsbury Golden Layer Buttermilk Biscuits, Ar...   
1     167513  Pillsbury, Cinnamon Rolls with Icing, refriger...   
2     167514  Kraft Foods, Shake N Bake Original Recipe, Coa...   
3     167515     George Weston Bakeries, Thomas English Muffins   
4     167516         Waffles, buttermilk, frozen, ready-to-heat   
...      ...                                                ...   
7783  175295                 Game meat, beaver, cooked, roasted   
7786  175298             Game meat, boar, wild, cooked, roasted   
7788  175300         Game meat, buffalo, water, cooked, roasted   
7790  175302                    Game meat, elk, cooked, roasted   
7792  175304                   Game meat, goat, cooked, roasted   

      food_category_id  
0                   18  
1                   18  
2                   18  
3                   18  
4                   18  
...                ...  
7783                

In [217]:
# 3. In the category 6 (Soups, Sauces, and Gravies), remove items containing the word `sauce`,
# since sauces are not independent items.

for index, row in foods.iterrows():
    if row[COLUMN_FOOD_CATEGORY_ID] == category_id_mapper(FOOD_CATEGORY_SOUPS_SAUCES_AND_GRAVIES):
        if _contains_any_of_words(row[COLUMN_DESCRIPTION], ["sauce"]):
            foods.drop(index, inplace=True)

print(foods)

      fdc_id                                        description  \
0     167512  Pillsbury Golden Layer Buttermilk Biscuits, Ar...   
1     167513  Pillsbury, Cinnamon Rolls with Icing, refriger...   
2     167514  Kraft Foods, Shake N Bake Original Recipe, Coa...   
3     167515     George Weston Bakeries, Thomas English Muffins   
4     167516         Waffles, buttermilk, frozen, ready-to-heat   
...      ...                                                ...   
7783  175295                 Game meat, beaver, cooked, roasted   
7786  175298             Game meat, boar, wild, cooked, roasted   
7788  175300         Game meat, buffalo, water, cooked, roasted   
7790  175302                    Game meat, elk, cooked, roasted   
7792  175304                   Game meat, goat, cooked, roasted   

      food_category_id  
0                   18  
1                   18  
2                   18  
3                   18  
4                   18  
...                ...  
7783                

In [218]:
# 4. Remove items containing these words since those items are not easily accessible in our geographic area.

for index, row in foods.iterrows():
    if _contains_any_of_words(
            row[COLUMN_DESCRIPTION],
            [
                "quail", "pheasant", "dove", "squab", "goose", "duck", "guinea hen", "grouse", "emu", "ostrich",
                "ham", "pork", "bacon", "alcohol", "alcoholic", "beer", "wine", "whiskey", "gin", "rum",
                "vodka", "liquor",
            ],
    ):
        foods.drop(index, inplace=True)

print(foods)

      fdc_id                                        description  \
0     167512  Pillsbury Golden Layer Buttermilk Biscuits, Ar...   
1     167513  Pillsbury, Cinnamon Rolls with Icing, refriger...   
3     167515     George Weston Bakeries, Thomas English Muffins   
4     167516         Waffles, buttermilk, frozen, ready-to-heat   
5     167517  Waffle, buttermilk, frozen, ready-to-heat, toa...   
...      ...                                                ...   
7783  175295                 Game meat, beaver, cooked, roasted   
7786  175298             Game meat, boar, wild, cooked, roasted   
7788  175300         Game meat, buffalo, water, cooked, roasted   
7790  175302                    Game meat, elk, cooked, roasted   
7792  175304                   Game meat, goat, cooked, roasted   

      food_category_id  
0                   18  
1                   18  
3                   18  
4                   18  
5                   18  
...                ...  
7783                

In [219]:
# 5. In the category 15 (Finfish and Shellfish Products), remove items not containing any of these words
# since those items are not easily accessible in our geographic area.

for index, row in foods.iterrows():
    if row[COLUMN_FOOD_CATEGORY_ID] == category_id_mapper(FOOD_CATEGORY_FINFISH_AND_SHELLFISH_PRODUCTS):
        if not _contains_any_of_words(row[COLUMN_DESCRIPTION], ["salmon", "tuna", "shrimp"]):
            foods.drop(index, inplace=True)

print(foods)

      fdc_id                                        description  \
0     167512  Pillsbury Golden Layer Buttermilk Biscuits, Ar...   
1     167513  Pillsbury, Cinnamon Rolls with Icing, refriger...   
3     167515     George Weston Bakeries, Thomas English Muffins   
4     167516         Waffles, buttermilk, frozen, ready-to-heat   
5     167517  Waffle, buttermilk, frozen, ready-to-heat, toa...   
...      ...                                                ...   
7783  175295                 Game meat, beaver, cooked, roasted   
7786  175298             Game meat, boar, wild, cooked, roasted   
7788  175300         Game meat, buffalo, water, cooked, roasted   
7790  175302                    Game meat, elk, cooked, roasted   
7792  175304                   Game meat, goat, cooked, roasted   

      food_category_id  
0                   18  
1                   18  
3                   18  
4                   18  
5                   18  
...                ...  
7783                

In [220]:
# 6. In the category 17 (Lamb, Veal, and Game Products), remove items containing the word `game`,
# since those items are not easily accessible in our geographic area.

for index, row in foods.iterrows():
    if row[COLUMN_FOOD_CATEGORY_ID] == category_id_mapper(FOOD_CATEGORY_LAMB_VEAL_AND_GAME_PRODUCTS):
        if _contains_any_of_words(row[COLUMN_DESCRIPTION], ["game"]):
            foods.drop(index, inplace=True)

print(foods)

      fdc_id                                        description  \
0     167512  Pillsbury Golden Layer Buttermilk Biscuits, Ar...   
1     167513  Pillsbury, Cinnamon Rolls with Icing, refriger...   
3     167515     George Weston Bakeries, Thomas English Muffins   
4     167516         Waffles, buttermilk, frozen, ready-to-heat   
5     167517  Waffle, buttermilk, frozen, ready-to-heat, toa...   
...      ...                                                ...   
7773  175285  Veal, shoulder, blade, separable lean only, co...   
7775  175287  Veal, sirloin, separable lean and fat, cooked,...   
7776  175288  Veal, sirloin, separable lean and fat, cooked,...   
7777  175289  Veal, cubed for stew (leg and shoulder), separ...   
7779  175291                      Veal, ground, cooked, broiled   

      food_category_id  
0                   18  
1                   18  
3                   18  
4                   18  
5                   18  
...                ...  
7773                

In [221]:
# 7. In the category 18 (Baked Products), remove items not containing the word `bread`,
# since those items are not easily accessible in our geographic area.

for index, row in foods.iterrows():
    if row[COLUMN_FOOD_CATEGORY_ID] == category_id_mapper(FOOD_CATEGORY_BAKED_PRODUCTS):
        if not _contains_any_of_words(row[COLUMN_DESCRIPTION], ["bread"]):
            foods.drop(index, inplace=True)

print(foods)

      fdc_id                                        description  \
14    167526  Bread, salvadoran sweet cheese (quesadilla sal...   
15    167527    Bread, pound cake type, pan de torta salvadoran   
20    167532                                 Bread, white wheat   
24    167536             Snacks, beef jerky, chopped and formed   
25    167537         Snacks, corn-based, extruded, chips, plain   
...      ...                                                ...   
7773  175285  Veal, shoulder, blade, separable lean only, co...   
7775  175287  Veal, sirloin, separable lean and fat, cooked,...   
7776  175288  Veal, sirloin, separable lean and fat, cooked,...   
7777  175289  Veal, cubed for stew (leg and shoulder), separ...   
7779  175291                      Veal, ground, cooked, broiled   

      food_category_id  
14                  18  
15                  18  
20                  18  
24                  23  
25                  23  
...                ...  
7773                

In [222]:
# 8. In the category 20 (Cereal Grains and Pasta), remove items not containing any of these word,
# since those items are not usually consumed directly.

for index, row in foods.iterrows():
    if row[COLUMN_FOOD_CATEGORY_ID] == category_id_mapper(FOOD_CATEGORY_CEREAL_GRAINS_AND_PASTA):
        if not _contains_any_of_words(
                row[COLUMN_DESCRIPTION],
                ["rice", "pasta", "macaroni", "noodles", "spaghetti"],
        ):
            foods.drop(index, inplace=True)

print(foods)

      fdc_id                                        description  \
14    167526  Bread, salvadoran sweet cheese (quesadilla sal...   
15    167527    Bread, pound cake type, pan de torta salvadoran   
20    167532                                 Bread, white wheat   
24    167536             Snacks, beef jerky, chopped and formed   
25    167537         Snacks, corn-based, extruded, chips, plain   
...      ...                                                ...   
7773  175285  Veal, shoulder, blade, separable lean only, co...   
7775  175287  Veal, sirloin, separable lean and fat, cooked,...   
7776  175288  Veal, sirloin, separable lean and fat, cooked,...   
7777  175289  Veal, cubed for stew (leg and shoulder), separ...   
7779  175291                      Veal, ground, cooked, broiled   

      food_category_id  
14                  18  
15                  18  
20                  18  
24                  23  
25                  23  
...                ...  
7773                

#### Add Required Columns

In [223]:
# 1. Add five boolean columns to express whether this food can be consumed in each of five meals.

COLUMN_IS_BREAKFAST = "is_breakfast"
COLUMN_IS_MORNING_SNACK = "is_morning_snack"
COLUMN_IS_LUNCH = "is_lunch"
COLUMN_IS_AFTERNOON_SNACK = "is_afternoon_snack"
COLUMN_IS_DINNER = "is_dinner"

foods[COLUMN_IS_BREAKFAST] = 0
foods[COLUMN_IS_MORNING_SNACK] = 0
foods[COLUMN_IS_LUNCH] = 0
foods[COLUMN_IS_AFTERNOON_SNACK] = 0
foods[COLUMN_IS_DINNER] = 0

for index, row in foods.iterrows():
    if row[COLUMN_FOOD_CATEGORY_ID] in list(map(category_id_mapper, MEAL_FOOD_CATEGORY_MAPPING[MEAL_BREAKFAST])):
        foods.at[index, COLUMN_IS_BREAKFAST] = 1

    if row[COLUMN_FOOD_CATEGORY_ID] in list(map(category_id_mapper, MEAL_FOOD_CATEGORY_MAPPING[MEAL_MORNING_SNACK])):
        foods.at[index, COLUMN_IS_MORNING_SNACK] = 1

    if row[COLUMN_FOOD_CATEGORY_ID] in list(map(category_id_mapper, MEAL_FOOD_CATEGORY_MAPPING[MEAL_LUNCH])):
        foods.at[index, COLUMN_IS_LUNCH] = 1

    if row[COLUMN_FOOD_CATEGORY_ID] in list(map(category_id_mapper, MEAL_FOOD_CATEGORY_MAPPING[MEAL_AFTERNOON_SNACK])):
        foods.at[index, COLUMN_IS_AFTERNOON_SNACK] = 1

    if row[COLUMN_FOOD_CATEGORY_ID] in list(map(category_id_mapper, MEAL_FOOD_CATEGORY_MAPPING[MEAL_DINNER])):
        foods.at[index, COLUMN_IS_DINNER] = 1

print(foods)

      fdc_id                                        description  \
14    167526  Bread, salvadoran sweet cheese (quesadilla sal...   
15    167527    Bread, pound cake type, pan de torta salvadoran   
20    167532                                 Bread, white wheat   
24    167536             Snacks, beef jerky, chopped and formed   
25    167537         Snacks, corn-based, extruded, chips, plain   
...      ...                                                ...   
7773  175285  Veal, shoulder, blade, separable lean only, co...   
7775  175287  Veal, sirloin, separable lean and fat, cooked,...   
7776  175288  Veal, sirloin, separable lean and fat, cooked,...   
7777  175289  Veal, cubed for stew (leg and shoulder), separ...   
7779  175291                      Veal, ground, cooked, broiled   

      food_category_id  is_breakfast  is_morning_snack  is_lunch  \
14                  18             1                 1         1   
15                  18             1                 1     

In [224]:
# 2. Add a boolean column to express whether this food is a beverage or not.

COLUMN_IS_BEVERAGE = "is_beverage"

foods[COLUMN_IS_BEVERAGE] = 0

for index, row in foods.iterrows():
    if (
            (
                    row[COLUMN_FOOD_CATEGORY_ID] ==
                    category_id_mapper(FOOD_CATEGORY_FRUITS_AND_FRUIT_JUICES) and
                    _contains_any_of_words(row[COLUMN_DESCRIPTION], ["juice", "nectar"])
            ) or
            (
                    row[COLUMN_FOOD_CATEGORY_ID] ==
                    category_id_mapper(FOOD_CATEGORY_BEVERAGES)
            )
    ):
        foods.at[index, COLUMN_IS_BEVERAGE] = 1

print(foods)

      fdc_id                                        description  \
14    167526  Bread, salvadoran sweet cheese (quesadilla sal...   
15    167527    Bread, pound cake type, pan de torta salvadoran   
20    167532                                 Bread, white wheat   
24    167536             Snacks, beef jerky, chopped and formed   
25    167537         Snacks, corn-based, extruded, chips, plain   
...      ...                                                ...   
7773  175285  Veal, shoulder, blade, separable lean only, co...   
7775  175287  Veal, sirloin, separable lean and fat, cooked,...   
7776  175288  Veal, sirloin, separable lean and fat, cooked,...   
7777  175289  Veal, cubed for stew (leg and shoulder), separ...   
7779  175291                      Veal, ground, cooked, broiled   

      food_category_id  is_breakfast  is_morning_snack  is_lunch  \
14                  18             1                 1         1   
15                  18             1                 1     

In [225]:
# 3. With file `food_nutrients.csv` find the values of the following nutrients.

NUTRITIONAL_ITEM_CALORIES = "calories"
NUTRITIONAL_ITEM_PROTEIN = "protein"
NUTRITIONAL_ITEM_FAT = "fat"
NUTRITIONAL_ITEM_CHOLESTEROL = "cholesterol"
NUTRITIONAL_ITEM_CARBOHYDRATE = "carbohydrate"
NUTRITIONAL_ITEM_SUGAR = "sugar"
NUTRITIONAL_ITEM_SODIUM = "sodium"
NUTRITIONAL_ITEM_CALCIUM = "calcium"
NUTRITIONAL_ITEM_IRON = "iron"
NUTRITIONAL_ITEM_POTASSIUM = "potassium"
NUTRITIONAL_ITEM_VITAMIN_C = "vitamin_c"

NUTRITIONAL_ITEMS = [
    NUTRITIONAL_ITEM_CALORIES,
    NUTRITIONAL_ITEM_PROTEIN,
    NUTRITIONAL_ITEM_FAT,
    NUTRITIONAL_ITEM_CHOLESTEROL,
    NUTRITIONAL_ITEM_CARBOHYDRATE,
    NUTRITIONAL_ITEM_SUGAR,
    NUTRITIONAL_ITEM_SODIUM,
    NUTRITIONAL_ITEM_CALCIUM,
    NUTRITIONAL_ITEM_IRON,
    NUTRITIONAL_ITEM_POTASSIUM,
    NUTRITIONAL_ITEM_VITAMIN_C,
]

COLUMN_CALORIES = "calories"
COLUMN_PROTEIN = "protein"
COLUMN_FAT = "fat"
COLUMN_CHOLESTEROL = "cholesterol"
COLUMN_CARBOHYDRATE = "carbohydrate"
COLUMN_SUGAR = "sugar"
COLUMN_SODIUM = "sodium"
COLUMN_CALCIUM = "calcium"
COLUMN_IRON = "iron"
COLUMN_POTASSIUM = "potassium"
COLUMN_VITAMIN_C = "vitamin_c"
COLUMN_FDC_ID = "fdc_id"
COLUMN_NUTRIENT_ID = "nutrient_id"
COLUMN_AMOUNT = "amount"

# From nutrients.csv
NUTRITIONAL_ITEM_ID_MAPPING = {
    NUTRITIONAL_ITEM_CALORIES: 1008,
    NUTRITIONAL_ITEM_PROTEIN: 1003,
    NUTRITIONAL_ITEM_FAT: 1004,
    NUTRITIONAL_ITEM_CHOLESTEROL: 1253,
    NUTRITIONAL_ITEM_CARBOHYDRATE: 1005,
    NUTRITIONAL_ITEM_SUGAR: 2000,
    NUTRITIONAL_ITEM_SODIUM: 1093,
    NUTRITIONAL_ITEM_CALCIUM: 1087,
    NUTRITIONAL_ITEM_IRON: 1089,
    NUTRITIONAL_ITEM_POTASSIUM: 1092,
    NUTRITIONAL_ITEM_VITAMIN_C: 1162,
}

food_nutrients = pd.read_csv(input_dir / "food_nutrient.csv")

for index, row in foods.iterrows():
    fdc_id = row[COLUMN_FDC_ID]
    current_food_nutrients = food_nutrients[food_nutrients[COLUMN_FDC_ID] == fdc_id]
    food_nutrients = food_nutrients[food_nutrients[COLUMN_FDC_ID] != fdc_id]

    for _, nutrients_row in current_food_nutrients.iterrows():
        for key in NUTRITIONAL_ITEMS:
            if nutrients_row[COLUMN_NUTRIENT_ID] == NUTRITIONAL_ITEM_ID_MAPPING[key]:
                foods.at[index, key] = nutrients_row[COLUMN_AMOUNT]

print(foods)

      fdc_id                                        description  \
14    167526  Bread, salvadoran sweet cheese (quesadilla sal...   
15    167527    Bread, pound cake type, pan de torta salvadoran   
20    167532                                 Bread, white wheat   
24    167536             Snacks, beef jerky, chopped and formed   
25    167537         Snacks, corn-based, extruded, chips, plain   
...      ...                                                ...   
7773  175285  Veal, shoulder, blade, separable lean only, co...   
7775  175287  Veal, sirloin, separable lean and fat, cooked,...   
7776  175288  Veal, sirloin, separable lean and fat, cooked,...   
7777  175289  Veal, cubed for stew (leg and shoulder), separ...   
7779  175291                      Veal, ground, cooked, broiled   

      food_category_id  is_breakfast  is_morning_snack  is_lunch  \
14                  18             1                 1         1   
15                  18             1                 1     

#### Remove Missing Values

In [226]:
# For any food, if we don't know any of the values mentioned above, we drop that row from foods for more accuracy.

foods = foods.dropna(how='any')

foods = foods.reset_index(drop=True)

print(foods)

      fdc_id                                        description  \
0     167526  Bread, salvadoran sweet cheese (quesadilla sal...   
1     167532                                 Bread, white wheat   
2     167536             Snacks, beef jerky, chopped and formed   
3     167537         Snacks, corn-based, extruded, chips, plain   
4     167542                  Snacks, granola bars, hard, plain   
...      ...                                                ...   
2873  175274   Veal, loin, separable lean only, cooked, roasted   
2874  175276  Veal, rib, separable lean and fat, cooked, bra...   
2875  175278  Veal, shoulder, whole (arm and blade), separab...   
2876  175288  Veal, sirloin, separable lean and fat, cooked,...   
2877  175291                      Veal, ground, cooked, broiled   

      food_category_id  is_breakfast  is_morning_snack  is_lunch  \
0                   18             1                 1         1   
1                   18             1                 1     

### LP

After We Preprocessed the data, We are able to receive personal info and preferences about a person and find a diet.

In [227]:
# We define the following parameters
# - Gender
#     - Male
#     - Female
# - Pregnant (boolean)
# - Lactating (boolean)
# - Weight (kg)
# - Height (cm)
# - Age (years)
# - Activity Level
#     - Sedentary (desk job; little or no exercise)
#     - Slightly Active (exercise 1-3 times a week)
#     - Moderately Active (exercise 3-5 times a week)
#     - Very Active (exercise 6-7 times a week)
#     - Extremely Active (physical job or regular training)
# - Goal
#     - Maintenance (maintaining current weight)
#     - Muscle Gain (building muscle mass)
#     - Fat Loss (losing body fat)

GENDER_MALE = "male"
GENDER_FEMALE = "female"

LIFESTYLE_SEDENTARY = "sedentary"
LIFESTYLE_SLIGHTLY_ACTIVE = "slightly_active"
LIFESTYLE_MODERATELY_ACTIVE = "moderately_active"
LIFESTYLE_VERY_ACTIVE = "very_active"
LIFESTYLE_EXTREMELY_ACTIVE = "extremely_active"

GOAL_MAINTENANCE = 'maintenance'
GOAL_MUSCLE_GAIN = 'muscle_gain'
GOAL_FAT_LOSS = 'fat_loss'

# We consider these values as an example

gender = GENDER_FEMALE
weight = 65
height = 165
age = 23
lifestyle = LIFESTYLE_SEDENTARY
goal = GOAL_MAINTENANCE
pregnant = False
lactating = False

For each of the nutrients, we create a function to calculate the healthy range of intake per day, based on the parameters.

In [228]:
# Basal Metabolic Rate (BMR)

def calculate_bmr(gender, weight, height, age):
    if gender == GENDER_FEMALE:
        return 447.593 + (9.247 * weight) + (3.098 * height) - (4.330 * age)
    if gender == GENDER_MALE:
        return 88.362 + (13.397 * weight) + (4.799 * height) - (5.677 * age)


bmr = calculate_bmr(gender, weight, height, age)

print(bmr)

1460.2279999999998


In [229]:
# Total Daily Energy Expenditure (TDEE)

def calculate_tdee(bmr, lifestyle, pregnant, lactating):
    factor = 1

    if pregnant:
        factor *= 1.2

    if lactating:
        factor *= 1.5

    if lifestyle == LIFESTYLE_SEDENTARY:
        return bmr * 1.2
    if lifestyle == LIFESTYLE_SLIGHTLY_ACTIVE:
        return bmr * 1.375
    if lifestyle == LIFESTYLE_MODERATELY_ACTIVE:
        return bmr * 1.55
    if lifestyle == LIFESTYLE_VERY_ACTIVE:
        return bmr * 1.725
    if lifestyle == LIFESTYLE_EXTREMELY_ACTIVE:
        return bmr * 1.9


tdee = calculate_tdee(bmr, lifestyle, pregnant, lactating)

print(tdee)

1752.2735999999998


In [230]:
def calculate_calories(tdee, goal):
    if goal == GOAL_MAINTENANCE:
        return tdee * 0.9, tdee * 1.1

    if goal == GOAL_MUSCLE_GAIN:
        return tdee * 1.1, tdee * 1.2

    if goal == GOAL_FAT_LOSS:
        return tdee * 0.8, tdee * 0.9


calories_range = calculate_calories(tdee, goal)

print(calories_range)

(1577.04624, 1927.5009599999998)


In [231]:
def calculate_protein(gender, weight, lifestyle, goal, pregnant, lactating):
    min_per_kg_body_wight, max_per_kg_body_wight = {
        (GENDER_MALE, LIFESTYLE_SEDENTARY, GOAL_MAINTENANCE): (0.8, 1.0),
        (GENDER_MALE, LIFESTYLE_SEDENTARY, GOAL_MUSCLE_GAIN): (1.2, 1.5),
        (GENDER_MALE, LIFESTYLE_SEDENTARY, GOAL_FAT_LOSS): (1.2, 1.5),
        (GENDER_MALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_MAINTENANCE): (0.8, 1.2),
        (GENDER_MALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_MUSCLE_GAIN): (1.2, 1.7),
        (GENDER_MALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_FAT_LOSS): (1.2, 1.7),
        (GENDER_MALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_MAINTENANCE): (1.0, 1.2),
        (GENDER_MALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_MUSCLE_GAIN): (1.5, 1.7),
        (GENDER_MALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_FAT_LOSS): (1.5, 1.7),
        (GENDER_MALE, LIFESTYLE_VERY_ACTIVE, GOAL_MAINTENANCE): (1.0, 1.5),
        (GENDER_MALE, LIFESTYLE_VERY_ACTIVE, GOAL_MUSCLE_GAIN): (1.5, 2.0),
        (GENDER_MALE, LIFESTYLE_VERY_ACTIVE, GOAL_FAT_LOSS): (1.5, 1.7),
        (GENDER_MALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_MAINTENANCE): (1.2, 1.5),
        (GENDER_MALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_MUSCLE_GAIN): (1.7, 2.0),
        (GENDER_MALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_FAT_LOSS): (1.5, 1.7),

        (GENDER_FEMALE, LIFESTYLE_SEDENTARY, GOAL_MAINTENANCE): (0.8, 1.0),
        (GENDER_FEMALE, LIFESTYLE_SEDENTARY, GOAL_MUSCLE_GAIN): (1.0, 1.2),
        (GENDER_FEMALE, LIFESTYLE_SEDENTARY, GOAL_FAT_LOSS): (1.0, 1.2),
        (GENDER_FEMALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_MAINTENANCE): (0.8, 1.0),
        (GENDER_FEMALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_MUSCLE_GAIN): (1.0, 1.5),
        (GENDER_FEMALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_FAT_LOSS): (1.0, 1.5),
        (GENDER_FEMALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_MAINTENANCE): (0.8, 1.0),
        (GENDER_FEMALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_MUSCLE_GAIN): (1.2, 1.5),
        (GENDER_FEMALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_FAT_LOSS): (1.2, 1.5),
        (GENDER_FEMALE, LIFESTYLE_VERY_ACTIVE, GOAL_MAINTENANCE): (0.8, 1.2),
        (GENDER_FEMALE, LIFESTYLE_VERY_ACTIVE, GOAL_MUSCLE_GAIN): (1.2, 1.7),
        (GENDER_FEMALE, LIFESTYLE_VERY_ACTIVE, GOAL_FAT_LOSS): (1.2, 1.5),
        (GENDER_FEMALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_MAINTENANCE): (1.0, 1.2),
        (GENDER_FEMALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_MUSCLE_GAIN): (1.5, 1.7),
        (GENDER_FEMALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_FAT_LOSS): (1.2, 1.5),
    }[gender, lifestyle, goal]

    if pregnant:
        min_per_kg_body_wight += 0.3
        max_per_kg_body_wight += 0.4

    if lactating:
        min_per_kg_body_wight += 0.2
        max_per_kg_body_wight += 0.3

    return min_per_kg_body_wight * weight, max_per_kg_body_wight * weight


protein_range = calculate_protein(gender, weight, lifestyle, goal, pregnant, lactating)

print(protein_range)

(52.0, 65.0)


In [232]:
def calculate_fat(gender, lifestyle, goal, calories_range):
    fat_percentage = {
        (GENDER_MALE, LIFESTYLE_SEDENTARY, GOAL_MAINTENANCE): (20, 25),
        (GENDER_MALE, LIFESTYLE_SEDENTARY, GOAL_MUSCLE_GAIN): (20, 30),
        (GENDER_MALE, LIFESTYLE_SEDENTARY, GOAL_FAT_LOSS): (20, 30),
        (GENDER_MALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_MAINTENANCE): (25, 30),
        (GENDER_MALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_MUSCLE_GAIN): (20, 30),
        (GENDER_MALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_FAT_LOSS): (20, 30),
        (GENDER_MALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_MAINTENANCE): (30, 35),
        (GENDER_MALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_MUSCLE_GAIN): (20, 30),
        (GENDER_MALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_FAT_LOSS): (20, 30),
        (GENDER_MALE, LIFESTYLE_VERY_ACTIVE, GOAL_MAINTENANCE): (35, 40),
        (GENDER_MALE, LIFESTYLE_VERY_ACTIVE, GOAL_MUSCLE_GAIN): (20, 30),
        (GENDER_MALE, LIFESTYLE_VERY_ACTIVE, GOAL_FAT_LOSS): (20, 30),
        (GENDER_MALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_MAINTENANCE): (40, 45),
        (GENDER_MALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_MUSCLE_GAIN): (20, 30),
        (GENDER_MALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_FAT_LOSS): (20, 30),

        (GENDER_FEMALE, LIFESTYLE_SEDENTARY, GOAL_MAINTENANCE): (20, 25),
        (GENDER_FEMALE, LIFESTYLE_SEDENTARY, GOAL_MUSCLE_GAIN): (20, 30),
        (GENDER_FEMALE, LIFESTYLE_SEDENTARY, GOAL_FAT_LOSS): (20, 30),
        (GENDER_FEMALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_MAINTENANCE): (25, 30),
        (GENDER_FEMALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_MUSCLE_GAIN): (20, 30),
        (GENDER_FEMALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_FAT_LOSS): (20, 30),
        (GENDER_FEMALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_MAINTENANCE): (30, 35),
        (GENDER_FEMALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_MUSCLE_GAIN): (20, 30),
        (GENDER_FEMALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_FAT_LOSS): (20, 30),
        (GENDER_FEMALE, LIFESTYLE_VERY_ACTIVE, GOAL_MAINTENANCE): (35, 40),
        (GENDER_FEMALE, LIFESTYLE_VERY_ACTIVE, GOAL_MUSCLE_GAIN): (20, 30),
        (GENDER_FEMALE, LIFESTYLE_VERY_ACTIVE, GOAL_FAT_LOSS): (20, 30),
        (GENDER_FEMALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_MAINTENANCE): (40, 45),
        (GENDER_FEMALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_MUSCLE_GAIN): (20, 30),
        (GENDER_FEMALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_FAT_LOSS): (20, 30),
    }[gender, lifestyle, goal]

    # Convert calories to grams (1 gram of fat = 9 calories)
    return (fat_percentage[0] / 100) * calories_range[0] / 9, (fat_percentage[1] / 100) * calories_range[1] / 9


fat_range = calculate_fat(gender, lifestyle, goal, calories_range)

print(fat_range)

(35.045472, 53.54169333333333)


In [233]:
def calculate_cholesterol():
    return 200, 300


cholesterol_range = calculate_cholesterol()

print(cholesterol_range)

(200, 300)


In [234]:
def calculate_carbohydrate(gender, lifestyle, goal, calories_range):
    carbohydrate_percentage = {
        (GENDER_MALE, LIFESTYLE_SEDENTARY, GOAL_MAINTENANCE): (50, 65),
        (GENDER_MALE, LIFESTYLE_SEDENTARY, GOAL_MUSCLE_GAIN): (45, 65),
        (GENDER_MALE, LIFESTYLE_SEDENTARY, GOAL_FAT_LOSS): (45, 65),
        (GENDER_MALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_MAINTENANCE): (45, 60),
        (GENDER_MALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_MUSCLE_GAIN): (45, 65),
        (GENDER_MALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_FAT_LOSS): (45, 65),
        (GENDER_MALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_MAINTENANCE): (40, 55),
        (GENDER_MALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_MUSCLE_GAIN): (45, 65),
        (GENDER_MALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_FAT_LOSS): (45, 65),
        (GENDER_MALE, LIFESTYLE_VERY_ACTIVE, GOAL_MAINTENANCE): (35, 50),
        (GENDER_MALE, LIFESTYLE_VERY_ACTIVE, GOAL_MUSCLE_GAIN): (45, 65),
        (GENDER_MALE, LIFESTYLE_VERY_ACTIVE, GOAL_FAT_LOSS): (45, 65),
        (GENDER_MALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_MAINTENANCE): (30, 45),
        (GENDER_MALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_MUSCLE_GAIN): (45, 65),
        (GENDER_MALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_FAT_LOSS): (45, 65),

        (GENDER_FEMALE, LIFESTYLE_SEDENTARY, GOAL_MAINTENANCE): (50, 65),
        (GENDER_FEMALE, LIFESTYLE_SEDENTARY, GOAL_MUSCLE_GAIN): (45, 65),
        (GENDER_FEMALE, LIFESTYLE_SEDENTARY, GOAL_FAT_LOSS): (45, 65),
        (GENDER_FEMALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_MAINTENANCE): (45, 60),
        (GENDER_FEMALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_MUSCLE_GAIN): (45, 65),
        (GENDER_FEMALE, LIFESTYLE_SLIGHTLY_ACTIVE, GOAL_FAT_LOSS): (45, 65),
        (GENDER_FEMALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_MAINTENANCE): (40, 55),
        (GENDER_FEMALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_MUSCLE_GAIN): (45, 65),
        (GENDER_FEMALE, LIFESTYLE_MODERATELY_ACTIVE, GOAL_FAT_LOSS): (45, 65),
        (GENDER_FEMALE, LIFESTYLE_VERY_ACTIVE, GOAL_MAINTENANCE): (35, 50),
        (GENDER_FEMALE, LIFESTYLE_VERY_ACTIVE, GOAL_MUSCLE_GAIN): (45, 65),
        (GENDER_FEMALE, LIFESTYLE_VERY_ACTIVE, GOAL_FAT_LOSS): (45, 65),
        (GENDER_FEMALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_MAINTENANCE): (30, 45),
        (GENDER_FEMALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_MUSCLE_GAIN): (45, 65),
        (GENDER_FEMALE, LIFESTYLE_EXTREMELY_ACTIVE, GOAL_FAT_LOSS): (45, 65),
    }[gender, lifestyle, goal]

    # Convert calories to grams (1 gram of carbohydrate = 4 calories)
    return (
        (carbohydrate_percentage[0] / 100) * calories_range[0] / 4,
        (carbohydrate_percentage[1] / 100) * calories_range[1] / 4
    )


carbohydrate_range = calculate_carbohydrate(gender, lifestyle, goal, calories_range)

print(carbohydrate_range)

(197.13078, 313.218906)


In [235]:
def calculate_sugar(calories_range):
    # It is advised to limit sugar intake to no more than 10% of total daily calorie intake.
    # Convert calories to grams (1 gram of carbohydrate = 4 calories)
    return 0, (10 / 100) * calories_range[1] / 4


sugar_range = calculate_sugar(calories_range)

print(sugar_range)

(0, 48.187523999999996)


In [236]:
def calculate_sodium():
    return 200, 2300


sodium_rage = calculate_sodium()

print(sodium_rage)

(200, 2300)


In [237]:
def calculate_calcium():
    return 1000, 1200


calcium_range = calculate_calcium()

print(calcium_range)

(1000, 1200)


In [238]:
def calculate_iron(gender, age, pregnant, lactating):
    if age <= 3:
        return 7, 11

    if age <= 8:
        return 10, 14

    if age <= 13:
        return 8, 12

    if age <= 18:
        if gender == GENDER_MALE:
            return 11, 15

        if gender == GENDER_FEMALE:
            if pregnant:
                return 27, 31

            if lactating:
                return 10, 14

            return 15, 19

    if age <= 50:
        if gender == GENDER_MALE:
            return 8, 12

        if gender == GENDER_FEMALE:
            if pregnant:
                return 27, 31

            if lactating:
                return 9, 13

            return 18, 22

    return 8, 12


iron_range = calculate_iron(gender, age, pregnant, lactating)

print(iron_range)

(18, 22)


In [239]:
def calculate_potassium(age):
    if age <= 3:
        return 2000, 2500

    if age <= 8:
        return 2300, 2900

    if age <= 13:
        return 2500, 3000

    if age <= 18:
        return 2600, 3000

    return 2600, 3400


potassium_range = calculate_potassium(age)

print(potassium_range)

(2600, 3400)


In [240]:
def calculate_vitamin_c(age, pregnant, lactating):
    if age <= 3:
        return 15, 35

    if age <= 8:
        return 25, 45

    if age <= 13:
        return 45, 65

    if pregnant and lactating:
        return 120, 130

    if pregnant:
        return 80, 95

    if lactating:
        return 115, 120

    return 75, 90


vitamin_c_rage = calculate_vitamin_c(age, pregnant, lactating)

print(vitamin_c_rage)

(75, 90)


Now we can define an LP and add constraints.

In [241]:
ranges = {
    NUTRITIONAL_ITEM_CALORIES: calories_range,
    NUTRITIONAL_ITEM_PROTEIN: protein_range,
    NUTRITIONAL_ITEM_FAT: fat_range,
    NUTRITIONAL_ITEM_CHOLESTEROL: cholesterol_range,
    NUTRITIONAL_ITEM_CARBOHYDRATE: carbohydrate_range,
    NUTRITIONAL_ITEM_SUGAR: sugar_range,
    NUTRITIONAL_ITEM_SODIUM: sodium_rage,
    NUTRITIONAL_ITEM_CALCIUM: calcium_range,
    NUTRITIONAL_ITEM_IRON: iron_range,
    NUTRITIONAL_ITEM_POTASSIUM: potassium_range,
    NUTRITIONAL_ITEM_VITAMIN_C: vitamin_c_rage,
}

foods_mapping = {}
for _, row in foods.iterrows():
    foods_mapping[row[COLUMN_FDC_ID]] = row[COLUMN_DESCRIPTION]

food_ids = foods[COLUMN_FDC_ID].tolist()

based on user goal, we do one of the following.

- If the goal is Muscle Gain or Maintenance, we maximize protein intake.
- If the goal is Fat loss, we minimize calories intake.

In [242]:
import pulp

prob = pulp.LpProblem(
    "DietOptimization",
    pulp.LpMinimize if goal == GOAL_FAT_LOSS else pulp.LpMaximize
)

Before we define the goal, we need to add the variables.
- `b_i` is a continuous variable representing the amount of `i`-th food in breakfast in grams.
- `m_i` is a continuous variable representing the amount of `i`-th food in morning snack in grams.
- `l_i` is a continuous variable representing the amount of `i`-th food in lunch in grams.
- `a_i` is a continuous variable representing the amount of `i`-th food in afternoon snack in grams.
- `d_i` is a continuous variable representing the amount of `i`-th food in dinner in grams.
- `x_i` is a continuous variable representing the amount of `i`-th food in whole day in grams.
- `bc_i` is a binary variable indicating whether `b_i` is non-zero.
- `mc_i` is a binary variable indicating whether `m_i` is non-zero.
- `lc_i` is a binary variable indicating whether `l_i` is non-zero.
- `ac_i` is a binary variable indicating whether `a_i` is non-zero.
- `dc_i` is a binary variable indicating whether `d_i` is non-zero.


In [243]:
v = {
    "b": pulp.LpVariable.dicts("b", food_ids, lowBound=0, cat=pulp.const.LpContinuous),
    "m": pulp.LpVariable.dicts("m", food_ids, lowBound=0, cat=pulp.const.LpContinuous),
    "l": pulp.LpVariable.dicts("l", food_ids, lowBound=0, cat=pulp.const.LpContinuous),
    "a": pulp.LpVariable.dicts("a", food_ids, lowBound=0, cat=pulp.const.LpContinuous),
    "d": pulp.LpVariable.dicts("d", food_ids, lowBound=0, cat=pulp.const.LpContinuous),
    "x": pulp.LpVariable.dicts("x", food_ids, lowBound=0, cat=pulp.const.LpContinuous),

    "bc": pulp.LpVariable.dicts("bc", food_ids, lowBound=0, cat=pulp.const.LpBinary),
    "mc": pulp.LpVariable.dicts("mc", food_ids, lowBound=0, cat=pulp.const.LpBinary),
    "lc": pulp.LpVariable.dicts("lc", food_ids, lowBound=0, cat=pulp.const.LpBinary),
    "ac": pulp.LpVariable.dicts("ac", food_ids, lowBound=0, cat=pulp.const.LpBinary),
    "dc": pulp.LpVariable.dicts("dc", food_ids, lowBound=0, cat=pulp.const.LpBinary),
}

Now, we are able to define the goal.

In [244]:
if goal == GOAL_FAT_LOSS:
    # minimize calories
    prob += pulp.lpSum((foods[COLUMN_CALORIES][i] / 100) * v["x"][fid] for i, fid in enumerate(food_ids))
else:
    # maximize protein
    prob += pulp.lpSum((foods[COLUMN_PROTEIN][i] / 100) * v["x"][fid] for i, fid in enumerate(food_ids))

Now, we need to relate the variables.

In [245]:
# constraints relating food variables and meal variables
for fid in food_ids:
    prob += pulp.lpSum((v["b"][fid], v["m"][fid], v["l"][fid], v["a"][fid], v["d"][fid])) == v["x"][fid]

# constraints for either not having a food or having between MIN_FOOD and MAX_FOOD grams of the food in each meal
# mc is one if m is none-zero and is zero if m is zero for m in {b, m, l, a, d}
MIN_FOOD = 50
MAX_FOOD = 500

for fid in food_ids:
    prob += v["b"][fid] <= MAX_FOOD * v["bc"][fid]
    prob += v["b"][fid] >= MIN_FOOD * v["bc"][fid]

    prob += v["m"][fid] <= MAX_FOOD * v["mc"][fid]
    prob += v["m"][fid] >= MIN_FOOD * v["mc"][fid]

    prob += v["l"][fid] <= MAX_FOOD * v["lc"][fid]
    prob += v["l"][fid] >= MIN_FOOD * v["lc"][fid]

    prob += v["a"][fid] <= MAX_FOOD * v["ac"][fid]
    prob += v["a"][fid] >= MIN_FOOD * v["ac"][fid]

    prob += v["d"][fid] <= MAX_FOOD * v["dc"][fid]
    prob += v["d"][fid] >= MIN_FOOD * v["dc"][fid]


We add some constraints to achieve a more realistic and diverse diet.

In [246]:
# constraints for not repeating a food in two meals
for fid in food_ids:
    prob += pulp.lpSum((v["bc"][fid], v["mc"][fid], v["lc"][fid], v["ac"][fid], v["dc"][fid])) <= 1

# constraints for excluding non-relevant foods in meals
for i, fid in enumerate(food_ids):
    if not foods[COLUMN_IS_BREAKFAST][i]:
        prob += v["b"][fid] == 0

    if not foods[COLUMN_IS_MORNING_SNACK][i]:
        prob += v["m"][fid] == 0

    if not foods[COLUMN_IS_LUNCH][i]:
        prob += v["l"][fid] == 0

    if not foods[COLUMN_IS_AFTERNOON_SNACK][i]:
        prob += v["a"][fid] == 0

    if not foods[COLUMN_IS_DINNER][i]:
        prob += v["d"][fid] == 0

# constraints for food and beverage intake per meal and per day
is_beverage = foods[COLUMN_IS_BEVERAGE]

MIN_MEAL_FOOD = 200
MAX_MEAL_FOOD = 800

MIN_SNACK_FOOD = 100
MAX_SNACK_FOOD = 700

MIN_MEAL_BEVERAGE = 240
MAX_MEAL_BEVERAGE = 360

MIN_SNACK_BEVERAGE = 240
MAX_SNACK_BEVERAGE = 360

prob += pulp.lpSum((1 - is_beverage[i]) * v["b"][fid] for i, fid in enumerate(food_ids)) >= MIN_MEAL_FOOD
prob += pulp.lpSum((1 - is_beverage[i]) * v["b"][fid] for i, fid in enumerate(food_ids)) <= MAX_MEAL_FOOD

prob += pulp.lpSum((1 - is_beverage[i]) * v["m"][fid] for i, fid in enumerate(food_ids)) >= MIN_SNACK_FOOD
prob += pulp.lpSum((1 - is_beverage[i]) * v["m"][fid] for i, fid in enumerate(food_ids)) <= MAX_SNACK_FOOD

prob += pulp.lpSum((1 - is_beverage[i]) * v["l"][fid] for i, fid in enumerate(food_ids)) >= MIN_MEAL_FOOD
prob += pulp.lpSum((1 - is_beverage[i]) * v["l"][fid] for i, fid in enumerate(food_ids)) <= MAX_MEAL_FOOD

prob += pulp.lpSum((1 - is_beverage[i]) * v["a"][fid] for i, fid in enumerate(food_ids)) >= MIN_SNACK_FOOD
prob += pulp.lpSum((1 - is_beverage[i]) * v["a"][fid] for i, fid in enumerate(food_ids)) <= MAX_SNACK_FOOD

prob += pulp.lpSum((1 - is_beverage[i]) * v["d"][fid] for i, fid in enumerate(food_ids)) >= MIN_MEAL_FOOD
prob += pulp.lpSum((1 - is_beverage[i]) * v["d"][fid] for i, fid in enumerate(food_ids)) <= MAX_MEAL_FOOD

prob += pulp.lpSum(is_beverage[i] * v["b"][fid] for i, fid in enumerate(food_ids)) >= MIN_MEAL_BEVERAGE
prob += pulp.lpSum(is_beverage[i] * v["b"][fid] for i, fid in enumerate(food_ids)) <= MAX_MEAL_BEVERAGE

prob += pulp.lpSum(is_beverage[i] * v["m"][fid] for i, fid in enumerate(food_ids)) >= MIN_SNACK_BEVERAGE
prob += pulp.lpSum(is_beverage[i] * v["m"][fid] for i, fid in enumerate(food_ids)) <= MAX_SNACK_BEVERAGE

prob += pulp.lpSum(is_beverage[i] * v["l"][fid] for i, fid in enumerate(food_ids)) >= MIN_MEAL_BEVERAGE
prob += pulp.lpSum(is_beverage[i] * v["l"][fid] for i, fid in enumerate(food_ids)) <= MAX_MEAL_BEVERAGE

prob += pulp.lpSum(is_beverage[i] * v["a"][fid] for i, fid in enumerate(food_ids)) >= MIN_SNACK_BEVERAGE
prob += pulp.lpSum(is_beverage[i] * v["a"][fid] for i, fid in enumerate(food_ids)) <= MAX_SNACK_BEVERAGE

prob += pulp.lpSum(is_beverage[i] * v["d"][fid] for i, fid in enumerate(food_ids)) >= MIN_MEAL_BEVERAGE
prob += pulp.lpSum(is_beverage[i] * v["d"][fid] for i, fid in enumerate(food_ids)) <= MAX_MEAL_BEVERAGE

Finally, we add constraints to limit the amount of nutrients per day.

In [247]:
for key in NUTRITIONAL_ITEMS:
    prob += pulp.lpSum((foods[key][i] / 100) * v["x"][fid] for i, fid in enumerate(food_ids)) >= ranges[key][0]
    prob += pulp.lpSum((foods[key][i] / 100) * v["x"][fid] for i, fid in enumerate(food_ids)) <= ranges[key][1]

Now we are able to solve the LP.

In [248]:
prob.solve()

print(pulp.LpStatus[prob.status])

Welcome to the CBC MILP Solver 
Version: 2.10.3 
Build Date: Dec 15 2019 

command line - /home/kamyar/repos/uni/ais/CS-SBU-eAdvancedAlgorithms-MSc-2023/venv/lib/python3.10/site-packages/pulp/solverdir/cbc/linux/64/cbc /tmp/8432a85caea94d06ac354a3f8a69e44e-pulp.mps max timeMode elapsed branch printingOptions all solution /tmp/8432a85caea94d06ac354a3f8a69e44e-pulp.sol (default strategy 1)
At line 2 NAME          MODEL
At line 3 ROWS
At line 40147 COLUMNS
At line 248808 RHS
At line 288951 BOUNDS
At line 303342 ENDATA
Problem MODEL has 40142 rows, 31658 columns and 177096 elements
Coin0008I MODEL read with 0 errors
Option for timeMode changed from cpu to elapsed
Continuous objective value is 65 - 0.11 seconds
Cgl0008I 2878 inequality constraints converted to equality constraints
Cgl0003I 0 fixed, 0 tightened bounds, 4 strengthened rows, 2 substitutions
Cgl0004I processed model has 22314 rows, 22295 columns (11152 integer (11152 of which binary)) and 91512 elements
Cbc0038I Initial state -

In [164]:
print(
    "Breakfast:\n" + "\n".join(
        f"{int(v['b'][food_id].varValue)} grams {foods_mapping[food_id]}" for food_id in food_ids if v['b'][food_id].varValue != 0
    ) + "\n\n" +
    "Morning Snack:\n" + "\n".join(
        f"{int(v['m'][food_id].varValue)} grams {foods_mapping[food_id]}" for food_id in food_ids if v['m'][food_id].varValue != 0
    ) + "\n\n" +
    "Lunch:\n" + "\n".join(
        f"{int(v['l'][food_id].varValue)} grams {foods_mapping[food_id]}" for food_id in food_ids if v['l'][food_id].varValue != 0
    ) + "\n\n" +
    "Afternoon Snack:\n" + "\n".join(
        f"{int(v['a'][food_id].varValue)} grams {foods_mapping[food_id]}" for food_id in food_ids if v['a'][food_id].varValue != 0
    ) + "\n\n" +
    "Dinner:\n" + "\n".join(
        f"{int(v['d'][food_id].varValue)} grams {foods_mapping[food_id]}" for food_id in food_ids if v['d'][food_id].varValue != 0
    )
)

Breakfast:
240 grams Beverages, AMBER, hard cider
500 grams Cereals, QUAKER, Instant Oatmeal, fruit and cream variety, dry

Morning Snack:
240 grams Orange-grapefruit juice, canned or bottled, unsweetened
500 grams Onions, frozen, whole, cooked, boiled, drained, without salt

Lunch:
240 grams Beverages, ABBOTT, ENSURE PLUS, ready-to-drink
500 grams Okra, frozen, cooked, boiled, drained, without salt

Afternoon Snack:
500 grams Artichokes, (globe or french), cooked, boiled, drained, with salt
240 grams Carbonated beverage, cream soda

Dinner:
240 grams Orange juice, canned, unsweetened
500 grams Chicken, broilers or fryers, meat only, cooked, fried
