## Introduction

Access to a nutritious diet is not convenient for an average American as a result of multiple factors. Education about the optimal combination of macronutrients, a liberal budget to acquire them, and suficient time to choose and adjust ingredient quantities to reach the macro goals are among some of these factors that this project seems to address. 

According to doctor, health coach and nutritionist Gabrielle Lyon, Americans on average do not meet their daily protein and fibre goals, and often exceed their sugar and saturated fat goals, and it is not surprising that these issues are more prevelant amongst the lower-income households. Even for individuals who have spent considerable time educating themselves on optimal macro intakes for their health goals and make an effort to plan their meals and groceries accordingly, in times of high price volatility and inflation, commmitting to a fixed plan may not be financially sustainable, and individuals may find themselves substituting high quality proteins and fibres with lower quality products that unintentionally bring saturated fats and high amounts of sugar into their diet. 

This project aims to produce a model, as well as a basis for policy recommendations with regards to web-scraping legislation, that would ultimately smoothen this process for ordinary citizens, thus improving access to nutritious diets on a wide scale. 

## Overview of steps (not finalized)

1. Calculate caloric intake/ dietary needs 
    1. Partnership with dieticians/ firms specialised in this, many (including myfitnesspal) have already automated this
2. Input location/ information to detect nearest Walmart/ other budget grocery store with online price availability 
3. Scrape the internet for recipes and input into a dataframe
    1. Could include/ exclude items for certain dietary restrictions 
    2. Which website? What kind of recipes? 
    3. Potential partnerships for a large-scale project: companies that work on compiling recipes 
4. Make a dataset of ingredients of each recipe, for all recipes 
5. Scrape Instacart/ Walmart website for prices of all ingredients 
    1. Legality og scraping for prices under the competition act --> competitor price scraping could be illegal
    2. Think about the implications of legalizing web scraping for prices --> this ccould have ground breaking implications in terms of reducing search cost, promoting competitive markets and helping consumers with household item costs.
6. Adjust main ingredients (protein vegetables carbs) to minimise cost s.t. Given nutrition values
7. Make a plan that minimises repetition of ingredients/ minimises waste

## Aims

- We could automate the meal planning phase: inputting the user’s info to determine caloric/ dietary needs, location, and ultimately providing them with a recipe/ list of recipes that hit these goals at the lowest cost, while maintaining some variety in dining options (optional), and optionally provide a grocery list for a given number of portions
- This planning phase could reduce decision time, optimally (in a way that’s manually impossible) allocate budget to food that maximises your nutrition intake 
- Target: low income households, those who wish to achieve their dietary goals at the cheapest level

## Diet optimization problem, example (step 6)

In [19]:
# !pip install PuLP==2.0
# This is my Linear Programming (LP) library used to solve optimization problems

# Import all classes of PuLP module
from pulp import *
import pandas as pd

In [69]:
example_per_100g = {'product_name': ['rice', 'lean_gr_beef', 'broccoli'],
                    'price': [0.38, 1.43, 1.01],
                    'calories': [129, 230, 35],
                    'carbs': [27.9, 0, 7.4],
           'fat': [0.28, 12, 0.4],
           'protein': [2.66, 14, 2.8],
           'fibre': [0.4, 0, 2.6],
           'sugar': [0.5, 0, 1.7],
           'sat_fat': [0.08, 4.7, 0]
        }

example_per_100g = pd.DataFrame(example_per_100g)
example_per_100g

Unnamed: 0,product_name,price,calories,carbs,fat,protein,fibre,sugar,sat_fat
0,rice,0.38,129,27.9,0.28,2.66,0.4,0.5,0.08
1,lean_gr_beef,1.43,230,0.0,12.0,14.0,0.0,0.0,4.7
2,broccoli,1.01,35,7.4,0.4,2.8,2.6,1.7,0.0


In [44]:
# assume that the following is my caloric goals for dinner
goals = {'calories': ['>500'],
                    'carbs ': ['>50'], 'fat (g)': ['<35'], 'protein (g)': ['>38'],
                    'fibre (g)': ['>7'], 'sugar (g)': ['<5'],'sat_fat (g)': ['<5']}
        
goals = pd.DataFrame(goals)
goals

Unnamed: 0,calories,carbs,fat (g),protein (g),fibre (g),sugar (g),sat_fat (g)
0,>500,>50,<35,>38,>7,<5,<5


In [88]:
# Create the problem variable to contain the problem data
model = LpProblem("Balanced Diet Problem", LpMinimize)

# Define Decision Variables
x1 = LpVariable(example_per_100g['product_name'][0], 0, 3, LpContinuous)
x2 = LpVariable(example_per_100g['product_name'][1], 0, 3, LpContinuous) 
x3 = LpVariable(example_per_100g['product_name'][2], 0, 3, LpContinuous) 

# Define Prices
price_1 = example_per_100g['price'][0]
price_2 = example_per_100g['price'][1]
price_3 = example_per_100g['price'][2]

# Define Objective
model += (price_1 * x1) + (price_2 * x2) + (price_3 * x3)

# Define Constraints
model += x1 + x2 + x3 <= 3 # we don't want our dinner ingredients exceeding 300g total

model += example_per_100g['calories'][0]*x1 + example_per_100g['calories'][1]*x2 + example_per_100g['calories'][2]*x3 >= 500 # Calories
model += example_per_100g['carbs'][0]*x1 + example_per_100g['carbs'][1]*x2 + example_per_100g['carbs'][2]*x3 >= 50 # Carbs
model += example_per_100g['fat'][0]*x1 + example_per_100g['fat'][1]*x2 + example_per_100g['fat'][2]*x3 <= 35 # Fat
model += example_per_100g['protein'][0]*x1 + example_per_100g['protein'][1]*x2 + example_per_100g['protein'][2]*x3 >= 38 # protein
model += example_per_100g['fibre'][0]*x1 + example_per_100g['fibre'][1]*x2 + example_per_100g['fibre'][2]*x3 >= 7 # fibre
model += example_per_100g['sugar'][0]*x1 + example_per_100g['sugar'][1]*x2 + example_per_100g['sugar'][2]*x3 <= 5 # sugar
model += example_per_100g['sat_fat'][0]*x1 + example_per_100g['sat_fat'][1]*x2 + example_per_100g['sat_fat'][2]*x3 <= 5 # sat_fat


# The problem is solved using PuLP's choice of Solver
model.solve()

# Print the variables optimized value
for v in model.variables():
    print(v.name, "=", v.varValue * 100, 'grams')
    
# The optimised objective function value is printed to the screen
print("Value of Objective Function (i.e. price) = ", value(model.objective))

broccoli = 311.07023 grams
lean_gr_beef = 260.88626 grams
rice = -271.95649 grams
Value of Objective Function (i.e. price) =  5.839048179000001


To be fixed --> the lower bound constraints of 0 for each ingredient seems to not be working --> to be fixed

Web scraping --> BeautifulSoup
- Walmart's own web scraping API --> does not scrape prices, but other info 
- perhaps useful for obtaining nutritional info though?