# Food Rec v2 for Minimum Viable Product
Play around with recommending food that'll satisfy nutritional requirements not yet met in the current food selection.

Recommend top 3 foods to satisfy a missing nutritional need.

Note: "Food set" here is defined as the set of the user's selected foods AND the currently possibly recommended food (see below).

Working with updated dataset

## Import Libraries

In [2]:
import pandas as pd
import random

## Look at Data

In [3]:
data = pd.read_csv("./nutrient_foodname_amount.tsv", sep="\t")
display(data)

Unnamed: 0,nutrient,unit,food,nutrient_value
0,Protein,G,"Butter, salted",0.85
1,Protein,G,"Butter, whipped, with salt",0.49
2,Protein,G,"Butter oil, anhydrous",0.28
3,Protein,G,"Cheese, blue",21.40
4,Protein,G,"Cheese, brick",23.24
...,...,...,...,...
146670,"Sugars, total including NLEA",G,"REDUCED SODIUM: Turkey breast, sliced, prepack...",0.91
146671,"Sugars, total including NLEA",G,"REDUCED SODIUM: Chicken breast, deli, rotisser...",0.75
146672,"Sugars, total including NLEA",G,"REDUCED SODIUM: Bologna, meat and poultry",1.97
146673,"Sugars, total including NLEA",G,"REDUCED SODIUM: Nuts, almond butter, plain, wi...",6.27


## Get List of All Foods and Nutrients in Our Dataset

In [4]:
uniqueFoods = data["food"].unique()
uniqueNutrients = data["nutrient"].unique()
display(uniqueFoods, uniqueNutrients)

array(['Butter, salted', 'Butter, whipped, with salt',
       'Butter oil, anhydrous', ...,
       'REDUCED SODIUM: Bologna, meat and poultry',
       'REDUCED SODIUM: Nuts, almond butter, plain, with salt added',
       'Vitamin D as ingredient'], dtype=object)

array(['Protein', 'Total lipid (fat)', 'Carbohydrate, by difference',
       'Energy', 'Alcohol, ethyl', 'Water', 'Caffeine', 'Theobromine',
       'Fiber, total dietary', 'Calcium, Ca', 'Iron, Fe', 'Magnesium, Mg',
       'Phosphorus, P', 'Potassium, K', 'Sodium, Na', 'Zinc, Zn',
       'Copper, Cu', 'Selenium, Se', 'Retinol', 'Vitamin A, RAE',
       'Carotene, beta', 'Carotene, alpha',
       'Vitamin E (alpha-tocopherol)', 'Vitamin D (D2 + D3)',
       'Cryptoxanthin, beta', 'Lycopene', 'Lutein + zeaxanthin',
       'Vitamin C, total ascorbic acid', 'Thiamin', 'Riboflavin',
       'Niacin', 'Vitamin B-6', 'Folate, total', 'Vitamin B-12',
       'Choline, total', 'Vitamin K (phylloquinone)', 'Folic acid',
       'Folate, food', 'Folate, DFE', 'Vitamin E, added',
       'Vitamin B-12, added', 'Cholesterol',
       'Fatty acids, total saturated', 'SFA 4:0', 'SFA 6:0', 'SFA 8:0',
       'SFA 10:0', 'SFA 12:0', 'SFA 14:0', 'SFA 16:0', 'SFA 18:0',
       'MUFA 18:1', 'PUFA 18:2', 'PUFA 1

## Simulation Data


In [9]:
def get_info(food):
    '''Return a DataFrame containing nutrient data of only the food passed in'''
    return data[data["food"] == food]

In [17]:
def get_nutrient(food, nutrient):
    '''Return the specific nutrient amount for a particular food and nutrient of interest'''
    thisFood = get_info(food)

    nutInfo = thisFood[thisFood["nutrient"] == nutrient]["nutrient_value"]

    if(len(nutInfo) == 0):
        return 0
    
    return float(nutInfo.values)



In [18]:
simSelectedFoods = ["REDUCED SODIUM: Bologna, meat and poultry", "Butter", "REDUCED SODIUM: Nuts, almond butter, plain, with salt added"]
nutrientsOfInterest = ["Iron, Fe", "Cholesterol", "Calcium, Ca"]

nutrientData = {nutrient:0 for nutrient in nutrientsOfInterest}

# Go through each food of interest and gather data
for food in simSelectedFoods:
    
    # Nutrient data
    nutrientData = {nut:amt+get_nutrient(food, nut) for (nut,amt) in nutrientData.items()}
    
display(nutrientData)

{'Iron, Fe': 4.73, 'Cholesterol': 92.0, 'Calcium, Ca': 472.0}

## Rec System
In this version, for now, we'd just like to recommend a food that'll satisfy the nutrient requirement for the nutrient that's furthest from the daily recommended value in the currently selected foods

In [19]:
recAmounts = {nut:random.random() * random.randint(100, 2000) for nut in nutrientsOfInterest}
display(recAmounts)

{'Iron, Fe': 228.00038157258365,
 'Cholesterol': 280.3960267490736,
 'Calcium, Ca': 221.18241042043087}

In [52]:
# Randomly generate a most needed nutrient
lowestNutrient = uniqueNutrients[random.randint(0, len(uniqueNutrients) - 1)]
display(lowestNutrient)

'Protein'

In [53]:
def rec_foods(nutrient, numRecs = 3):
    '''Recommend a food to satisfy the missing nutritional need of argument nutrient'''
    possibleFoods = data[data["nutrient"] == nutrient]
    possibleFoods.sort_values(by = ["nutrient_value"], ascending = False, inplace = True)
    #display(possibleFoods)
    
    # If there's enough foods with this nutrient
    if numRecs <= possibleFoods.shape[0]:
        return possibleFoods.iloc[:numRecs]["food"].values
    
    return possibleFoods.iloc[0]["food"]

In [55]:
rec_foods(lowestNutrient, 5)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  after removing the cwd from sys.path.


array(['Soy protein isolate', 'Gelatins, dry powder, unsweetened',
       'Beverages, Protein powder whey based',
       'Beverages, ABBOTT, EAS whey protein powder',
       'Fish, cod, Atlantic, dried and salted'], dtype=object)