# Do Blue Zone diets work in Berkeley?

##### Blue Zones are areas of the world where people live much longer than the rest of the population and there are only 5 Blue Zones in the world in countries such as Greece, Italy, Japan, USA, and Costa Rica. The longevity of the population is a reflection of the their culture, community, and most importantly diets since they all tend to lead healthy lifestyles. 
##### Our group wants to see how feasible it would be to eat a Blue Zone diet as Berkeley students as compared to our typical diets. To do this, we are testing a typical Mediterranean diet, eaten in Ikaria, Greece, and an Okinawa, Japan diet against a typical Berkeley student's diet to see which has the lowest price (with Berkeley Safeway prices) and most nutritional value.

### Dietary Reference Intakes

In [8]:
#checking that we are in the correct working directory
!pwd

#installing neccesary packages and access to fdc data 

!pip install -r requirements.txt #--upgrade

from  scipy.optimize import linprog as lp
import numpy as np
import pandas as pd

import fooddatacentral as fdc
import warnings

/home/jovyan/EEP153_Materials/Project2


In [9]:
#Read diet minimum data
diet_min = pd.read_csv("diet_minimums.csv")
#drop unneeded columns 
diet_min = diet_min.drop(columns=["Unnamed: 0"])
diet_min = diet_min.set_index('Nutrition')

diet_min
#set "Nutrition" as the index of the Dataframe 
#diet_min = diet_min.set_index('Nutrition')  

Unnamed: 0_level_0,Source,C 1-3,F 4-8,M 4-8,F 9-13,M 9-13,F 14-18,M 14-18,F 19-30,M 19-30,F 31-50,M 31-50,F 51+,M 51+
Nutrition,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
Energy,---,1000.0,1200.0,1400.0,1600.0,1800.0,1800.0,2200.0,2000.0,2400.0,1800.0,2200.0,1600.0,2000.0
Protein,RDA,13.0,19.0,19.0,34.0,34.0,46.0,52.0,46.0,56.0,46.0,56.0,46.0,56.0
"Fiber, total dietary",---,14.0,16.8,19.6,22.4,25.2,25.2,30.8,28.0,33.6,25.2,30.8,22.4,28.0
"Folate, DFE",RDA,150.0,200.0,200.0,300.0,300.0,400.0,400.0,400.0,400.0,400.0,400.0,400.0,400.0
"Calcium, Ca",RDA,700.0,1000.0,1000.0,1300.0,1300.0,1300.0,1300.0,1000.0,1000.0,1000.0,1000.0,1200.0,1000.0
"Carbohydrate, by difference",RDA,130.0,130.0,130.0,130.0,130.0,130.0,130.0,130.0,130.0,130.0,130.0,130.0,130.0
"Iron, Fe",RDA,7.0,10.0,10.0,8.0,8.0,15.0,11.0,18.0,8.0,18.0,8.0,8.0,8.0
"Magnesium, Mg",RDA,80.0,130.0,130.0,240.0,240.0,360.0,410.0,310.0,400.0,320.0,420.0,320.0,420.0
Niacin,RDA,6.0,8.0,8.0,12.0,12.0,14.0,16.0,14.0,16.0,14.0,16.0,14.0,16.0
"Phosphorus, P",RDA,460.0,500.0,500.0,1250.0,1250.0,1250.0,1250.0,700.0,700.0,700.0,700.0,700.0,700.0


In [10]:
def dietary_ref_intake(age,sex,df):
    """Takes in age and sex, and returns the dietary reference intake for the chosen population"""

    if age <= 3:
        col = 'C 1-3'
    age_ranges = [(4,8),(9,13),(14,18),(19,30),(31,50),(50,100)]
    for age_range in age_ranges:
        if age >= age_range[0] and age <= age_range[1]:
            col = sex + ' ' + str(age_range[0]) + '-' + str(age_range[1])
    return pd.Series(df[col]) 

In [11]:
# Example of minimum dietary requirements for a male aged 19
dietary_ref_intake(age=19,sex='F',df=diet_min)

Nutrition
Energy                            2000.0
Protein                             46.0
Fiber, total dietary                28.0
Folate, DFE                        400.0
Calcium, Ca                       1000.0
Carbohydrate, by difference        130.0
Iron, Fe                            18.0
Magnesium, Mg                      310.0
Niacin                              14.0
Phosphorus, P                      700.0
Potassium, K                      4700.0
Riboflavin                           1.1
Thiamin                              1.1
Vitamin A, RAE                     700.0
Vitamin B-12                         2.4
Vitamin B-6                          1.3
Vitamin C, total ascorbic acid      75.0
Vitamin E (alpha-tocopherol)        15.0
Vitamin K (phylloquinone)           90.0
Zinc, Zn                             8.0
Name: F 19-30, dtype: float64

## Function to Solve Lowest Cost

In [12]:
def solve_subsistence_problem(FoodNutrients,Prices,dietmin,dietmax,max_weight=None,tol=1e-6):
    """Solve Stigler's Subsistence Cost Problem.

    Inputs:
       - FoodNutrients : A pd.DataFrame with rows corresponding to foods, columns to nutrients.
       - Prices : A pd.Series of prices for different foods
       - diet_min : A pd.Series of DRIs, with index corresponding to columns of FoodNutrients,
                    describing minimum intakes.
       - diet_max : A pd.Series of DRIs, with index corresponding to columns of FoodNutrients,
                    describing maximum intakes.
       - max_weight : Maximum weight (in hectograms) allowed for diet.
       - tol : Solution values smaller than this in absolute value treated as zeros.
       
    """
    try: 
        p = Prices.apply(lambda x:x.magnitude)
    except AttributeError:  # Maybe not passing in prices with units?
        warnings.warn("Prices have no units.  BE CAREFUL!  We're assuming prices are per hectogram or deciliter!")
        p = Prices

    p = p.dropna()

    # Compile list that we have both prices and nutritional info for; drop if either missing
    use = p.index.intersection(FoodNutrients.columns)
    p = p[use]

    # Drop nutritional information for foods we don't know the price of,
    # and replace missing nutrients with zeros.
    Aall = FoodNutrients[p.index].fillna(0)

    # Drop rows of A that we don't have constraints for.
    Amin = Aall.loc[Aall.index.intersection(dietmin.index)]
    Amin = Amin.reindex(dietmin.index,axis=0)
    idx = Amin.index.to_frame()
    idx['type'] = 'min'
    #Amin.index = pd.MultiIndex.from_frame(idx)
    #dietmin.index = Amin.index
    
    Amax = Aall.loc[Aall.index.intersection(dietmax.index)]
    Amax = Amax.reindex(dietmax.index,axis=0)
    idx = Amax.index.to_frame()
    idx['type'] = 'max'
    #Amax.index = pd.MultiIndex.from_frame(idx)
    #dietmax.index = Amax.index

    # Minimum requirements involve multiplying constraint by -1 to make <=.
    A = pd.concat([Amin,
                   -Amax])

    b = pd.concat([dietmin,
                   -dietmax]) # Note sign change for max constraints

    # Make sure order of p, A, b are consistent
    A = A.reindex(p.index,axis=1)
    A = A.reindex(b.index,axis=0)

    if max_weight is not None:
        # Add up weights of foods consumed
        A.loc['Hectograms'] = -1
        b.loc['Hectograms'] = -max_weight
        
    # Now solve problem!  (Note that the linear program solver we'll use assumes
    # "less-than-or-equal" constraints.  We can switch back and forth by
    # multiplying $A$ and $b$ by $-1$.)

    result = lp(p, -A, -b, method='highs')

    result.A = A
    result.b = b
    
    if result.success:
        result.diet = pd.Series(result.x,index=p.index)
    else: # No feasible solution?
        warnings.warn(result.message)
        result.diet = pd.Series(result.x,index=p.index)*np.nan  

    return result

In [14]:
%pip install gnupg

Collecting gnupg
  Using cached gnupg-2.3.1-py3-none-any.whl
Installing collected packages: gnupg
Successfully installed gnupg-2.3.1
Note: you may need to restart the kernel to use updated packages.


In [15]:
from eep153_tools.sheets import read_sheets

DRI_url = "https://docs.google.com/spreadsheets/d/1y95IsQ4HKspPW3HHDtH7QMtlDA66IUsCHJLutVL-MMc/"

DRIs = read_sheets(DRI_url)

# Define *minimums*
diet_min = DRIs['diet_minimums'].set_index('Nutrition')

# Define *maximums*
diet_max = DRIs['diet_maximums'].set_index('Nutrition')

Key available for students@eep153.iam.gserviceaccount.com.


  D[w.title]=df.apply(lambda x: pd.to_numeric(x,errors='ignore'))


## Generic Berkeley Student Diet

In [16]:
from  scipy.optimize import linprog as lp
import numpy as np
import warnings

In [17]:
apikey = "sCD07VKZEF2pe7ewJNYSSWlOHY0nRMda34HLcp80"

In [18]:
%pip install pandas
%pip install gnupg

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


#### Prices for Generic Berkeley Diet

In [19]:
SHEETs = [# BERKELEY DIET foods, Berkeley prices
          ("https://docs.google.com/spreadsheets/d/11Ou4aZ8bE12J6dY9hmyUeCFFCNpplexnOGtfJVKdgbY/edit#gid=628663795","GENERIC"),
         ]

In [20]:
import pandas as pd
from eep153_tools.sheets import read_sheets

df = read_sheets(SHEETs[0][0])[SHEETs[0][1]]
df
df['FDC'] = pd.to_numeric(df['FDC'], errors='coerce').fillna(0).astype(int)

print(df)

Key available for students@eep153.iam.gserviceaccount.com.
        FDC                  Food  Quantity Units  Price                      \
0   2646170        Chicken Breast         1   lbs   4.99 NaN NaN NaN NaN NaN   
1   2100593        Chicken Thighs         1   lbs   2.99 NaN NaN NaN NaN NaN   
2   1990910  Ground Chicken (96%)         1   lbs   8.69 NaN NaN NaN NaN NaN   
3   2033779   Ground Turkey (93%)         1   lbs   3.99 NaN NaN NaN NaN NaN   
4   2546569  Smoked Turkey Breast         1   lbs  10.99 NaN NaN NaN NaN NaN   
..      ...                   ...       ...   ...    ...  ..  ..  ..  ..  ..   
58  2343973                  oats         1   lbs   1.92 NaN NaN NaN NaN NaN   
59  1727861    honey nut cheerios         1   lbs   5.92 NaN NaN NaN NaN NaN   
60  2423848          plain bagels         1   lbs   2.56 NaN NaN NaN NaN NaN   
61  2341163    plain cream cheese         1   lbs   7.04 NaN NaN NaN NaN NaN   
62  2093809                 bacon         1   lbs   7.68 NaN 

#### Nutritional Information for Berkeley Diet Foods

In [21]:
import fooddatacentral as fdc
import warnings

D = {}
count = 0
for food in df.Food.tolist():
    try:
        FDC = df.loc[df.Food==food,:].FDC[count]
        count+=1
        D[food] = fdc.nutrients(apikey,FDC).Quantity
    except AttributeError: 
        warnings.warn("Couldn't find FDC Code %s for food %s." % (food, FDC))        

FoodNutrients = pd.DataFrame(D,dtype=float)
FoodNutrients

Unnamed: 0,Chicken Breast,Chicken Thighs,Ground Chicken (96%),Ground Turkey (93%),Smoked Turkey Breast,Roasted Turkey Breast,Boneless Pork Loin chop,Ground Beef (80%),Steak,Pink Salmon,...,bananas,white bread,sliced cheddar cheese,potato chips,vanilla ice cream,oats,honey nut cheerios,plain bagels,plain cream cheese,bacon
Alanine,,,,,,,,,,,...,,,,,,,,,,
"Alcohol, ethyl",,,,,,,,,,,...,0.00,0.00,,,0.00,0.00,,,0.0,
Amino acids,,,,,,,,,,,...,,,,,,,,,,
Arginine,,,,,,,,,,,...,,,,,,,,,,
Ash,1.1290,,,,,,1.074,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Vitamin K (Menaquinone-4),,,,,,,,,,,...,,,,,,,,,,
Vitamin K (phylloquinone),,,,,,,,,,,...,0.10,0.20,,,0.30,2.00,,,2.1,
Vitamins and Other Components,,,,,,,,,,,...,,,,,,,,,,
Water,74.7800,,,,,,68.820,,,,...,75.60,35.70,,,61.00,10.80,,,52.6,


In [37]:
# Unit Conversion
# Convert food quantities to FDC units
df['FDC Quantity'] = df[['Quantity','Units']].T.apply(lambda x : fdc.units(x['Quantity'],x['Units']))

# Now may want to filter df by time or place--need to get a unique set of food names.
df['FDC Price'] = df['Price']/df['FDC Quantity']

df.dropna(how='any') # Drop food with any missing data

# To use minimum price observed
Prices = df.groupby('Food',sort=False)['FDC Price'].min()

  result[:] = values


#### Berkeley Result

In [38]:
group = 'M 19-30'
tol = 1e-6

result = solve_subsistence_problem(FoodNutrients,Prices,diet_min[group],diet_max[group],tol=tol)

print("Cost of diet for %s is $%4.2f per day.\n" % (group,result.fun))

# Put back into nice series
diet = result.diet

print("\nDiet (in 100s of grams or milliliters):")
print(diet[diet >= tol])  # Drop items with quantities less than precision of calculation.
print()

tab = pd.DataFrame({"Outcome":np.abs(result.A).dot(diet),"Recommendation":np.abs(result.b)})
print("\nWith the following nutritional outcomes of interest:")
print(tab)
print()

print("\nConstraining nutrients are:")
excess = tab.diff(axis=1).iloc[:,1]
print(excess.loc[np.abs(excess) < tol*100].index.tolist())



TypeError: must be real number, not NoneType

## Mediterranean Diet

#### Prices for Mediterranean Diet

In [None]:
mSHEETs = [# MEDITERRANEAN foods, Berkeley prices
          ("https://docs.google.com/spreadsheets/d/11Ou4aZ8bE12J6dY9hmyUeCFFCNpplexnOGtfJVKdgbY/edit#gid=628663795","MED"),
         ]

In [None]:
import pandas as pd
from eep153_tools.sheets import read_sheets

mdf = read_sheets(mSHEETs[0][0])[mSHEETs[0][1]]
mdf
mdf['FDC'] = pd.to_numeric(mdf['FDC'], errors='coerce').fillna(0).astype(int)

print(mdf)

In [None]:
from eep153_tools.sheets import read_sheets

DRI_url = "https://docs.google.com/spreadsheets/d/1y95IsQ4HKspPW3HHDtH7QMtlDA66IUsCHJLutVL-MMc/"

DRIs = read_sheets(DRI_url)

# Define *minimums*
diet_min = DRIs['diet_minimums'].set_index('Nutrition')

# Define *maximums*
diet_max = DRIs['diet_maximums'].set_index('Nutrition')

#### Nutritional Information for Med Diet

In [None]:
import fooddatacentral as fdc
import warnings

D = {}
count = 0
for food in mdf.Food.tolist():
    try:
        FDC = mdf.loc[mdf.Food==food,:].FDC[count]
        count+=1
        D[food] = fdc.nutrients(apikey,FDC).Quantity
    except AttributeError: 
        warnings.warn("Couldn't find FDC Code %s for food %s." % (food, FDC))        

FoodNutrients = pd.DataFrame(D,dtype=float)
FoodNutrients

In [None]:
# Unit Conversion
# Convert food quantities to FDC units
mdf['FDC Quantity'] = mdf[['Quantity','Units']].T.apply(lambda x : fdc.units(x['Quantity'],x['Units']))

# Now may want to filter df by time or place--need to get a unique set of food names.
mdf['FDC Price'] = mdf['Price']/mdf['FDC Quantity']

mdf.dropna(how='any') # Drop food with any missing data

# To use minimum price observed
Prices = mdf.groupby('Food',sort=False)['FDC Price'].min()

In [None]:
from eep153_tools.sheets import read_sheets

DRI_url = "https://docs.google.com/spreadsheets/d/1y95IsQ4HKspPW3HHDtH7QMtlDA66IUsCHJLutVL-MMc/"

DRIs = read_sheets(DRI_url)

# Define *minimums*
diet_min = DRIs['diet_minimums'].set_index('Nutrition')

# Define *maximums*
diet_max = DRIs['diet_maximums'].set_index('Nutrition')

#### Mediterranean Result

In [None]:
group = 'M 19-30'
tol = 1e-6

result = solve_subsistence_problem(FoodNutrients,Prices,diet_min[group],diet_max[group],tol=tol)

print("Cost of diet for %s is $%4.2f per day.\n" % (group,result.fun))

# Put back into nice series
diet = result.diet

print("\nDiet (in 100s of grams or milliliters):")
print(diet[diet >= tol])  # Drop items with quantities less than precision of calculation.
print()

tab = pd.DataFrame({"Outcome":np.abs(result.A).dot(diet),"Recommendation":np.abs(result.b)})
print("\nWith the following nutritional outcomes of interest:")
print(tab)
print()

print("\nConstraining nutrients are:")
excess = tab.diff(axis=1).iloc[:,1]
print(excess.loc[np.abs(excess) < tol*100].index.tolist())

## Okinawa Diet

#### Prices for Okinawa Diet

In [31]:
oSHEETs = [# OKINAWA foods, Berkeley prices
          ("https://docs.google.com/spreadsheets/d/11Ou4aZ8bE12J6dY9hmyUeCFFCNpplexnOGtfJVKdgbY/edit#gid=628663795","OKINAWA"),
         ]

In [32]:
import pandas as pd
from eep153_tools.sheets import read_sheets

odf = read_sheets(oSHEETs[0][0])[oSHEETs[0][1]]
odf
odf['FDC'] = pd.to_numeric(odf['FDC'], errors='coerce').fillna(0).astype(int)

print(odf)

Key available for students@eep153.iam.gserviceaccount.com.


  D[w.title]=df.apply(lambda x: pd.to_numeric(x,errors='ignore'))


        FDC                Food  Quantity Units   Price
0    451884      Sweet potatoes         1   lbs    1.99
1   2345512             Seaweed         1   lbs   79.84
2   2029705                Kelp         1   lbs  184.00
3   2029502       Bamboo shoots         1   lbs   19.36
4   2345503      Daikon raddish         1   lbs    2.49
5   1548192        Bitter melon         1   lbs    5.99
6    169975             Cabbage         1   lbs    1.49
7   2079038             Carrots         1   lbs    1.49
8    169260        Chinese okra         1   lbs    6.72
9   2653425             Pumpkin         1   lbs    4.64
10   169926        Green papaya         1   lbs    3.99
11  2343861              Millet         1   lbs   17.45
12  2343200               Wheat         1   lbs    4.00
13   356554                Rice         1   lbs    0.80
14  2008214   Buckwheat noodles         1   lbs    7.52
15  2294522                Tofu         1   lbs    3.36
16  2342914                Miso         1   lbs 

#### Nutritional Information for Okinawa Diet

In [33]:
import fooddatacentral as fdc
import warnings

D = {}
count = 0
for food in odf.Food.tolist():
    try:
        FDC = odf.loc[odf.Food==food,:].FDC[count]
        count+=1
        D[food] = fdc.nutrients(apikey,FDC).Quantity
    except AttributeError: 
        warnings.warn("Couldn't find FDC Code %s for food %s." % (food, FDC))        

FoodNutrients = pd.DataFrame(D,dtype=float)
FoodNutrients

Unnamed: 0,Sweet potatoes,Seaweed,Kelp,Bamboo shoots,Daikon raddish,Bitter melon,Cabbage,Carrots,Chinese okra,Pumpkin,...,Eggplant,Barley,Organic Goji Berry,Pineapple,Mango,Shittake Mushrooms,Spinach,Ginger,Okra,Black Plum
"Ergosta-5,7-dienol",,,,,,,,,,,...,,,,,,3.6070,,,,
"Ergosta-7,22-dienol",,,,,,,,,,,...,,,,,,1.5070,,,,
Alanine,,,,,,,0.042,,0.073,,...,,,,,,,,,,
"Alcohol, ethyl",,0.00,,,0.00,,0.000,,0.000,,...,0.00,,,,,,,,,0.0
Amino acids,,,,,,,0.000,,0.000,,...,,,,,,0.0000,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Vitamin K (Dihydrophylloquinone),,,,,,,0.000,,,,...,,,,,,,,,,
Vitamin K (phylloquinone),1.20,25.00,,,3.50,,76.000,,31.300,,...,3.50,,,,,,,,,6.4
Vitamins and Other Components,,,,,,,0.000,,0.000,,...,,,,,,0.0000,,,,
Water,,6.68,,,92.60,,92.180,,89.580,,...,92.30,,,,,88.6000,,,,87.2


In [39]:
# Unit Conversion
# Convert food quantities to FDC units
odf['FDC Quantity'] = odf[['Quantity','Units']].T.apply(lambda x : fdc.units(x['Quantity'],x['Units']))

# Now may want to filter df by time or place--need to get a unique set of food names.
odf['FDC Price'] = odf['Price']/odf['FDC Quantity']

odf.dropna(how='any') # Drop food with any missing data

# To use minimum price observed
Prices = odf.groupby('Food',sort=False)['FDC Price'].min()

  result[:] = values


In [40]:
from eep153_tools.sheets import read_sheets

DRI_url = "https://docs.google.com/spreadsheets/d/1y95IsQ4HKspPW3HHDtH7QMtlDA66IUsCHJLutVL-MMc/"

DRIs = read_sheets(DRI_url)

# Define *minimums*
diet_min = DRIs['diet_minimums'].set_index('Nutrition')

# Define *maximums*
diet_max = DRIs['diet_maximums'].set_index('Nutrition')

Key available for students@eep153.iam.gserviceaccount.com.


  D[w.title]=df.apply(lambda x: pd.to_numeric(x,errors='ignore'))


#### Okinawa Result

In [41]:
group = 'M 19-30'
tol = 1e-6

result = solve_subsistence_problem(FoodNutrients,Prices,diet_min[group],diet_max[group],tol=tol)

print("Cost of diet for %s is $%4.2f per day.\n" % (group,result.fun))

# Put back into nice series
diet = result.diet

print("\nDiet (in 100s of grams or milliliters):")
print(diet[diet >= tol])  # Drop items with quantities less than precision of calculation.
print()

tab = pd.DataFrame({"Outcome":np.abs(result.A).dot(diet),"Recommendation":np.abs(result.b)})
print("\nWith the following nutritional outcomes of interest:")
print(tab)
print()

print("\nConstraining nutrients are:")
excess = tab.diff(axis=1).iloc[:,1]
print(excess.loc[np.abs(excess) < tol*100].index.tolist())

Cost of diet for M 19-30 is $40.05 per day.


Diet (in 100s of grams or milliliters):
Daikon raddish    12.252199
Chinese okra       5.343837
Green papaya       3.328007
White fish         0.916031
Eggplant           6.988189
Black Plum        21.821757
dtype: float64


With the following nutritional outcomes of interest:
                                     Outcome  Recommendation
Nutrition                                                   
Energy                           3100.000000          2400.0
Protein                            63.462306            56.0
Fiber, total dietary               92.651216            33.6
Folate, DFE                       977.652075           400.0
Calcium, Ca                      1060.761876          1000.0
Carbohydrate, by difference       406.111461           130.0
Iron, Fe                           13.899282             8.0
Magnesium, Mg                     772.328634           400.0
Niacin                             24.545408            16.0
Phosp