# **`Project 2: Team Thomas Allinson`**

### **Objective**: Analyze the comparative costs of a vegan diet versus an omnivorous diet within the American population, with a specific focus on their environmental impact.

#### Group Members:
> Johann: johann.dicken@berkeley.edu <br>
> Laure: laureho@berkeley.edu <br>
> Reily: reilyjean@berkeley.edu <br>
> Carmen: carmenvega@berkeley.edu <br>
> Steven: k1519632@berkeley.edu <br>

### **[A]: Description of population of interest**

The project looks at the cost differences between vegan and omnivorous diets in the U.S., focusing on their environmental impact. It examines factors like water consumption, CO2 emissions, and land use tied to different diets. Specifically, we are concentrating on the demographic of women ages in 19-30 age range.

### **[A]: Dietary Reference Intakes**

In [1]:
import pandas as pd
import numpy as np

In [21]:
# Import Dietary Requirements spreadsheet data as a pd.DataFrame
diet_min = pd.read_csv('Dietary_Requirements.csv')
diet_min.head()

Unnamed: 0,Nutrition,Source,C 1-3,F 4-8,M 4-8,F 9-13,M 9-13,F 14-18,M 14-18,F 19-30,M 19-30,F 31-50,M 31-50,F 51+,M 51+
0,Energy,---,1000.0,1200.0,1400.0,1600.0,1800.0,1800.0,2200.0,2000.0,2400.0,1800.0,2200.0,1600.0,2000.0
1,Protein,RDA,13.0,19.0,19.0,34.0,34.0,46.0,52.0,46.0,56.0,46.0,56.0,46.0,56.0
2,"Fiber, total dietary",---,14.0,16.8,19.6,22.4,25.2,25.2,30.8,28.0,33.6,25.2,30.8,22.4,28.0
3,"Folate, DFE",RDA,150.0,200.0,200.0,300.0,300.0,400.0,400.0,400.0,400.0,400.0,400.0,400.0,400.0
4,"Calcium, Ca",RDA,700.0,1000.0,1000.0,1300.0,1300.0,1300.0,1300.0,1000.0,1000.0,1000.0,1000.0,1200.0,1000.0


Dietary function takes 2 arguments: `age`, a positive integer, and `sex`, a string (not case-senstitive) with the classification of male, female, or child.

In [22]:
def dietary_ref(age, sex):

    # Validate age input
    if not isinstance(age, int) or age <= 0:
        return "Incorrect age input. Please enter a positive integer for the age."
    
    # Normalize and validate sex input
    sex = sex.lower()
    if sex not in ['male', 'female', 'child']:
        return "Incorrect sex input. Input must be Male, Female, or Child."
    
    # Determine the appropriate column based on age and sex
    if sex == 'child':
        if age <= 3:
            col_name = 'C 1-3'
        elif age <= 8:
            col_name = 'C 4-8'
        else:
            return "Age out of range for child category."
    else:
        if age <= 8:
            col_name = f"{'F' if sex == 'female' else 'M'} 4-8"
        elif age <= 13:
            col_name = f"{'F' if sex == 'female' else 'M'} 9-13"
        elif age <= 18:
            col_name = f"{'F' if sex == 'female' else 'M'} 14-18"
        elif age <= 30:
            col_name = f"{'F' if sex == 'female' else 'M'} 19-30"
        elif age <= 50:
            col_name = f"{'F' if sex == 'female' else 'M'} 31-50"
        else:
            col_name = f"{'F' if sex == 'female' else 'M'} 51+"
    
    # Extract and return the relevant nutrient recommendations
    if col_name in diet_min.columns:
        return diet_min[['Nutrition', col_name]].set_index('Nutrition')[col_name]
    else:
        return "Matching column not found in DataFrame. Check the column names."

In [23]:
# Example usage
dietary_ref(15, 'Male')

Nutrition
Energy                            2200.0
Protein                             52.0
Fiber, total dietary                30.8
Folate, DFE                        400.0
Calcium, Ca                       1300.0
Carbohydrate, by difference        130.0
Iron, Fe                            11.0
Magnesium, Mg                      410.0
Niacin                              16.0
Phosphorus, P                     1250.0
Potassium, K                      4700.0
Riboflavin                           1.3
Thiamin                              1.2
Vitamin A, RAE                     900.0
Vitamin B-12                         2.4
Vitamin B-6                          1.3
Vitamin C, total ascorbic acid      75.0
Vitamin E (alpha-tocopherol)        15.0
Vitamin K (phylloquinone)           75.0
Zinc, Zn                            11.0
Name: M 14-18, dtype: float64

### **[A]: Data on prices for different foods**

In [24]:
apikey = "KNqUDtV7Kcktiuheo3EoNhB0zDlCevFAdqZrKgdj" 
%pip install -r requirements.txt --upgrade
import fooddatacentral as fdc

Collecting gspread (from -r requirements.txt (line 20))
  Using cached gspread-6.2.0-py3-none-any.whl.metadata (11 kB)
Note: you may need to restart the kernel to use updated packages.


> ### Omnivore Diet

In [25]:
import re

# Load the CSV file into a DataFrame
df = pd.read_csv('food_and_prices1.csv')

# Define a regex pattern for common animal products
animal_product_pattern = r'\b(butter|cheese|milk|kefir|whey|eggnog|mascarpone|mozzarella|stracchino|parmigiano|beef|turbot|cod|ricotta|chicken|carp|salmon|bacon|trout|mealworms|dulce|pork|egg|fish|lamb|turkey|turtle|breast|mollusks|frog|thigh|yogurt|honey|gelatin|cream|lard|sausage|anchovy|shellfish|shrimp|mayo|ham|meat)\b'

# Create a new column 'animal product' that marks items based on the pattern
df['animal product'] = df['Food commodity ITEM'].apply(
    lambda x: 'animal product' if re.search(animal_product_pattern, str(x), re.IGNORECASE) else 'plant-based'
)

# Display the updated DataFrame
df.rename(columns={'Food commodity ITEM': 'Food'}, inplace=True)
df = df[df['Food'] != 'YEAST COMPRESSED*']
df['Average Price per 100g (USD)'] = df['Average Price per kg (USD)']/10
#df.drop('YEAST COMPRESSED*'
#df[df['animal product'] == 'plant-based'].head(100)
df.set_index('Food', inplace=True)
df.head()

Unnamed: 0_level_0,Carbon Footprint kg CO2eq/kg or l of food ITEM,Water Footprint liters water/kg o liter of food ITEM,FDC ID,FDC Food Name,Average Price per kg (USD),animal product,Average Price per 100g (USD)
Food,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
CHOCOLATE OR CREAM FILLED COOKIES**,1.53,2902.0,2707915,"Cookie, chocolate or fudge",0.17,animal product,0.017
SIMPLE COOKIES**,1.39,1723.0,2707964,"Cookie, shortbread",0.12,plant-based,0.012
BREAD MULTICEREAL**,0.7,771.0,2707777,"Bread, multigrain",0.14,plant-based,0.014
BREAD PLAIN**,0.89,1031.0,174929,"Bread, sticks, plain",0.14,plant-based,0.014
BREAD WHOLE**,0.77,887.0,2707709,"Bread, whole wheat",0.08,plant-based,0.008


### **[A]: Nutritional content of different foods**

In [26]:
D = {}
count = 0
for food in df.index:
        try:
            FDC = df.loc[df.index==food,:]['FDC ID'][0]
            count+=1
            D[food] = fdc.nutrients(apikey,FDC).Quantity
            #print(D[food])
            #print(food)
        except AttributeError:
            warnings.warn(f"Couldn't find FDC Code {FDC} for food {food}.")
    
D = pd.DataFrame(D,dtype=float)
D

Unnamed: 0,CHOCOLATE OR CREAM FILLED COOKIES**,SIMPLE COOKIES**,BREAD MULTICEREAL**,BREAD PLAIN**,BREAD WHOLE**,FLAVORED CRACKERS**,PLAIN CRACKERS**,WHOLEGRAIN CRACKERS**,CRISPBREAD**,KETCHUP,...,ARTICHOKE,BROCCOLI,CAULIFLOWER,CUCUMBER,PEPPER,TOMATO,ZUCCHINI,COD,SALMON,TROUT
Alanine,,,,0.395,,,,,,,...,,,,,,,,,,
"Alcohol, ethyl",0.00,0.00,0.0,0.000,0.00,,0.00,0.00,,0.00,...,0.0,,,0.00,,,,,,0.00
Amino acids,,,,0.000,,,,,,,...,,,,,,,,,,
Arginine,,,,0.432,,,,,,,...,,,,,,,,,,
Ash,,,,3.900,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Vitamin K (Dihydrophylloquinone),,,,,,,,,,,...,,,,,,,,,,
Vitamin K (phylloquinone),2.40,11.00,1.4,2.200,7.80,,69.30,36.00,,3.00,...,14.8,,,14.60,,,,,,12.00
Vitamins and Other Components,,,,0.000,,,,,,,...,,,,,,,,,,
Water,4.50,3.60,36.9,6.100,38.70,,3.14,2.50,,68.50,...,84.1,,,90.40,,,,,,53.20


### **[A]: Solution**

In [27]:
import warnings
import fooddatacentral as fdc

bmin = diet_min.set_index('Nutrition', inplace= True)

bmin = diet_min
bmin = bmin.drop('Source',axis=1)
#bmax
bmax = pd.read_csv('diet_max.csv')

bmax = bmax.drop('Source',axis=1) #
bmax = bmax.set_index('Nutrition')

#bmin

b = pd.concat([bmin,-bmax]) # Note sign change for max constraints

b

Unnamed: 0_level_0,C 1-3,F 4-8,M 4-8,F 9-13,M 9-13,F 14-18,M 14-18,F 19-30,M 19-30,F 31-50,M 31-50,F 51+,M 51+
Nutrition,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
Energy,1000.0,1200.0,1400.0,1600.0,1800.0,1800.0,2200.0,2000.0,2400.0,1800.0,2200.0,1600.0,2000.0
Protein,13.0,19.0,19.0,34.0,34.0,46.0,52.0,46.0,56.0,46.0,56.0,46.0,56.0
"Fiber, total dietary",14.0,16.8,19.6,22.4,25.2,25.2,30.8,28.0,33.6,25.2,30.8,22.4,28.0
"Folate, DFE",150.0,200.0,200.0,300.0,300.0,400.0,400.0,400.0,400.0,400.0,400.0,400.0,400.0
"Calcium, Ca",700.0,1000.0,1000.0,1300.0,1300.0,1300.0,1300.0,1000.0,1000.0,1000.0,1000.0,1200.0,1000.0
"Carbohydrate, by difference",130.0,130.0,130.0,130.0,130.0,130.0,130.0,130.0,130.0,130.0,130.0,130.0,130.0
"Iron, Fe",7.0,10.0,10.0,8.0,8.0,15.0,11.0,18.0,8.0,18.0,8.0,8.0,8.0
"Magnesium, Mg",80.0,130.0,130.0,240.0,240.0,360.0,410.0,310.0,400.0,320.0,420.0,320.0,420.0
Niacin,6.0,8.0,8.0,12.0,12.0,14.0,16.0,14.0,16.0,14.0,16.0,14.0,16.0
"Phosphorus, P",460.0,500.0,500.0,1250.0,1250.0,1250.0,1250.0,700.0,700.0,700.0,700.0,700.0,700.0


In [29]:
from  scipy.optimize import linprog as lp
import numpy as np

def get_grub(sex_age_group, diet, df):
    if (diet == 'vegan') or (diet == 'plant-based'):
        df = df[df['animal product'] == 'plant-based']
    D = {}
    count = 0
    
    
    for food in df.index:
        try:
            FDC = df.loc[df.index==food,:]['FDC ID'][0]
            count+=1
            D[food] = fdc.nutrients(apikey,FDC).Quantity
            #print(D[food])
            #print(food)
        except AttributeError:
            warnings.warn(f"Couldn't find FDC Code {FDC} for food {food}.")
    
    D = pd.DataFrame(D,dtype=float)
    df.dropna(how='any') # Drop food with any missing data

    Prices = df.groupby('Food')['Average Price per 100g (USD)'].min()
    p = Prices.apply(lambda x:x).dropna()
    
    # Compile list that we have both prices and nutritional info for; drop if either missing
    use = p.index.intersection(D.columns)
    p = p[use]
    tol = 1e-6 # Numbers in solution smaller than this (in absolute value) treated as zeros

    Aall = D[p.index].fillna(0)

    # Drop rows of A that we don't have constraints for.
    Amin = Aall.loc[bmin.index]
    
    Amax = Aall.loc[bmax.index]
    
    # Maximum requirements involve multiplying constraint by -1 to make <=.
    A = pd.concat([Amin,-Amax])
    ## Choose sex/age group!
    result =  lp(p, -A, -b[sex_age_group], method='highs')
    print(f"Cost of diet for {group} is ${result.fun:.2f} per day.")
    # Put back into nice series
    diet = pd.Series(result.x,index=p.index)
    
    print("\nYou'll be eating (in 100s of grams or milliliters):")
    print(diet[diet >= tol])  # Drop items with quantities less than precision of calculation.
    return diet

group = "F 19-30"
diet = 'not vegan'
#result = get_grub(group, diet, df)

solution = get_grub(group, diet, df)

Cost of diet for F 19-30 is $0.07 per day.

You'll be eating (in 100s of grams or milliliters):
AVOCADO                 1.422686
BEEF BONE FREE MEAT*    0.579366
CARROT                  0.809701
GARLIC                  0.585595
MANGO                   2.295733
ONION                   0.001165
PEPPER                  0.171361
SOY BURGER              0.233006
SOYBEAN                 1.300371
SUNFLOWER SEED          0.892521
TOMATO & BASIL          0.798631
dtype: float64


### **[B]: Is your solution edible?**

Yes, our solution is edible! Here is our recipe:

**Veggie Burger with Potato bun & Overnight oats**

*Total cost/person: $3.49*

*Ingredients:*

> *Beyond Burger* <br>
> *Carrot* <br>
> *Potato* <br>
> *Peas* <br>
> *Soy oil* <br>
> *Oats* <br>
> *Milk* <br>
> *Lettuce* <br>

### **[B]: What is total cost for population of interest?**

In [30]:
%pip install wbdata
import wbdata
import warnings
warnings.simplefilter("ignore")

Note: you may need to restart the kernel to use updated packages.


Population function:

In [31]:
countries = wbdata.get_countries()
country_dict = {}

for country in countries:
    country_code = country['id']
    country_name = country['name']
    country_dict[country_name] = country_code

def int_to_str(num):
    """Convert the integer to the proper format."""
    if 0 <= num < 10:
        return f"0{num}"
    else:
        return str(num)

def population_range(year, sex, age_range, place):
    """Return the population for a certain age range by calling wbdata."""
    sex_codes = {"people": "", "females": "FE", "males": "MA"}
    sex_used = sex_codes[sex]
    lower, upper = int_to_str(age_range[0]), int_to_str(age_range[1])
    range_string = lower + upper
    country_code = country_dict.get(place)
    df = wbdata.get_dataframe({"SP.POP." + range_string + "." + sex_used: "Population"},
                              country= {country_code: place}).squeeze()
    df = df.to_frame().reset_index()
    population_total = int(df[df["date"] == str(year)]["Population"].iloc[0])
    return population_total

def over_80_pop(year, sex, place):
    sex_codes = {"people": "", "females": "FE", "males": "MA"}
    sex_used = sex_codes[sex]
    country_code = country_dict.get(place)

    df = wbdata.get_dataframe({"SP.POP." + "80UP" + "." + sex_used: "Population"},
                              country= {country_code: place}).squeeze()
    df = df.to_frame().reset_index()
    population_total = int(df[df["date"] == str(year)]["Population"].iloc[0])
    return population_total

def dict_helper(year, sex, age_range, place):
    """Expand function to include every age specified possible."""
    if len(age_range) == 1:
        age_range = [age_range[0], age_range[0]]
    elif age_range[1] < age_range[0]:
      raise ValueError(f"Please ensure that the second value in the range is greater than the first.")

    minimum_age, maximum_age = age_range
    possible_minimums = [i for i in range(0, 76, 5)]
    possible_maximums = [i for i in range(4, 80, 5)]

    my_dict = {}
    for age in range(minimum_age, maximum_age + 1):
        """Find the index in the possible ranges that includes the current age."""
        range_index = next((i for i, min_val in enumerate(possible_minimums) if
                            min_val <= age and age <= possible_maximums[i]), None)
        if range_index is not None:
            popl_value = population_range(year, sex, [possible_minimums[range_index], possible_maximums[range_index]], place) // 5
            my_dict[age] = popl_value
        elif age >= 80 and age <= 100:
            my_dict[age] = over_80_pop(year, sex, place) / 20
        else:
            raise ValueError(f"No age range available for age {age}")
    return my_dict

def population(year, sex = "all", age_range = [0,100], place = "World"):
    if place not in country_dict:
        valid_regions = ", ".join(country_dict.keys())
        raise ValueError(f"The region '{place}' is not valid. Please choose from the following regions: {valid_regions}")
    if sex in ["all", "people", "p", "P", "People", "All", "Everyone"]:
      female_dict = dict_helper(year, "females", age_range, place)
      male_dict = dict_helper(year, "males", age_range, place)
      return sum(female_dict.values()) + sum(male_dict.values())
    elif sex in ["female", "females", "f", "Female", "Females", "F", "FE"]:
      female_dict = dict_helper(year, "females", age_range, place)
      return sum(female_dict.values())
    elif sex in ["male", "males", "m", "Male", "Males", "M", "MA"]:
      male_dict = dict_helper(year, "males", age_range, place)
      return sum(male_dict.values())

Dataframe:

In [32]:
def create_population_dataframe(regions, years, age_range):
    data = []

    if len(age_range) == 1:
        full_age_range = [age_range[0]]
    else:
        full_age_range = list(range(age_range[0], age_range[1] + 1))

    for region in regions:
        for year in years:
            row = {'Region': region, 'Year': year}
            for age in full_age_range:
                male_population = population(year, 'male', [age], region)
                female_population = population(year, 'female', [age], region)

                row[f'Male Population Age {age}'] = male_population
                row[f'Female Population Age {age}'] = female_population
            data.append(row)

    df = pd.DataFrame(data)
    df.set_index(['Region', 'Year'], inplace=True)
    return df

usa_pop_df = create_population_dataframe(["United States"], [2023], [0, 80])

groupings = {
    "C 1-3": [
        "Male Population Age 1", "Male Population Age 2", "Male Population Age 3",
        "Female Population Age 1", "Female Population Age 2", "Female Population Age 3"
    ],
    "F 4-8": [
        "Female Population Age 4", "Female Population Age 5", "Female Population Age 6",
        "Female Population Age 7", "Female Population Age 8"
    ],
    "M 4-8": [
        "Male Population Age 4", "Male Population Age 5", "Male Population Age 6",
        "Male Population Age 7", "Male Population Age 8"
    ],
    "F 9-13": [
        "Female Population Age 9", "Female Population Age 10", "Female Population Age 11",
        "Female Population Age 12", "Female Population Age 13"
    ],
    "M 9-13": [
        "Male Population Age 9", "Male Population Age 10", "Male Population Age 11",
        "Male Population Age 12", "Male Population Age 13"
    ],
    "F 14-18": [
        "Female Population Age 14", "Female Population Age 15", "Female Population Age 16",
        "Female Population Age 17", "Female Population Age 18"
    ],
    "M 14-18": [
        "Male Population Age 14", "Male Population Age 15", "Male Population Age 16",
        "Male Population Age 17", "Male Population Age 18"
    ],
    "F 19-30": [
        "Female Population Age 19", "Female Population Age 20", "Female Population Age 21",
        "Female Population Age 22", "Female Population Age 23", "Female Population Age 24",
        "Female Population Age 25", "Female Population Age 26", "Female Population Age 27",
        "Female Population Age 28", "Female Population Age 29", "Female Population Age 30"
    ],
    "M 19-30": [
        "Male Population Age 19", "Male Population Age 20", "Male Population Age 21",
        "Male Population Age 22", "Male Population Age 23", "Male Population Age 24",
        "Male Population Age 25", "Male Population Age 26", "Male Population Age 27",
        "Male Population Age 28", "Male Population Age 29", "Male Population Age 30"
    ],
    "F 31-50": [
        "Female Population Age 31", "Female Population Age 32", "Female Population Age 33",
        "Female Population Age 34", "Female Population Age 35", "Female Population Age 36",
        "Female Population Age 37", "Female Population Age 38", "Female Population Age 39",
        "Female Population Age 40", "Female Population Age 41", "Female Population Age 42",
        "Female Population Age 43", "Female Population Age 44", "Female Population Age 45",
        "Female Population Age 46", "Female Population Age 47", "Female Population Age 48",
        "Female Population Age 49", "Female Population Age 50"
    ],
    "M 31-50": [
        "Male Population Age 31", "Male Population Age 32", "Male Population Age 33",
        "Male Population Age 34", "Male Population Age 35", "Male Population Age 36",
        "Male Population Age 37", "Male Population Age 38", "Male Population Age 39",
        "Male Population Age 40", "Male Population Age 41", "Male Population Age 42",
        "Male Population Age 43", "Male Population Age 44", "Male Population Age 45",
        "Male Population Age 46", "Male Population Age 47", "Male Population Age 48",
        "Male Population Age 49", "Male Population Age 50"
    ],
    "F 51+": [
        "Female Population Age 51", "Female Population Age 52", "Female Population Age 53",
        "Female Population Age 54", "Female Population Age 55", "Female Population Age 56",
        "Female Population Age 57", "Female Population Age 58", "Female Population Age 59",
        "Female Population Age 60", "Female Population Age 61", "Female Population Age 62",
        "Female Population Age 63", "Female Population Age 64", "Female Population Age 65",
        "Female Population Age 66", "Female Population Age 67", "Female Population Age 68",
        "Female Population Age 69", "Female Population Age 70", "Female Population Age 71",
        "Female Population Age 72", "Female Population Age 73", "Female Population Age 74",
        "Female Population Age 75", "Female Population Age 76", "Female Population Age 77",
        "Female Population Age 78", "Female Population Age 79", "Female Population Age 80"
    ],
    "M 51+": [
        "Male Population Age 51", "Male Population Age 52", "Male Population Age 53",
        "Male Population Age 54", "Male Population Age 55", "Male Population Age 56",
        "Male Population Age 57", "Male Population Age 58", "Male Population Age 59",
        "Male Population Age 60", "Male Population Age 61", "Male Population Age 62",
        "Male Population Age 63", "Male Population Age 64", "Male Population Age 65",
        "Male Population Age 66", "Male Population Age 67", "Male Population Age 68",
        "Male Population Age 69", "Male Population Age 70", "Male Population Age 71",
        "Male Population Age 72", "Male Population Age 73", "Male Population Age 74",
        "Male Population Age 75", "Male Population Age 76", "Male Population Age 77",
        "Male Population Age 78", "Male Population Age 79", "Male Population Age 80"
    ]
}

# Loop through each group and sum the relevant columns
for group, cols in groupings.items():
    usa_pop_df[group] = usa_pop_df[cols].sum(axis=1)

# After summing the populations, drop the original age columns
columns_to_drop = [col for sublist in groupings.values() for col in sublist]
usa_pop_df.drop(columns=columns_to_drop, inplace=True)
usa_pop_df

Unnamed: 0_level_0,Unnamed: 1_level_0,Male Population Age 0,Female Population Age 0,C 1-3,F 4-8,M 4-8,F 9-13,M 9-13,F 14-18,M 14-18,F 19-30,M 19-30,F 31-50,M 31-50,F 51+,M 51+
Region,Year,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
United States,2023,1859555,1776227,10907346,9500935,9992427,10058653,10646674,10579645,11242472,25700622,27479410,43051959,45568239,53186441.55,50890359.2


Total cost function:

In [33]:
def total_cost_pop(sex_age_group, diet, df, parameter = 'price'):
    
    if parameter == 'price':
        var1 = get_grub(sex_age_group, diet, df)
    elif parameter == 'co2':
        var1 = get_grub_cotwo(sex_age_group, diet, df)
    else:
        return "Invalid parameter."
    
    cost_match = re.search(r"\$([0-9]+\.[0-9]+)", var1)
    cost = float(cost_match.group(1))
    
    return cost * usa_pop_df[sex_age_group].index(0)

In [34]:
#Total cost F 19-30:
total_cost_pop('F 19-30', 'not vegan', df, 'price')

Cost of diet for M 51+ is $0.07 per day.

You'll be eating (in 100s of grams or milliliters):
AVOCADO                 1.422686
BEEF BONE FREE MEAT*    0.579366
CARROT                  0.809701
GARLIC                  0.585595
MANGO                   2.295733
ONION                   0.001165
PEPPER                  0.171361
SOY BURGER              0.233006
SOYBEAN                 1.300371
SUNFLOWER SEED          0.892521
TOMATO & BASIL          0.798631
dtype: float64


TypeError: expected string or bytes-like object, got 'Series'

### **[C]: Sensitivity of Solution**

In [17]:
# Code here

### **[X]: Cost comparison: Co2 emissions**

NameError: name 'Aall' is not defined

In [24]:
alpha = 0.1


#social cost of a ton of carbon is $185 so 0.0185 per 100 g
df['Price with Co2 Social Cost per 100g (USD)'] = (0.0185 * df['Carbon Footprint kg CO2eq/kg or l of food ITEM']) + df['Average Price per 100g (USD)']

def get_grub_cotwo(sex_age_group, diet, df):
    if (diet == 'vegan') or (diet == 'plant-based'):
        df = df[df['animal product'] == 'plant-based']
    D = {}
    count = 0
    
    
    for food in df.index:
        try:
            FDC = df.loc[df.index==food,:]['FDC ID'][0]
            count+=1
            D[food] = fdc.nutrients(apikey,FDC).Quantity
            #print(D[food])
            #print(food)
        except AttributeError:
            warnings.warn(f"Couldn't find FDC Code {FDC} for food {food}.")
    
    D = pd.DataFrame(D,dtype=float)
    df.dropna(how='any') # Drop food with any missing data

    Prices = df.groupby('Food')['Price with Co2 Social Cost per 100g (USD)'].min()
    p = Prices.apply(lambda x:x).dropna()
    
    # Compile list that we have both prices and nutritional info for; drop if either missing
    use = p.index.intersection(D.columns)
    p = p[use]
    tol = 1e-6 # Numbers in solution smaller than this (in absolute value) treated as zeros

    Aall = D[p.index].fillna(0)
    display(Aall)
    # Drop rows of A that we don't have constraints for.
    Amin = Aall.loc[bmin.index]
    
    Amax = Aall.loc[bmax.index]
    
    # Maximum requirements involve multiplying constraint by -1 to make <=.
    A = pd.concat([Amin,-Amax])
    ## Choose sex/age group!
    result =  lp(p, -A, -b[sex_age_group], method='highs')
    print(f"Cost of diet for {group} considering Co2 social cost is ${result.fun:.2f} per day.")
    # Put back into nice series
    diet = pd.Series(result.x,index=p.index)
    
    print("\nYou'll be eating (in 100s of grams or milliliters):")
    print(diet[diet >= tol])  # Drop items with quantities less than precision of calculation.
    return diet

group = "F 19-30"
diet = 'vegan'
#result = get_grub(group, diet, df)

solution = get_grub_cotwo(group, diet, df)


  FDC = df.loc[df.index==food,:]['FDC ID'][0]


Unnamed: 0,APPLE,APRICOT,ARTICHOKE,AVOCADO,BANANA,BARLEY,BEAN,BEANS (F),BEET SUGAR,BLACKBERRY,...,TOMATO,TOMATO & BASIL,TOMATO ARRABBIATA,TOMATO PEELED,TOMATO PUREE,TURNIP,WATERMELON,WHEAT,WHOLEGRAIN CRACKERS**,ZUCCHINI
Alanine,0.0,0.0,0.0,0.00,0.0,0.00,0.00,0.00,0.00,0.0,...,0.0,0.00,0.0,0.0,0.0,0.147,0.0,0.00,0.00,0.0
"Alcohol, ethyl",0.0,0.0,0.0,0.00,0.0,0.00,0.00,0.00,0.00,0.0,...,0.0,0.00,0.0,0.0,0.0,0.000,0.0,0.00,0.00,0.0
Amino acids,0.0,0.0,0.0,0.00,0.0,0.00,0.00,0.00,0.00,0.0,...,0.0,0.00,0.0,0.0,0.0,0.000,0.0,0.00,0.00,0.0
Arginine,0.0,0.0,0.0,0.00,0.0,0.00,0.00,0.00,0.00,0.0,...,0.0,0.00,0.0,0.0,0.0,0.128,0.0,0.00,0.00,0.0
Ash,0.0,0.0,0.0,0.00,0.0,0.00,0.00,0.00,0.00,0.0,...,0.0,0.00,0.0,0.0,0.0,0.830,0.0,0.00,0.00,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Vitamin K (Dihydrophylloquinone),0.0,0.0,0.0,0.00,0.0,0.00,0.00,0.00,0.00,0.0,...,0.0,0.00,0.0,0.0,0.0,0.000,0.0,0.00,0.00,0.0
Vitamin K (phylloquinone),0.0,0.0,14.8,21.00,0.0,2.10,1.70,1.70,0.30,0.0,...,0.0,27.20,0.0,0.0,0.0,0.000,0.0,4.90,36.00,0.0
Vitamins and Other Components,0.0,0.0,0.0,0.00,0.0,0.00,0.00,0.00,0.00,0.0,...,0.0,0.00,0.0,0.0,0.0,0.000,0.0,0.00,0.00,0.0
Water,0.0,0.0,84.1,73.20,0.0,67.10,73.00,73.00,81.90,0.0,...,0.0,76.70,0.0,0.0,0.0,93.130,0.0,35.20,2.50,0.0


Cost of diet for F 19-30 considering Co2 social cost is $3.49 per day.

You'll be eating (in 100s of grams or milliliters):
CARROT           4.737367
COWPEA           2.232558
LETTUCE          0.073666
MANGO            0.759613
OAT              0.962603
ORANGE JUICE     2.212507
SOY BURGER       1.194030
SOYBEAN          1.197894
SUNFLOWER OIL    0.157119
dtype: float64


### **[X]: Cost comparison: Water usage**