# 3. REFINED RESULTS AND STATISTICS

In the USDA database, it is assumed that the nutrient values are per 100g of the food item.
In IASI database, FOOD_PORTION is given in grams.
Therefore, we need to scale the nutrient values accordingly:

$ \text{Nutrient Value (scaled)} = \left( \frac{\text{Nutrient Value (per 100g)} \times \text{FOOD\_PORTION (g)}}{100} \right) $

In [1]:
# Imports
import pandas as pd

In [4]:
refined_iasi_df = pd.read_csv("./data/PROCESSED_iasi_with_nutrients_raw.csv")
refined_iasi_df.head()

# Drop columns food_code, fdc_ids, double_check_desc
refined_iasi_df = refined_iasi_df.drop(columns=['FOOD_CODE', 'fdc_ids', 'double_check_desc'])
refined_iasi_df.head()

Unnamed: 0,ID,MEAL_ID,FOOD_PORTION,DESC,Retinol (UG),Lycopene (UG),cis_Lycopene (UG),trans_Lycopene (UG),Carotene_beta (UG),cis_beta_Carotene (UG),...,Cryptoxanthin_beta (UG),Choline_total (MG),Carotene_alpha (UG),Vitamin_K_phylloquinone (UG),Zeaxanthin (UG),Lutein (UG),Lutein_plus_zeaxanthin (UG),cis_Lutein/Zeaxanthin (UG),Vitamin_D_D2_plus_D3 (UG),Vitamin_A_RAE (UG)
0,1,0,293.0,"Whole milk, average",31.0,0.0,0.0,0.0,7.0,7.0,...,0.0,17.8,0.0,0.3,0.0,0.0,0.0,0.0,1.1,32.0
1,1,1,0.56,"Beef, average, fat, cooked",3.0,0.0,0.0,0.0,0.0,0.0,...,0.0,86.1,0.0,1.6,0.0,0.0,0.0,0.0,0.1,3.0
2,1,1,6.93,"Beef, rump steak, grilled, lean",1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,77.2,0.0,1.6,0.0,0.0,0.0,0.0,0.2,1.0
3,1,2,14.0,"Beefburgers, chilled/frozen, fried",3.0,0.0,0.0,0.0,0.0,0.0,...,0.0,79.4,0.0,1.9,0.0,0.0,0.0,0.0,0.0,3.0
4,1,3,4.945,"Pork, fat, cooked",1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,75.7,0.0,0.0,0.0,0.0,0.0,0.0,0.6,1.0


In [6]:
nutrient_columns = refined_iasi_df.columns.difference(['ID', 'MEAL_ID', 'FOOD_PORTION', 'DESC'])

# In the USDA database, it is assumed that the nutrient values are per 100g of the food item.
# In IASI database, FOOD_PORTION is given in grams.
# Therefore, we need to scale the nutrient values accordingly.
for col in nutrient_columns:
    refined_iasi_df[col] = (refined_iasi_df[col] * refined_iasi_df['FOOD_PORTION']) / 100.0

refined_iasi_df.to_csv('./data/PROCESSED_iasi_with_nutrients_scaled_with_food_portion.csv', index=False)
refined_iasi_df.head()

Unnamed: 0,ID,MEAL_ID,FOOD_PORTION,DESC,Retinol (UG),Lycopene (UG),cis_Lycopene (UG),trans_Lycopene (UG),Carotene_beta (UG),cis_beta_Carotene (UG),...,Cryptoxanthin_beta (UG),Choline_total (MG),Carotene_alpha (UG),Vitamin_K_phylloquinone (UG),Zeaxanthin (UG),Lutein (UG),Lutein_plus_zeaxanthin (UG),cis_Lutein/Zeaxanthin (UG),Vitamin_D_D2_plus_D3 (UG),Vitamin_A_RAE (UG)
0,1,0,293.0,"Whole milk, average",90.83,0.0,0.0,0.0,20.51,20.51,...,0.0,52.154,0.0,0.879,0.0,0.0,0.0,0.0,3.223,93.76
1,1,1,0.56,"Beef, average, fat, cooked",0.0168,0.0,0.0,0.0,0.0,0.0,...,0.0,0.48216,0.0,0.00896,0.0,0.0,0.0,0.0,0.00056,0.0168
2,1,1,6.93,"Beef, rump steak, grilled, lean",0.0693,0.0,0.0,0.0,0.0,0.0,...,0.0,5.34996,0.0,0.11088,0.0,0.0,0.0,0.0,0.01386,0.0693
3,1,2,14.0,"Beefburgers, chilled/frozen, fried",0.42,0.0,0.0,0.0,0.0,0.0,...,0.0,11.116,0.0,0.266,0.0,0.0,0.0,0.0,0.0,0.42
4,1,3,4.945,"Pork, fat, cooked",0.04945,0.0,0.0,0.0,0.0,0.0,...,0.0,3.743365,0.0,0.0,0.0,0.0,0.0,0.0,0.02967,0.04945


### PLOTS AND STATISTICS - TBD