# Completion

- Fixed a bug (The code assumes that every meal will have all the groups; not necessarily the case).


# Clarifications

- Reducing whitespace
-- "Vegetables/Grains" vs " Vegetables / Grains " \
-- Reason: The code is a bit sensitive to the whitespace. \
-- I captured "Vegetables" and " Vegetables " as two entities.

- The following alcohol items don't have any serving size (Column N, HEIFA composition file): \
-- 02E10483, 02F40291, 02E60309, 10A10502, 02F40291, 02F40294 \
-- They do have alcohol amount in the Intake24 file \
-- VERDICT: On-hold

# Questions to Ask

- Edge case: Sodium example (1610.29) \
-- Intuition: Keep a lower range (0.00 - 1610.0), and another range (1610.1 - 2300.00) \
-- We are rounding to 1 decimal place (I think best we do it for all groups, even if they are integers)



In [None]:
from utils import *
from file_loaders import *

from pprint import pprint

import nest_asyncio
import asyncio

# Only run nest_asyncio in a Jupyter Notebook environment
nest_asyncio.apply()

In [None]:
# Load the respective files (Extract)

async def get_all_dataframes():
    return await asyncio.gather(
        load_intake24(),
        load_latrobe_file(),
        load_heifa_ingredients(),
        load_heifa_recipes(),
        load_heifa_scores()
    )

intake24_df, latrobe_df, heifa_food_df, heifa_recipes_df, heifa_scores_df = asyncio.run(get_all_dataframes())

In [None]:
# Convert the data to objects

user_dict = create_user_objects(intake24_df)
food_composition_dict = create_food_objects(heifa_food_df)
recipe_dict = create_recipe_objects(heifa_recipes_df)
heifa_scores_dict = create_scores_objects(heifa_scores_df)

In [None]:
#for user_id in user_dict.keys():

#    print(f"Printing for User {user_id}")
#    user_obj = user_dict[user_id]
#    user_obj.print_information()

In [None]:
# Create the objects

#for key, food_comp_obj in food_composition_dict.items():
#    food_comp_obj.print_full_details()

In [None]:
#for id, recipe_obj in recipe_dict.items():
#    print(f"Printing for ID {id}\n")
#    recipe_obj.print_ingredients_information()

In [None]:
# Find the daily intake
user_daily_intake = calculate_user_servings(user_dict, food_composition_dict, recipe_dict)

In [None]:
pprint(user_daily_intake)

# Test with Samara's CSV file and post the updates here

**Assumption**: This should be the same as Intake24 file format.

## Errors encountered

**Column  names between Intake24 and Latrobe**
- "Start date (AEST)" -> Different from Intake24 (used 'Start Time'). -> "RESOLVED"
- 'Nutrient table code (original)' -> Different from Intake24 (used 'Nutrient table code'). -> "RESOLVED"
- 'Energy, with dietary fibre (kJ)' -> Different from Intake24 (used 'Energy, with dietary fibre'). -> "RESOLVED"

**Nutrient ID related**
- Values of "N/A" in the Nutrient ID still present. -> "RESOLVED"
- Unknown codes still present (8416) -> "RESOLVED"
- Row difference before and after dropping: 6028 vs 5613 (415) -> "RESOLVED"

**Inside the file**:
- Some nutrient ID is the food description and not the ID (Example: Porridge, made with light milk) -> "RESOLVED"
- Nutrient ID does not have the ID from Row 3072 to 3294; it has description (same as previous reason) -> "RESOLVED"
- Some values of energy are not found; they are shown as #VALUE! (#VALUE! present in the google sheet) -> "RESOLVED"

**Alcohol in Intake24 but not in HEIFA composition**:
- 02E10483, 02F40291, 02E60309, 10A10502, 02F40291, 02F40289, 02F40294, 02E60309


In [None]:
user_latrobe_dict = create_user_objects(latrobe_df)

#for user_id in user_latrobe_dict.keys():

#    print(f"Printing for User {user_id}")
#    user_obj = user_latrobe_dict[user_id]
#    user_obj.print_information()

In [None]:
latrobe_user_daily_intake = calculate_user_servings(user_latrobe_dict, food_composition_dict, recipe_dict)

latrobe_user_heifa_scores = calculate_heifa_scores(heifa_scores_dict, latrobe_user_daily_intake)

In [None]:
first_layer_mapping = {
    'Water': 'ml',
    'Non-Alcohol': 'ml',
    'Sodium': 'mg',
    'Sugar': 'g',
    'Saturated Fat': 'g',
    'Unsaturated Fat': 'g'
}

heifa_layer_mapping = {
    'Water': '%',
    'Non-Alcohol': 'ml',
    'Sugar': '%',
    'Saturated Fat': '%',
    'Sodium': 'mg'
}

In [None]:
# Display
for user_id, daily_intake_dict in latrobe_user_daily_intake.items():

    for survey_id, food_group_dict in daily_intake_dict.items():
        print(f"Breakdown of User {user_id} for Survey ID {survey_id}:")

        individual_dict = food_group_dict['individual']
        total_dict = food_group_dict['total']
        variations_dict = food_group_dict['variations']

        # Sort so can display in alphabetical order
        individual_dict = dict(sorted(individual_dict.items()))

        for food_group, total_serving in individual_dict.items():

            metric = first_layer_mapping.get(food_group, 'serves')

            print(f"- {food_group}: {total_serving:.2f} {metric}")

        print("")
        
        print("***HEIFA SCORES CONVERSION (START)***\n")
        total_dict = dict(sorted(total_dict.items()))

        for food_group, total_serving in total_dict.items():
            
            metric = heifa_layer_mapping.get(food_group, 'serves')
            
            print(f"> {food_group}: {total_serving:.2f} {metric}")

            if food_group not in latrobe_user_heifa_scores[user_id][survey_id]['breakdown']:
                print("* No score")
                print("")
                continue

            if food_group in variations_dict:

                variations = variations_dict[food_group]
                for sub_group, serving_size in variations.items():
                    print(f"-- {sub_group}: {serving_size:.2f} {metric}")

            gender_scores = latrobe_user_heifa_scores[user_id][survey_id]['breakdown'][food_group]

            male_score = gender_scores['male_score']
            female_score = gender_scores['female_score']
        
            print(f"* Male score: {male_score}")
            print(f"* Female score: {female_score}")
            print("")
        
        
        total_male_heifa = latrobe_user_heifa_scores[user_id][survey_id]['male_total']
        total_female_heifa = latrobe_user_heifa_scores[user_id][survey_id]['female_total']

        print(f"HEIFA Total (Male): {total_male_heifa}")
        print(f"HEIFA Total (Female): {total_female_heifa}")
        print("")
        
        print("***HEIFA SCORES CONVERSION (END)***")
        print("")
        print("=" * 20)

In [None]:
# Create the HEIFA scores list (Non-latrobe)
user_heifa_scores = calculate_heifa_scores(heifa_scores_dict, user_daily_intake)

pprint(user_daily_intake)

In [None]:
# Display
for user_id, daily_intake_dict in user_daily_intake.items():

    for survey_id, food_group_dict in daily_intake_dict.items():
        print(f"Breakdown of User {user_id} for Survey ID {survey_id}:")

        individual_dict = food_group_dict['individual']
        total_dict = food_group_dict['total']
        variations_dict = food_group_dict['variations']

        # Sort so can display in alphabetical order
        individual_dict = dict(sorted(individual_dict.items()))

        for food_group, total_serving in individual_dict.items():

            metric = first_layer_mapping.get(food_group, 'serves')

            print(f"- {food_group}: {total_serving:.2f} {metric}")

        print("")
        
        print("***HEIFA SCORES CONVERSION (START)***\n")
        total_dict = dict(sorted(total_dict.items()))

        for food_group, total_serving in total_dict.items():

            metric = heifa_layer_mapping.get(food_group, 'serves')
            
            print(f"> {food_group}: {total_serving:.2f} {metric}")

            if food_group not in user_heifa_scores[user_id][survey_id]['breakdown']:
                print("* No score")
                print("")
                continue

            if food_group in variations_dict:

                variations = variations_dict[food_group]
                for sub_group, serving_size in variations.items():
                    print(f"-- {sub_group}: {serving_size:.2f} {metric}")

            gender_scores = user_heifa_scores[user_id][survey_id]['breakdown'][food_group]

            male_score = gender_scores['male_score']
            female_score = gender_scores['female_score']
        
            print(f"* Male score: {male_score}")
            print(f"* Female score: {female_score}")
            print("")
        
        
        total_male_heifa = user_heifa_scores[user_id][survey_id]['male_total']
        total_female_heifa = user_heifa_scores[user_id][survey_id]['female_total']

        print(f"HEIFA Total (Male): {total_male_heifa}")
        print(f"HEIFA Total (Female): {total_female_heifa}")
        print("")
        
        print("***HEIFA SCORES CONVERSION (END)***")
        print("")
        print("=" * 20)

In [None]:
_ = create_heifa_csv(
    heifa_scores_dict, food_composition_dict, 
    user_daily_intake, user_heifa_scores,
    'intake24_breakdown'
)

column_names = create_heifa_csv(
    heifa_scores_dict, food_composition_dict, 
    latrobe_user_daily_intake, latrobe_user_heifa_scores,
    'cleaned_intake24_breakdown'
)

In [None]:
pprint(column_names)