# Completion (02nd October 2023)

- Iterate by date (Some surveys have the same date)

# Clarifications

- Will the HEIFA scores be 1 decimal place? What if it fails to fall within the range?


# Questions to Ask

- Is it okay to use CSV file from your end as well? Ease of convenience and I don't have to hardcode it in my end
- Need some examples for vegetables and fruits variations

In [None]:
from utils import *

import nest_asyncio
import asyncio

# Only run nest_asyncio in a Jupyter Notebook environment
nest_asyncio.apply()

In [None]:
# Load the respective files

async def get_all_dataframes():
    return await asyncio.gather(
        load_intake24(),
        load_heifa_ingredients(),
        load_heifa_recipes(),
        load_heifa_scores()
    )

intake24_df, heifa_food_df, heifa_recipes_df, heifa_scores_df = asyncio.run(get_all_dataframes())

# Breakdown of Intake 24:

The file has many users.

Each user has many surveys.

Each survey has many meal intake.

Each intake consists of many food components.

Every food component is marked with a "Nutrition ID code".

In [None]:
user_dict = create_user_objects(intake24_df)

#for user_id in user_dict.keys():

#    print(f"Printing for User {user_id}")
#    user_obj = user_dict[user_id]
#    user_obj.print_information()

# Breakdown of HEIFA (Food Composition)

Every row in the file is a unique ingredient.

Every ingredient:
- has it's own attributes.
- can be mapped to a 8-digit code (for HEIFA Recipe)
- is used as a divisor for either energy (kilo joules) or grams (g)

In [None]:
# Create the objects
food_composition_dict = create_food_objects(heifa_food_df)

#for key, food_comp_obj in food_composition_dict.items():
#    food_comp_obj.print_full_details()

# Breakdown of HEIFA (Recipes)

- Every recipe has multiple ingredients
- Keys are repeated across rows (similar to Survey ID of Intake24)
- Every ingredient has respective proportion to the recipe

In [None]:
recipe_dict = create_recipe_objects(heifa_recipes_df)

#for id, recipe_obj in recipe_dict.items():
#    print(f"Printing for ID {id}\n")
#    recipe_obj.print_ingredients_information()

## Mapping between Intake24 and HEIFA Ingredients

- For each user, extract the given nutrients and store in an array.
- This is from ALL the survey data.
- We don't care about the order here.
- The array will contain a list of dictionaries/JSON.

In the array:

- Use the HEIFA ID (from user) to map to the HEIFA Ingredients' HEIFA ID.
- Check if a result is found or not.
- Check if it requires a recipe or not.

## Mapping between Intake24 and HEIFA Recipes

This is in case a recipe is found (The second step).

- For the given recipe, extract the given nutrients ID and proportion, store in an array.
- We don't care about the order here.
- The array will contain a list of dictionaries.

In the array:

- Use the HEIFA ID (from the recipes) to map the HEIFA Ingredients' HEIFA ID.
- Check the energy and serving size.

In [None]:
user_daily_intake = calculate_user_servings(user_dict, food_composition_dict, recipe_dict)

# Test with Samara's CSV file and post the updates here

**Assumption**: This should be the same as Intake24 file format.

## Errors encountered

**Column  names between Intake24 and Latrobe**
- "Start date (AEST)" -> Different from Intake24 (used 'Start Time').
- 'Nutrient table code (original)' -> Different from Intake24 (used 'Nutrient table code').
- 'Energy, with dietary fibre (kJ)' -> Different from Intake24 (used 'Energy, with dietary fibre').

**Nutrient ID related**
- Values of "N/A" in the Nutrient ID still present.
- Unknown codes still present (8416)
- Row difference before and after dropping: 6028 vs 5613 (415)

**Inside the file**:
- Some nutrient ID is the food description and not the ID (Example: Porridge, made with light milk)
- Nutrient ID does not have the ID from Row 3072 to 3294; it has description (same as previous reason)
- Some values of energy are not found; they are shown as #VALUE! (#VALUE! present in the google sheet)


In [None]:
#from utils import *

#latrobe_df = load_latrobe_file()

#print(latrobe_df)

In [None]:

#user_latrobe_dict = create_user_objects(latrobe_df)

#for user_id in user_latrobe_dict.keys():

#    print(f"Printing for User {user_id}")
#    user_obj = user_latrobe_dict[user_id]
#    user_obj.print_information()

In [None]:
#user_daily_intake = calculate_user_servings(user_latrobe_dict, food_composition_dict, recipe_dict)

# Calculating the HEIFA Scores

Heifa scores are to be calculated on a **daily basis**.

To calculate them, let's break them down:

- Break down by user
- Break down by date
- Break down by major food group (Example: Vegetables/Green -> Vegetables is the major food group)
- Break down by sub-food group of the major (Example: Vegetables/Green -> Green is the sub-food group)
- Compare the scores by gender (male and female)

There are some exceptions to the rule, based on the HEIFA scores guideline:

- Grains and cereals/Wholegrains -> This is to be calculated separately as "Grains and cereals" and "Wholegrains".

In [None]:
user_daily_intake

In [None]:
# Create the HEIFA scores list
heifa_scores_dict = create_scores_objects(heifa_scores_df)

user_heifa_scores = calculate_heifa_scores(heifa_scores_dict, user_daily_intake)

In [None]:
user_heifa_scores

In [None]:
# Display
for user_id, daily_intake_dict in user_daily_intake.items():

    for date, food_group_dict in daily_intake_dict.items():
        print(f"Breakdown of User {user_id} on {date}:")

        individual_dict = food_group_dict['individual']
        total_dict = food_group_dict['total']

        for food_group, total_serving in individual_dict.items():
            print(f"- {food_group}: {total_serving:.2f} serves")

        print("")
        
        for food_group, total_serving in total_dict.items():
            
            
            if type(total_serving) is not dict:
                print(f"> {food_group}: {total_serving:.2f} serves")

            if food_group not in user_heifa_scores[user_id][date]:
                print("")
                continue

            gender_scores = user_heifa_scores[user_id][date][food_group]

            male_score = gender_scores['male_score']
            female_score = gender_scores['female_score']
        
            print(f"* Male score: {male_score}")
            print(f"* Female score: {female_score}")
            print("")
        
        print("")