# Test Notebook for Lookup Utils

This notebook demonstrates how to use the utility functions in `lookup_utils.py` to swap foreign key IDs with names and vice-versa.

In [None]:
import pandas as pd
from sqlmodel import Session, select
import sys
import os

# Add the project root to the Python path to allow for module imports
project_root = os.path.abspath(os.path.join(os.getcwd(), '..'))
if project_root not in sys.path:
    sys.path.insert(0, project_root)

from src.database import engine
from src.models.biomass import Biomass, BiomassType
from src.utils.lookup_utils import replace_id_with_name_df, replace_name_with_id_df

## 1. Create a Database Session

First, we create a database session using the engine from our `database.py` module. This session is used to communicate with the database.

In [None]:
db = Session(engine)

## 2. Load Tables into Pandas DataFrames

Next, we query the `Biomass` and `BiomassType` tables and load their contents into pandas DataFrames. This simulates the data you would be working with in a transformation script.

In [None]:
# Load Biomass table
biomass_statement = select(Biomass)
biomass_records = db.exec(biomass_statement).all()
biomass_df = pd.DataFrame([record.dict() for record in biomass_records])

print("Original Biomass DataFrame:")
display(biomass_df.head())

In [None]:
# Load BiomassType table
biomass_type_statement = select(BiomassType)
biomass_type_records = db.exec(biomass_type_statement).all()
biomass_type_df = pd.DataFrame([record.dict() for record in biomass_type_records])

print("Biomass Type Reference DataFrame:")
display(biomass_type_df.head())

## 3. Example Usage

Now you can use the utility functions to transform the DataFrames. The cells below provide examples that you can run and modify.

### Example 1: Replace `biomass_type_id` with `biomass_type` Name

In [None]:
# To run this, uncomment the following lines:
# df_with_names = replace_id_with_name_df(
#     db=db,
#     df=biomass_df,
#     ref_model=BiomassType,
#     id_column_name="biomass_type_id",
#     name_column_name="biomass_type"
# )
#
# print("DataFrame with Biomass Type Names:")
# display(df_with_names.head())

### Example 2: Replace `biomass_type` Name with `biomass_type_id`

This example also demonstrates the 'get or create' functionality. Let's first add a new, non-existent biomass type to our DataFrame to see if the function correctly adds it to the database.

In [None]:
# Create a dummy DataFrame for this example
data = {'biomass_name': ['Test Biomass 1', 'Test Biomass 2'], 'biomass_type': ['ag_residue', 'new_test_type']}
name_df = pd.DataFrame(data)

# To run this, uncomment the following lines:
# df_with_ids = replace_name_with_id_df(
#     db=db,
#     df=name_df,
#     ref_model=BiomassType,
#     name_column_name="biomass_type",
#     id_column_name="biomass_type_id"
# )
#
# print("DataFrame with Biomass Type IDs:")
# display(df_with_ids.head())

### Verify New Type was Added

If you ran the cell above, you can run this cell to query the `BiomassType` table again and see the new `new_test_type` entry.

In [None]:
# To run this, uncomment the following lines:
# updated_biomass_type_statement = select(BiomassType)
# updated_biomass_type_records = db.exec(updated_biomass_type_statement).all()
# updated_biomass_type_df = pd.DataFrame([record.dict() for record in updated_biomass_type_records])
#
# print("Updated Biomass Type Table:")
# display(updated_biomass_type_df)