# Example 3 - MIX

This example shows another simple traceability example:

We have two cages, C1 and C2.

* 2025-09-01 12:00
    * C1 is stocked with 200 fish at 12:00 => this becomes population P1.

* 2025-09-03 12:00
    * C2 is stocked with 1800 fish at time 12:00 => this becomes population P2.    

* 2025-09-05 12:00:

    * The fish in Cage C2 is moved to Cage C1, mixing P1 and P2.    
    * This ends populations P1 and P2, and creates 1 new population in C1; P3

O2 [mg/L] is logged daily in each cage. 
Cage C1 has very low levels to illustrate effects of different aggregations when tracing.

The example shows how to:

* load model data (populations, containers, transfers)
* load cage related timeseries (daily temperature per cage)
* map cage data to populations
* perform a trace backward (P3)
* map population data (mapped from cage data) to trace
* aggregate timeseries along the trace

In [1]:

from pathlib import Path
from aqua_tracekit import SdtModel, SdtSchema
from pathlib import Path
from datetime import datetime
from IPython.display import HTML

Load base model

In [2]:
base_path = Path("data")
model = SdtModel(base_path=str(base_path.resolve()))
df_containers = model.load_containers("containers.csv")
df_populations = model.load_populations("populations.csv")
df_transfers = model.load_transfers("transfers.csv")
HTML(model.visualize_trace())

Load cage data and map to populations

In [3]:

df_cage_o2 = model.load_container_timeseries("O2.csv")
df_cage_o2 = model.parse_float(df_cage_o2, "O2_mg_per_litre")
df_pop_o2 = model.map_container_data_to_populations(df_cage_o2, include_unmatched=False, allow_multiple=False)
df_pop_o2.head(11)

container_id,date_time,O2_mg_per_litre,population_id
str,datetime[μs],f64,str
"""C1""",2025-09-01 16:00:00,4.0,"""P1"""
"""C1""",2025-09-02 16:00:00,5.0,"""P1"""
"""C1""",2025-09-03 16:00:00,4.0,"""P1"""
"""C1""",2025-09-04 16:00:00,5.0,"""P1"""
"""C1""",2025-09-05 16:00:00,4.0,"""P3"""
…,…,…,…
"""C1""",2025-09-07 16:00:00,4.0,"""P3"""
"""C1""",2025-09-08 16:00:00,5.0,"""P3"""
"""C1""",2025-09-09 16:00:00,4.0,"""P3"""
"""C1""",2025-09-10 16:00:00,5.0,"""P3"""


Pivot and display data on population level

In [4]:
pd_pop_o2 = df_pop_o2.to_pandas()
pd_pivot = (
    pd_pop_o2
    .pivot_table(
        index=SdtSchema.TimeSeries.DATE_TIME,
        columns=SdtSchema.Population.POPULATION_ID,
        values="O2_mg_per_litre"
    )
    .reset_index()
    .fillna("")
)

pd_pivot.head(11)

population_id,date_time,P1,P2,P3
0,2025-09-01 16:00:00,4.0,,
1,2025-09-02 16:00:00,5.0,,
2,2025-09-03 16:00:00,4.0,11.0,
3,2025-09-04 16:00:00,5.0,9.0,
4,2025-09-05 16:00:00,,,4.0
5,2025-09-06 16:00:00,,,5.0
6,2025-09-07 16:00:00,,,4.0
7,2025-09-08 16:00:00,,,5.0
8,2025-09-09 16:00:00,,,4.0
9,2025-09-10 16:00:00,,,5.0


Create traceability_index_ for Trace P3

In [5]:

# Trace populations that exists at 2025-01-10 16:00:00
trace_time = datetime(2025,9,10,16,0,0)
df_origin_populations = model.get_populations_active_at(trace_time)
df_traceability_index = model.trace_populations(df_origin_populations)
df_traceability_index.head(11)


origin_population_id,traced_population_id,direction,share_count_forward,share_biomass_forward,share_count_backward,share_biomass_backward
str,str,str,f64,f64,f64,f64
"""P3""","""P3""","""identity""",1.0,1.0,1.0,1.0
"""P3""","""P1""","""backward""",1.0,1.0,0.1,0.1
"""P3""","""P2""","""backward""",1.0,1.0,0.9,0.9


map O2 data to the traceability index and display

In [6]:
df_traced_data = model.add_data_to_trace(df_pop_o2, df_traceability_index)
df_traced_data = df_traced_data.sort(SdtSchema.TimeSeries.DATE_TIME)
df_traced_data.head(11)


origin_population_id,traced_population_id,direction,share_count_forward,share_biomass_forward,share_count_backward,share_biomass_backward,container_id,date_time,O2_mg_per_litre
str,str,str,f64,f64,f64,f64,str,datetime[μs],f64
"""P3""","""P1""","""backward""",1.0,1.0,0.1,0.1,"""C1""",2025-09-01 16:00:00,4.0
"""P3""","""P1""","""backward""",1.0,1.0,0.1,0.1,"""C1""",2025-09-02 16:00:00,5.0
"""P3""","""P1""","""backward""",1.0,1.0,0.1,0.1,"""C1""",2025-09-03 16:00:00,4.0
"""P3""","""P2""","""backward""",1.0,1.0,0.9,0.9,"""C2""",2025-09-03 16:00:00,11.0
"""P3""","""P1""","""backward""",1.0,1.0,0.1,0.1,"""C1""",2025-09-04 16:00:00,5.0
…,…,…,…,…,…,…,…,…,…
"""P3""","""P3""","""identity""",1.0,1.0,1.0,1.0,"""C1""",2025-09-05 16:00:00,4.0
"""P3""","""P3""","""identity""",1.0,1.0,1.0,1.0,"""C1""",2025-09-06 16:00:00,5.0
"""P3""","""P3""","""identity""",1.0,1.0,1.0,1.0,"""C1""",2025-09-07 16:00:00,4.0
"""P3""","""P3""","""identity""",1.0,1.0,1.0,1.0,"""C1""",2025-09-08 16:00:00,5.0


A simple pivot will use mean() for the values

In [7]:
pd_traced_data = df_traced_data.to_pandas()
pd_pivot_mean = (
    pd_traced_data
    .pivot_table(
        index=SdtSchema.TimeSeries.DATE_TIME,
        columns=SdtSchema.TraceabilityIndex.ORIGIN_POPULATION_ID,
        values="O2_mg_per_litre"
    )
    .reset_index()
)

pd_pivot_mean.head(11)


origin_population_id,date_time,P3
0,2025-09-01 16:00:00,4.0
1,2025-09-02 16:00:00,5.0
2,2025-09-03 16:00:00,7.5
3,2025-09-04 16:00:00,7.0
4,2025-09-05 16:00:00,4.0
5,2025-09-06 16:00:00,5.0
6,2025-09-07 16:00:00,4.0
7,2025-09-08 16:00:00,5.0
8,2025-09-09 16:00:00,4.0
9,2025-09-10 16:00:00,5.0


The O2 data on the 3rd is 4.0 and 11.0:

>C1,2025-09-03 16:00:00,4.0
>
>C2,2025-09-03 16:00:00,11.0

and the standard pivot_table is to do a MEAN(4,11)=>7.5:

>2025-09-03 16:00:00	7.5

