In [1]:
# # Run this and then restart the kernel at the start of each session to install
# # 'teotil3' in development mode
# !pip install -e /home/jovyan/projects/teotil3/

In [2]:
import nivapy3 as nivapy
import pandas as pd
import teotil3 as teo

In [3]:
# eng = nivapy.da.connect_postgis(admin=True)
eng = nivapy.da.connect_postgis()

Connection successful.


# Task 2.9: Improve the workflow for aquaculture

From the proposal text:

> **Oppgave 2.9: Forbedre arbeidsflyten for akvakultur**
>
> En kort litteraturgjennomgang vil bli foretatt for å identifisere typiske proporsjoner av DIN, TON, TDP og TPP i utslipp fra akvakultur. Arbeidsflyten for total N og P fra anlegg i sjøvann vil bli omkodet for å gjøre nøkkelmodellparametere mer tydelige og enklere å oppdatere. Beregningen for TOC vil også bli implementert, som beskrevet i avsnitt 6.2.4 av hovedrapporten. SS anses som ikke relevant for akvakultur.

## 1. Clarify calculations for TOTN and TOTP

The existing aquaculture workflow for TOTN and TOTP is described [here](https://niva.brage.unit.no/niva-xmlui/bitstream/handle/11250/2985726/7726-2022+high.pdf?sequence=1#page=29). The old model already includes functions to estimate fluxes of TOTN and TOTP using this method and the approach will not change in the new version. However, the code needs restructuring and clarifying because parameters in the original version are neither obvious nor easily changed.

The the `preprocessing` module of TEOTIL3 includes a function named `estimate_aquaculture_nutrient_inputs` with the following call signature

    estimate_aquaculture_nutrient_inputs(
        df, year, eng, cu_tonnes=None, species_ids=[71401, 71101]
    )

By default, the function considers only species IDs 71401 and 71101 (salmon and rainbow trout), but it could be easily adapted to include other species too, if desired. In addition, all parameters involved in the calculations for N, P and C have been moved to a parameter file named `aquaculture_productivity_coefficients.csv`, which is hosted online [here](https://github.com/NIVANorge/teotil3/blob/main/data/aquaculture_productivity_coefficients.csv). The new model reads values directly from the most recent version of this file hosted online, so as long as it is kept up-to-date the model will always use the latest coefficients for the aquaculture calculations.

## 2. Subdivide TOTN and TOTP

The SINTEF report by [Broch and Ellingsen (2020)](https://www.sintef.no/globalassets/sintef-ocean/arrangement/slam/l1.1-delrapport-1-kvantifisering-av-utslipp.pdf) estimates discharges of C, N and P from salmon and rainbow trout farms in seawater, including subfractions of N and P. The methodology is similar - but not identical - to that used in TEOTIL and overall the two workflows seem compatible. Table 3 of this report shows that typical proportions of N and P subfractions remain approximately fixed throughout the year. In the context of TEOTIL, these results can be summarised as follows:

 * **TOTN**: 69% DIN and 31% TON
 * **TOTP**: 71% TPP and 29% TDP

These fractions are included in the file [here](https://github.com/NIVANorge/teotil3/blob/main/data/point_source_treatment_types.csv), which is used to subdivide N and P from different point sources.

## 3. Estimate TOC

The proposed new workflow for TOC is described [here](https://niva.brage.unit.no/niva-xmlui/bitstream/handle/11250/2985726/7726-2022+high.pdf?sequence=1#page=43) and implemented by the functions `calculate_aquaculture_toc_loss` and `estimate_aquaculture_nutrient_inputs` within the `preprocessing` module.

## 4. TEOTIL3 workflow

### 4.1. Raw data

Raw monthly data are provided each year by Fiskeridirektoratet in an encrypted Excel file. TEOTIL attempts to parse these data correctly, but it is a good idea to double-check the file manually first.

### 4.2. TEOTIL3 functions overview

`teo.preprocessing` contains the following new functions for handling data from aquaculture:

 * `read_raw_aquaculture_data` reads the raw Excel file and identifies any site IDs that are not already in the database. This function returns two objects: (i) a geodataframe of point co-ordinates for new sites to be added to the database, and (ii) a dataframe of raw monthly data for all aquaculture sites in the specified year. The geodataframe is reprojected to EPSG 25833, ready for upload to `teotil3.point_source_locations`. The dataframe should be passed to the function below to estimate nutrient losses
 
 * `estimate_aquaculture_nutrient_inputs` is the main function for estimating nutrient losses from aquaculture. By default, **the function estimates losses of TOTN, DIN, TON, TOTP, TDP, TDP and TOC from fish farms producing salmon and rainbow trout** (following the approach outlined in the forprosjekt report - see the links above). Optionally, the user can specify the total annual amount of copper used nationally by the aquaculture industry (keyword argument `cu_tonnes`), and this will be divided among all active sites in proportion to the estimated losses of TOTP. If desired, users can also specify a list of species IDs to consider instead of just salmon and trout (keyword argument `species_ids`). Note, however, that at present the same feed and productivity coefficients, and the same Feed Conversion Ratio, are assumed for all species
 
 * `get_annual_copper_usage_aquaculture` is a convenience function that reads the file [here](https://github.com/NIVANorge/teotil3/blob/main/data/aquaculture_annual_copper_usage.csv) and returns the total annual copper usage in aquaculture, as reported by Miljødirketoratet
 
Key parameters in the aquaculture calculations are the proportion of nutrients by mass in the feed and produced fish ($k_{feed}$ and $k_{prod}$, respectively), as well as the national **Feed Conversion Ratio** (FCR). These values are provided by Miljødirektoratet and the ones currently used by the TEOTIL model are defined [here](https://github.com/NIVANorge/teotil3/blob/main/data/aquaculture_productivity_coefficients.csv).

### 4.3. Example processing

The code below processes the data provided by Fiskeridirektoratet (but does not add it to the database - see the notebook for [Task 2.1e](https://nbviewer.org/github/NIVANorge/teotil3/blob/main/notebooks/development/T2-1e_annual_data_upload.ipynb) for a complete annual update workflow).

In [4]:
# Year of interest
year = 2021

xl_path = (
    f"/home/jovyan/shared/teotil3/point_data/{year}/fiske_oppdret_{year}_raw.xlsx"
)
xl_sheet = f"fiskeoppdrett_{year}"

cu_tonnes = teo.preprocessing.get_annual_copper_usage_aquaculture(year)
nidb_gdf, df = teo.preprocessing.read_raw_aquaculture_data(
    xl_path, xl_sheet, year, eng
)
df = teo.preprocessing.estimate_aquaculture_nutrient_inputs(
    df, year, eng, cu_tonnes=cu_tonnes, species_ids=[71401, 71101]
)

if nidb_gdf is not None:
    print(f"\n{len(nidb_gdf)} locations should be added to the database:")
    display(nidb_gdf)
    
df.head()

0 locations do not have co-ordinates in this year's data.
0 locations are not in the database.
The total annual copper lost to water from aquaculture is 1308.1 tonnes.


Unnamed: 0,site_id,in_par_id,year,value
0,10029,107,2021,5354.154754
1,10041,107,2021,12420.154769
2,10050,107,2021,12225.427231
3,10054,107,2021,49565.907688
4,10080,107,2021,70920.013671
