In [1]:
# # Run this and then restart the kernel at the start of each session to install
# # 'teotil3' in development mode
# !pip install -e /home/jovyan/projects/teotil3/

In [2]:
import nivapy3 as nivapy
import pandas as pd
import teotil3 as teo

In [3]:
# eng = nivapy.da.connect_postgis(admin=True)
eng = nivapy.da.connect_postgis()

Connection successful.


# Task 2.9: Improve the workflow for aquaculture

## Part A: TOTN, TOTP and TOC

From the proposal text:

> **Oppgave 2.9: Forbedre arbeidsflyten for akvakultur**
>
> En kort litteraturgjennomgang vil bli foretatt for å identifisere typiske proporsjoner av DIN, TON, TDP og TPP i utslipp fra akvakultur. Arbeidsflyten for total N og P fra anlegg i sjøvann vil bli omkodet for å gjøre nøkkelmodellparametere mer tydelige og enklere å oppdatere. Beregningen for TOC vil også bli implementert, som beskrevet i avsnitt 6.2.4 av hovedrapporten. SS anses som ikke relevant for akvakultur.

The old model already includes functions to estimate fluxes of TOTN and TOTP from aquaculture. For the updated model, the calculations for N and P remain the same, but the code needs clarifying (because I now have a better understanding of what the calculations actually mean). In addition, a new workflow is required for TOC and we need some new functions to process the raw data from Fiskeridirektoratet and add it to the TEOTIL database.

The existing workflow for TOTN and TOTP is described [here](https://niva.brage.unit.no/niva-xmlui/bitstream/handle/11250/2985726/7726-2022+high.pdf?sequence=1#page=29) and the proposed new workflow for TOC is [here](https://niva.brage.unit.no/niva-xmlui/bitstream/handle/11250/2985726/7726-2022+high.pdf?sequence=1#page=43).

## 1. Raw data

Raw monthly data are provided each year by Fiskeridirektoratet in an encrypted Excel file. TEOTIL attempts to parse these data correctly, but it is a good idea to double-check the file manually first.

## 2. TEOTIL3 workflow

`teo.preprocessing` contains the following new functions for handling data from aquaculture:

 * `read_raw_aquaculture_data` reads the raw Excel file and identifies any site IDs that are not already in the database. This function returns two objects: (i) a geodataframe of point co-ordinates for new sites to be added to the database, and (ii) a dataframe of raw monthly data for all aquaculture sites in the specified year. The geodataframe is reprojected to EPSG 25833, ready for upload to `teotil3.point_source_locations`. The dataframe should be passed to the function below to estimate nutrient losses
 
 * `estimate_aquaculture_nutrient_inputs` is the main function for estimating nutrient losses from aquaculture. By default, **the function estimates losses of TOTN, TOTP and TOC from fish farms producing salmon and rainbow trout** (following the approach outlined in the forprosjekt report - see the links above). Optionally, the user can specify the total annual amount of copper used nationally by the aquaculture industry (keyword argument `cu_tonnes`), and this will be divided among all active sites in proportion to the estimated losses of TOTP. If desired, users can also specify a list of species IDs to consider instead of just salmon and trout (keyword argument `species_ids`). Note, however, that at present the same feed and productivity coefficients, and the same Feed Conversion Ratio, is assumed for all species
 
Key parameters in the aquaculture calculations are the proportion of nutrients by mass in the feed and produced fish ($k_{feed}$ and $k_{prod}$, respectively), as well as the national **Feed Conversion Ratio** (FCR). These values are provided by Miljødirektoratet and the ones currently used by the TEOTIL model are defined [here](https://github.com/NIVANorge/teotil3/blob/main/data/aquaculture_productivity_coefficients.csv).

## 3. Process data for 2021

The code below processes the latest data from Fiskeridirektoratet and adds it to the database.

In [4]:
year = 2021
xl_path = f"/home/jovyan/shared/teotil3/point_data/{year}/fiske_oppdret_{year}_raw.xlsx"
xl_sheet = f"fiskeoppdrett_{year}"

nidb_df, df = teo.preprocessing.read_raw_aquaculture_data(xl_path, xl_sheet, year, eng)
df = teo.preprocessing.estimate_aquaculture_nutrient_inputs(
    df, year, eng, cu_tonnes=None, species_ids=[71401, 71101]
)

if nidb_df is not None:
    print(f"\n{len(nidb_df)} locations should be added to the database:")
    display(nidb_df)

0 locations do not have co-ordinates in this year's data.
27 locations are not in the database.

27 locations should be added to the database:


Unnamed: 0,site_id,name,type,geom
0,40118,Svindalen,Aquaculture,POINT (517203.608 7646375.293)
1,37637,TRØVIKA,Aquaculture,POINT (-53151.398 6783390.344)
2,45020,Rekvika,Aquaculture,POINT (468.514 6920551.041)
3,45066,Lingaholmane,Aquaculture,POINT (6104.903 6712741.463)
4,39437,Hovden,Aquaculture,POINT (-31943.198 6883009.233)
5,39957,Flatøyan,Aquaculture,POINT (179655.749 7099338.070)
6,40117,OTERVIKA,Aquaculture,POINT (357948.510 7229891.098)
7,45010,Holand,Aquaculture,POINT (510949.349 7613251.093)
8,45058,URDANESET,Aquaculture,POINT (85039.449 6935532.475)
9,45072,BLEKET,Aquaculture,POINT (-18992.910 6904070.869)


In [5]:
# # Add new aquaculture sites to the database
# if nidb_df is not None:
#     nidb_df.to_sql(
#         "point_source_locations",
#         con=eng,
#         schema="teotil3",
#         if_exists="append",
#         index=False,
#     )

In [6]:
df.head()

Unnamed: 0,site_id,in_par_id,year,value
0,10029,107,2021,5354.154754
1,10041,107,2021,12420.154769
2,10050,107,2021,12225.427231
3,10054,107,2021,49565.907688
4,10080,107,2021,70920.013671


In [7]:
# # Add aquaculture data to database
# df.to_sql(
#     "point_source_values",
#     con=eng,
#     schema="teotil3",
#     if_exists="append",
#     index=False,
# )