In [1]:
import pandas as pd

# Task 2.8: Improve workflow for wastewater treatment and industry

From the proposal text:

> **Oppgave 2.8: Forbedre arbeidsflyten for avløpsrensing og industri**
>
> En litteraturgjennomgang vil bli foretatt for å identifisere typiske proporsjoner av DIN, TON, TDP og TPP i avløp fra ulike typer industri og avløpsrenseanlegg. Typiske forhold mellom BOF og TOC for ulike anlegg vil også bli identifisert og brukt til å konvertere rapporterte utslipp av BOF til TOC (for samsvar med andre kilder).
>
> TEOTIL-koden vil bli oppdatert for å gjøre nye modellparametere godt synlige og enkle å oppdatere (for eksempel innenfor en enkelt parameter fil eller Excel arbeidsbok).
>
> Merk: Denne oppgaven vil kreve at Miljødirektoratet leverer rådata fra databasene for industri og renseanlegg. SSBs arbeidsflyt for behandling av data fra avløpsanlegg må også oppdateres til å inkludere SS og BOF. Dette er ikke inkludert i tidsestimatet som er gitt her.

This notebook provides an overview of the new method, which is implemented by functions in `teo.preprocessing`. For an example of how these functions are used, see [notebook 2.1e](https://nbviewer.org/github/NIVANorge/teotil3/blob/main/notebooks/development/T2-1e_annual_data_upload.ipynb), which illustrates the annual workflow to update the TEOTIL3 database.

## 1. Wastewater treatment

The wastewater dataset comes from SSB and is split into two parts: discharges from "large" sites (>50 p.e.) and discharges from small sites (≤50 p.e). 

### 1.1. Data for "small" sites

The data for small sites (often called the "spredt dataset") is aggregated to kommune level and includes estimates of TOTN and TOTP from 14 different types of small treatment plant:

 * Direkte utslipp
 * Slamavskiller
 * Infiltrasjonsanlegg
 * Sandfilteranlegg
 * Biologisk
 * Kjemisk
 * Biologisk og kjemisk
 * Tett tank (for alt avløpsvann)
 * Tett tank for svartvann
 * Biologisk toalett
 * Konstruert våtmark
 * Tett tank for svartvann, gråvannsfilter
 * Biologisk toalett, gråvannsfilter
 * Annen løsning
 
`Tett tank (for alt avløpsvann)` does not need to be considered by TEOTIL3, as these tanks must be emptied periodically into the kommunal network (i.e. these discharges are reported as part of the dataset for "large" sites).

**SS from small sites is not considered in SSB's workflow and therefore cannot be included in TEOTIL at this time** (see the note in the proposal text above). However, the model structure is flexible enough to allow SS for "spredt" to be added easily, if Miljødirektoratet decide to fund additional processing with SSB.
 
### 1.2. Data for "large" sites

The basic dataset for large sites is often called the "miljøgifter" dataset and it includes all monitored discharges from large wastewater plants. For TOTN, TOTP, BOF5 and KOF, SSB use statistical interpolation to patch reporting gaps in this dataset. These data are delivered in two files: the "store anlegg" dataset, which contains interpolated values for TOTN and TOTP, and a file named `RID_Totalpopulasjon_{year}.csv`, which contains interpolated values for BOF5 and KOF, plus the treatment type for each plant. The following treatment categories are used:

 * Urenset
 * Mekanisk - slamavskiller
 * Mekanisk - sil, rist
 * Mekanisk
 * Kjemisk
 * Biologisk
 * Kjemisk-biologisk
 * Naturbasert
 * Annen rensing
 
SSB do not produce statistically interpolated values for SS at large sites. **TEOTIL3 therefore includes SS when it is reported directly (i.e. in the `miljøgifter` dataset), but the wastewater dataset for this variable is less complete than the datasets for N, P and TOC**.

## 2. Industry

In Miljødirektoratet's database, industrial sites are classified according to the **anleggsaktivitet**, which comprises 29 classes:

| Anleggsaktivitet                                | Description                                                                                                                                                    |
|-------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Tekstil-, skinn- og   tauprodukter              | Produserer tekstiler, skinn og tau                                                                                                                             |
| Avfallsforbrenning                              | Forbrenner alle typer avfall, herunder farlig avfall (jf. kap.   10 i avfallsforskriften)                                                                      |
| Farlig avfall - mottak og   mellomlagring       | Mottar og mellomlagrer farlig avfall                                                                                                                           |
| Farlig avfall - behandling                      | Fysiske/kjemiske/biologiske prosesser som endrer det farlige   avfallets egenskaper                                                                            |
| Asfaltverk                                      | Fremstiller asfaltmasse (jf kap. 24)                                                                                                                           |
| Fiskeforedling                                  | Slakter eller foredler fisk, skjell, bløtdyr og skalldyr (jf,   kap. 26)                                                                                       |
| Kjemisk/elektrolytisk   overflatebehandling     | Belegger en metall- eller plastoverflate med metall eller   bearbeider en slik overflate kjemisk (jf. kap. 28)                                                 |
| Mekanisk overflatebehandling   og verft         | Bygger og vedlikeholder metallkonstruksjoner.   Høytrykksspyling, blåserensing, metallisering og sprøytemaling/lakkering.   Herunder båtslipper. (jf. kap. 29) |
| Pukkverk                                        | Knuse- og sikteverk som produserer pukk, grus, sand og singel   i dagen (jf. kap. 30)                                                                          |
| Skytebaner                                      | Riflebaner, leirduebaner, skiskytterbaner, pistolbaner m.m. Et   anlegg kan ha flere baner                                                                     |
| Motorsportbane                                  | Motorsportbaner                                                                                                                                                |
| Fôrproduksjon                                   | Produserer animalsk og vegetabilsk fôr til landbruk og   akvakultur, samt råstoffer til fôrproduksjon                                                          |
| Flyplass                                        | Landingsplass for luftfarttøy og tilhørende fasiliteter                                                                                                        |
| Brannøvingsplass                                | Øver slokking av antente brennstoffer og trevirke,   røykdykking, antente bygninger, biler                                                                     |
| Mineralsk industri, unntatt   pukkverk          | Utvinner og foredler mineraler og stein (gruver, kalkbrudd,   glass, glassfiber, mineralfiber, keramikk, betong)                                               |
| Forbrenningsanlegg for rene   brensler          | Forbrenner biomasse, olje, gass og lignende (ikke forbrenning   av avfall) (herunder kap. 27 anlegg)                                                           |
| Næringsmiddelindustri,   unntatt fiskeforedling | Produserer fast og flytende føde for konsumering, herunder   slakterier. (Unntatt fiskeforedling)                                                              |
| Kjemisk industri                                | Produserer kjemikalier og kjemiske produkter (farmasøytisk,   vaskemidler, rense- og polermidler, toalettartikler, plast, gummi,   sprengstoff, maling etc)    |
| Treforedling                                    | Fremstiller papirmasse, papir, kartong, møbelplater etc.   Impregnerer tre                                                                                     |
| Tannlegekontor og grafisk   virksomhet          | Tannklinikker og tannlegekontorer. Kjemiske prosesser innen   foto, røntgen og grafisk industri hvor fotokjemikalieholdig avløpsvann   genereres               |
| Krematorium                                     | Anlegg for kremering (jf. kap. 10)                                                                                                                             |
| Metallurgisk industri                           | Produserer stål, silisiummetall, aluminium, ferrolegeringer og   andre metaller                                                                                |
| Plast- og glassfiberprodukter                   | Produserer plastbåter, tanker, beholdere m.m.                                                                                                                  |
| Notvaskerier                                    | Rengjører, vasker eller impregnerer oppdrettsnøter (jf. kap.   25)                                                                                             |
| Tanklagring                                     | Lagrer miljøfarlige stoffer, eks bensin, diesel, naturgass,   propan eller andre kjemikalier                                                                   |
| Olje- og gassvirksomhet på   land               | Ilandføring, raffinering eller prosessering av olje eller gass                                                                                                 |
| Skyte- og øvingsfelt                            | Skytefelt og øvingsfelt                                                                                                                                        |
| Skipsgjenvinning                                | Gjenvinner næringsfartøy som skip eller flyttbare innretninger   og marine konstruksjoner                                                                      |
| Tunneldrift                                     | Drift av tunneler, grunnet utslipp av vaskevann eller   overvann.                                                                                              |

SSB does not perform statistical inteprolation/gap-filling of the industrial data, so **TEOTIL3 works directly with the raw data reported to Miljødirektoratet**. For this project, a complete export industrial data for the period from 2010 to 2021 was obtained from Miljødirektoratet.

## 3. Literature review

Christian Vogelsang has undertaken a literature review to identify typical subfractions of N and P, and the relationships between BOF, KOF and TOC, in the discharges from different types of wastewater and industrial plant. These factors are used to subdivide TOTN and TOTP, and to estimate TOC from BOF and KOF. Measured SS fluxes are included where available.

It is important to emphasise that there are large variations in both the proportions of the N and P subfractions and the relationships between TOC and BOF/KOF, even within plants of the same type. For this reason, **estimates of DIN, TON, TDP, TPP and TOC for industrial and wastewater treatment sites are uncertain, and should be interpreted with caution**.

For TOTP from industry, Christian was not able to find sufficient data to make meaningful estimates for TDP and TPP: the literature suggests that P subfractions will be highly variable between sites, and in the data from Miljødirektoratet from 2010 to 2021, there are only three sites in Norway that report `P-ORT` (`~TDP`). As a result, **TEOTIL3 cannot subdivide TOTP from industry in a meaningful way**. Instead, the new model assumes TOTP comprises equal proportions of TDP and TPP, as this gives equal weight to each fraction in the retention calculations and ensures that retention of TOTP will be estimated correctly (see [notebook 2.5a](https://nbviewer.org/github/NIVANorge/teotil3/blob/main/notebooks/development/T2-5a_est_vollenweider_params_from_data.ipynb)). For this reason, **estimates of *industrial* TDP and TPP in TEOTIL3 are highly speculative**. 

## 4. Improved workflow summary

The new version of TEOTIL considers the following inputs for **wastewater treatment** sites:

 * TOTN, DIN & TON and TOTP, TDP & TPP, and TOC for both large and small sites
 * TOC is estimated from BOF5 or KOF. Where estimates for both BOF5 and KOF are available, KOF is used in preference to BOF
 * SS from large sites, where reported

The new version of TEOTIL considers the following inputs for **industrial** sites:

 * TOTN, DIN & TON. Subfractions are estimated from reported values for TOTN (even though measured data for some subfractions is occasionally available)
 * TOTP, TDP & TPP. For all sites, it is assumed that 50% of TOTP is TDP and 50% is TPP. This is to ensure overall retention of TOTP behaves as expected, but means that estimates for industrial discharges of TDP and TPP are uncertain and likely to be inaccurate 
 * TOC, either where it is reported directly or where it can be inferred from BOF5 or KOF. If BOF5 and KOF are both available, KOF is used in preference to BOF
 * SS
 
The model **does not** consider:

 * SS from small/spredt wastewater sites. However, if this component is included in SSB's workflow in the future, it could be easily added to TEOTIL3.

## 5. New model parameters

For each class of site ("large wastewater", "small wastewater", "industry") and each treatment type ("mechanical", "biological", "chemical" etc. for wastewater treatment; "anleggsaktivitet" for industry), the file [here](https://github.com/NIVANorge/teotil3/blob/main/data/point_source_treatment_types.csv) (also shown below) provides factors to subdivide N and P, and to estimating TOC from BOF5 or KOF. For the TOC calculations, equations take the form `TOC = kof_fac * KOF` and `TOC = bof_fac * BOF`.

In [2]:
url = r"https://raw.githubusercontent.com/NIVANorge/teotil3/main/data/point_source_treatment_types.csv"
df = pd.read_csv(url)
df

Unnamed: 0,sector,type,prop_din,prop_ton,prop_tpp,prop_tdp,kof_fac,bof_fac
0,Large wastewater,Urenset,0.65,0.35,0.67,0.33,0.23,0.6
1,Large wastewater,Mekanisk - slamavskiller,0.68,0.32,0.35,0.65,0.23,0.6
2,Large wastewater,"Mekanisk - sil, rist",0.68,0.32,0.55,0.45,0.23,0.6
3,Large wastewater,Mekanisk,0.68,0.32,0.45,0.55,0.23,0.6
4,Large wastewater,Kjemisk,0.72,0.28,0.87,0.13,0.33,0.9
5,Large wastewater,Biologisk,0.87,0.13,0.56,0.44,0.38,1.8
6,Large wastewater,Kjemisk-biologisk,0.9,0.1,0.87,0.13,0.32,1.8
7,Large wastewater,Kjemisk-biologisk m/N-fjerning,0.76,0.24,0.55,0.45,0.32,1.8
8,Large wastewater,Naturbasert,0.91,0.09,0.66,0.34,0.32,1.8
9,Large wastewater,Annen rensing,0.76,0.24,0.61,0.39,0.29,1.2
