# SACO Minimal Examples

This notebook provides some minimal examples to help get started with the SACO package. 

See the package documentation - including the tutorial section - for more detailed explanations and reference information on:
- Data requirements
- Available functionality
- Methodology and approaches
- Key functions and classes (attributes and methods)
- Optional arguments and customisation


## Imports

We begin by importing the package, along with some other useful packages. It is possible to access other functions/classes via more granular imports if they become useful at some stage.

In [8]:
import os

import pandas as pd

from saco import Dataset, Calculator, Optimiser

## Tables and Datasets

In this section we look at loading and manipulating data that can then be used as input to the Calculator and/or Optimiser components. We begin with a usage example in which we load a WRGIS-like dataset stored in multiple files within a folder. (This dataset is synthetic, with no relationship to any real waterbodies or artificial influences.)

In [9]:
ds = Dataset(data_folder='./data')
ds.load_data()

Here we have created a `Dataset` object (as `ds`) and loaded data into memory. A `Dataset` is primarily used to group together the relevant WRGIS data tables. It is the main input to the Calculator and Optimiser.

### Changing Numbers

The documentation explains the structure of a `Dataset` and how we might go about changing numbers in its component tables. One way to do this is outside of SACO, saving changes to file(s) and loading them using the code example above. Alternatively, it is possible to manipulate the `data` attribute of each component table.

For example, `ds.swabs` provides access to the SWABS_NBB table of surface water abstractions. Similar attributes exist for the other tables and are listed in the documentation. To access the actual dataframe of surface water abstractions, we can use `ds.swabs.data`, as in the example code cell below.

In [10]:
ds.swabs.data.head()

Unnamed: 0_level_0,EA_WB_ID,HOFMLD,HOFWBID,PURPCODE,RESRVRFLAG,SWQ30FLWR,SWQ30FPWR,SWQ30RAWR,SWQ50FLWR,SWQ50FPWR,...,SWQ70RAWR,SWQ95FLWR,SWQ95FPWR,SWQ95RAWR,SW_LAKE1,SW_LAKE2,SW_LAKE3,SW_LAKE4,SW_LAKE5,SW_LDMU_NO
UNIQUEID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
S0000,GB0000,4.70523,GB0002,ABC,0,0.153598,0.125243,0.111025,0.141783,0.115609,...,0.093945,0.118153,0.096341,0.085404,0,0,0,0,0,0
S0001,GB0001,23.95339,GB0003,ABC,1,0.001842,0.001498,0.001321,0.0017,0.001382,...,0.001118,0.001417,0.001152,0.001016,0,0,0,0,0,0
S0002,GB0002,8.011331,GB0000,ABC,0,0.089443,0.066862,0.059842,0.082563,0.061719,...,0.050636,0.068803,0.051432,0.046033,0,0,0,0,0,0
S0003,GB0001,0.0,,ABC,1,0.005585,0.004175,0.003691,0.005155,0.003854,...,0.003123,0.004296,0.003212,0.002839,0,0,0,0,0,0
S0004,GB0000,0.0,,ABC,0,0.031993,0.023487,0.020002,0.029532,0.021681,...,0.016924,0.02461,0.018067,0.015386,0,0,0,0,0,0


We can then view and manipulate this attribute as we could any dataframe. In the arbitrary example below, we subset the dataframe on surface water abstractions impacting a given waterbody under the fully licensed (FL) scenario at Q95. We set these particular impacts to zero, just to illustrate how the dataframe can be modified.

In [11]:
# Set impacts in waterbody GB0002 to zero under the FL scenario at Q95
ds.swabs.data.loc[ds.swabs.data['EA_WB_ID'] == 'GB0002', 'SWQ95FLWR'] = 0.0

# Query dataframe to check the change
ds.swabs.data.loc[ds.swabs.data['EA_WB_ID'] == 'GB0002', ['EA_WB_ID', 'SWQ95FLWR']]

Unnamed: 0_level_0,EA_WB_ID,SWQ95FLWR
UNIQUEID,Unnamed: 1_level_1,Unnamed: 2_level_1
S0002,GB0002,0.0
S0011,GB0002,0.0
S0013,GB0002,0.0
S0016,GB0002,0.0
S0018,GB0002,0.0
S0023,GB0002,0.0
S0026,GB0002,0.0
S0027,GB0002,0.0
S0028,GB0002,0.0
S0030,GB0002,0.0


Of course other operations like joins/merges etc could be undertaken. Details of the attributes and methods of a `Dataset` and each table class are given in the reference documentation (see also the tutorial).

## Calculator

Once a `Dataset` has been loaded or constructed it can be supplied as input to the `Calculator`. As demonstrated in the code cell below, the `run` method of the `Calculator` can then be executed to recalculate scenario flows, surpluses/deficits and compliance bands based on the input `Dataset`.

In [12]:
calculator = Calculator(ds)
output_dataset = calculator.run()

By default the `Calculator` returns a new `Dataset`, which is the same as the input `Dataset` except for the `Master` table of the `Dataset`. The `Master` table is a wide, waterbody-indexed table that provides all water balance terms, surpluses/deficits and compliance bands for all waterbodies. Supplying the optional `master_only=True` argument to the `run` method will just return the `Master` table as a dataframe.

We can examine the new `Master` table. For example, we can see that the FL artificial influences scenario at the Q30 flow percentile shows issues with surplus/deficit (SD) and compliance (COMP). Deficits are present for waterbodies GB0001 and GB0002, which result in band 2 and band 1 non-compliance, respectively.

In [13]:
output_dataset.mt.data[['SDFLQ30', 'COMPFLQ30']].head()

Unnamed: 0_level_0,SDFLQ30,COMPFLQ30
EA_WB_ID,Unnamed: 1_level_1,Unnamed: 2_level_1
GB0000,1.664156,0
GB0001,-1.919755,2
GB0002,-0.851963,1
GB0003,10.057077,0


Note that we could save all the tables at this point via:

```
output_dataset.write_tables(output_folder='/path/to/output/folder')
```

## Optimiser

As noted in the documentation, the role of the `Optimiser` is to suggest how impacts could best be adjusted to meet flow targets, given some objective(s) and constraints. The solution to this problem is obtained via mixed integer (binary) linear programming.

The starting point for the `Optimiser` is again a `Dataset`. However, as described in the documentation, we need to ensure that flow targets are available and that each row of the point abstractions tables is "flagged" for inclusion/exclusion from the optimisation. These steps can be taken using the methods below (which are both customisable).

In [14]:
ds.set_flow_targets()
ds.set_optimise_flag(exclude_deregulated=False, exclude_below=None)

For `set_optimise_flag`, we choose to deviate from the defaults regarding deregulated abstractions and excluding impacts below a certain threshold. This gives an example of how the optional arguments can be used to customise behaviour.

Once we are happy that a `Dataset` is ready for the `Optimiser`, we can invoke the `run` method of the `Optimiser` as below. It is possible to customise the scenarios and flow percentiles that are optimised, as well as the domain/catchment considered (amongst other things).

In [15]:
optimiser = Optimiser(ds, scenarios=['FL'], percentiles=[30])
output_dataset = optimiser.run()

Whereas in the "base" case we had non-compliant waterbodies under this scenario/percentile combination, the surplus/deficit (SD) and compliance (COMP) fields in the optimised output dataset show that the problem has been solved.

In [16]:
output_dataset.mt.data[['SDFLQ30', 'COMPFLQ30']].head()

Unnamed: 0_level_0,SDFLQ30,COMPFLQ30
EA_WB_ID,Unnamed: 1_level_1,Unnamed: 2_level_1
GB0000,1.955807,0
GB0001,-4.440892e-16,0
GB0002,1.544491,0
GB0003,12.6752,0
