# `bmorph` Example Workflow Template
This notebook demonstrates how to setup data for and bias correct it through **bmorph**, containing the same information as ``bmorph_tutorial.rst``.

## Import Packages and Load Data
We will be using numpy, xarray, and pandas in this example notebook.
Note: numpy can imported directly intsead of using magic, ``%pylab inline`` if desired. More on Built-in magic commands can be found [here](https://ipython.readthedocs.io/en/stable/interactive/magics.html).

In [None]:
%pylab inline
import xarray as xr
import pandas as pd

We will mainly deal with ``bmorph.workflows``, our primary organizing script, but will also use ``bmorph.mizuroute_utils`` to pre-process your data for bmorph.

In [None]:
import bmorph
from bmorph.util import mizuroute_utils as mizutil

Setting up a client for parallelism can help speed up the process
of bias correction immensely, espeically if you are working with large numbers of
watersheds. Calibrating which meterological variable you want to condition to can take
some time, so parralelism is recommended in especially the initial uses of ``bmorph``.

In [None]:
from dask.distributed import Client, progress

In case you are just copying this over, the client is only set up with
one thread and one worker to prevent accidentally overburdening any
machine this is running on. If you actually want to use parallelism, 
make sure to change this!

In [None]:
client = Client(threads_per_worker=1, n_workers=1) #Increase for parallel power!!!

Next you provide the gauge site names and their respective river segment identification
numbers, or ``site``'s and ``seg``'s. This will be used throughout to ensure the data does
not get mismatched.

In [None]:
site_to_seg = { site_0_name : site_0_seg, ...} # Input this mapping or read it from a text file before running!

Since it is nice to be able to access the data you just filled out without much struggle, here we create
some other useful forms of these gauge site mappings for later use.

In [None]:
seg_to_site = {seg: site for site, seg in site_to_seg.items()}
ref_sites = list(site_to_seg.keys())
ref_segs = list(site_to_seg.values())    

Next we load in topographical data (topo), meterological data (met), 
uncorrected flows (raw), and reference flows (ref). Note that some
fields have placeholder names that you should update before running.
If some data is not accessible in a single function call, be sure to collapse
it into a single file first before loading them. File designation calls assume
this code is in a folder seperate from the data, but that this code's containing
folder is at the same heirarchy as the folders containing the data. A description
of how your project directory is expected to be set up can be found in ``data.rst``.

In [None]:
basin_topo = xr.open_dataset('../topologies/basin_topology_file_name.nc').load()

Sometimes meterological data may only be available for a larger region
or watershed than anlayzing, so the following data will be described under such
an assumption.
    
Here we load in some example meterological data: daily minimum temperature (tmin), seasonal precipitation (prec),
and daily maximum temperature (tmax). You can use similar or completely different data, just note naming should be universally updated and unused names should be deleted or commented out completely.

In [None]:
watershed_met = xr.open_dataset('../input/tmin.nc').load()
watershed_met['seasonal_precip'] = xr.open_dataset('../input/prec.nc')['prec'].load().rolling(time=30, min_periods=1).sum()
watershed_met['tmax'] = xr.open_dataset('../input/tmax.nc')['tmax'].load()

Hydrualic response units (hru's) are the typical coordinate for meteorologic data. Later, mizuroute_utils
will take care of mapping these hru's to seg's.

In [None]:
watershed_met['hru'] = (watershed_met['hru'] - 1.7e7).astype(np.int32)

And last not be certainly not least, we need the flows themselves! ``bmorph`` is designed to bias 
correct simulated streamflow as modeled by [mizuroute](https://mizuroute.readthedocs.io/en/latest/). As a result, loading
up the raw flows involves combining a number of flow netcdf files, hence the ``open_mfdataset``.

In [None]:
watershed_raw = xr.open_mfdataset('../input/first_route*.nc')[['IRFroutedRunoff', 'dlayRunoff', 'reachID']].load()
watershed_raw['seg'] = watershed_raw.isel(time=0)['reachID'].astype(np.int32)
watershed_ref = xr.open_dataset('../input/nrni_reference_flows.nc').load().rename({'outlet':'site'})[['seg', 'seg_id', 'reference_flow']]

In order to select data for the basin of analysis from the larger watershed, we 
need the topology of the larger watershed as well.

In [None]:
watershed_topo = xr.open_dataset('../topologies/watershed_topology_file_name.nc').load()
watershed_topo = watershed_topo.where(watershed_topo['hru'] < 1.79e7, drop=True)

Here we clean up a few naming conventions to get everything on the same page in accordance with ``data.rst``.

In [None]:
if 'hru_id2' in basin_topo:
    basin_topo['hru'] = basin_topo['hru_id2']
if 'seg_id' in basin_topo:
    basin_topo['seg'] = basin_topo['seg_id']

## Convert ``mizuroute`` formatting to ``bmorph`` formatting

``mizuroute_utils`` is our utility script that will handle converting
Mizuroute outputs to what we need for ``bmorph``. For more information
on what ``mizuroute_utils`` does specifically and how to change its 
parameters, check out ``data.rst``.

Here we pull out coordinate data from the ovearching watershed
for the specific basin we want to analyze.