# Interactive Moving Window Kriging Pipeline
---
The preprocessing pipeline is executed in the following sequence. It assumes that masks have been generated. If a custom mask is required, work through the `mask.ipynb` file to see how the default masks were generated.

1. Place netCDF models into `climpyrical/data/model_inputs`. Ensemble models must have:
    * lat, lon, rlat, rlon and a 2D data field variable
2. Place station files into `climpyrical/data/station_inputs`. Input stations must have:
    * A data column with the design value of interest in the same units as the ensemble model. Note that the units need to be placed in parentheses next to the data variable name. i.e "RL50 (kPa)" or "HDD (degC-day)" are examples of valid names
    * latitude or longitude columns
    * Additional columns, like province name, elevation, and station name are optional
3. The data produced in the pipeline will go in various subdirectories of `climpyrical/data/results/` using the PCIC design value naming standards (outlined below)
    * figures will be in `climpyrical/data/results/figures/`
    * tables will be in `climpyrical/data/results/TableC2/`
    * netCDF files in `climpyrical/data/results/netcdf/`
    * intermediate notebooks for troubleshooting will be in `climpyrical/data/results/intermediate/`
    * preprocessed statations and models are in `climpyrical/data/results/intermediate/` subdirectories

```
climpyrical/data/results
├── netcdf
│   └── 
├── figures
│   ├── 
├── intermediate
│   ├── notebooks
│   │   ├── model_log_{design value}.ipynb
│   │   ├── plotting_log_{design value}.ipynb
│   │   ├── MWOrK_log_{design value}.ipynb
│   │   ├── station_log_{design value}.ipynb
│   ├── preprocessed_netcdf
│   │   ├── {design value}_preprocessed.nc
│   └── preprocessed_stations
│       └── {design value}.csv
└── TableC2
     └── {design_vale}_TableC2.csv
```

In [None]:
import yaml
import papermill as pm
from simple_colors import red
from pkg_resources import resource_filename

## Configuration Example
---
```yaml
# Parameterize the pipeline
# The pipeline will iterate through each parent tree
# in dvs and provide the associated parameters

# Which notebooks to use in the pipeline
steps: [
"preprocess_model.ipynb", 
"stations.ipynb", 
"MWOrK.ipynb", 
"plots.ipynb", 
"nbcc_stations.ipynb", 
"combine_tables.ipynb"
]

n_jobs: 2

# To be placed in climpyrical/
paths:
    output_notebook_path: /data/results/intermediate/notebooks/
    preprocessed_model_path: /data/results/intermediate/preprocessed_netcdf/
    preprocessed_stations_path: /data/results/intermediate/preprocessed_stations/
    output_reconstruction_path: /data/results/netcdf/
    output_tables_path: /data/results/TableC2/
    output_figure_path: /data/results/figures/
    mask_path: data/masks/canada_mask_rp.nc
    north_mask_path: data/masks/canada_mask_north_rp.nc
    nbcc_loc_path: data/station_inputs/NBCC_2020_new_coords.xlsm

nbcc_correction: True
dvs:
    RL50:
        station_dv: "RL50 (kPa)"
        station_path: data/station_inputs/sl50_rl50_for_maps.csv
        input_model_path: data/model_inputs/snw_rain_CanRCM4-LE_ens35_1951-2016_max_rl50_load_ensmean.nc
        medians: 
            value: 0.3
            action: multiply
        fill_glaciers: True
        
    RHann:
        station_dv: "mean RH (%)"
        station_path: data/station_inputs/rh_annual_mean_10yr_for_maps.csv
        input_model_path: data/model_inputs/hurs_CanRCM4-LE_ens15_1951-2016_ensmean.nc
        medians:
            value: None
            action: None
        fill_glaciers: True
```

In [None]:
config_yml = "./config_example_means.yml"

with open(config_yml) as f:
    params = yaml.safe_load(f)
    
names = params["dvs"].keys()
output_notebook_dir = resource_filename("climpyrical", params["paths"]["output_notebook_path"])

# Run the pipeline
---
For each design value in config.yml, run each file in the pipeline.

### 1.) Preprocess Models
### 2.) Preprocess Stations
### 3.) MWOrK (Moving Window Ordinary ratio Kriging)
### 4.) Generate Figures of Results
### 5.) Generate TableC2
### 6.) Combine Tables

In [None]:
from multiprocessing import Pool
import nest_asyncio
nest_asyncio.apply()

def poolproc(name):
    if "preprocess_model.ipynb" in params["steps"]:
        print(red(f"Preprocessing Model for {name}", "bold"), "\n")
        pm.execute_notebook(
            "preprocess_model.ipynb",
            f"{output_notebook_dir}preprocessing_model_log_{name}.ipynb",
            parameters = {"name": name, **params["dvs"][name], **params["paths"]} 
        )
    if "stations.ipynb" in params["steps"]:
        print(red(f"Preprocessing stations for {name}", "bold"), "\n")
        pm.execute_notebook(
            "stations.ipynb",
            f"{output_notebook_dir}stations_log_{name}.ipynb",
            parameters = {"name": name, **params["dvs"][name], **params["paths"]} 
        )
    if "MWOrK.ipynb" in params["steps"]:
        print(red(f"Moving Window ratio reconstruction for {name}", "bold"), "\n")
        pm.execute_notebook(
            "MWOrK.ipynb",
            f"{output_notebook_dir}MWOrK_log_{name}.ipynb",
            parameters = {
                "name": name,
                **params["dvs"][name],
                **params["paths"],
                "nbcc_correction": params["nbcc_correction"]
            }
        )
    if "plots.ipynb" in params["steps"]:
        print(red(f"Generating figures for {name}", "bold"), "\n")
        pm.execute_notebook(
            "plots.ipynb",
            f"{output_notebook_dir}plots_log_{name}.ipynb",
            parameters = {
                "name": name,
                **params["dvs"][name],
                **params["paths"]
            }
        )
    if "nbcc_locations.ipynb" in params["steps"]:
        print(red(f"Generating tables for {name}", "bold"), "\n")
        pm.execute_notebook(
            "nbcc_locations.ipynb",
            f"{output_notebook_dir}nbcc_locations_log_{name}.ipynb",
            parameters = {
                "name": name,
                **params["dvs"][name],
                **params["paths"]
            }
        )

p = Pool(params["n_jobs"])
p.map(poolproc, params["dvs"].keys())
p.close()
p.join()

### Generate Full Tables

In [None]:
gen_raster_mask_from_vectorif "combine_tables.ipynb" in params["steps"]:
    print(red(f"Combining tables for all reconstructions", "bold"))
    pm.execute_notebook(
        "combine_tables.ipynb",
        output_notebook_dir+f"combined_stations_log.ipynb",
        parameters = {
            **params,
            **params["paths"]
        }
    )