# Interactive Moving Window Kriging Pipeline
---
The preprocessing pipeline is executed in the following sequence. It assumes that masks have been generated. If a custom mask is required, work through the `mask.ipynb` file to see how the default masks were generated.

1. Place netCDF models into `climpyrical/data/model_inputs`. Ensemble models must have:
    * lat, lon, rlat, rlon and a 2D data field variable
2. Place station files into `climpyrical/data/station_inputs`. Input stations must have:
    * A data column with the design value of interest in the same units as the ensemble model. Note that the units need to be placed in parentheses next to the data variable name. i.e "RL50 (kPa)" or "HDD (degC-day)" are examples of valid names
    * latitude or longitude columns
    * Additional columns, like province name, elevation, and station name are optional, but recommended
3. The data produced in the pipeline will go in various subdirectories of `climpyrical/data/results/` using the PCIC design value naming standards (outlined below)
    * figures will be in `climpyrical/data/results/figures/`
    * tables will be in `climpyrical/data/results/TableC2/`
    * netCDF files in `climpyrical/data/results/netcdf/`
    * intermediate notebooks for troubleshooting will be in `climpyrical/data/results/intermediate/`

```
climpyrical/data/results
├── netcdf
│   └── 
├── figures
│   ├── 
├── intermediate
│   ├── notebooks
│   │   ├── model_log_{design value}.ipynb
│   │   ├── plotting_log_{design value}.ipynb
│   │   ├── MWOrK_log_{design value}.ipynb
│   │   ├── station_log_{design value}.ipynb
│   ├── preprocessed_netcdf
│   │   ├── {design value}_preprocessed.nc
│   └── preprocessed_stations
│       └── {design value}.csv
└── TableC2
     └── {design_vale}_TableC2.csv
```

In [1]:
import papermill as pm
import config
from simple_colors import red
from pkg_resources import resource_filename

## Configuration
---

Configure the notebook pipeline. This notebook calls subsequent notebooks in the correct order.

`station_dv` is the name of the design values as they appear in the station csv header column file provided for the station processing step. The naming standards between the station files and the output files need to manually configured


`filenames` this dictionary relates the station design value name to the PCIC design value standard name for the given design value. I.e. it produces filenames and plot titles according to this relationsihp

In [2]:
notebooks = ["preprocess_model.ipynb", "stations.ipynb", "ratio_kriging.ipynb"]


config_yml = "./config_example_means.yml"
import yaml
with open(config_yml) as f:
    params = yaml.safe_load(f)
    
names = params["dvs"].keys()

output_notebook_dir = resource_filename("climpyrical", params["paths"]["output_notebook_path"])

Set up execution

In [3]:
params["steps"]

['preprocess_model.ipynb',
 'stations.ipynb',
 'MWOrK.ipynb',
 'plots.ipynb',
 'nbcc_stations.ipynb',
 'combine_tables.ipynb']

In [4]:
params["n_jobs"]

6

# Run the pipeline
---
For each design value in config.yml, run each file in the pipeline.

### 1.) Preprocess Models
### 2.) Preprocess Stations
### 3.) MWOrK (Moving Window Ordinary ratio Kriging)
### 4.) Generate Figures of Results
### 5.) Generate TableC2
### 6.) Combine Tables

In [None]:
from multiprocessing import Pool
import nest_asyncio
nest_asyncio.apply()

# for name in params["dvs"].keys():
def poolproc(name):
    if "preprocess_model.ipynb" in params["steps"]:
        print(red(f"Preprocessing Model for {name}", "bold"), "\n")
        pm.execute_notebook(
            "preprocess_model.ipynb",
            f"{output_notebook_dir}preprocessing_model_log_{name}.ipynb",
            parameters = {"name": name, **params["dvs"][name], **params["paths"]} 
        )
    if "stations.ipynb" in params["steps"]:
        print(red(f"Preprocessing stations for {name}", "bold"), "\n")
        pm.execute_notebook(
            "stations.ipynb",
            f"{output_notebook_dir}stations_log_{name}.ipynb",
            parameters = {"name": name, **params["dvs"][name], **params["paths"]} 
        )
    if "MWOrK.ipynb" in params["steps"]:
        print(red(f"Moving Window ratio reconstruction for {name}", "bold"), "\n")
        pm.execute_notebook(
            "MWOrK.ipynb",
            f"{output_notebook_dir}MWOrK_log_{name}.ipynb",
            parameters = {
                "name": name,
                **params["dvs"][name],
                **params["paths"],
                "nbcc_median_correction": params["nbcc_median_correction"]
            }
        )
    if "plots.ipynb" in params["steps"]:
        print(red(f"Generating figures for {name}", "bold"), "\n")
        pm.execute_notebook(
            "plots.ipynb",
            f"{output_notebook_dir}plots_log_{name}.ipynb",
            parameters = {
                "name": name,
                **params["dvs"][name],
                **params["paths"]
            }
        )
    if "nbcc_stations.ipynb" in params["steps"]:
        print(red(f"Generating tables for {name}", "bold"), "\n")
        pm.execute_notebook(
            "nbcc_stations.ipynb",
            f"{output_notebook_dir}nbcc_stations_log_{name}.ipynb",
            parameters = {
                "name": name,
                **params["dvs"][name],
                **params["paths"]
            }
        )

p = Pool(params["n_jobs"])
p.map(poolproc, params["dvs"].keys())
p.close()
p.join()

[1;31mPreprocessing Model for SL50[0m[1;31mPreprocessing Model for RL50[0m[1;31mPreprocessing Model for WP50[0m[1;31mPreprocessing Model for WP10[0m[1;31mPreprocessing Model for HDD[0m[1;31mPreprocessing Model for RHann[0m    
  












HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=26.0), HTML(value='')))

HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=26.0), HTML(value='')))

HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=26.0), HTML(value='')))

HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=26.0), HTML(value='')))

HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=26.0), HTML(value='')))

HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=26.0), HTML(value='')))


[1;31mPreprocessing stations for SL50[0m 



HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=28.0), HTML(value='')))


[1;31mPreprocessing stations for RL50[0m 



HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=28.0), HTML(value='')))


[1;31mPreprocessing stations for HDD[0m 



HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=28.0), HTML(value='')))


[1;31mPreprocessing stations for RHann[0m 



HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=28.0), HTML(value='')))


[1;31mMoving Window ratio reconstruction for SL50[0m 



HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=38.0), HTML(value='')))



[1;31mPreprocessing stations for WP50[0m 

[1;31mPreprocessing stations for WP10[0m 



HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=28.0), HTML(value='')))

HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=28.0), HTML(value='')))


[1;31mMoving Window ratio reconstruction for RL50[0m 



HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=38.0), HTML(value='')))


[1;31mMoving Window ratio reconstruction for HDD[0m 



HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=38.0), HTML(value='')))


[1;31mMoving Window ratio reconstruction for RHann[0m 



HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=38.0), HTML(value='')))


[1;31mMoving Window ratio reconstruction for WP50[0m
 

[1;31mMoving Window ratio reconstruction for WP10[0m 



HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=38.0), HTML(value='')))

HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=38.0), HTML(value='')))


[1;31mGenerating figures for RHann[0m 



HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))

### Generate Full Tables

In [5]:
if "combine_tables.ipynb" in params["steps"]:
    print(red(f"Combining tables for all reconstructions", "bold"))
    pm.execute_notebook(
        "combine_tables.ipynb",
        output_notebook_dir+f"combined_stations_log.ipynb",
        parameters = {
#             "name": name,
            **params,
            **params["paths"]
        }
    )

[1;31mCombining tables for all reconstructions[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


