# Interactive Moving Window Kriging Pipeline
---
The preprocessing pipeline is executed in the following sequence. It assumes that masks have been generated. If a custom mask is required, work through the `mask.ipynb` file to see how the default masks were generated.

1. Place netCDF models into `climpyrical/data/model_inputs`. Ensemble models must have:
    * lat, lon, rlat, rlon and a 2D data field variable
2. Place station files into `climpyrical/data/station_inputs`. Input stations must have:
    * A data column with the design value of interest in the same units as the ensemble model. Note that the units need to be placed in parentheses next to the data variable name. i.e "RL50 (kPa)" or "HDD (degC-day)" are examples of valid names
    * latitude or longitude columns
    * Additional columns, like province name, elevation, and station name are optional, but recommended
3. The data produced in the pipeline will go in various subdirectories of `climpyrical/data/results/` using the PCIC design value naming standards (outlined below)
    * figures will be in `climpyrical/data/results/figures/`
    * tables will be in `climpyrical/data/results/tables/`
    * netCDF files in `climpyrical/data/results/datasets/`
    * intermediate notebooks for troubleshooting will be in `climpyrical/data/results/intermediate/`

```
climpyrical/data/results
├── datasets
│   └── 
├── figures
│   ├── 
├── intermediate
│   ├── notebooks
│   │   ├── model_log_{design value}.ipynb
│   │   ├── plotting_log_{design value}.ipynb
│   │   ├── RR_log_{design value}.ipynb
│   │   ├── station_log_{design value}.ipynb
│   ├── preprocessed_models
│   │   ├── {design value}\_preprocessed.nc
│   └── preprocessed_stations
│       └── {design value}\_processed_stations.csv
└── tables
     └── {design_vale}\tablec2.csv
```

In [1]:
import papermill as pm
import config
from simple_colors import red
from pkg_resources import resource_filename

## Configuration
---

Configure the notebook pipeline. This notebook calls subsequent notebooks in the correct order.

`station_dv` is the name of the design values as they appear in the station csv header column file provided for the station processing step. The naming standards between the station files and the output files need to manually configured


`filenames` this dictionary relates the station design value name to the PCIC design value standard name for the given design value. I.e. it produces filenames and plot titles according to this relationsihp

In [2]:
notebooks = ["preprocess_model.ipynb", "stations.ipynb", "ratio_kriging.ipynb"]

station_dvs = config.station_dvs
filenames = config.filenames
model_paths = config.model_paths
station_paths = config.station_paths

# Run the pipeline
---
For each station in the `station_dvs` list, run each file in the pipeline.

In [3]:
output_notebook_dir = resource_filename(
    "climpyrical",
    "/data/results/intermediate/notebooks/"
)

preprocessed_model_dir = resource_filename(
    "climpyrical",
    "/data/results/intermediate/preprocessed_models/"
)

output_stations_dir = resource_filename(
    "climpyrical",
    f"/data/results/intermediate/preprocessed_stations/")

output_reconstruction_dir = resource_filename(
    "climpyrical",
    f"/data/results/datasets/"
)

for station in station_dvs:
    print(red(f"Preprocessing Model for {station}", "bold"))
    pm.execute_notebook(
        "preprocess_model.ipynb",
        output_notebook_dir+f"model_log_{filenames[station]}.ipynb",
        parameters = dict(
            station_dv = station,
            model_input_path = resource_filename("climpyrical", model_paths[station]),
            name = filenames[station],
            fill_glaciers = True,
            processed_model_output_path = preprocessed_model_dir+f"{filenames[station]}_preprocessed.nc"
        )
    )

    print(red(f"Preprocessing stations for {station}", "bold"))
    pm.execute_notebook(
        "stations.ipynb",
        output_notebook_dir+f"station_log_{filenames[station]}.ipynb",
        parameters = dict(
            station_dv = station,
            station_input_path = resource_filename(
                "climpyrical",
                station_paths[station]
            ),
            name = filenames[station],
            processed_model_output_path = preprocessed_model_dir+f"{filenames[station]}_preprocessed.nc",
            df_path_write = output_stations_dir+f"{filenames[station]}_processed_stations.csv"
            
        )
    )
    print(red(f"Moving Window ratio reconstruction for {station}", "bold"))
    pm.execute_notebook(
        "ratio_kriging.ipynb",
        output_notebook_dir+f"RR_log_{filenames[station]}.ipynb",
        parameters = dict(
            station_dv = station,
            station_input_path = resource_filename(
                "climpyrical",
                station_paths[station]
            ),
            name = filenames[station],
            processed_model_output_path = preprocessed_model_dir+f"{filenames[station]}_preprocessed.nc",
            output_reconstruction_path = output_reconstruction_dir+f"{filenames[station]}_reconstructed.nc",
            df_path_write = output_stations_dir+f"{filenames[station]}_processed_stations.csv"
        )
    )
    print(red(f"Generationg figures for {station}", "bold"))
    pm.execute_notebook(
        "plots.ipynb",
        output_notebook_dir+f"plotting_log_{filenames[station]}.ipynb",
        parameters = dict(
            station_dv = station,
            name = filenames[station],
            preprocessed_model_path = preprocessed_model_dir+f"{filenames[station]}_preprocessed.nc",
            reconstruction_path = output_reconstruction_dir+f"{filenames[station]}_reconstructed.nc",
            processed_stations_path = output_stations_dir+f"{filenames[station]}_processed_stations.csv",
            output_figure_dir = resource_filename("climpyrical", "data/results/figures/")
        )
    )
#     print("Preprocessing stations for", station)
#     pm.execute_notebook(notebooks[1], output_notebook_dir+f"intermediate_{filenames[station]}_"+notebooks[1])
#     print("Ratio kriging for", station)
#     pm.execute_notebook(notebooks[2], output_notebook_dir+f"intermediate_{filenames[station]}_"+notebooks[2])
#     print("Completed!")

[1;31mPreprocessing Model for RL50 (kPa)[0m


HBox(children=(FloatProgress(value=0.0, description='Executing', max=25.0, style=ProgressStyle(description_wid…


[1;31mPreprocessing stations for RL50 (kPa)[0m


HBox(children=(FloatProgress(value=0.0, description='Executing', max=20.0, style=ProgressStyle(description_wid…


[1;31mMoving Window ratio reconstruction for RL50 (kPa)[0m


HBox(children=(FloatProgress(value=0.0, description='Executing', max=19.0, style=ProgressStyle(description_wid…


[1;31mGenerationg figures for RL50 (kPa)[0m


HBox(children=(FloatProgress(value=0.0, description='Executing', max=10.0, style=ProgressStyle(description_wid…




PapermillExecutionError: 
---------------------------------------------------------------------------
Exception encountered at "In [7]":
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-7-30969b2469d2> in <module>
      2 # whether to plot on a log colorscale. This dictionary configures
      3 # these options
----> 4 plot_dict = config.plot_dict
      5 
      6 colorscale, log, decimal = plot_dict[station_dv]

NameError: name 'config' is not defined
