# Interactive Moving Window Kriging Pipeline
---
The preprocessing pipeline is executed in the following sequence. It assumes that masks have been generated. If a custom mask is required, work through the `mask.ipynb` file to see how the default masks were generated.

1. Place netCDF models into `climpyrical/data/model_inputs`. Ensemble models must have:
    * lat, lon, rlat, rlon and a 2D data field variable
2. Place station files into `climpyrical/data/station_inputs`. Input stations must have:
    * A data column with the design value of interest in the same units as the ensemble model. Note that the units need to be placed in parentheses next to the data variable name. i.e "RL50 (kPa)" or "HDD (degC-day)" are examples of valid names
    * latitude or longitude columns
    * Additional columns, like province name, elevation, and station name are optional, but recommended
3. The data produced in the pipeline will go in various subdirectories of `climpyrical/data/results/` using the PCIC design value naming standards (outlined below)
    * figures will be in `climpyrical/data/results/figures/`
    * tables will be in `climpyrical/data/results/tables/`
    * netCDF files in `climpyrical/data/results/netcdf/`
    * intermediate notebooks for troubleshooting will be in `climpyrical/data/results/intermediate/`

```
climpyrical/data/results
├── netcdf
│   └── 
├── figures
│   ├── 
├── intermediate
│   ├── notebooks
│   │   ├── model_log_{design value}.ipynb
│   │   ├── plotting_log_{design value}.ipynb
│   │   ├── RR_log_{design value}.ipynb
│   │   ├── station_log_{design value}.ipynb
│   ├── preprocessed_netcdf
│   │   ├── {design value}\_preprocessed.nc
│   └── preprocessed_stations
│       └── {design value}\_processed_stations.csv
└── tables
     └── {design_vale}\tablec2.csv
```

In [1]:
import papermill as pm
import config
from simple_colors import red
from pkg_resources import resource_filename

## Configuration
---

Configure the notebook pipeline. This notebook calls subsequent notebooks in the correct order.

`station_dv` is the name of the design values as they appear in the station csv header column file provided for the station processing step. The naming standards between the station files and the output files need to manually configured


`filenames` this dictionary relates the station design value name to the PCIC design value standard name for the given design value. I.e. it produces filenames and plot titles according to this relationsihp

In [2]:
notebooks = ["preprocess_model.ipynb", "stations.ipynb", "ratio_kriging.ipynb"]

station_dvs = config.station_dvs
filenames = config.filenames
model_paths = config.model_paths
station_paths = config.station_paths

# Run the pipeline
---
For each station in the `station_dvs` list, run each file in the pipeline.

In [3]:
output_notebook_dir = resource_filename(
    "climpyrical",
    "/data/results/intermediate/notebooks/"
)

preprocessed_model_dir = resource_filename(
    "climpyrical",
    "/data/results/intermediate/preprocessed_netcdf/"
)

output_stations_dir = resource_filename(
    "climpyrical",
    f"/data/results/intermediate/preprocessed_stations/")

output_reconstruction_dir = resource_filename(
    "climpyrical",
    f"/data/results/netcdf/"
)

output_tables_dir = resource_filename(
    "climpyrical",
    f"/data/results/TableC2/"
)

for station in station_dvs:
#     print(red(f"Preprocessing Model for {station}", "bold"))
#     pm.execute_notebook(
#         "preprocess_model.ipynb",
#         output_notebook_dir+f"preprocessing_model_log_{filenames[station]}.ipynb",
#         parameters = dict(
#             station_dv = station,
#             model_input_path = resource_filename("climpyrical", model_paths[station]),
#             name = filenames[station],
#             fill_glaciers = True,
#             processed_model_output_path = preprocessed_model_dir+f"{filenames[station]}_preprocessed.nc"
#         )
#     )
#     print(red(f"Preprocessing stations for {station}", "bold"))
#     pm.execute_notebook(
#         "stations.ipynb",
#         output_notebook_dir+f"stations_log_{filenames[station]}.ipynb",
#         parameters = dict(
#             station_dv = station,
#             station_input_path = resource_filename(
#                 "climpyrical",
#                 station_paths[station]
#             ),
#             name = filenames[station],
#             processed_model_output_path = preprocessed_model_dir+f"{filenames[station]}_preprocessed.nc",
#             df_path_write = output_stations_dir+f"{filenames[station]}_processed_stations.csv"
            
#         )
#     )
# #     print(red(f"Moving Window ratio reconstruction for {station}", "bold"))
# #     pm.execute_notebook(
# #         "ratio_kriging.ipynb",
# #         output_notebook_dir+f"ratio_kriging_log_{filenames[station]}.ipynb",
# #         parameters = dict(
# #             station_dv = station,
# #             station_input_path = resource_filename(
# #                 "climpyrical",
# #                 station_paths[station]
# #             ),
# #             name = filenames[station],
# #             processed_model_output_path = preprocessed_model_dir+f"{filenames[station]}_preprocessed.nc",
# #             output_reconstruction_path = output_reconstruction_dir+f"{filenames[station]}_reconstructed.nc",
# #             df_path_write = output_stations_dir+f"{filenames[station]}_processed_stations.csv"
# #         )
# #     )
#     print(red(f"Moving Window ratio reconstruction for {station}", "bold"))
#     pm.execute_notebook(
#         "ratio_kriging_med_corrected.ipynb",
#         output_notebook_dir+f"ratio_kriging_log_{filenames[station]}.ipynb",
#         parameters = dict(
#             station_dv = station,
#             station_input_path = resource_filename(
#                 "climpyrical",
#                 station_paths[station]
#             ),
#             name = filenames[station],
#             processed_model_output_path = preprocessed_model_dir+f"{filenames[station]}_preprocessed.nc",
#             output_reconstruction_path = output_reconstruction_dir+f"{filenames[station]}_reconstructed.nc",
#             df_path_write = output_stations_dir+f"{filenames[station]}_processed_stations.csv"
#         )
#     )
    print(red(f"Generating figures for {station}", "bold"))
    pm.execute_notebook(
        "plots.ipynb",
        output_notebook_dir+f"plots_log_{filenames[station]}.ipynb",
        parameters = dict(
            station_dv = station,
            name = filenames[station],
            original_model_path = resource_filename("climpyrical", model_paths[station]),
            preprocessed_model_path = preprocessed_model_dir+f"{filenames[station]}_preprocessed.nc",
            reconstruction_path = output_reconstruction_dir+f"{filenames[station]}_reconstructed.nc",
            processed_station_path = output_stations_dir+f"{filenames[station]}_processed_stations.csv",
            output_figure_dir = resource_filename("climpyrical", "data/results/figures/")
        )
    )
    print(red(f"Generating tables for {station}", "bold"))
    pm.execute_notebook(
        "nbcc_stations.ipynb",
        output_notebook_dir+f"nbcc_stations_log_{filenames[station]}.ipynb",
        parameters = dict(
            station_dv = station,
            output_reconstruction_path = output_reconstruction_dir+f"{filenames[station]}_reconstructed.nc",
            processed_station_path = output_stations_dir+f"{filenames[station]}_processed_stations.csv",
            output_nrc_path = output_tables_dir+f"{filenames[station]}_TableC2.csv"
        )
    )
    
    
print(red(f"Combining tables for all reconstructions", "bold"))
pm.execute_notebook(
    "combine_tables.ipynb",
    output_notebook_dir+f"combined_stations_log_{filenames[station]}.ipynb",
    parameters = dict(
        station_dv = station,
        output_reconstruction_path = output_reconstruction_dir+f"{filenames[station]}_reconstructed.nc",
        processed_station_path = output_stations_dir+f"{filenames[station]}_processed_stations.csv",
        output_nrc_path = output_tables_dir+f"{filenames[station]}_TableC2.csv"
    )
)


[1;31mGenerating figures for RL50 (kPa)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for RL50 (kPa)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for SL50 (kPa)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for SL50 (kPa)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for moisture_index[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for moisture_index[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for mean RH (%)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for mean RH (%)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for HDD (degC-day)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for HDD (degC-day)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for TJan2.5 (degC)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for TJan2.5 (degC)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for TJan1.0 (degC)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for TJan1.0 (degC)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for TJul2.5 (degC)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for TJul2.5 (degC)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for TwJul2.5 (degC)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for TwJul2.5 (degC)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for Tmin (degC)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for Tmin (degC)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for Tmax (degC)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for Tmax (degC)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for WP10[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for WP10[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for WP50[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for WP50[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for DRWP-RL5 (Pa)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for DRWP-RL5 (Pa)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for annual_pr (mm)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for annual_pr (mm)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for annual_rain (mm)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for annual_rain (mm)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for 1day rain RL50 (mm)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for 1day rain RL50 (mm)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))


[1;31mGenerating figures for Gum-LM RL10 (mm)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=14.0), HTML(value='')))


[1;31mGenerating tables for Gum-LM RL10 (mm)[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))

Input notebook does not contain a cell with tag 'parameters'



[1;31mCombining tables for all reconstructions[0m


HBox(children=(HTML(value='Executing'), FloatProgress(value=0.0, max=10.0), HTML(value='')))




{'cells': [{'cell_type': 'code',
   'metadata': {'tags': ['injected-parameters'],
    'papermill': {'exception': False,
     'start_time': '2020-12-16T20:00:14.609996',
     'end_time': '2020-12-16T20:00:14.627023',
     'duration': 0.017027,
     'status': 'completed'},
    'execution': {'iopub.status.busy': '2020-12-16T20:00:14.624585Z',
     'iopub.execute_input': '2020-12-16T20:00:14.624923Z',
     'shell.execute_reply': '2020-12-16T20:00:14.626560Z',
     'iopub.status.idle': '2020-12-16T20:00:14.626905Z'}},
   'execution_count': 1,
   'source': '# Parameters\nstation_dv = "Gum-LM RL10 (mm)"\noutput_reconstruction_path = "/home/nannau/Desktop/pipeline/climpyrical/climpyrical/data/results/netcdf/R15m10_reconstructed.nc"\nprocessed_station_path = "/home/nannau/Desktop/pipeline/climpyrical/climpyrical/data/results/intermediate/preprocessed_stations/R15m10_processed_stations.csv"\noutput_nrc_path = "/home/nannau/Desktop/pipeline/climpyrical/climpyrical/data/results/TableC2/R15m10_Tabl