# Example Notebook

### Using the 01_generate_single_synth_parameter_data.yaml experiment

This notebook is meant to explain how the objects in this class work, and are configurable in a notebook setting. 

Notebooks are a replacement for the `Experiment` class, as we will be handling our experiments in the notebook setting rather than using a .py file

First, let's import all of the stuff we need

In [40]:
# Python Lib Packages
import os
from pathlib import Path
import sys

# Pypi imported Modules
import matplotlib.pyplot as plt
import numpy as np
from omegaconf import DictConfig, OmegaConf
import pandas as pd
import torch
import torch.nn as nn

# Putting the dMC module in the Python Path
current_dir = Path.cwd()
dmc_dev_path = current_dir.parents[0]
sys.path.append(str(dmc_dev_path))

# Synthetic Parameter distributions and MLP Networks
from dMC.nn.power_distribution import Power
from dMC.nn.single_parameters import SingleParameters
from dMC.nn.inverse_linear import InverseLinear
from dMC.nn.parameter_list import ParameterList
from dMC.nn.mlp import MLP
from dMC.nn import Initialization

# Physics model
from dMC.physics.explicit_mc import ExplicitMC

# Experiment
from dMC.experiments.generate_synthetic import GenerateSynthetic
from dMC.experiments.writer import Writer

# Utils functions
from dMC.configuration import _set_device
from dMC.__main__ import _set_seed

# Required to generate data
from dMC.data.datasets.nhd_srb import NHDSRB
from dMC.data.observations.usgs import USGS
from dMC.data.dates import Dates
from dMC.data.normalize.min_max import MinMax
from dMC.data import DataLoader

# For evaluation
from dMC.experiments.metrics import Metrics

## Setting up the Config

Let's import the config files from our `dMC` directory:

In [23]:
cfg = OmegaConf.load(dmc_dev_path / "dMC/conf/global_settings.yaml")
experiment_settings = OmegaConf.load(dmc_dev_path / "dMC/conf/config/01_generate_single_synth_parameter_data.yaml")
cfg.config = experiment_settings
cfg

{'defaults': [{'config': '03_train_usgs_period_1a'}, {'hydra_settings': 'settings'}, '_self_'], 'cwd': '/data/tkb5476/projects/dMC-Juniata-hydroDL2', 'data_dir': '/data/tkb5476/projects/dMC-Juniata-hydroDL2/flat_files/dMC-Juniata-hydroDL2/dx-2000dis1_non_merge', 'name': '03_train_usgs_period_1a', 'device': 'cpu', 'config': {'service_locator': {'experiment': 'generate_synthetic.GenerateSynthetic', 'data': 'nhd_srb.NHDSRB', 'observations': 'usgs.USGS', 'physics': 'explicit_mc.ExplicitMC', 'neural_network': 'single_parameters.SingleParameters'}, 'data': {'processed_dir': '${cwd}/flat_files', 'end_node': 4809, 'time': {'start': '02/01/2001 00:00:00', 'end': '09/18/2010 23:00:00', 'steps': 1344, 'tau': 9, 'batch_size': '${config.data.time.steps}'}, 'observations': {'loss_nodes': [1053, 1280, 2662, 2689, 2799, 4780, 4801, 4809], 'dir': '${data_dir}/inflow_interpolated/', 'file_name': '???'}, 'save_paths': {'edges': '${config.data.processed_dir}/${config.data.end_node}_edges.csv', 'nodes': '$

In [52]:
# Applying our global settings, and specifying an output dir as 
_set_device(cfg)
_set_seed(cfg)

# Building Objects

Below we'll do the "behind the scenes" work of building our Dataloader, Model, and Experiment so that we can just use those objects here

## Dataloader:

In [41]:
cfg_data = cfg.config.data

dates = Dates(cfg_data)  # Dates Object
normalize = MinMax(cfg_data)  # Normalization Object
data = NHDSRB(cfg_data, dates=dates, normalize=normalize)  # Dataset Object
obs = USGS(cfg_data, dates, normalize)  # Observations Object

# Getting the data
hydrofabric = data.get_data()
observations = obs.get_data().transpose(0, 1)

dataloader =  DataLoader(data, obs)(cfg_data)
dataloader

<torch.utils.data.dataloader.DataLoader at 0x7f6c8dfa6ad0>

## Model:

In [47]:
cfg_model = cfg.config.model

neural_network = SingleParameters(cfg=cfg_model).to(cfg_model.device)
physics_model = ExplicitMC(cfg=cfg_model, neural_network=neural_network)
physics_model

ExplicitMC(
  (neural_network): SingleParameters()
)

In [48]:
physics_model.neural_network.n

Parameter containing:
tensor(0.0300, requires_grad=True)

## Experiment:

In [37]:
cfg_experiment = cfg.config.experiment
# writer = Writer(cfg_experiment)
experiment = GenerateSynthetic(cfg=cfg_experiment, writer=None)
experiment

<dMC.experiments.generate_synthetic.GenerateSynthetic at 0x7f6c8dc42a10>

# Running the experiment

Similar to the dependency injection framework in the code, you can run the experiment like below

In [49]:
experiment.run(dataloader, physics_model)

Epoch 0: Explicit Muskingum Cunge Routing:   0%|          | 0/1343 [00:00<?, ?it/s]

In [51]:
experiment.save_path

PosixPath('/data/tkb5476/projects/dMC-Juniata-hydroDL2/runs/01_synthetic_data')

In [55]:
# Our synthetic discharge outputs. The Rows represent time, the Cols are the edge associated with the discharge
df = pd.read_csv(experiment.save_path / "01_generate_single_synth_parameter_data.csv")
df.head()

Unnamed: 0.1,Unnamed: 0,1053,1280,2662,2689,2799,4780,4801,4809
0,0,0.072551,0.303997,0.035397,0.017339,0.273754,0.097784,0.170211,0.272679
1,1,0.345283,1.01094,0.064402,0.048944,0.513326,0.258317,0.222125,0.407868
2,2,0.75554,1.308555,0.099312,0.127346,0.782226,0.438819,0.284994,0.669957
3,3,1.101563,1.415994,0.155389,0.183362,1.09874,0.659021,0.400763,0.938
4,4,1.485405,1.499759,0.235011,0.221742,1.618124,0.993112,0.632635,1.26881


# What now?

Feel free to check out the other experiments. All of the objects that they use are included in their `service_locator` config entry

### 01: Single Parameter Experiments
To run these, you should use the following configs:
- `01_generate_single_synth_parameter_data.yaml`
- `01_train_against_single_synthetic.yaml`

### 02: Synthetic Parameter Distribution Recovery

There are many synthetic parameter experiments. Run the following configs to recreate them

#### Synthetic Constants
- `02_generate_mlp_param_list.yaml`
- `02_train_mlp_param_list.yaml`

#### Synthetic Power Law A
- `02_generate_mlp_power_a.yaml`
- `02_train_mlp_power_a.yaml`

#### Synthetic Power Law B
- `02_train_mlp_power_b.yaml`
- `02_generate_mlp_power_b.yaml`

### 03: Train against USGS data:
You can run the following cfgs to train models against USGS data
- `03_train_usgs_period_1a.yaml`
- `03_train_usgs_period_1b.yaml`
- `03_train_usgs_period_2a.yaml`
- `03_train_usgs_period_2b.yaml`
- `03_train_usgs_period_3a.yaml`
- `03_train_usgs_period_3b.yaml`
- `03_train_usgs_period_4a.yaml`
- `03_train_usgs_period_4b.yaml`