# Model-Specific Input Data Preprocessing in CONFLUENCE

## Introduction

This notebook covers the model-specific preprocessing steps for input data in CONFLUENCE. After completing the model-agnostic preprocessing, we now focus on tailoring our data to the specific requirements of the chosen hydrological model (e.g., SUMMA, HYPE, or MESH).

Key aspects covered in this notebook include:

1. Formatting data according to the chosen model's input specifications
2. Generating model-specific configuration files
3. Preparing initial conditions and parameter files
4. Creating forcing data in the required format and resolution

In this notebook we ensure that our preprocessed data is compatible with the chosen hydrological model. By the end of this process, you will have a complete set of input files ready for model initialization and simulation.

## First we import the libraries and functions we need

In [1]:
import sys
from pathlib import Path
from typing import Dict, Any
import logging
import yaml # type: ignore

current_dir = Path.cwd()
parent_dir = current_dir.parent.parent
sys.path.append(str(parent_dir))

from utils.dataHandling_utils.specificPreProcessor_util import SummaPreProcessor_spatial, flashPreProcessor # type: ignore
from utils.models_utils.mizuroute_utils import MizuRoutePreProcessor

# Set up logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

## Check configurations

Now we should print our configuration settings and make sure that we have defined all the settings we need. 

In [2]:
config_path = Path('../../0_config_files/config_active.yaml')
with open(config_path, 'r') as config_file:
    config = yaml.safe_load(config_file)
    print(f"FORCING_DATASET: {config['FORCING_DATASET']}")
    print(f"EASYMORE_CLIENT: {config['EASYMORE_CLIENT']}")
    print(f"FORCING_VARIABLES: {config['FORCING_VARIABLES']}")
    print(f"EXPERIMENT_TIME_START: {config['EXPERIMENT_TIME_START']}")
    print(f"EXPERIMENT_TIME_START: {config['EXPERIMENT_TIME_START']}")

FORCING_DATASET: ERA5
EASYMORE_CLIENT: easymore cli
FORCING_VARIABLES: longitude,latitude,time,LWRadAtm,SWRadAtm,pptrate,airpres,airtemp,spechum,windspd
EXPERIMENT_TIME_START: 2010-01-01 01:00
EXPERIMENT_TIME_START: 2010-01-01 01:00


## Define default paths

Now let's define the paths to data directories before we run the pre processing scripts and create the containing directories

In [3]:
# Main project directory
data_dir = config['CONFLUENCE_DATA_DIR']
project_dir = Path(data_dir) / f"domain_{config['DOMAIN_NAME']}"

# Data directoris
model_input_dir = project_dir / f"{config['HYDROLOGICAL_MODEL']}_input"

# Make sure the new directories exists
model_input_dir.mkdir(parents = True, exist_ok = True)


## Create model configuration files

In [4]:
# Initialize model specific preprocessors

if config['HYDROLOGICAL_MODEL'] == 'SUMMA':
    ssp = SummaPreProcessor_spatial(config, logger)
    ssp.run_preprocessing()

    mp = MizuRoutePreProcessor(config,logger)
    mp.run_preprocessing()
    
    
elif config['HYDROLOGICAL_MODEL'] == 'FLASH':
    ssp = flashPreProcessor(config, logger)

2024-10-20 20:29:59,983 - INFO - Starting SUMMA spatial preprocessing
2024-10-20 20:29:59,984 - INFO - Starting forcing data processing
2024-10-20 20:29:59,984 - INFO - Starting to apply temperature lapse rate and add data step
2024-10-20 20:29:59,996 - INFO - Processing Bow_at_Banff_ERA5_remapped_2010-01-01-00-00-00.nc
2024-10-20 20:30:00,993 - INFO - Processing Bow_at_Banff_ERA5_remapped_2010-02-01-00-00-00.nc
2024-10-20 20:30:01,293 - INFO - Processing Bow_at_Banff_ERA5_remapped_2010-03-01-00-00-00.nc
2024-10-20 20:30:01,617 - INFO - Processing Bow_at_Banff_ERA5_remapped_2010-04-01-00-00-00.nc
2024-10-20 20:30:02,024 - INFO - Processing Bow_at_Banff_ERA5_remapped_2010-05-01-00-00-00.nc
2024-10-20 20:30:02,381 - INFO - Processing Bow_at_Banff_ERA5_remapped_2010-06-01-00-00-00.nc
2024-10-20 20:30:02,712 - INFO - Processing Bow_at_Banff_ERA5_remapped_2010-07-01-00-00-00.nc
2024-10-20 20:30:03,050 - INFO - Processing Bow_at_Banff_ERA5_remapped_2010-08-01-00-00-00.nc
2024-10-20 20:30:03,