## HRES Postprocessing

This script is designed to perform postprocessing on HRES date.

The script:

* Takes preprocessed HRES data, which includes only the "pr" variable representing precipitation, and reformats it back into the original format (The original HRES files contain multiple variables, including the "tp" variable for precipitation)

* Restores the original format of the HRES files while retaining the cumulative precipitation information from the preprocessed data.

* Converts precip from `mm` back to `m`


Have to provide which predicted file to use (e.g., `HRES_C_predict_data.mse.32.0.01.0.0001.0.25.2.2.8.0.1.64_mse.32.0.01.0.0001.0.25.2.2.8.0.1.64.nc`)

In [4]:
# import libraries and directories
from py_env_hpc import *

### 1. Original HRES

In [5]:
# Define the start and end dates for the loop
date_start = "2020-07-01"
date_end = "2023-04-26"

# Open the preprocessed data
preprocessed_file = os.path.join(ATMOS_DATA, "HRES_pr.nc")
pre_HRES = xr.open_dataset(preprocessed_file)

# Iterate through each day in the date range
current_date = pd.to_datetime(date_start)
prev_month = None

while current_date <= pd.to_datetime(date_end):

    # Convert the current date to the required format
    current_date_str = current_date.strftime("%Y%m%d")

    # Define the file names
    original_file = os.path.join(HRES_OR, f"ADAPTER_DE05_{current_date_str}.12.0-90-1.boundary_1.nc")
    output_file = os.path.join(PARFLOWCLM, "sim", "ADAPTER_DE05_ECMWF-HRES_detforecast__FZJ-IBG3-ParFlowCLM380D_v03bJuwelsGpuProdClimatologyTl_PRhourly", "forcing", "o.data.MARS_retrieval", current_date.strftime("%Y"), f"ADAPTER_DE05_{current_date_str}.12.0-90-1.boundary_1.nc")

    # Open the original file
    orig_HRES = xr.open_dataset(original_file)

    # Find the common dates between pre_HRES and orig_HRES
    common_dates = sorted(list(set(pre_HRES.time.values) & set(orig_HRES.time.values)))

    # Get the indices for the common dates in pre_HRES and orig_HRES
    preprocessed_indices = np.where(np.isin(pre_HRES.time.values, common_dates))[0]
    original_indices = np.where(np.isin(orig_HRES.time.values, common_dates))[0]

    # Convert the data from preprocessed HRES to cumulative
    cumulative_data = np.cumsum(pre_HRES.pr.values[preprocessed_indices], axis=0)

    # Replace precipitation data in the preprocessed file with original data
    orig_HRES["tp"].values[original_indices] = cumulative_data/1000

    # Add a history attribute to the original file
    orig_HRES.attrs["history"] = "postprocessed by Kaveh - ATMOSCORRECT/HRES_POSTP.ipynb"

    # Save the modified original HRES file with the same name in the output directory
    orig_HRES.to_netcdf(output_file)

    # Print the month when it changes
    current_month = current_date.strftime("%m")
    if current_month != prev_month:
        print(f"{current_date.strftime('%Y')}/{current_month} postprocessing is completed.")
        prev_month = current_month

    # Move to the next day
    current_date += pd.DateOffset(days=1)

2020/07 postprocessing is completed.
2020/08 postprocessing is completed.
2020/09 postprocessing is completed.
2020/10 postprocessing is completed.
2020/11 postprocessing is completed.
2020/12 postprocessing is completed.
2021/01 postprocessing is completed.
2021/02 postprocessing is completed.
2021/03 postprocessing is completed.
2021/04 postprocessing is completed.
2021/05 postprocessing is completed.
2021/06 postprocessing is completed.
2021/07 postprocessing is completed.
2021/08 postprocessing is completed.
2021/09 postprocessing is completed.
2021/10 postprocessing is completed.
2021/11 postprocessing is completed.
2021/12 postprocessing is completed.
2022/01 postprocessing is completed.
2022/02 postprocessing is completed.
2022/03 postprocessing is completed.
2022/04 postprocessing is completed.
2022/05 postprocessing is completed.
2022/06 postprocessing is completed.
2022/07 postprocessing is completed.
2022/08 postprocessing is completed.
2022/09 postprocessing is completed.
2

### 2. Corrected HRES

In [5]:
# Define which file to postprocess
PREDICT_FILE="HRES_C_predict_data.mse.32.0.01.0.0001.0.25.2.2.8.0.1.64_mse.32.0.01.0.0001.0.25.2.2.8.0.1.64.nc"

# Define the start and end dates for the loop
date_start = "2020-07-01"
date_end = "2023-04-26"

# Open HRES_C file
HRES_C_file = os.path.join(PREDICT_FILES, PREDICT_FILE)
HRES_C = xr.open_dataset(HRES_C_file)

# Iterate through each day in the date range
current_date = pd.to_datetime(date_start)
prev_month = None

while current_date <= pd.to_datetime(date_end):

    # Convert the current date to the required format
    current_date_str = current_date.strftime("%Y%m%d")

    # Define the file names
    original_file = os.path.join(HRES_OR, f"ADAPTER_DE05_{current_date_str}.12.0-90-1.boundary_1.nc")
    output_file = os.path.join(PARFLOWCLM, "sim", "ADAPTER_DE05_ECMWF-HRES_detforecast__FZJ-IBG3-ParFlowCLM380D_v03bJuwelsGpuProdClimatologyTl_PRhourly_HRES_CORRECTED", "forcing", "o.data.MARS_retrieval", current_date.strftime("%Y"), f"ADAPTER_DE05_{current_date_str}.12.0-90-1.boundary_1.nc")

    # Check if the output file already exists
    if not os.path.isfile(output_file):
    
        # Open the original file
        orig_HRES = xr.open_dataset(original_file)

        # Find the common dates between HRES_C and orig_HRES
        common_dates = sorted(list(set(HRES_C.time.values) & set(orig_HRES.time.values)))

        # Get the indices for the common dates in HRES_C and orig_HRES
        preprocessed_indices = np.where(np.isin(HRES_C.time.values, common_dates))[0]
        original_indices = np.where(np.isin(orig_HRES.time.values, common_dates))[0]

        # Convert the data from preprocessed HRES to cumulative
        cumulative_data = np.cumsum(HRES_C.pr.values[preprocessed_indices], axis=0)

        # Replace precipitation data in the preprocessed file with original data
        orig_HRES["tp"].values[original_indices] = cumulative_data/1000

        # Add a history attribute to the original file
        orig_HRES.attrs["history"] = "postprocessed by Kaveh - ATMOSCORRECT/HRES_POSTP.ipynb"

        # Create directories if they don't exist
        os.makedirs(os.path.dirname(output_file), exist_ok=True)

        # Save the modified original HRES file with the same name in the output directory
        orig_HRES.to_netcdf(output_file)

        # Print the month when it changes
        current_month = current_date.strftime("%m")
        if current_month != prev_month:
            print(f"{current_date.strftime('%Y')}/{current_month} postprocessing is completed.")
            prev_month = current_month

    # Move to the next day
    current_date += pd.DateOffset(days=1)

2020/07 postprocessing is completed.
2020/08 postprocessing is completed.
2020/09 postprocessing is completed.
2020/10 postprocessing is completed.
2020/11 postprocessing is completed.
2020/12 postprocessing is completed.
2021/01 postprocessing is completed.
2021/02 postprocessing is completed.
2021/03 postprocessing is completed.
2021/04 postprocessing is completed.
2021/05 postprocessing is completed.
2021/06 postprocessing is completed.
2021/07 postprocessing is completed.
2021/08 postprocessing is completed.
2021/09 postprocessing is completed.
2021/10 postprocessing is completed.
2021/11 postprocessing is completed.
2021/12 postprocessing is completed.
2022/01 postprocessing is completed.
2022/02 postprocessing is completed.
2022/03 postprocessing is completed.
2022/04 postprocessing is completed.
2022/05 postprocessing is completed.
2022/06 postprocessing is completed.
2022/07 postprocessing is completed.
2022/08 postprocessing is completed.
2022/09 postprocessing is completed.
2