### **Unit Conversion of CMIP6 NetCDF Files**
This step converts key climate variables in the **clipped NetCDF files** and saves them in a new folder while maintaining the original directory structure.

- **Defines input (`clipped_data`) and output (`converted_data`) directories** to keep raw and processed files separate.
- **Loops through all models, scenarios, and parameters**, ensuring all climate variables are processed.
- **Applies necessary unit conversions**:
  - **Precipitation (`pr`)**: Converts from **kg/m²/s** to **mm/day**.
  - **Temperature (`tasmax`, `tasmin`)**: Converts from **Kelvin to Celsius**.
  - **Relative Humidity (`hurs`)**: Converts from **percentage to fraction**.
  - **Solar Radiation (`rsds`)**: Converts from **W/m² to MJ/m²**.
- **Saves the converted files** in a structured **converted_data** directory.
- **Handles errors gracefully**, ensuring that failed conversions do not stop the entire process.

At the end of this step, all **CMIP6 NetCDF files are standardized** for hydrological modeling.


In [1]:
import os
import xarray as xr

# Define input and output directories
input_base_dir = r"D:/CMIP6-BiasCorrection-SWAT/workingfolder/clipped_data"
output_base_dir = r"D:/CMIP6-BiasCorrection-SWAT/workingfolder/converted_data"
os.makedirs(output_base_dir, exist_ok=True)

# Conversion functions
def convert_temperature(temp_k):
    """Convert temperature from Kelvin to Celsius."""
    return temp_k - 273.15

def convert_humidity(humidity_percent):
    """Convert relative humidity from percentage to fraction."""
    return humidity_percent / 100.0

def convert_radiation(radiation_wm2):
    """Convert solar radiation from W/m^2 to MJ/m^2."""
    return radiation_wm2 * 0.0864

def convert_precipitation(precip_kgm2s):
    """Convert precipitation from kg/m²/s to mm/day."""
    return precip_kgm2s * 86400

# Loop through models, scenarios, and parameters
models = [model for model in os.listdir(input_base_dir) if os.path.isdir(os.path.join(input_base_dir, model))]

for model in models:
    model_path = os.path.join(input_base_dir, model)
    output_model_path = os.path.join(output_base_dir, model)
    os.makedirs(output_model_path, exist_ok=True)

    scenarios = [scenario for scenario in os.listdir(model_path) if os.path.isdir(os.path.join(model_path, scenario))]

    for scenario in scenarios:
        scenario_path = os.path.join(model_path, scenario)
        output_scenario_path = os.path.join(output_model_path, scenario)
        os.makedirs(output_scenario_path, exist_ok=True)

        parameters = [param for param in os.listdir(scenario_path) if os.path.isdir(os.path.join(scenario_path, param))]

        for param in parameters:
            param_path = os.path.join(scenario_path, param)
            output_param_path = os.path.join(output_scenario_path, param)
            os.makedirs(output_param_path, exist_ok=True)

            files = [f for f in os.listdir(param_path) if f.endswith(".nc")]

            for file in files:
                input_file = os.path.join(param_path, file)
                output_file = os.path.join(output_param_path, file)

                try:
                    # Load the dataset
                    ds = xr.open_dataset(input_file)

                    # Apply conversion based on the parameter
                    if param == "tasmax" or param == "tasmin":
                        ds = ds.map(convert_temperature)
                    elif param == "hurs":
                        ds = ds.map(convert_humidity)
                    elif param == "rsds":
                        ds = ds.map(convert_radiation)
                    elif param == "pr":
                        ds = ds.map(convert_precipitation)
                    else:
                        print(f"Skipping unhandled parameter: {param}")
                        continue

                    # Save the converted dataset in a new directory
                    ds.to_netcdf(output_file)
                    print(f"Converted and saved: {output_file}")

                except Exception as e:
                    print(f"Error processing {input_file}: {e}")

print("All files processed and converted!")

Converted and saved: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/converted_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ssp245_r1i1p1f1_gr_2015_v1.1_clipped.nc
Converted and saved: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/converted_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ssp245_r1i1p1f1_gr_2016_v1.1_clipped.nc
Converted and saved: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/converted_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ssp245_r1i1p1f1_gr_2017_v1.1_clipped.nc
Converted and saved: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/converted_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ssp245_r1i1p1f1_gr_2018_v1.1_clipped.nc
Converted and saved: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/converted_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ssp245_r1i1p1f1_gr_2019_v1.1_clipped.nc
Converted and saved: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/converted_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ss

### **Standardizing Time Format in CMIP6 NetCDF Files**
This step ensures that all NetCDF files have a **consistent time format** across models and scenarios.

- **Defines input (`converted_data`) and output (`standardized_data`) directories** to keep raw and standardized files separate.
- **Loops through all models, scenarios, and parameters**, ensuring all climate data is processed.
- **Standardizes the time format to `YYYY-MM-DD`** to maintain consistency for further analysis.
- **Handles different time formats**, including:
  - Converting **cftime objects** to standard datetime format.
  - Formatting existing datetime values to the required format.
- **Saves the standardized files** while preserving the directory structure.
- **Ensures error handling**, so that failed files do not interrupt the process.

At the end of this step, all CMIP6 NetCDF files have a **uniform time format**, making them ready for analysis and model integration.


In [2]:
import os
import xarray as xr
import cftime

# Define input and output directories
input_base_dir = r"D:/CMIP6-BiasCorrection-SWAT/workingfolder/converted_data"
output_base_dir = r"D:/CMIP6-BiasCorrection-SWAT/workingfolder/standardized_data"
os.makedirs(output_base_dir, exist_ok=True)

# Function to standardize time format
def standardize_time(ds):
    """Ensure time is in 'YYYY-MM-DD' format."""
    try:
        if "time" not in ds.coords:
            print("No 'time' coordinate found in dataset.")
            return ds

        if isinstance(ds.indexes["time"][0], cftime.datetime):
            ds["time"] = ds.indexes["time"].to_datetimeindex()
        ds["time"] = ds["time"].dt.strftime("%Y-%m-%d")
        return ds
    except Exception as e:
        print(f"Error standardizing time: {e}")
        return None

# Function to process a single file
def process_file(file_path, output_file):
    try:
        print(f"Processing file: {file_path}")
        ds = xr.open_dataset(file_path)
        ds_standardized = standardize_time(ds)
        if ds_standardized is None:
            print(f"Skipping file due to standardization error: {file_path}")
            return

        os.makedirs(os.path.dirname(output_file), exist_ok=True)
        ds_standardized.to_netcdf(output_file, mode="w")
        print(f"Saved standardized file: {output_file}")
    except Exception as e:
        print(f"Error processing file {file_path}: {e}")

# Loop through models, scenarios, and parameters
models = [model for model in os.listdir(input_base_dir) if os.path.isdir(os.path.join(input_base_dir, model))]

for model in models:
    model_path = os.path.join(input_base_dir, model)
    output_model_path = os.path.join(output_base_dir, model)
    os.makedirs(output_model_path, exist_ok=True)

    scenarios = [scenario for scenario in os.listdir(model_path) if os.path.isdir(os.path.join(model_path, scenario))]

    for scenario in scenarios:
        scenario_path = os.path.join(model_path, scenario)
        output_scenario_path = os.path.join(output_model_path, scenario)
        os.makedirs(output_scenario_path, exist_ok=True)

        parameters = [param for param in os.listdir(scenario_path) if os.path.isdir(os.path.join(scenario_path, param))]

        for param in parameters:
            param_path = os.path.join(scenario_path, param)
            output_param_path = os.path.join(output_scenario_path, param)
            os.makedirs(output_param_path, exist_ok=True)

            files = [f for f in os.listdir(param_path) if f.endswith(".nc")]

            for file in files:
                input_file = os.path.join(param_path, file)
                output_file = os.path.join(output_param_path, file)
                process_file(input_file, output_file)

print("Time standardization for all files complete!")

Processing file: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/converted_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ssp245_r1i1p1f1_gr_2015_v1.1_clipped.nc
Saved standardized file: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/standardized_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ssp245_r1i1p1f1_gr_2015_v1.1_clipped.nc
Processing file: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/converted_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ssp245_r1i1p1f1_gr_2016_v1.1_clipped.nc
Saved standardized file: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/standardized_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ssp245_r1i1p1f1_gr_2016_v1.1_clipped.nc
Processing file: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/converted_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ssp245_r1i1p1f1_gr_2017_v1.1_clipped.nc
Saved standardized file: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/standardized_data\EC-Earth3\ssp245\pr\pr_day_EC-

### **Checking Time Format in Standardized NetCDF Files**  
This step verifies that the time format in the standardized NetCDF files is correctly set to **YYYY-MM-DD** by inspecting the first few time entries from sample files.


In [3]:
import os
import xarray as xr

# Define the output directory containing standardized NetCDF files
output_base_dir = r"D:/CMIP6-BiasCorrection-SWAT/workingfolder/standardized_data"

# Get a list of NetCDF files to check
nc_files = []
for root, _, files in os.walk(output_base_dir):
    for file in files:
        if file.endswith(".nc"):
            nc_files.append(os.path.join(root, file))

# Check time format in a few sample files
for file in nc_files[:5]:  # Checking first 5 files
    ds = xr.open_dataset(file)
    print(f"File: {file}")
    print(ds["time"].values[:5])  # Display first 5 time entries
    ds.close()


File: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/standardized_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ssp245_r1i1p1f1_gr_2015_v1.1_clipped.nc
['2015-01-01' '2015-01-02' '2015-01-03' '2015-01-04' '2015-01-05']
File: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/standardized_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ssp245_r1i1p1f1_gr_2016_v1.1_clipped.nc
['2016-01-01' '2016-01-02' '2016-01-03' '2016-01-04' '2016-01-05']
File: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/standardized_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ssp245_r1i1p1f1_gr_2017_v1.1_clipped.nc
['2017-01-01' '2017-01-02' '2017-01-03' '2017-01-04' '2017-01-05']
File: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/standardized_data\EC-Earth3\ssp245\pr\pr_day_EC-Earth3_ssp245_r1i1p1f1_gr_2018_v1.1_clipped.nc
['2018-01-01' '2018-01-02' '2018-01-03' '2018-01-04' '2018-01-05']
File: D:/Hesham/WhiteNile/CMIP6-BiasCorrection-SWAT/workingfolder/standardized_data\EC-E