# Average forcing into HRUs
We have raw ERA5 and EM-Earth forcing. Here we average those gridded data into HRU-averaged time series, for both the distributed and lumped catchment shapes.

Workflow, per catchment:
- Create forcing grid shapefiles
- Create a remapping csv file using 1 forcing file
- Remap all other forcing files using the remap file

In [1]:
import glob
import shutil
import sys
import pandas as pd
from pathlib import Path
sys.path.append(str(Path().absolute().parent))
import python_cs_functions as cs

## Config handling

In [2]:
# Specify where the config file can be found
config_file = '../0_config/config.txt'

In [3]:
# Get the required info from the config file
data_path = cs.read_from_config(config_file,'data_path')

# CAMELS-spat metadata
cs_meta_path = cs.read_from_config(config_file,'cs_basin_path')
cs_meta_name = cs.read_from_config(config_file,'cs_meta_name')
cs_unusable_name = cs.read_from_config(config_file,'cs_unusable_name')

# Basin folder
cs_basin_folder = cs.read_from_config(config_file, 'cs_basin_path')
basins_path = Path(data_path) / cs_basin_folder

## Data loading

In [4]:
# CAMELS-spat metadata file
cs_meta_path = Path(data_path) / cs_meta_path
cs_meta = pd.read_csv(cs_meta_path / cs_meta_name)

In [5]:
# Open list of unusable stations; Enforce reading IDs as string to keep leading 0's
cs_unusable = pd.read_csv(cs_meta_path / cs_unusable_name, dtype={'Station_id': object})

## Processing

In [6]:
debug_message = f'\n!!! CHECK DEBUGGING STATUS: \n- Testing 1 basin\n'

In [7]:
for ix,row in cs_meta.iterrows():

    # DEBUGGING
    if ix != 0: continue
    
    # Check if we need to run downloads for this station at all
    missing = cs.flow_obs_unavailable(cs_unusable, row.Country, row.Station_id)
    if 'iv' in missing and 'dv' in missing: 
        continue # with next station, because we have no observations at all for this station
    
    # Get shapefile path to determine download coordinates, and forcing destination path
    basin_id, shp_lump_path, shp_dist_path, _, _ = cs.prepare_delineation_outputs(cs_meta, ix, Path(data_path)/cs_basin_folder)
    raw_fold, lump_fold, dist_fold = cs.prepare_forcing_outputs(cs_meta, ix, Path(data_path)/cs_basin_folder) # Returns folders only, not file names
    shp_dist_path = Path( str(shp_dist_path).format('basin') )
    print('--- Now running basin {}. {}'.format(ix, basin_id))

    # Ensure we have the CRS set in these shapes, because EASYMORE needs this
    for shp in [shp_lump_path, shp_dist_path]:
        cs.add_crs_to_shapefile(shp)
    
    # Get the forcing files
    eme_merged_files = sorted(glob.glob(str(raw_fold/'EM_Earth_[0-9][0-9][0-9][0-9]-[0-9][0-9].nc'))) # list
    era_merged_files = sorted(glob.glob(str(raw_fold/'ERA5_[0-9][0-9][0-9][0-9]-[0-9][0-9].nc'))) # list
    era_invariant = glob.glob(str(raw_fold/'ERA5_*_invariants.nc'))

    # Make forcing shapefiles
    era_grid_shp, eme_grid_shp = cs.prepare_forcing_grid_shapefiles(row.Country, row.Station_id, Path(data_path)/cs_basin_folder)
    for infile, outfile in zip([era_merged_files[0],eme_merged_files[0]], [era_grid_shp,eme_grid_shp]):
        cs.make_forcing_grid_shapefile(infile,outfile)

    # Add geopotential to ERA5 forcing grid shapefile
    cs.add_geopotential_to_era5_grid(era_invariant[0], era_grid_shp)

    # Prepare for remapping
    esmr_temp = cs.prepare_easymore_temp_folder(row.Country, row.Station_id, Path(data_path)/cs_basin_folder)
    
    # Initiate EASYMORE objects
    era_lump_esmr, era_lump_remap = cs.get_easymore_settings('ERA5',     'lumped', era_grid_shp, shp_lump_path, esmr_temp, lump_fold)
    era_dist_esmr, era_dist_remap = cs.get_easymore_settings('ERA5',     'dist',   era_grid_shp, shp_dist_path, esmr_temp, dist_fold)
    eme_lump_esmr, eme_lump_remap = cs.get_easymore_settings('EM-Earth', 'lumped', eme_grid_shp, shp_lump_path, esmr_temp, lump_fold)
    eme_dist_esmr, eme_dist_remap = cs.get_easymore_settings('EM-Earth', 'dist',   eme_grid_shp, shp_dist_path, esmr_temp, dist_fold)
    
    # Create the four remap files
    cs.run_easymore_to_make_remap_file([era_merged_files[0], eme_merged_files[0], era_merged_files[0], eme_merged_files[0]],
                                    [era_lump_esmr,       eme_lump_esmr,       era_dist_esmr,       eme_dist_esmr])

    # Update the EASYMORE objects now we have remap files
    era_lump_esmr.remap_csv = str(esmr_temp / era_lump_remap)
    era_dist_esmr.remap_csv = str(esmr_temp / era_dist_remap)
    eme_lump_esmr.remap_csv = str(esmr_temp / eme_lump_remap)
    eme_dist_esmr.remap_csv = str(esmr_temp / eme_dist_remap)
    
    # Remap the ERA5 files
    for file in era_merged_files[1:]:
        era_lump_esmr.source_nc = file 
        era_dist_esmr.source_nc = file
        era_lump_esmr.nc_remapper()
        era_dist_esmr.nc_remapper()

    # Remap the EM-Earth files
    for file in eme_merged_files[1:]:
        eme_lump_esmr.source_nc = file 
        eme_dist_esmr.source_nc = file
        eme_lump_esmr.nc_remapper()
        eme_dist_esmr.nc_remapper()
        
    # Remove the EASYMORE temp folder
    #shutil.rmtree(esmr_temp)

--- Now running basin 0. CAN_01AD002
EASYMORE version 1.0.0 is initiated.
EASYMORE version 1.0.0 is initiated.
EASYMORE version 1.0.0 is initiated.
EASYMORE version 1.0.0 is initiated.
EASYMORE is given multiple variables for remapping but only on format and fill value. EASYMORE repeats the format and fill value for all the variables in output files
EASYMORE will remap variable  msdwlwrf  from source file to variable  msdwlwrf  in remapped netCDF file
EASYMORE will remap variable  msnlwrf  from source file to variable  msnlwrf  in remapped netCDF file
EASYMORE will remap variable  msdwswrf  from source file to variable  msdwswrf  in remapped netCDF file
EASYMORE will remap variable  msnswrf  from source file to variable  msnswrf  in remapped netCDF file
EASYMORE will remap variable  mtpr  from source file to variable  mtpr  in remapped netCDF file
EASYMORE will remap variable  sp  from source file to variable  sp  in remapped netCDF file
EASYMORE will remap variable  mper  from source 

  shp_int.to_file(self.temp_dir+self.case_name+'_intersected_shapefile.shp') # save the intersected files


Ended at date and time 2023-09-11 14:07:58.275267
It took 2.263524 seconds to finish creating of the remapping file
---------------------------
------REMAPPING------
netcdf output file will be compressed at level 4
Removing existing remapped .nc file.
Remapping C:\Globus endpoint\CAMELS_spat\camels-spat-data\basin_data\CAN_01AD002\forcing\raw\ERA5_1950-01.nc to C:\Globus endpoint\CAMELS_spat\camels-spat-data\basin_data\CAN_01AD002\forcing\lumped/ERA5_lumped_remapped_1950-01-01-00-00-00.nc 
Started at date and time 2023-09-11 14:07:58.309637 
Ended at date and time 2023-09-11 14:09:33.399833 
It took 95.090196 seconds to finish the remapping of variable(s) 
---------------------
---------------------
EASYMORE is given multiple variables for remapping but only on format and fill value. EASYMORE repeats the format and fill value for all the variables in output files
EASYMORE will remap variable  tmean  from source file to variable  tmean  in remapped netCDF file
EASYMORE will remap variab

  shp_int.to_file(self.temp_dir+self.case_name+'_intersected_shapefile.shp') # save the intersected files


Ended at date and time 2023-09-11 14:09:35.791828
It took 2.374034 seconds to finish creating of the remapping file
---------------------------
------REMAPPING------
netcdf output file will be compressed at level 4
Removing existing remapped .nc file.
Remapping C:\Globus endpoint\CAMELS_spat\camels-spat-data\basin_data\CAN_01AD002\forcing\raw\EM_Earth_1950-01.nc to C:\Globus endpoint\CAMELS_spat\camels-spat-data\basin_data\CAN_01AD002\forcing\lumped/EM-Earth_lumped_remapped_1950-01-01-00-00-00.nc 
Started at date and time 2023-09-11 14:09:35.832236 
Ended at date and time 2023-09-11 14:09:45.101823 
It took 9.269587 seconds to finish the remapping of variable(s) 
---------------------
---------------------
EASYMORE is given multiple variables for remapping but only on format and fill value. EASYMORE repeats the format and fill value for all the variables in output files
EASYMORE will remap variable  msdwlwrf  from source file to variable  msdwlwrf  in remapped netCDF file
EASYMORE will

  shp_int.to_file(self.temp_dir+self.case_name+'_intersected_shapefile.shp') # save the intersected files


Ended at date and time 2023-09-11 14:09:50.667931
It took 5.476086 seconds to finish creating of the remapping file
---------------------------
------REMAPPING------
netcdf output file will be compressed at level 4
Removing existing remapped .nc file.
Remapping C:\Globus endpoint\CAMELS_spat\camels-spat-data\basin_data\CAN_01AD002\forcing\raw\ERA5_1950-01.nc to C:\Globus endpoint\CAMELS_spat\camels-spat-data\basin_data\CAN_01AD002\forcing\distributed/ERA5_dist_remapped_1950-01-01-00-00-00.nc 
Started at date and time 2023-09-11 14:09:50.701796 
Ended at date and time 2023-09-11 14:11:24.542442 
It took 93.840646 seconds to finish the remapping of variable(s) 
---------------------
---------------------
EASYMORE is given multiple variables for remapping but only on format and fill value. EASYMORE repeats the format and fill value for all the variables in output files
EASYMORE will remap variable  tmean  from source file to variable  tmean  in remapped netCDF file
EASYMORE will remap var

  shp_int.to_file(self.temp_dir+self.case_name+'_intersected_shapefile.shp') # save the intersected files


Ended at date and time 2023-09-11 14:11:31.473173
It took 6.901234 seconds to finish creating of the remapping file
---------------------------
------REMAPPING------
netcdf output file will be compressed at level 4
Removing existing remapped .nc file.
Remapping C:\Globus endpoint\CAMELS_spat\camels-spat-data\basin_data\CAN_01AD002\forcing\raw\EM_Earth_1950-01.nc to C:\Globus endpoint\CAMELS_spat\camels-spat-data\basin_data\CAN_01AD002\forcing\distributed/EM-Earth_dist_remapped_1950-01-01-00-00-00.nc 
Started at date and time 2023-09-11 14:11:31.511908 
Ended at date and time 2023-09-11 14:11:40.852027 
It took 9.340119 seconds to finish the remapping of variable(s) 
---------------------
---------------------
remap file is provided; EASYMORE will use this file and skip creation of remapping file
EASYMORE will remap variable  msdwlwrf  from source file to variable  msdwlwrf  in remapped netCDF file
EASYMORE will remap variable  msnlwrf  from source file to variable  msnlwrf  in remapped