## Datasets Used

### 1. MERIT-Basins

**Description**  
A global reconstruction of naturalized river flows for approximately **2.94 million river reaches**, derived from the MERIT hydrography framework. The dataset provides a consistent global representation of river connectivity and flow characteristics suitable for large-scale hydrological modeling.

> **Note:** A bug-fixed version of this dataset was used in the present work.

**Citation**  
Lin, P., Pan, M., Beck, H. E., Yang, Y., Yamazaki, D., Frasson, R., *et al.* (2019).  
*Global reconstruction of naturalized river flows at 2.94 million reaches.*  
**Water Resources Research**, 55(8), 6499â€“6516.  
https://doi.org/10.1029/2019WR025287

**Dataset Access**  
- MERIT-Basins Hydrography (based on MERIT-Hydro v0.7 / v1.0_bugfix1):  
  https://www.reachhydro.org/home/params/merit-basins  
  *(Includes minor bug fixes for coastline pixels)*

---

### 2. HydroLAKES (Version 1)

**Description**  
A global vector database of lakes and reservoirs, providing detailed information on lake shorelines, surface area, volume, depth estimates, and hydrological connectivity. HydroLAKES is widely used in global hydrology and water resources studies.

**Citation**  
Messager, M. L., Lehner, B., Grill, G., Nedeva, I., & Schmitt, O. (2016).  
*Estimating the volume and age of water stored in global lakes using a geostatistical approach.*  
**Nature Communications**, 7, 13603.  
https://doi.org/10.1038/ncomms13603

**Dataset Access**  
- HydroLAKES product page:  
  https://www.hydrosheds.org/products/hydrolakes

### Assigning parameters and folders

In [None]:
# outputfolder for where the files will be sitting
OutFolder = '/Users/shg096/Desktop/LakeRiverOut/MERITBasins/'

# location of MERIT-Basin bug fixed files
riv_file_template='/Users/shg096/Desktop/MERIT_Hydro/riv/riv_pfaf_*_MERIT_Hydro_v07_Basins_v01_bugfix1.shp',
cat_file_template='/Users/shg096/Desktop/MERIT_Hydro/cat/cat_pfaf_*_MERIT_Hydro_v07_Basins_v01_bugfix1.shp',
cst_file_template='/Users/shg096/Desktop/MERIT_Hydro/hill/hillslope_*_clean.shp'

# location of HydroLAKES
lake_file = '/Volumes/F:/hydrography/hydrolakes/HydroLAKES_polys_v10_shp/HydroLAKES_polys_v10_shp/HydroLAKES_polys_v10.shp'

# pfaf to be computed
pfafs = ["71","72","73","74"]


In [None]:
# load the needed packages
import os
import shutil
import geopandas as gpd
from   riverlakenetwork import Utility, BurnLakes
import warnings; warnings.filterwarnings("ignore")

In [None]:
#load hydrolakeDataset
lake = gpd.read_file(lake_file) # read the hydrolake dataset
# merge lake Michigan and Huron as they are hydraulically connected
lake = Utility.FixHydroLAKESv1(lake, merge_lakes={"Michigan+Huron": [6, 8]})

for pfaf in pfafs:

    pfaf_base = f"pfaf{pfaf}"

    # create the folder if not existed
    org_folder = os.path.join(OutFolder, f"{pfaf_base}_org")
    if os.path.isdir(org_folder):
        try:
            shutil.rmtree(org_folder)
        except OSError as e:
            raise RuntimeError(f"Failed to remove {org_folder}: {e}")
    os.makedirs(org_folder, exist_ok=True)
    
    # read the pfaf merit folder
    riv, cat = Utility.merit_read_file(pfaf,
                                       riv_file_template=riv_file_template,
                                       cat_file_template=cat_file_template,
                                       cst_file_template=cst_file_template)
    # save riv and cat
    riv.to_file(os.path.join(org_folder, "riv.gpkg"))
    cat.to_file(os.path.join(org_folder, "cat.gpkg"))

    # create the folder if not existed
    corrected_folder = os.path.join(OutFolder, f"{pfaf_base}_corrected")
    if os.path.isdir(corrected_folder):
        try:
            shutil.rmtree(corrected_folder)
        except OSError as e:
            raise RuntimeError(f"Failed to remove {corrected_folder}: {e}")
    os.makedirs(corrected_folder, exist_ok=True)
    
    # create the config and pass it to the Burn lake
    config = {
        "riv": riv,
        "riv_dict": {
            "COMID": {"col":"COMID"},
            "NextDownCOMID": {"col":"NextDownID"},
            "length": {"col":"lengthkm"},
            "uparea": {"col":"uparea","unit":"km2"}
        },
        "cat": cat,
        "cat_dict": {
            "COMID": {"col":"COMID"},
            "unitarea": {"col":"unitarea","unit":"km2"},
        },
        "lake": lake,
        "lake_dict": {
            "LakeCOMID": {"col":"Hylak_id"},
            "unitarea": {"col":"Lake_area","unit":"km2"}
        },
    }
    
    bl = BurnLakes(config)
    bl.riv.to_file(os.path.join(corrected_folder, "riv.gpkg"))
    bl.cat.to_file(os.path.join(corrected_folder, "cat.gpkg"))
    bl.lake.to_file(os.path.join(corrected_folder, "lake.gpkg"))