# ED2 Santarem Job


## Background

The Ecosystem Demography Biosphere Model (ED2) is an integrated terrestrial biosphere model incorporating hydrology, land-surface biophysics, vegetation dynamics, and soil carbon and nitrogen biogeochemistry (Longo et al. 2019;Medvigy et al., 2009). Like its predecessor, ED (Moorcroft et al., 2001), ED2 uses a set of size- and age-structured partial differential equations that track the changing structure and composition of the plant canopy. With the ED2 model, in contrast to conventional biosphere models in which ecosystems with climatological grid cells are represented in a highly aggregated manner, the state of the aboveground ecosystem is described by the density of trees of different sizes and how this varies across horizontal space for a series of plant functional types. For more details, please go [here](https://github.com/EDmodel/*ED2*).

## Run ED2 jobs on HPC cluster

This notebook run ED2 jobs on HPC clusters. This currently works with servers that can be connected  via single-factor authentication. You can submit santarem job. There is a separate notebook for [other jobs.](https://colab.research.google.com/drive/1_b5t5P5UUpVDhXrhFjlT9aGces44nLSi#scrollTo=HjCnosfaFWET&uniqifier=2)

## Prerequisites
The following modules are necessary to run this notebook. Use pip install

1. paramiko

In [None]:
!pip install paramiko

In [None]:
import getpass
import paramiko
import stat
import os
#from parsl.app.python import PythonApp

## Server Details


In [None]:
# Cluster details
hostname = 'cc-login.campuscluster.illinois.edu' # the hostname of the cluster you want to run the ed2 model
password = None
username = "ABC" # username on the cluster

## Job output details

In [None]:
show_status = False # Set this to see ocurrent status of the job you submitted
show_output = False # Set this to see the output of the job

## Batch job details
Feel free to ignore the parameters if you wish to keep the default.

In [None]:
# Batch job details
time = "0:04:00"                        # Job run time (hh:mm:ss)
nodes = 1                               # Number of nodes
ntasks_per_node = 16                    # Number of task (cores/ppn) per node
job_name = "ED2IN-santarem"             # Name of batch job
partition = "secondary"                 # Partition (queue)
output = "openmp_" + job_name + ".o%j"  # Name of batch job output file
error = "openmp_" + job_name + ".e%j"   # Name of batch job error file
mail_user = "ABC@illinois.edu"        # Send email notifications
mail_type = "BEGIN,END"                 # Type of email notifications to send

## Path to input data for job

In [None]:
# Path to various inputs
path_to_data = "${HOME}/ED-2.2_StartKit" #path to data for ED2 on cluster
path_to_singularity_image = "${HOME}/ed2-intel.sif" # path to singularity image of ED2 model on cluster
path_to_ED2IN = "Simulations/S0001_SantaremKm83_Test/ED2IN" # path to ED2IN file on cluster

In [None]:
# Specific path changes required in Header and ED2IN file

file_path_to_header_file ="$HOME/ED-2.2_StartKit/ED2_InputData/SiteData/Santarem_Km83/MeteoDriver/Santarem_Km83_HEADER" # path to header file
pattern1 = "path_to" #pattern to replace in header file
new_line1 = "ED2_InputData/SiteData/Santarem_Km83/MeteoDriver/Santarem_Km83_" # new line to be put in header file

file_path_to_ED2IN = "$HOME/ED-2.2_StartKit/Simulations/S0001_SantaremKm83_Test/ED2IN"
# FFILOUT -- Path and prefix for analysis files (all but history/restart).
NLFFILOUT = "/data/Simulations/S0001_SantaremKm83_Test/Analy/S0001_SantaremKm83_Test"
# SFILOUT -- Path and prefix for history files.
NLSFILOUT = "/data/Simulations/S0001_SantaremKm83_Test/Analy/S0001_SantaremKm83_Test"
# GFILOUT  -- Prefix for the output patch table/gap files
NLGFILOUT = "/data/Simulations/S0001_SantaremKm83_Test/Shade/S0001_SantaremKm83_Test"
# SFILIN --  The meaning and the size of this variable depends on the type of run, set  !
NLSFILIN = "/data/ED2_InputData/SiteData/Santarem_Km83/SiteBioData/s83_nounder."
NLVEG_DATABASE = "/data/ED2_InputData/GriddedData/VegetData/OGE2/OGE2_"
NLSOIL_DATABASE = "/data/ED2_InputData/GriddedData/SoilData/Texture/SoilGrids20/SoilGrids20_"
NLLU_DATABASE = "/data/ED2_InputData/GriddedData/LandUse/glu-3.3.1/glu-3.3.1-"
NLTHSUMS_DATABASE = "/data/ED2_InputData/GriddedData/ThermalSums/"
NLED_MET_DRIVER_DB = "/data/ED2_InputData/SiteData/Santarem_Km83/MeteoDriver/Santarem_Km83_HEADER"
NLSOILSTATE_DB = "/data/ED2_InputData/GriddedData/SoilData/TempMoist/STW1996OCT.dat"
NLSOILDEPTH_DB = "c"

# patterns to find and replce with new lines in ED2IN
pattern2 = "NL%FFILOUT"
new_line2 = "NL%FFILOUT=\\'" + NLFFILOUT + "\\'"
pattern3 = "NL%SFILOUT"
new_line3 = "NL%SFILOUT=\\'" + NLSFILOUT + "\\'"
pattern4 = "NL%GFILOUT"
new_line4 = "NL%GFILOUT=\\'" + NLGFILOUT + "\\'"
pattern5 = "NL%SFILIN"
new_line5 = "NL%SFILIN=\\'" + NLSFILIN + "\\'"
pattern6 = "NL%SFILIN"
new_line6 = "NL%SFILIN=\\'" + NLSFILIN + "\\'"
pattern7="NL%VEG_DATABASE"
new_line7="NL%VEG_DATABASE=\\'" + NLVEG_DATABASE + "\\'"
pattern8="NL%SOIL_DATABASE"
new_line8="NL%SOIL_DATABASE=\\'" + NLSOIL_DATABASE + "\\'"
pattern9="NL%LU_DATABASE"
new_line9="NL%LU_DATABASE=\\'" + NLLU_DATABASE + "\\'"
pattern10="NL%THSUMS_DATABASE"
new_line10="NL%THSUMS_DATABASE=\\'" + NLTHSUMS_DATABASE + "\\'"
pattern11 ="NL%ED_MET_DRIVER_DB"
new_line11 = "NL%ED_MET_DRIVER_DB=\\'" + NLED_MET_DRIVER_DB + "\\'"
pattern12 = "NL%ED_MET_DRIVER_DB"
new_line12 = "NL%ED_MET_DRIVER_DB=\\'" + NLED_MET_DRIVER_DB + "\\'"
pattern13 = "NL%SOILSTATE_DB"
new_line13 = "NL%SOILSTATE_DB=\\'" + NLSOILSTATE_DB + "\\'"
pattern14 = "NL%SOILDEPTH_DB"
new_line14 = "NL%SOILDEPTH_DB=\\'" + NLSOILDEPTH_DB + "\\'"

In [None]:
def run_singularity(executable, singularity_image, args, stdout=None, stderr=None):
    return f"{executable} exec {singularity_image} {args}"

def submit_job(username):
    ssh_client = paramiko.SSHClient()
    ssh_client.load_system_host_keys()
    ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    try:
        ssh_client.connect(hostname, username=username, password=password, allow_agent=True)
        print("successfully connected")
    except:
        pass
    transport = ssh_client.get_transport()
    transport.auth_password(username, getpass.getpass('Enter {0} Logon password :'.format(hostname)))
    sftp_client = paramiko.SFTPClient.from_transport(transport)


    #create the bat file
    with open(job_name + ".sbatch", 'w') as f:
        f.writelines("#!/bin/bash\n")
        f.writelines("#SBATCH --time=" + str(time) + "\n")
        f.writelines("#SBATCH --ntasks-per-node=" + str(ntasks_per_node) + "\n")
        f.writelines("#SBATCH --job-name=" + job_name + "\n")
        f.writelines("#SBATCH --partition=" + partition + "\n")
        f.writelines("#SBATCH --output=" + output + "\n")
        f.writelines("#SBATCH --error=" + error + "\n")
        f.writelines("#SBATCH --mail-user=" + mail_user + "\n")
        f.writelines("#SBATCH --mail-type=" + mail_type + "\n")
        f.writelines("\n")
        f.writelines("module load singularity" + "\n")
        #f.writeline("sed -i '/pattern1/c$new_line1" "$file_path_to_header_file"" + "\n")
        f.writelines([f"sed -i /{pattern1}/c{new_line1} {file_path_to_header_file}\n"])
        f.writelines([f"sed -i /{pattern2}/c{new_line2} {file_path_to_ED2IN}\n"])
        f.writelines([f"sed -i /{pattern3}/c{new_line3} {file_path_to_ED2IN}\n"])
        f.writelines([f"sed -i /{pattern4}/c{new_line4} {file_path_to_ED2IN}\n"])
        f.writelines([f"sed -i /{pattern5}/c{new_line5} {file_path_to_ED2IN}\n"])
        f.writelines([f"sed -i /{pattern6}/c{new_line6} {file_path_to_ED2IN}\n"])
        f.writelines([f"sed -i /{pattern7}/c{new_line7} {file_path_to_ED2IN}\n"])
        f.writelines([f"sed -i /{pattern8}/c{new_line8} {file_path_to_ED2IN}\n"])
        f.writelines([f"sed -i /{pattern9}/c{new_line9} {file_path_to_ED2IN}\n"])
        f.writelines([f"sed -i /{pattern10}/c{new_line10} {file_path_to_ED2IN}\n"])
        f.writelines([f"sed -i /{pattern11}/c{new_line11} {file_path_to_ED2IN}\n"])
        f.writelines([f"sed -i /{pattern12}/c{new_line12} {file_path_to_ED2IN}\n"])
        f.writelines([f"sed -i /{pattern13}/c{new_line13} {file_path_to_ED2IN}\n"])
        f.writelines([f"sed -i /{pattern14}/c{new_line14} {file_path_to_ED2IN}\n"])
        f.writelines("singularity exec --bind " + path_to_data + ":/data --no-home --pwd /data " + path_to_singularity_image + " ed2 -f " + path_to_ED2IN)
    f.close()

    #transfer .bat file to cluster and run it
    sftp_client.put(job_name + ".sbatch", f"/home/{username}/" + job_name + ".sbatch")
    sftp_client.chmod(f"/home/{username}/" + job_name + ".sbatch", stat.S_IRWXU)
    _, stdo, stde = ssh_client.exec_command("sbatch " + job_name + ".sbatch")
    print(stde.read().decode())

    # Extract the job ID from the sbatch output
    result = stdo.read().decode()
    print(result)
    job_id = result.split()[3]

    # Show job status
    if show_status:
        # Check the job status periodically
        while True:
            #job_status = subprocess.run(f"squeue -u {username} -j {job_id}", shell=True, capture_output=True, text=True)
            _, stdo, stde = ssh_client.exec_command(f"squeue -u {username} -j {job_id}")
            job_status = stdo.read().decode()
            print(job_status)

            # Break the loop if the job is completed or failed
            if job_id not in job_status:
                break

            # Wait for a few seconds before checking again
            timer.sleep(10)
    if show_output:
        print("Output")
        # View output
        _, stdo, stde = ssh_client.exec_command(f"cat /home/{username}/openmp_{job_name}.o{job_id}")
        print(stdo.read().decode())

        print("Error")
        # View error
        _, stdo, stde = ssh_client.exec_command(f"cat /home/{username}/openmp_{job_name}.e{job_id}")
        print(stdo.read().decode())

    sftp_client.close()
    ssh_client.close()
    transport.close()

submit_job(username)

Enter cc-login.campuscluster.illinois.edu Logon password :··········
Submitted batch job 9863129


