# **Data Transfer Nodes for Large Data Transfer at Utah CHPC**
For science workflows that transfer very large datasets between institutions, we need ***advanced parallel transfer tools*** running on tuned devices such as ***Data Transfer Nodes (DTNs)***. The University of Utah CHPC supports various parallel transfer tools that support these heavyweight tasks.

Network traffic from most CHPC systems (on campus) pass through the campus firewall when communicating  with resources off campus.
* Large research computing workflows require more bandwidth and connections/sessions requirements than the campus firewall can handle: it overwhelm the campus firewall capacity, impacting the usage for the rest of campos.
    * For adress these needs, Utah campus has created a Science DMZ (a network segment with different security approaches) that allows for specific transfers (high performance and low latency) of data.
## **General DTN environments**
There are (all CHPC users are able to utilize the following):
* intdtn01.chpc.utah.edu (connected at 10gbs, no dmz, use for internal campus transfers)
* intdtn02.chpc.utah.edu (connected at 10gbs, no dmz, use for internal campus transfers)
* intdtn03.chpc.utah.edu (connected at 10gbs, no dmz, use for internal campus transfers)
* intdtn04.chpc.utah.edu (connected at 10gbs, no dmz, use for internal campus transfers)
* dtn05.chpc.utah.edu (connected via dmz at 100gbs)
* dtn06.chpc.utah.edu (connected via dmz at 100gbs)
* dtn07.chpc.utah.edu (connected via dmz at 100gbs)
* dtn08.chpc.utah.edu (connected via dmz at 100gbs)

Where (for moving large datasets):
* dtn05-08 operate individually, as well as together.
* intdtn01-03 operate both individually as well as together.
Furthermore:
* CHPC supports specialized tools for moving data to/from cloud storage.
    * `s3cmd` for Amazon cloud services
    * `rclone` for different cloud storage providers.
* **dtns** via slurm is enabled at `notchpeak`.
# Data Transfer Node Access via SLURM
It is good know that each **dtn** node has:
* 24 cores, 128 GB RAM
    * Only 12 cores and 96 GB RAM are avialable to run Slurm jobs.
        * For `notchpeak` cluster:
            * Slurm partition: `notchpeak-dtn`.
            * Slurm Account: `dtn`.
            * Nodes: `dtn05`,`dtn06`,`dtn07`,`dtn08`.
            * `notchpeak-dtn` has 100 Gbps connections to the **Utah's Science DMZ** (segment of the Utah network with streamlined data-flow across the campus firewall to and from off-campus).
    * All CHPC users have been set up to use the dtns.

`notchpeak-dtn` Slurm partition is similar to other shared SLURM partitions at CHPC, with multiple transfer jobs sharing a node.
* Each Slurm job running on a **dtn** is allocated a 1 core and 2 GB RAM.
* `notchpeak-dtn` has 72 hours per job as a maximum limit time.

## **Download a dataset using dtn Slurm script:**
```bash
#!/bin/tcsh 

#SBATCH --partition=notchpeak-dtn

#SBATCH --account=dtn

#SBATCH --time=1:00:00

#SBATCH -o slurm-%j.out-%N

#SBATCH -e slurm-%j.err-%N s

setenv SCR /scratch/general/lustre/$USER/$SLURM_JOB_ ID

mkdir -p $SCR

cd $SCR

wget https://www1.ncdc.noaa.gov/pub/data/uscrn/products/daily01/2020/CRND0103-2020-AK_Aleknagik_1_NNE.txt
```
Note that:
* The appropiate account and partition were used (`notchpeak-dtn` and `dtn`).
* `setenv SCR /scratch/general/lustre/$USER/$SLURM_JOB_ID` set an environment variable `SCR` to a path in the scratch file system (specific to the user and job ID). `setenv` (`tcsh` command) is equivalent to `export` in bash. 
    * `SCR` is the name of the environment variable being set.
    * `/scratch/general/lustre` is a directory path on the file system intended for temporary or intermediate data storage. The **scratch** space is a HP temporary storage area.
    * `$USER`is an environment variable that ensures each user's data is kept separate.
    * `$SLURM_JOB_ID` is an Slurm environment variable, containing the unique job ID assigned to the current job. It ensures that data from different jobs run by the same user doesn't collide and is stored in separate directories.
        * `/scratch/general/lustre/$USER/$SLURM_JOB_ID` is the value being assigned to the `SCR` environment variable. It constructs a path where temporary files can be stored for the job.
* `mkdir -p $SCR` creates the directory if it doesn't already exist, ensuring that the path `scratch/general/lustre/$USER/$SLURM_JOB_ID` exists. 
* `cd $SCR` changes the current directory to the one just created.
* `wget [...]` downloads a specific file into the directory defined by `SCR`.
