# Utah HPC environment notes
# How to connect to a cluster at Utah HPC?
We need to put this in the terminal:

```bash
ssh [uNID]@[cluster_name].chpc.utah.edu
```

Where we can access to `notchpeak`, `kingspeak` and `lonepeak` clusters (this is because they are not restricted environments).

```bash 
ssh u6059911@kingspeak.chpc.utah.edu
```

We can navigate, create and manipulate directories starting from the home directory:

```bash 
[u6059911@kingspeak1:~]$ pwd ~/
```

Which results:

```bash 
/uufs/chpc.utah.edu/common/home/u6059911
```

# How to install Miniconda and create programming environments

We need to run:

```bash 
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
```

Which downloads the Miniconda Installer. Then, we have to execute:

```bash 
bash ./Miniconda3-latest-Linux-x86_64.sh -b -p $HOME/software/pkg/miniconda3 -s
```

Where `$HOME/software/pkg/miniconda3` will be the directory where we are going to install Miniconda. You can see what `~/software/pkg/miniconda3` contains:


```bash
[u6059911@kingspeak1:~]$ cd software/pkg/miniconda3/
bin/                         man/
cmake/                       pkgs/
compiler_compat/             sbin/
condabin/                    share/
conda-meta/                  shell/
envs/                        ssl/
etc/                         x86_64-conda_cos7-linux-gnu/
include/                     x86_64-conda-linux-gnu/
lib/
```


So all the environments will be saved in `envs/`. To easily manage the Miniconda environment, Utah CHPC suggests the user create a custom environment module. For this, first you need to create a **user modules directory:**
```bash
 mkdir -p $HOME/MyModules/miniconda3
 ```
 Then, we copy the module File:
```bash
cp /uufs/chpc.utah.edu/sys/installdir/python/modules/miniconda3/latest.lua $HOME/MyModules/miniconda3
```
Where `/uufs/chpc.utah.edu/sys/installdir/python/modules/miniconda3/` is a Utah CHPC directory distribution, and `latest.lua` is a script used with the environment module system (it configures the user environment setting necessary environment variables). On the other hand, to make sure that there will be no errors when using the `conda` commands, we must to add in the `.bashrc`, through (in `~/`):
```bash
nano .bashrc
```
The next code block:
```bash
unset -f conda
# Load user modules
if [ -f ~/.custom.sh ]; then
    source ~/.custom.sh
fi
```
Thus, our bash file will be configured as follows:
```bash
(base) [u6059911@kingspeak1:~]$ cat .bashrc
# please send comments/suggestions to issues@chpc.utah.edu
# currently works on all CHPC based clusters and on Linux desktops

# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi

#MC if UUFSCELL is not defined, try to source it
if [ -z "$UUFSCELL" ] ; then       # Test whether the the $UUFSCELL has zero length
  if [ -e /etc/profile.d/uufs.sh ] ; then
     . /etc/profile.d/uufs.sh
  fi
fi

# Init. modules before setting UUFSCELL
# Unset MODULEPATH + set LUA + LMOD
source /uufs/chpc.utah.edu/sys/modulefiles/scripts/module_init/module_init.sh

# stacksize by default :very small => programs with large static data to segfault
ulimit -s unlimited

# create .pbs_spool if not created already
if [ ! -d $HOME/.pbs_spool ] ; then
     mkdir $HOME/.pbs_spool
fi

# NEVER!!! ADD env. variables into .aliases -> interferes with modules
if [ -f "$HOME/.aliases" ] ; then
    source ~/.aliases
fi

if [ -n "$UUFSCELL" ] ; then       # If the String length is NON-zero
    GRP=`echo $UUFSCELL | cut -d . -f 2`
    
    if [[ $BASHRC_LOADED == 1 ]] ; then
      if [ ! -n "$TERM" ] ; then return; fi
      if [[ $TERM != "screen" ]] ; then return; fi
    fi
    export BASHRC_LOADED=0

    # Template custom.sh to be found at :
    #   /uufs/chpc.utah.edu/sys/modulefiles/templates/custom.sh
    if [ -f "$HOME/.custom.sh" ] ; then
        source $HOME/.custom.sh
    fi
fi # End of the UUFSCELL loop

# scp defines TERM='dumb', so, this condition is not valid
#if [ ! -n "$TERM" ] ; then
# need to use PS1 instead
if [ -z "$PS1" ] ; then
    return
fi

if [[ "$UUFSCELL" = "redwood.bridges" ]] ; then
# set up custom prompt with PE string
  PS1="[\u@\h *PE* \W]\$"
else
#  set prompt = "[$USER@%m:%b%c2]%b "
  PS1="[\u@\h:\W]\$ "
fi

# list completions when the tab key is hit
# menu-complete cycles through options, complete stops at first ambiguous character
#bind "TAB:menu-complete"
bind "TAB:complete"
bind "set show-all-if-ambiguous on"
# expand variable names in tab auto-completion
shopt -s direxpand
export TMOUT=0

unset -f conda

# Load user modules
if [ -f ~/.custom.sh ]; then
    source ~/.custom.sh
fi

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/uufs/chpc.utah.edu/common/home/u6059911/software/pkg/miniconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/uufs/chpc.utah.edu/common/home/u6059911/software/pkg/miniconda3/etc/profile.d/conda.sh" ]; then
        . "/uufs/chpc.utah.edu/common/home/u6059911/software/pkg/miniconda3/etc/profile.d/conda.sh"
    else
        export PATH="/uufs/chpc.utah.edu/common/home/u6059911/software/pkg/miniconda3/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

#SLURM Aliases that provide information in a useful manner for our clusters
alias si="sinfo -o \"%20P %5D %14F %8z %10m %10d %11l %32f %N\""
alias si2="sinfo -o \"%20P %5D %6t %8z %10m %10d %11l %32f %N\""
alias sq="squeue -o \"%8i %12j %4t %10u %20q %20a %10g %20P %10Q %5D %11l %11L %R\""
```
Note that the aliases declared at the end of the `.bashrc` will be used later when you want to check the status of the GPUs on a given partition. Also, we need to modify the `.custom.sh` file (through `nano .custom.sh` in the home directory `~/`), as suggested by the Utah CHPC documentation. That is, we need to add these lines to that file:
```bash
module use $HOME/MyModules
module load miniconda3/latest
```
So we would have in `.custom.sh`:
```bash
(base) [u6059911@kingspeak1:~]$ cat .custom.sh
#!/bin/bash

# Here add custom module loads for all CHPC Linux systems

# ----------------------------------------------------------------------
if [[ "$UUFSCELL" = "kingspeak.peaks" ]] ; then
# add custom module loads after this
     :

# ----------------------------------------------------------------------
# Do Notchpeak specific initializations
elif [[ "$UUFSCELL" = "notchpeak.peaks" ]] ; then
# add custom module loads after this
     :

# ----------------------------------------------------------------------
# Do Lonepeak specific initializations
elif [[ "$UUFSCELL" = "lonepeak.peaks" ]] ; then
# add custom module loads after this
     :

# ----------------------------------------------------------------------
# Do Ash specific initializations
elif [[ "$UUFSCELL" = "ash.peaks" ]] ; then
# add custom module loads after this
     :

# ----------------------------------------------------------------------
# Do Tangent specific initializations
elif [[ "$UUFSCELL" = "tangent.peaks" ]] ; then
# add custom module loads after this
     :

# ----------------------------------------------------------------------
elif [[ "$UUFSCELL" = "redwood.bridges" ]] ; then
# add custom module loads after this
     :

# ----------------------------------------------------------------------
# Do astro.utah.edu specific initializations
elif [[ "$UUFSCELL" = "astro.utah.edu" ]] ; then
# add custom module loads after this
	:

# ----------------------------------------------------------------------
# Do cemi specific initializations
elif [[ "$UUFSCELL" = "cemi" ]] ; then
# add custom module loads after this
	:

fi

# Uncomment to set TMPDIR from default (and small) /tmp to /scratch/local
#if [ ! -d /scratch/local/$USER ] ; then
#     mkdir /scratch/local/$USER 
#fi
#export TMPDIR=/scratch/local/$USER 
module use $HOME/MyModules
module load miniconda3/latest
```
To run jobs on the CHPC clusters, we need to create slurm scripts, for that we could use `nano [name].slurm`. For example, the follow **Slurm Batch Script** defines a job that will be executed to verify the usage of a GPU. Similarly, it will call a conda environment to compile a Python file. That conda environment will be opened on the `kingspeak` cluster, but once the job is done, the environment will be deactived.
```bash
(base) [u6059911@kingspeak2:test_env]$ cat run_job.slurm 
#!/bin/bash
#SBATCH --partition=kingspeak-gpu
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --time=1:00:00
#SBATCH --job-name=my_job
#SBATCH --gres=gpu:1
#SBATCH --account=kingspeak-gpu
#SBATCH --output=my_job_%j.out
#SBATCH -error=my_job_%j.err

source ~/software/pkg/miniconda3/etc/profile.d/conda.sh
conda activate ~/software/pkg/miniconda3/envs/my_conda_env

echo "Conda environment:"
conda info --envs

echo "Checking GPU:"
nvidia-smi

echo "Python script results:"
python ~/scripts/test_env/sum.py
```

Note that:
```bash
source ~/software/pkg/miniconda3/etc/profile.d/conda.sh
conda activate ~/software/pkg/miniconda3/envs/my_conda_env
```
By running the first command line, you are ensuring that Conda functions and commands are available for use (since `conda.sh`  file contains settings and functions necessary for conda to work correctly in the current shell session). While the second command line activates the specific Conda environment that we have created (`my_conda_env`).

```bash
(base) [u6059911@kingspeak2:test_env]$ sbatch run_job.slurm 
Submitted batch job 13328606
(base) [u6059911@kingspeak2:test_env]$ squeue -u $USER
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
          13328606 kingspeak   my_job u6059911  R       0:10      1 kp298
(base) [u6059911@kingspeak2:test_env]$ squeue -u $USER
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
(base) [u6059911@kingspeak2:test_env]$ ls
 my_job_13328606.out  'rror=my_job_13328606.err'   run_job.slurm   sum.py
(base) [u6059911@kingspeak2:test_env]$ cat my_job_13328606.out 
Conda environment:
# conda environments:
#
base                     /uufs/chpc.utah.edu/common/home/u6059911/software/pkg/miniconda3
my_conda_env          *  /uufs/chpc.utah.edu/common/home/u6059911/software/pkg/miniconda3/envs/my_conda_env

Checking GPU:
Fri Jul 26 04:34:25 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14              Driver Version: 550.54.14      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX TITAN X     On  |   00000000:04:00.0 Off |                  N/A |
| 22%   32C    P8             16W /  250W |       1MiB /  12288MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
Python script results:
Python executable: /uufs/chpc.utah.edu/common/home/u6059911/software/pkg/miniconda3/envs/my_conda_env/bin/python
Python version: 3.8.19 (default, Mar 20 2024, 19:58:24) 
[GCC 11.2.0]
NumPy version: 1.24.3
The sum of 2 and 3 is: 5
```




```bash
(base) [u6059911@kingspeak2:~]$ cd environments/my_conda_env/
(base) [u6059911@kingspeak2:my_conda_env]$ ls
environment.yml
(base) [u6059911@kingspeak2:my_conda_env]$ cd
(base) [u6059911@kingspeak2:~]$ conda activate my_conda_env
```




```bash
(my_conda_env) [u6059911@kingspeak2:~]$ conda list
# packages in environment at /uufs/chpc.utah.edu/common/home/u6059911/software/pkg/miniconda3/envs/my_conda_env:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             5.1                       1_gnu  
blas                      1.0                         mkl  
bottleneck                1.3.7            py38ha9d4c09_0  
brotli                    1.0.9                h5eee18b_8  
brotli-bin                1.0.9                h5eee18b_8  
bzip2                     1.0.8                h5eee18b_6  
ca-certificates           2024.7.2             h06a4308_0  
contourpy                 1.0.5            py38hdb19cb5_0  
cycler                    0.11.0             pyhd3eb1b0_0  
cyrus-sasl                2.1.28               h52b45da_1  
dbus                      1.13.18              hb2f20db_0  
expat                     2.6.2                h6a678d5_0  
fontconfig                2.14.1               h4c34cd2_2  
fonttools                 4.51.0           py38h5eee18b_0  
freetype                  2.12.1               h4a9f257_0  
glib                      2.78.4               h6a678d5_0  
glib-tools                2.78.4               h6a678d5_0  
gst-plugins-base          1.14.1               h6a678d5_1  
gstreamer                 1.14.1               h5eee18b_1  
icu                       73.1                 h6a678d5_0  
importlib_resources       6.4.0            py38h06a4308_0  
intel-openmp              2023.1.0         hdb19cb5_46306  
jpeg                      9e                   h5eee18b_1  
kiwisolver                1.4.4            py38h6a678d5_0  
krb5                      1.20.1               h143b758_1  
lcms2                     2.12                 h3be6417_0  
ld_impl_linux-64          2.38                 h1181459_1  
lerc                      3.0                  h295c915_0  
libbrotlicommon           1.0.9                h5eee18b_8  
libbrotlidec              1.0.9                h5eee18b_8  
libbrotlienc              1.0.9                h5eee18b_8  
libclang                  14.0.6          default_hc6dbbc7_1  
libclang13                14.0.6          default_he11475f_1  
libcups                   2.4.2                h2d74bed_1  
libdeflate                1.17                 h5eee18b_1  
libedit                   3.1.20230828         h5eee18b_0  
libffi                    3.4.4                h6a678d5_1  
libgcc-ng                 11.2.0               h1234567_1  
libglib                   2.78.4               hdc74915_0  
libgomp                   11.2.0               h1234567_1  
libiconv                  1.16                 h5eee18b_3  
libllvm14                 14.0.6               hdb19cb5_3  
libpng                    1.6.39               h5eee18b_0  
libpq                     12.17                hdbd6064_0  
libstdcxx-ng              11.2.0               h1234567_1  
libtiff                   4.5.1                h6a678d5_0  
libuuid                   1.41.5               h5eee18b_0  
libwebp-base              1.3.2                h5eee18b_0  
libxcb                    1.15                 h7f8727e_0  
libxkbcommon              1.0.1                h5eee18b_1  
libxml2                   2.10.4               hfdd30dd_2  
lz4-c                     1.9.4                h6a678d5_1  
matplotlib                3.7.2            py38h06a4308_0  
matplotlib-base           3.7.2            py38h1128e8f_0  
mkl                       2023.1.0         h213fc3f_46344  
mkl-service               2.4.0            py38h5eee18b_1  
mkl_fft                   1.3.8            py38h5eee18b_0  
mkl_random                1.2.4            py38hdb19cb5_0  
mysql                     5.7.24               h721c034_2  
ncurses                   6.4                  h6a678d5_0  
numexpr                   2.8.4            py38hc78ab66_1  
numpy                     1.24.3           py38hf6e8229_1  
numpy-base                1.24.3           py38h060ed82_1  
openjpeg                  2.4.0                h9ca470c_2  
openssl                   3.0.14               h5eee18b_0  
packaging                 24.1             py38h06a4308_0  
pandas                    2.0.3            py38h1128e8f_0  
pcre2                     10.42                hebb0a14_1  
pillow                    10.4.0           py38h5eee18b_0  
pip                       24.0             py38h06a4308_0  
ply                       3.11                     py38_0  
pyparsing                 3.0.9            py38h06a4308_0  
pyqt                      5.15.10          py38h6a678d5_0  
pyqt5-sip                 12.13.0          py38h5eee18b_0  
python                    3.8.19               h955ad1f_0  
python-dateutil           2.9.0post0       py38h06a4308_2  
python-tzdata             2023.3             pyhd3eb1b0_0  
pytz                      2024.1           py38h06a4308_0  
qt-main                   5.15.2              h53bd1ea_10  
readline                  8.2                  h5eee18b_0  
setuptools                69.5.1           py38h06a4308_0  
sip                       6.7.12           py38h6a678d5_0  
six                       1.16.0             pyhd3eb1b0_1  
sqlite                    3.45.3               h5eee18b_0  
tbb                       2021.8.0             hdb19cb5_0  
tk                        8.6.14               h39e8969_0  
tomli                     2.0.1            py38h06a4308_0  
tornado                   6.4.1            py38h5eee18b_0  
unicodedata2              15.1.0           py38h5eee18b_0  
wheel                     0.43.0           py38h06a4308_0  
xz                        5.4.6                h5eee18b_1  
zipp                      3.17.0           py38h06a4308_0  
zlib                      1.2.13               h5eee18b_1  
zstd                      1.5.5                hc292b87_2  
(my_conda_env) [u6059911@kingspeak2:~]$ 
```



```bash
(my_conda_env) [u6059911@kingspeak2:~]$ cd environments/my_conda_env/ && cat environment.yml
name: my_conda_env
channels:
  - defaults
dependencies:
  - python=3.8
  - numpy
  - pandas
  - matplotlib
```


```bash
(base) [u6059911@kingspeak1:~]$ echo $PATH
/uufs/chpc.utah.edu/common/home/u6059911/software/pkg/miniconda3/bin:/uufs/chpc.utah.edu/common/home/u6059911/software/pkg/miniconda3/condabin:/uufs/chpc.utah.edu/common/home/u6059911/software/pkg/miniconda3/bin:/uufs/chpc.utah.edu/sys/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/uufs/chpc.utah.edu/common/home/u6059911/.dotnet/tools:/uufs/kingspeak.peaks/sys/pkg/slurm/std/bin:/uufs/chpc.utah.edu/common/home/u6059911/bin
(base) [u6059911@kingspeak1:~]$ conda --version
conda 24.5.0
```


```bash
(base) [u6059911@kingspeak2:~]$ freegpus -p kingspeak-gpu
GPUS_FREE: kingspeak-gpu
titanx (x14)
(base) [u6059911@kingspeak2:~]$ freegpus -p notchpeak-gpu
GPUS_FREE: notchpeak-gpu
2080ti (x1)
3090 (x3)
(base) [u6059911@kingspeak2:~]$ freegpus -p lonepeak-gpu
GPUS_FREE: lonepeak-gpu
1080ti (x24)

```