# Setting up the computer environment

Notes and tips on software configuration and management on Yale HPC cluster.

# 1. Accessing Yale HPC cluster

Using AnyConnect to establish connection:

* VNP: access.yale.edu
* username: dc2325
* MFA: push # after this accept the access in the duo mobile app

Using Linux terminal command:
```
sudo openconnect -u dc2325 access.yale.edu
```
Then type password at first password prompt, and `push` at 2nd password prompt. After this accept the access in the duo mobile app.

Before you can login you must provide the public key of your computer to the server. To do so, please visit: https://secure.its.yale.edu/cas/login to login, then provide the key at http://gold.hpc.yale.internal/cgi-bin/sshkeys.py

To login from the terminal:

```
ssh dc2325@farnam.hpc.yale.edu
```

You should not perform any analysis on the login node. If you do, your job may either run into memory error, or get terminated without noticing. However you can submit jobs easily using `sbatch` command. For example instead of running:

```
# `sos` is our pipeline software
sos run analysis.ipynb
```

you run

```
sbatch sos run analysis.ipynb
```

to submit job. This uses default resource allocation on the cluster. The rest of this document provides tips on more advanced job template, job management and logging to an interactive compute node. 

### Loading and listing modules in your environment on the cluster

```
$ module avail # For a list of modules available to use
$ module list # Displays all of the module files that are currently loaded in your environment
$ module avail python # To look for specific modules
$ module spider # Displays a description of all available modules
$ module load <name> # to load pre-installed software
$ module unload <name> # to unload
```

### Copying files/directories from and to the cluster

To copy from the cluster to your local machine

In your local terminal and to copy to the current dir `.`:

```
scp dc2325@farnam.hpc.yale.edu:/home/dc2325/results/pleiotropy/2020-04_bolt/BMI/*.snp_stats.bgen.gz . 
scp dc2325@farnam.hpc.yale.edu:/home/dc2325/project/results/pleiotropy/2020-04_bolt/INT-WHR/snp_stats.all_chr.gz . 
```

From your local machine to the cluster:

```
scp INT_BMI.sumstats.gz dc2325@farnam.hpc.yale.edu:/home/dc2325/scratch60/plink-clumping
```


# 2. Installing software in your $HOME directory

## a. Conda installation

You'd better install your own python, R, R packages and other softwares needed using miniconda, not to rely on the cluster.
###  1. The first step is to download miniconda3 to your local directory, then `sh` to install.
```
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh
```
### 2. Add miniconda to the PATH

By default a `.bashrc` file adding miniconda to the `$PATH` will be created, you can then modify as needed. 

If it gives you error when running `conda` after the  installation, please check your `.bash_profile` with command `$ cat ~/.bash_profile` to see if conda is added to the path. 

If not you should copy the code from the `.bashrc` file and add it manually:  

    
1. Open .bashrc file
2. Copy the code between `#make miniconda part of the $PATH` and `# <<< conda initialize <<<`
3. Add it to the .bash_profile
4. Exit the cluster and re-enter or source .bash_profile

### 3. Creating and switching environments with conda

Installing python 2.7 and creating a conda environment

```
conda create --name py2 python=2.7
conda activate py2
conda deactivate
```

## b. BOLT-LMM installation

For local installs add these lines to your ~/.bash_profile
```
# local installs
export MY_PREFIX=~/software
export PATH=$MY_PREFIX/bin:$PATH
export LD_LIBRARY_PATH=$MY_PREFIX/lib:$LD_LIBRARY_PATH
```

Then install the package:

```
cd ~/software && mkdir bin lib && \
wget https://data.broadinstitute.org/alkesgroup/BOLT-LMM/downloads/BOLT-LMM_v2.3.4.tar.gz && \
tar -zxvf BOLT-LMM_v2.3.4.tar.gz && \
rm -rf BOLT-LMM_v2.3.4.tar.gz && \
cp BOLT-LMM_v2.3.4/bolt ~/software/bin/ && \
cp BOLT-LMM_v2.3.4/lib/* ~/software/lib/
```

## c. SAIGE installation

#### Creating a conda environment

As per SAIGE tutorial

```
conda create -n RSAIGE r-essentials r-base=3.6.1 python=2.7
conda activate RSAIGE
conda install -c anaconda cmake
conda install -c conda-forge gettext lapack r-matrix
conda install -c r r-rcpp  r-rcpparmadillo r-data.table r-bh
conda install -c conda-forge r-spatest r-rcppeigen r-devtools  r-skat r-rcppparallel r-optparse boost openblas
pip3 install cget click
conda env export > environment-RSAIGE.yml
```

Solving some error issues in the installation of SAIGE https://github.com/weizhouUMICH/SAIGE/issues/118

```
conda create -n RSAIGE r-essentials r-base=3.6.1 python=2.7
conda activate RSAIGE
conda install -c anaconda cmake boost zlib
conda install -c conda-forge gettext lapack r-matrix 
conda install -c conda-forge r-spatest r-rcppeigen r-devtools r-skat
conda install -c conda-forge r-rcpp  r-rcpparmadillo r-data.table r-bh
conda install -c conda-forge r-rcppparallel r-optparse
pip install cget click
```


#### Activate conda environment

```
 conda activate RSAIGE
 FLAGPATH=`which python | sed 's|/bin/python$||'`
 export LDFLAGS="-L${FLAGPATH}/lib"
 export CPPFLAGS="-I${FLAGPATH}/include"
 export LDFLAGS="-L/gpfs/ysm/project/dewan/dc2325/conda_envs/RSAIGE/lib"
 export CPPFLAGS='-I/gpfs/ysm/project/dewan/dc2325/conda_envs/RSAIGE/include'
```

#### Intall required R libraries


For this part I had to install MetaSKAT using the remotes library otherwise I found an error
```
install.packages("remotes")
remotes::install_github("lin-lab/MetaSKAT")
```
#### Install SAIGE

Method 2: this method did not work for me, so I proceed to the next one

```
devtools::install_github("weizhouUMICH/SAIGE")
```

Method 3

```
src_branch=master
repo_src_url=https://github.com/weizhouUMICH/SAIGE
git clone --depth 1 -b $src_branch $repo_src_url
R CMD INSTALL SAIGE
```

#### SAIGE on Yale cluster

To install SAIGE in the HRC cluster first load necessary modules to create aspecific environment

Search and load modules

* `module avail` for a list of all available modules
 
* `module avail R` to see a list of all available R modules in Yale's HRC

	Select R-3.6.1 version if available by typing `module load R-3.6.1`

* `module avail gcc`

	Select gcc >= 5.4.0: `module load gcc-5.4.1`

* `module avail cmake`

	Select cmake 3.14.1: `module load cmake-3.14.1`

* `module avail cget`

	Select the latest version of cget: `module load cget`

* Install R packages using the `install_packages.R` script


Install SAIGE R package
	
```
R 
devtools::install_github("weizhouUMICH/SAIGE")`
```


Fixing problem with conda template: can't execute `conda activate` from bash script 
https://github.com/conda/conda/issues/7980

Added these variables to `.bash_profile` apparently fixed the issue

```
export -f conda
export -f __conda_activate
export -f __conda_reactivate
export -f __conda_hashr
        
```
Then `source .bash_profile`

## d. SoS installation and configuration

Install sos, sos-pbs and sos-notebook as the minimum requirements

```
conda install sos sos-pbs sos-notebook jupyterlab-sos sos-papermill -c conda-forge
```

Then check if all the kernels on jupyter are installed with this command : `$jupyter kernelspec list`, what we need are R, Bash and Python, if part of these kernels is missing, run this commmand to install: `$ conda install sos-r sos-python sos-bash -c conda-forge`

### SoS update to get the latest improvements

For development versions

```
pip install git+https://github.com/vatlab/sos -U

```

For released versions (when is implemented)
```
pip install sos -U
```

When you don't get the full features of the update do:
```
pip uninstall sos
pip install sos -U
```

### Test your code before running it

To check if the code in your notebook is running 

```
sos dryrun notebook.ipynb -q localhost
```


### SoS commands to create scripts from available notebooks

Make sure you have the latest papermill version

```
pip install sos-papermill -U
```

```
sos convert ~/project/pleiotropy_UKB/analysis/minimal_working_example.ipynb ~/project/pleiotropy_UKB/docs/analysis/minimal_working_example.html --execute
```

## e. R installation using conda

Intalling R with conda will allow you to manage your own packages. Refer to https://docs.ycrc.yale.edu/clusters-at-yale/guides/r/ for more information

```
conda install -c conda-forge r-base
```
To install R packages using conda 

```
conda install -c r package_name

e.g. packages required for LMM pipeline:
conda install -c r qqman 
conda install -c r dplyr 
conda install -c r ggrepel 
conda install -c r ggplot2

If this does not work, go to R to install:

R
install.packages("qqman")
install.packages("dplyr")
install.packages("ggrepel")
install.packages("ggplot2")
```

## f. QCTOOL version 2 usage

If installed from source it requires zlib to be installed and compilation needs to be done with python 2 

```
cd ~/software && \
hg clone -r beta https://gavinband@bitbucket.org/gavinband/qctool && cd qctool\
./waf-1.5.18 configure --prefix=$MY_PREFIX  && ./waf-1.5.18 \

```

If loaded from HPC cluster just write,

```
module load qctool
```


## h. REGENIE installation

Requirements:
* regenie requires compilation with GCC version >= 5.1 (on Linux) or Clang version >=3.3 (on Mac OSX). In the cluster you can load gcc by typing:
```
module load GCCcore/7.3.0
```
* Download and install [BGEN library](https://enkre.net/cgi-bin/code/bgen/dir?ci=trunk) to ~/software

Steps:
1. Clone the repo (make sure you compile the latest version)
```
git clone https://github.com/rgcgithub/regenie.git
```
2. In the source code edit the BGEN_PATH variable in the Makefile to the BGEN library path (/home/dc2325/software/bgen/) Remove space at the end
3. On the command line type make while in the main source code directory.
4. This should produce the executable called regenie

Run the program,
```
./regenie --help
```


If you find these problems try:

regenie: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by regenie)
regenie: /lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by regenie)
regenie: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by regenie)

```   
conda install -c omgarcia gcc-6 # install GCC version 6
conda install libgcc            # install conda gcc tools
```
make sure that you see GLIBCXX_3.4.xx on the list (which it could not find before)

```
strings miniconda3/lib/libstdc++.so.6 | grep GLIBCXX
```

add it to library paths

```
export LD_LIBRARY_PATH=<conda-env-path>/lib:$LD_LIBRARY_PATH
```

## Using Singularity on Yale's cluster

Add this to your bash_profile or bashrc

set SINGULARITY_CACHEDIR if you want to pull files (which can get big) somewhere other than $HOME/.singularity

```
export SINGULARITY_CACHEDIR=~/scratch60/.singularity
```

# 3. SLURM commands on Yale's cluster

Submit a submission script 
```
sbatch <script>
```

List queued and running jobs
```
squeue -u$USER
```

Cancel a queued job or kill a running job
```
scancel <job_id>
```

Cancel all your jobs (running and pending)
```
scancel -u$USER
```

Check status of individual job (including failed or completed)
```
sacct -j <job_id>
```

To see all pending jobs sorted by priority (jobs with higher priority at the top)
```
squeue --sort=-p -t PD -p general
```

To cancel all pending jobs
```
scancel -t PENDING -u$USER
```

To see files that will be deleted from scratch60 (they purge every 30 days)
```
cat /gpfs/ysm/scratch60/todelete/${UID}
```

To see the last job submitted slurm
```
sacct
sacct -S start-date -u user-name
```

# 4. Starting a jupyter notebook server on Yale's cluster

### 1. Submit/Start a jupyter-notebook server as a batch job. 

1.1. Clone the right github repo (containing the notebook to run), go the directory and run this command:`$ sbatch jupyter-tunnel.sh` to start the server, i.e. submit it as a job. 

1.2. Check if your job was submitted and is running with this command: `$ squeue -u$USER`  

```
Find the ST column:
   R: indicates the job is running 
   PD: the job is pending (you will have to wait for it to start running, otherwise you could not find the information needed in the log file to connect, which will be expained in detail in step 2.1)
```

### 2. Start a ssh tunnel

2.1. After making sure the job starts running, run command `$ ls` to find and open the log file. The name of it looks something like `jupyter-notebook-[jobid].log`. It contains all the information on how to connect. This will be located in the directory you submitted the job script from.


Log file example: 

The content of your log file should look like as follow, find and grab the corresponding highlighted information from your log file, which will be used in the next steps:

[hs863@farnam2 UKBB_GWAS_dev]$ cat jupyter-notebook-22823043.log

For more info and how to connect from windows,
   see https://docs.ycrc.yale.edu/clusters-at-yale/guides/jupyter/

MacOS or linux terminal command to create your ssh tunnel
`ssh -N -L 9240:c14n12:9240 hs863@farnam.hpc.yale.edu` --used in step 2.2

Windows MobaXterm info
Forwarded port:same as remote port
Remote server: c14n12
Remote port: 9240
SSH server: farnam.hpc.yale.edu
SSH login: hs863
SSH port: 22

Use a Browser on your local machine to go to:
`localhost:9240  (prefix w/ https:// if using password)` --e.g. https://localhost:9240 -- used step 3.1

[I 16:36:55.522 LabApp] Writing notebook server cookie secret to /gpfs/ysm/home/hs863/.local/share/jupyter/runtime/notebook_cookie_secret

[I 16:37:02.326 LabApp] JupyterLab extension loaded from /gpfs/ysm/project/dewan/hs863/conda_envs/notebook_env/lib/python3.8/site-packages/jupyterlab

[I 16:37:02.328 LabApp] JupyterLab application directory is /gpfs/ysm/project/dewan/hs863/conda_envs/notebook_env/share/jupyter/lab

[I 16:37:02.331 LabApp] Serving notebooks from local directory: /gpfs/ysm/home/hs863

[I 16:37:02.331 LabApp] The Jupyter Notebook is running at:

[I 16:37:02.331 LabApp] http://c14n12:9240/?token=0454c9e908381e176a57ee92615b9d7ab07f00c20370b967 `(get the string after token=)` --used in step 3.2 / alternative url 1

[I 16:37:02.331 LabApp]  or http://127.0.0.1:9240/?token=0454c9e908381e176a57ee92615b9d7ab07f00c20370b967 -- alternative url 2 of step3.1

[I 16:37:02.331 LabApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).



2.2 On a Mac or Linux machine, open a new terminal window and start the tunnel with `SSH` command. Get specific information from the log file.

### 3. Use local browser to connect and run the notebook

3.1 Browse the notebook: open a browser in your local machine and enter the address `http://localhost:port` or you can use the alternative urls in the log file.

3.2 Copy paste the token from the log file to the bar and enter.

**Tip and Notes:** 
* The address Jupyter creates by default (the one with the name of a compute node) will not work outside the cluster's network. 
* The server is created for 24h. After that you will need a new one but the changes are saved to you folders. Also is better to exit when you are not using it in order to free computer resources.

## Accessing an interactive node

Interactive jobs can be used for testing and troubleshooting code. By requesting an interactive job, you will be allocated resources and logged onto the node in a shell.

```
srun --pty -p interactive --mem-per-cpu=60G --cpus-per-task=1 --time=1-00:00:00 bash
```

## Copying results to shared folder

After completing the runs of the association analysis you should copy the results to the project shared folder

#### Path

```
/SAY/dbgapstg/scratch/UKBiobank/results/BOLTLMM_results/results_imputed_data/
/SAY/dbgapstg/scratch/UKBiobank/results/FastGWA_results/results_imputed_data/
```
* The combined association analyses for the imputed data in .snp_stats.gz
* The combined association analyses for the hard called genotypes in .stats.gz
* The gziped .stderr files in .stats.stderr.gz
* The gziped .stdout files in .stdout.gz

# 5. Checking job failure on cluster

When a job fails in the cluster make sure you do quadruple check to understand the origin of the error

### Step 1
Check the log file generated &> file.log that is located in the folder where you run the script

### Step 2
Check the stderr files that will be located in the output folder you choose

### Step 3
Check the .err files generated in the same folder where you run the script

### Step 4
Check the output of sos status task -v4

Note: it is suggested that you remove all err and out files `rm *.err *.out`  when you decide to run another round

# 6. Purging traces after runs

First of all, make sure there are not running pipelines.
Then, under the working directory (the place where you run the sbatch scripts) run the following commands to remove any traces of previous runs

```
rm -rf ~/.sos
rm -rf .sos
```

# 7. Accessing Columbia cluster

First of all you need to set a password by entering this website

https://portal.neuro.columbia.edu/vpn/index.html

```
uni: dmc2245
Password: Neurology99 is originated by default
```

You will need to change it when it is the first time

Then using AnyConnect type in the vpn:

```
ssl.cpmc.columbia.edu
select CUMC-VPN
uni: dmc2245
password: it should be the same as myColumbia (cas.columbia)
push (to verify with duo mobile)
```
After this go to the terminal and ssh into the cluster

```
ssh dmc2245@hgrcgrid.cpmc.columbia.edu
password (it is different from the VPN password)
```

# 8. Syncing the results between clusters
```
rsync -auzP /from/this/yale/path/* /to/this/columbia/path
```

# 9. Building an image using singularity

Yale's cluster

```
   singularity remote login
   singularity build --remote  marp.sif Singularity
   singularity shell -B /tmp:/scratch marp.sif
```
```
   
   singularity build lmm.sif docker://statisticalgenetics/lmm:1.9
   singularity build annovar.sif docker://gaow/gatk4-annovar
```

Columbia's cluster

```
module load Singularity/3.5.3
singularity remote login
> enter your token
singularity build --remote lmm.sif docker://statisticalgenetics/lmm:2.2
```
## Using singularity on the cluster

```
singularity shell --shell /bin/bash /mnt/mfs/statgen/containers/lmm.sif

```
