Additions to deployment notes and scripts

MRIresearch · Jan 10, 2024 · 22d9f71 · 22d9f71
1 parent f0a1e48
commit 22d9f71
Show file tree

Hide file tree

Showing 10 changed files with 199 additions and 35 deletions.
diff --git a/deployment/README.md b/deployment/README.md
@@ -1,14 +1,102 @@
-# PANpipelines deployment
+# PANpipelines Deployment
+These notes provide guidelines for reproducing the PANpipelines in a SLURM-based HPC envrionment. Specific notes are provided in the section **U of A HPC Deployment** for deploying these pipelines in the University of Arizona's Puma HPC environment. General notes for deployment are provided in **General HPC Deployment**.
 
+# University of Arizona HPC Deployment
+## Load Python Environment and Prepare Virtual Environment
+The most convenient approach to deploying these pipelines in the U of A's HPC Cluster is to use the preinstalled python module and we will use `virtualenv` to manage our python dependenices.
+
+
+```
+module load python/3.8/3.8.2
+pip install --user virtualenv
+```
+
+Create a virtual environment called `panvenv`
+
+```
+mkdir /xdisk/ryant/[USERNAME]/venvs
+cd /xdisk/ryant/[USERNAME]/venvs
+virtualenv -p python3 ./panvenv
+```
+
+Activate environment and Install the latest version of PAN pipelines using `pip`
+```
+module load python/3.8/3.8.2
+source /xdisk/ryant/[USERNAME]/venvs/panvenv/bin/activate
+pip install -U panpipelines
+```
+
+## Create personal deployment directory
+Clone the PANpipelines repsoitory and copy the deployment folder to your own workspace as follows:
+```
+cd /xdisk/ryant/[USERNAME]
+git clone https://github.com/MRIresearch/PANpipelines.git
+cp -R /xdisk/ryant/[USERNAME]/PANpipelines/deployment //xdisk/ryant/[USERNAME]/newdeployment
+```
+
+
+## Edit panpipeconfig_slurm.config
+
+In the configuration file `./config/panpipeconfig_slurm.config` make the following changes:
+
+Change `XNAT_HOST` to the XNAT URL i.e. `https://....`
+
+Currently a central data location as defined by `DATA_DIR` is being used to avoid duplication of data across individual users. If you prefer to isolate data for your pipelines from other users then you can change this location to another one.
+
+## Edit  credentials and change access permissions
+Change the username and password in `./config/credentials/credentials.json` to your XNAT credentials.
+
+Change access permissions on `credentials.json` to prevent unauthorized access. Using `chmod 400` on `credentials.json` should be the most conservative way to achieve this. You will need to set the folder `./config/credentials/` most conservatively to `500` if you also want to restrict access to the folder. Anything more conservative may prevent access to the file by the program.
+
+## Freesurfer license
+Update the provided freesurfer license in `./config/license.txt` with your personal licens if you have one. The pipeline will run with the existing license but it is good practice to use your own license. You can get a license from the Freesurfer main website.
+
+## Edit run_pan_slurm.sh
+Add commands required to instantiate your python environment e.g. module load env, conda activate env etc.
+
+Change the call to PYTHON depending on how it is invoked in your environment. Some environments require Python version 3 to be invoked as `python3`. In that case set `PYTHON=python3`
+
+Change `PKG_DIR` to point to the directory that contains the pipelines package:
+`PKG_DIR=[Path to]/PANpipelines/src`. This will be in your conda or python or virtual env environment if you installed PANpipelines using `pip`. If you are using the code directly from the repository then this will be the path to the cloned repository.
+
+If you have decided not to install the PANpipelines package but have downloaded the repository then add the path to the `src` folder in the `PYTHONPATH` environmental variable as follows:
+```
+export PYTHONPATH=${PKG_DIR}:$PYTHONPATH
+```
+
+## Edit  slurm templates in ./batch_script
+Edit `group_template.pbs` and `participant_template.pbs` and add commands required to instantiate your python environment e.g. module load env, conda activate env etc.
+
+Also change the call to PYTHON depending on how it is invoked in your environment. Some environments require Python version 3 to be invoked as `python3`. In that case set `PYTHON=python3`
+
+If you have decided not to install the PANpipelines package but have downloaded the repository then you may need to add the path to the `src` folder in the `PYTHONPATH` environmental variable as follows:
+```
+export PYTHONPATH=${PKG_DIR}:$PYTHONPATH
+```
+
+## Edit  batch_scripts/headers
+Go through each of the different slurm headers to adjust times and credentials as necessary. These are referenced in the config entries as `SLURM_CPU_HEADER` and `SLURM_GPU_HEADER` as required.
+
+##  Deploy
+run as `./run_pan_slurm.sh`.
+
+On your first run please allow a few minutes for the PAN participation information to be obtained for all the projects from the server.
+
+
+## Troubleshooting
+Most problems can be avoided by creating a clean new python environment using `conda` or `virtualenv`. If an existing python environment is used then package interactions and conflicts will unfortunately have to be handled manually and steps to resolve these will be unique to each environment in question.
+
+--- 
+# General HPC Deployment
 ## Prepare Virtual Environment
-It is advisable to create a virtual python environment to run the PAN pipelines An example using a `conda` virtual environment is shown as an example. It is recommended to use a python version of `3.8.2` or greater.
+It is advisable to create a virtual python environment to run the PAN pipelines An example using a `conda` virtual environment is shown below however `virtualenv` as demonstrated above could also be used. It is recommended to use a python version of `3.8.2` or greater.
 ```
-conda create -n pandev python=3.10.13
+conda create -n pandev python=3.8.2
 ```
 
 if you would like to create your virtual environment in a specific folder location then use the `-p` prefix parameter instead of the `-n` name parameter as follows:
 ```
-conda create -p /path/to/pandev python=3.10.13
+conda create -p /path/to/pandev python=3.8.2
 ```
 
 With `conda` the python environment can be instantiated as follows depending on if the environment was created with the `-n` or the `p` parameter:
@@ -24,10 +112,10 @@ There are three options here:
 ### Use the PyPI package using:
 
 ```
-pip install -U --user panpipelines
+pip install -U panpipelines
 ```
 
-### Use this repository and perform:
+### Clone this repository and perform:
 ```
 cd PANpipelines
 pip install -e ./
@@ -47,6 +135,7 @@ pip install pydicom==2.4.3
 pip install templateflow==23.1.0
 pip install nitransforms==23.0.1
 pip install pybids==0.16.3
+pip install scipy==1.10.1
 ```
 
 
@@ -113,4 +202,6 @@ On your first run please allow a few minutes for the PAN participation informati
 
 
 ## Troubleshooting
-Most problems can be avoided by creating a clean new python environment using `conda` or `virtualenv`. If an existing python environment is used then package interactions and conflicts will unfortunately have to be handled manually and steps to resolve these will be unique to each environment in question.
+Most problems can be avoided by creating a clean new python environment using `conda` or `virtualenv`. If an existing python environment is used then package interactions and conflicts will unfortunately have to be handled manually and steps to resolve these will be unique to each environment in question.
+
+if you see this error `ImportError: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'OpenSSL 1.0.2k-fips  26 Jan 2017'. See: https://github.com/urllib3/urllib3/issues/2168` and you are using `Python 3.7 - 3.9` on the HPC then downgrade `urllib3` as follows `pip install urllib==1.25.9` 
diff --git a/deployment/batch_scripts/group_template.pbs b/deployment/batch_scripts/group_template.pbs
@@ -1,15 +1,16 @@
 
-# Add commands here to start your python environment if required
-# e.g. module load python/3.8/3.8.2 or conda activate [ENVNAME]
+# Add commands below to start your python environment if required
+# e.g. module load python/3.8/3.8.2 and source /path/to/activate, or conda activate [ENVNAME]. 
 # module load python/3.8/3.8.2
+# source /xdisk/ryant/chidiugonna/venvs/panvenv/bin/activate
 
 # Change PYTHON=python3 if this is required to access python version.
 PYTHON=python
 
 # Export python path in case 'panpipelines' python package not accessible. This should not be necessary if
 # panpipelines has been installed in yout python environment. Just uncomment 2 lines below.
-#PKG_DIR=<PKG_DIR>
-#export PYTHONPATH=${PKG_DIR}:$PYTHONPATH
+# PKG_DIR=<PKG_DIR>
+# export PYTHONPATH=${PKG_DIR}:$PYTHONPATH
 
 ##### ----- Do not edit below this line ------- #######
 

diff --git a/deployment/batch_scripts/headers/slurm_cpu.pbs b/deployment/batch_scripts/headers/slurm_cpu.pbs
@@ -5,5 +5,5 @@
 #SBATCH --mem-per-cpu=5GB 
 #SBATCH --time=8:00:00
 #SBATCH --job-name=qsiprep
-#SBATCH --account=nkchen
+#SBATCH --account=ryant
 #SBATCH --partition=standard
diff --git a/deployment/batch_scripts/headers/slurm_cpu_highpri.pbs b/deployment/batch_scripts/headers/slurm_cpu_highpri.pbs
@@ -5,6 +5,6 @@
 #SBATCH --mem-per-cpu=5GB 
 #SBATCH --time=8:00:00
 #SBATCH --job-name=qsiprep
-#SBATCH --account=nkchen
+#SBATCH --account=ryant
 #SBATCH --partition=high_priority
 #SBATCH --qos=user_qos_nkchen
diff --git a/deployment/batch_scripts/headers/slurm_cpu_highpri_fmriprep.pbs b/deployment/batch_scripts/headers/slurm_cpu_highpri_fmriprep.pbs
@@ -5,6 +5,6 @@
 #SBATCH --mem-per-cpu=5GB 
 #SBATCH --time=8:00:00
 #SBATCH --job-name=qsiprep
-#SBATCH --account=nkchen
+#SBATCH --account=ryant
 #SBATCH --partition=high_priority
 #SBATCH --qos=user_qos_nkchen
diff --git a/deployment/batch_scripts/headers/slurm_gpu.pbs b/deployment/batch_scripts/headers/slurm_gpu.pbs
@@ -6,5 +6,5 @@
 #SBATCH --mem-per-cpu=5GB 
 #SBATCH --time=8:00:00
 #SBATCH --job-name=qsiprep
-#SBATCH --account=nkchen
+#SBATCH --account=ryant
 #SBATCH --partition=standard
diff --git a/deployment/batch_scripts/headers/slurm_gpu_highpri.pbs b/deployment/batch_scripts/headers/slurm_gpu_highpri.pbs
@@ -6,6 +6,6 @@
 #SBATCH --mem-per-cpu=5GB 
 #SBATCH --time=8:00:00
 #SBATCH --job-name=qsiprep
-#SBATCH --account=nkchen
+#SBATCH --account=ryant
 #SBATCH --partition=high_priority
 #SBATCH --qos=user_qos_nkchen 
diff --git a/deployment/batch_scripts/participant_template.pbs b/deployment/batch_scripts/participant_template.pbs
@@ -1,15 +1,16 @@
 
-# Add commands here to start your python environment if required
-# e.g. module load python/3.8/3.8.2 or conda activate [ENVNAME]
+# Add commands below to start your python environment if required
+# e.g. module load python/3.8/3.8.2 and source /path/to/activate, or conda activate [ENVNAME]. 
 # module load python/3.8/3.8.2
+# source /xdisk/ryant/chidiugonna/venvs/panvenv/bin/activate
 
 # Change PYTHON=python3 if this is required to access python version.
 PYTHON=python
 
 # Export python path in case 'panpipelines' python package not accessible. This should not be necessary if
 # panpipelines has been installed in yout python environment.Just uncomment 2 lines below.
-#PKG_DIR=<PKG_DIR>
-#export PYTHONPATH=${PKG_DIR}:$PYTHONPATH
+$ PKG_DIR=<PKG_DIR>
+# export PYTHONPATH=${PKG_DIR}:$PYTHONPATH
 
 ##### ----- Do not edit below this line ----- #######