Skip to content

4. Running in BSC clusters (internal use only)

MikiSchikora edited this page Jul 4, 2022 · 18 revisions

Running perSVade

If you are working from any cluster that has access to the BSC /gpfs filesystem YOU DON'T NEED TO INSTALL ANYTHING. We have already done so. PerSVade is installed in /gpfs/projects/bsc40/project/pipelines/perSVade/perSVade-<version> and the perSVade conda environments are in /gpfs/projects/bsc40/project/pipelines/anaconda3. You can run it with the following commands:

Running with the traditional installation

Activate the group's conda environment with:

source /gpfs/projects/bsc40/project/pipelines/anaconda3/etc/profile.d/conda.sh

Activate the perSVade environment (check the available environments with conda env list | grep perSVade):.

conda activate perSVade_<version>

You can next run perSVade. For example, you can run the 'align_reads' module with:

python /gpfs/projects/bsc40/project/pipelines/perSVade/perSVade-<version>/scripts/perSVade align_reads -r <path to the reference genome (fasta)> -o <output_directory> -f1 <forward_reads.fastq.gz> -f2 <reverse_reads.fastq.gz>

You may also run all these commands in a row if you have problems:

source /gpfs/projects/bsc40/project/pipelines/anaconda3/etc/profile.d/conda.sh && conda activate perSVade_<version> && python /gpfs/projects/bsc40/project/pipelines/perSVade/perSVade-<version>/scripts/perSVade <module> <args>

Note that, from version 1.02.3 on, the version number indicates which environment should be used. For example, all the versions which are 1.02.<version> (1.02.3, 1.02.4, 1.02.5 and 1.02.6) should be run with the environment perSVade_1.02. Minor updates, where only the code (in the scripts folder) changed, are thus indicated by this third number.

Running the singularity image

For some versions (from v1.02.4 on) we stored the singularity image in /gpfs/projects/bsc40/project/pipelines/perSVade/perSVade-<version>/mikischikora_persvade_<version>.sif. For example, you can run the 'align_reads' module with:

singularity exec -e /gpfs/projects/bsc40/project/pipelines/perSVade/perSVade-<version>/mikischikora_persvade_<version>.sif bash -c 'source /opt/conda/etc/profile.d/conda.sh && conda activate perSVade_env && python /perSVade/scripts/perSVade align_reads -r <path to the reference genome (fasta)> -o <output_directory> -f1 <forward_reads.fastq.gz> -f2 <reverse_reads.fastq.gz>'

Important notes

  • All perSVade modules work in MareNostrum, and most of them in Nord3.

  • You may find trouble executing perSVde with the source /gpfs/projects/bsc40/project/pipelines/anaconda3/etc/profile.d/conda.sh if you have other conda environments installed in the cluster (for example the mn0 conda). You can manually fix this by changing the PATH variable with export PATH=$PATH:/gpfs/projects/bsc40/project/pipelines/anaconda3/envs/perSVade_1.02/bin.

  • If you have environment errors, always verify that the python interpreter is the expected with which python. Verify that the python interpreter is /gpfs/projects/bsc40/project/pipelines/anaconda3/envs/perSVade_<version>_env/bin/python to check that the environment is correctly activated.

  • The activation of the perSVade conda environment works well from version 0.7 on. This means that you can activate from the login of MN or interactive nodes. However, the activation of older versions (v0.4 and below) is costly, and it overloads the login nodes. If you want to use an old version of perSVade, always activate it on an interactive node (i.e.: salloc).

  • You can't run the perSVade pipeline from a login, because it takes too many resources. You can submit perSVade as a job or run from an interactive session with salloc -n 1 --time=02:00:00 -c 48 --qos debug.