Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tar -p option causes quota issues in case quota are enforced on group level #349

Closed
vdejager opened this issue Mar 20, 2020 · 3 comments
Closed

Comments

@vdejager
Copy link

I'm encountering issues installing the bcbio pipeline on a HPC infrastructure where quota are strictly enforced on the group level AND file system location level.

all files in (for example) /projects/mygroup should have the user:mygroup attributes.
Files with any other attributes cause a 'quota reached' error.
While this is not an issue for most parts of the bcbio installation script I found the installation of genomes is throwing this error, even though only 190 GB is used of my 50TB quota

My installation script:

#!/bin/bash

BASE=/<obfuscated path>
VERSION="1.2.0"

# fetch installer
wget https://raw.githubusercontent.com/bcbio/bcbio-nextgen/master/scripts/bcbio_nextgen_install.py

# make directories
mkdir -p $BASE/$VERSION/tools

# run installer
python bcbio_nextgen_install.py ${BASE}/${VERSION}/bcbio \
      --tooldir=${BASE}/${VERSION}/tools \
      --genomes GRCh37 \
      --aligners bwa \
      --aligners star \
      --isolate \
      --cores 4

I could trace this back to the tar -p option used in :

subprocess.check_call("tar -xzpf %s" % zipped_file, shell=True)

and related places.

Manually removing all the "-p" options from the cloudbiolinux code in the tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py file resolved the issue. Restarting the installation finished the install properly.

any suggestions to streamline the process are welcome

@roryk
Copy link
Collaborator

roryk commented Mar 20, 2020

Thanks-- the intention behind the -p option is to make sure the genomes are set up with proper everyone-can-look permissions, otherwise we'd end up debugging a bunch of issues related to incorrect permissions being set for the shared genomes. So I think removing it is going to lead to more problems for new users getting started, so your workaround is the way I'd go for now for your particular system. It's the first time we've seen a problem with it. It is definitely not ideal for your setup though, sorry about that.

@vdejager
Copy link
Author

vdejager commented Mar 20, 2020 via email

@roryk
Copy link
Collaborator

roryk commented Mar 20, 2020

Thanks for being understanding, Vic. Hopefully the sysadmins can relax your quota. On shared systems where we are managing bcbio for other users, what we usually do is have a bcbio user that has permissions to do the installation and write to the genome directories and what not. Maybe they could set you up with something like that with relaxed quota.

@roryk roryk closed this as completed Mar 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants