Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPC Sun Grid Engine Bcbio installation #1378

Closed
mortunco opened this issue May 2, 2016 · 3 comments
Closed

HPC Sun Grid Engine Bcbio installation #1378

mortunco opened this issue May 2, 2016 · 3 comments

Comments

@mortunco
Copy link

mortunco commented May 2, 2016

Hi,

I tried to install bcbio_nextgen.py to our HPC. I installed bcbio to a folder called bcbio in my scratch directory.( not in usr/loca/share as it was told in documentation). After installation, we couldnt get the system run after we change some of the python paths from /anaconda1anaconda2anaconda3/ to our bcbio annotation/python. However, we encountered with problems;

  1. When we enter bcbio_nextgen.py to the command line, the response came after 2-3 minutes. Whereas in amazon this action finales in 5-10 seconds. How can we debug it?
  2. Is there a way to automatically set python path for the install ? or do we have to manually edit the python paths in the files ? If yes, could you point out which files should I change ?
  3. We decided to use your chr 6 teaching example a control of the system. Therefore, we need hg38 different than canonical installation? Therefore, at the --genome part how should I state that I need both Grch37 and hg38 to be installed?

Thank you very much,

Best,

Tunc.

@chapmanb
Copy link
Member

chapmanb commented May 3, 2016

Tunc;
Sorry about the issues. Trying to tackle the points one at a time:

  1. It sounds like the shared filesystem where you installed bcbio is running slowly. I've seen this slow startup time on heavily utilized systems where it took a long time to load all the python files and libraries used by bcbio. Moving to a more responsive shared filesystem on your HPC will hopefully improve this.
  2. Why did you manually edit python paths? This will, as you experienced, break the ability of the installation to find the installed python. bcbio installs it's own isolated python and doesn't need a system python. We don't recommend manual editing of files.
  3. You pass multiple flags to the installation: bcbio_nextgen.py upgrade --genomes GRCh37 --genomes hg38

Hope this helps.

@mortunco
Copy link
Author

mortunco commented May 3, 2016

Dear @chapmanb;

I came across with file problem in the ftp server during installation. I think there is a problem in the file location in the ftp server.

Also, I share my environment variables in my .bashrc . Are these enough? Do I have to add anything else to my .bashrc ??

My next challenge is to bring bcbio to my school's server. I believe I can introduce bcbio to my school.

Best regards,
Tunc.

export PATH=/share/apps/python-2.7.2/bin:$PATH
export LD_LIBRARY_PATH=/share/apps/python-2.7.2/lib:$LD_LIBRARY_PATH

export PATH=/share/apps/gcc/gcc-4.6.2/bin:$PATH
export LD_LIBRARY_PATH=/share/apps/gcc/gcc-4.6.2/lib64:/share/apps/gcc/gcc-4.6.2/lib:/usr/lib:/usr/lib64:$LD_LIBRARY_PATH
export LC_ALL=en_US.UTF-8
export PYTHONPATH=/mnt/scratch/tmorova15/bcbio/anaconda/bin

#CloudBioLinux PATH updates
export PATH=/mnt/scratch/tmorova15/bcbio/bin:$PATH

# CloudBioLinux PATH updates

Running GGD recipe: dbsnp
--2016-05-03 10:03:03--  ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b144_GRCh38p2/VCF/All_20150603.vcf.gz
           => `variation/dbsnp-144-orig.vcf.gz'
Resolving ftp.ncbi.nih.gov... 130.14.250.13, 2607:f220:41e:250::10
Connecting to ftp.ncbi.nih.gov|130.14.250.13|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD /snp/organisms/human_9606_b144_GRCh38p2/VCF ...
No such directory `snp/organisms/human_9606_b144_GRCh38p2/VCF'.

Traceback (most recent call last):
  File "/mnt/kufs/scratch/tmorova15/bcbio/bin/bcbio_nextgen.py", line 4, in <module>
    __import__('pkg_resources').run_script('bcbio-nextgen==0.9.7', 'bcbio_nextgen.py')
  File "/mnt/kufs/scratch/tmorova15/bcbio/anaconda/lib/python2.7/site-packages/setuptools-20.7.0-py2.7.egg/pkg_resources/__init__.py", line 719, in run_script

  File "/mnt/kufs/scratch/tmorova15/bcbio/anaconda/lib/python2.7/site-packages/setuptools-20.7.0-py2.7.egg/pkg_resources/__init__.py", line 1504, in run_script

  File "/mnt/kufs/scratch/tmorova15/bcbio/anaconda/lib/python2.7/site-packages/bcbio_nextgen-0.9.7-py2.7.egg-info/scripts/bcbio_nextgen.py", line 207, in <module>
    install.upgrade_bcbio(kwargs["args"])
  File "/mnt/kufs/scratch/tmorova15/bcbio/anaconda/lib/python2.7/site-packages/bcbio/install.py", line 91, in upgrade_bcbio
    upgrade_bcbio_data(args, REMOTES)
  File "/mnt/kufs/scratch/tmorova15/bcbio/anaconda/lib/python2.7/site-packages/bcbio/install.py", line 267, in upgrade_bcbio_data
    cbl_deploy.deploy(s)
  File "/mnt/kufs/scratch/tmorova15/bcbio/tmpbcbio-install/cloudbiolinux/cloudbio/deploy/__init__.py", line 65, in deploy
    _setup_vm(options, vm_launcher, actions)
  File "/mnt/kufs/scratch/tmorova15/bcbio/tmpbcbio-install/cloudbiolinux/cloudbio/deploy/__init__.py", line 110, in _setup_vm
    configure_instance(options, actions)
  File "/mnt/kufs/scratch/tmorova15/bcbio/tmpbcbio-install/cloudbiolinux/cloudbio/deploy/__init__.py", line 268, in configure_instance
    setup_biodata(options)
  File "/mnt/kufs/scratch/tmorova15/bcbio/tmpbcbio-install/cloudbiolinux/cloudbio/deploy/__init__.py", line 250, in setup_biodata
    install_proc(options["genomes"], ["ggd", "s3", "raw"])
  File "/mnt/kufs/scratch/tmorova15/bcbio/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 345, in install_data
    _prep_genomes(env, genomes, genome_indexes, ready_approaches)
  File "/mnt/kufs/scratch/tmorova15/bcbio/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 474, in _prep_genomes
    retrieve_fn(env, manager, gid, idx)
  File "/mnt/kufs/scratch/tmorova15/bcbio/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/genomes.py", line 796, in _install_with_ggd
    ggd.install_recipe(env.cwd, recipe_file)
  File "/mnt/kufs/scratch/tmorova15/bcbio/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/ggd.py", line 30, in install_recipe
    recipe["recipe"]["full"]["recipe_type"])
  File "/mnt/kufs/scratch/tmorova15/bcbio/tmpbcbio-install/cloudbiolinux/cloudbio/biodata/ggd.py", line 62, in _run_recipe
    subprocess.check_output(["bash", run_file])
  File "/mnt/kufs/scratch/tmorova15/bcbio/anaconda/lib/python2.7/subprocess.py", line 573, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['bash', '/mnt/kufs/scratch/tmorova15/bcbio/genomes/Hsapiens/hg38/txtmp/ggd-run.sh']' returned non-zero exit status 1

@chapmanb
Copy link
Member

chapmanb commented May 4, 2016

Tunc;
Sorry about the download issues. It looks like NCBI removed the references to dbSNP 144. I updated to the latest dbSNP 147 and things should now run cleanly if you remove the temporary CloudBioLinux:

rm -rf tmpbcbio-install

and re-run the install/update procedure. Thanks much for the report.

joemphilips pushed a commit to joemphilips/cloudbiolinux that referenced this issue May 17, 2016
- Update hg38 dbsnp to version 147. Fixes bcbio/bcbio-nextgen#1378
- Add bamutil and gemini supporting tools to bcbio installation.
  bcbio/bcbio-nextgen#1372
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants