Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installing with support for google storage #592

Open
egafni opened this issue Dec 21, 2017 · 11 comments
Open

Installing with support for google storage #592

egafni opened this issue Dec 21, 2017 · 11 comments
Labels

Comments

@egafni
Copy link

egafni commented Dec 21, 2017

I'm unable to get pysam to work with google storage urls, here's what I'm doing:

export HTSLIB_CONFIGURE_OPTIONS="--enable-plugins --enable-libcurl --enable-gcs"
pip3 install -v --no-use-wheels pysam

It seems like the compiler is picking up these flags:

   # pysam: htslib configure options: --enable-plugins --enable-libcurl --enable-gcs
    make: ./version.sh: Command not found
    make: ./version.sh: Command not found
    # pysam: htslib_config LDFLAGS=-rdynamic
    # pysam: htslib_config LIBHTS_OBJS=kfunc.o knetfile.o kstring.o bcf_sr_sort.o bgzf.o errmod.o faidx.o hfile.o hfile_net.o hts.o hts_os.o md5.o multipart.o probaln.o realn.o regidx.o sam.o synced_bcf_reader.o vcf_sweep.o tbx.o textutils.o thread_pool.o vcf.o vcfutils.o cram/cram_codecs.o cram/cram_decode.o cram/cram_encode.o cra
m/cram_external.o cram/cram_index.o cram/cram_io.o cram/cram_samtools.o cram/cram_stats.o cram/files.o cram/mFILE.o cram/open_trace_file.o cram/pooled_alloc.o cram/rANS_static.o cram/sam_header.o cram/string_alloc.o plugin.o
    # pysam: htslib_config LIBS=-llzma -lbz2 -lz -lm -ldl
    # pysam: htslib_config PLATFORM=default
    # pysam: config_option: ENABLE_PLUGINS=1
    # pysam: config_option: HAVE_COMMONCRYPTO=0
    # pysam: config_option: HAVE_GMTIME_R=1
    # pysam: config_option: HAVE_HMAC=1
    # pysam: config_option: HAVE_IRODS=0
    # pysam: config_option: HAVE_LIBCURL=1
    # pysam: config_option: HAVE_MMAP=1

But I still get the following error:
import pysam
pysam.AlignmentFile('gs://align.bam')
OSError: [Errno 93] could not open alignment file gs://align.bam: Protocol not supported

Any help would be greatly appreciated!!

@AndreasHeger
Copy link
Contributor

Hi, thanks for reporting.

The issue is that only main htslib is built when doing a pip/setup.py install, not any of the plugins.

There are two immediate solutions:

  1. build an external htslib with all the plugins you need and then use this via the HTSLIB_LIBRARY_DIR option before calling pip/setup.py. This will bypass the htslib that comes with pysam.

  2. use conda, which I think comes with a plugin-enabled htslib

Longer term, I will look into what is involved building htslib plugins via setup.py. I left them out originally as I was worried that building plugins might require more exotic system libraries to be present.

@egafni
Copy link
Author

egafni commented Dec 23, 2017

I checked out setup.py and it first tries whatever is in HTSLIB_CONFIGURE_OPTIONS, but if that fails it will try a few other options.

Turns out for whatever reason HTSLIB_CONFIGURE_OPTIONS="--enable-plugins --enable-libcurl --enable-gcs" does not work, but setting just HTSLIB_CONFIGURE_OPTIONS="--enable-gcs" did!

@egafni
Copy link
Author

egafni commented Dec 23, 2017

I'll also note here for any other users I have to set this env var for bucket authentication to work:
os.environ['GCS_OAUTH_TOKEN']=subprocess.check_output('gcloud auth application-default print-access-token', shell=True).decode()
The token seems to only have a certain life cycle (maybe 10mins?) and then has to be reset again

@Gibbsdavidl
Copy link

Gibbsdavidl commented Jan 10, 2019

In a google colab notebook:

import os
os.environ['HTSLIB_CONFIGURE_OPTIONS'] = "--enable-gcs"

!apt-get install libbz2-dev libcurl4-openssl-dev
!pip3 install pysam -v --force-reinstall --no-binary :all:

Worked for me.

@AndreasHeger
Copy link
Contributor

AndreasHeger commented Jan 10, 2019 via email

@Gibbsdavidl
Copy link

Yes that works!

!export HTSLIB_CONFIGURE_OPTIONS="--enable-gcs"
!apt-get install libbz2-dev libcurl4-openssl-dev
!pip3 install pysam -v --force-reinstall --no-binary :all:

@Gibbsdavidl
Copy link

Small problem: I can only read public bam files.
Haven't figured out how to read from my private bucket.

Have installed via conda...

@gokceneraslan
Copy link

gokceneraslan commented Apr 22, 2020

Apparently HTSLIB_CONFIGURE_OPTIONS="--enable-plugins --enable-gcs" gives rise to dynamic hfile_gcs.so file however HTSLIB_CONFIGURE_OPTIONS="--enable-gcs" first produces a hfile_gcs.o file and then just statically adds GCS support to the libhts.so library.

@AndreasHeger do you know if pysam can load the plugin .so files like hfile_gcs.so properly at run time? or is it something libhts.so is supposed to do? Maybe @daviesrob can also help us.

@jmarshall
Copy link
Member

@gokceneraslan: The best way for this to work would be to compile pysam with HTSLIB_CONFIGURE_OPTIONS="--enable-plugins" and not compile the hfile_gcs.c etc code at all when you are building pysam.

At runtime, pysam would then be in a position to pick up the previously-built hfile_gcs.so/etc plugins that were installed along with a --enable-plugins-htslib back when that was built and installed. Pysam's htslib might benefit from being pointed at the right place via the HTS_PATH environment variable.

@aalexander
Copy link

aalexander commented Oct 23, 2020

I had this same issue with version 0.16.0.1 and I found that the problem was with the pysam binary packages. Running the pip install with the "--no-binary=pysam" option fixed it without any changes to HTSLIB_CONFIGURE_OPTIONS needed.

@jowodo
Copy link

jowodo commented Sep 16, 2022

I had the same errors as OP with python3.10.4 and pysam 0.16.0.1.
I can install with python3.9.0 and pysam 0.16.0.1, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants