Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bcftools annotate method missing #958

Closed
tirohia opened this issue Sep 30, 2020 · 11 comments
Closed

bcftools annotate method missing #958

tirohia opened this issue Sep 30, 2020 · 11 comments

Comments

@tirohia
Copy link

tirohia commented Sep 30, 2020

Currently assuming this is the only method missing. Most methods for bcftools are present, I can't for the life of me, figure out how to call bcftools annotate though. It results in:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/home/bcurran/.pycharm_helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "/home/bcurran/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/red/project/prosper/src/vcftojson.py", line 16, in <module>
    pysam.annotate("-a", "geneNames.bed.gz", "-c", "CHROM, FROM, TO, GENE", "-h", '##INFO=<ID=GENE,Number=1,Type=String,Description="Gene name">', "-o", outfile)
AttributeError: module 'pysam' has no attribute 'annotate'

Have also looked using dir, i.e.

$ python
Python 3.6.12 |Anaconda, Inc.| (default, Sep  8 2020, 23:10:56) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pysam
>>> dir(pysam)
['AlignedRead', 'AlignedSegment', 'AlignmentFile', 'AlignmentHeader', 'BGZFile', 'CBACK', 'CDEL', 'CDIFF', 'CEQUAL', 'CHARD_CLIP', 'CINS', 'CMATCH', 'CPAD', 'CREF_SKIP', 'CSOFT_CLIP', 'FDUP', 'FMREVERSE', 'FMUNMAP', 'FPAIRED', 'FPROPER_PAIR', 'FQCFAIL', 'FREAD1', 'FREAD2', 'FREVERSE', 'FSECONDARY', 'FSUPPLEMENTARY', 'FUNMAP', 'FastaFile', 'Fastafile', 'FastqFile', 'FastqProxy', 'FastxFile', 'FastxRecord', 'GZIterator', 'GZIteratorHead', 'HFile', 'HTSFile', 'IndexedReads', 'IteratorColumn', 'IteratorRow', 'KEY_NAMES', 'Pileup', 'PileupColumn', 'PileupRead', 'Samfile', 'SamtoolsError', 'TabixFile', 'Tabixfile', 'VCF', 'VCFRecord', 'VariantFile', 'VariantHeader', 'VariantHeaderRecord', 'VariantRecord', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__samtools_version__', '__spec__', '__version__', 'addreplacerg', 'array_to_qualitystring', 'asBed', 'asGFF3', 'asGTF', 'asTuple', 'asVCF', 'bam2fq', 'bamshuf', 'bedcov', 'calmd', 'cat', 'collate', 'config', 'coverage', 'depad', 'depth', 'dict', 'faidx', 'fasta', 'fastq', 'fixmate', 'flags', 'flagstat', 'fqidx', 'get_defines', 'get_include', 'get_libraries', 'get_verbosity', 'idxstats', 'index', 'libcalignedsegment', 'libcalignmentfile', 'libcbcf', 'libcbcftools', 'libcbgzf', 'libcfaidx', 'libchtslib', 'libcsamfile', 'libcsamtools', 'libctabix', 'libctabixproxies', 'libcutils', 'libcvcf', 'markdup', 'merge', 'mpileup', 'os', 'pad2unpad', 'phase', 'py_bcftools', 'py_samtools', 'pysam', 'qualities_to_qualitystring', 'qualitystring_to_array', 'quickcheck', 'reheader', 'rmdup', 'samtools', 'set_verbosity', 'sort', 'split', 'stats', 'sys', 'sysconfig', 'tabix_compress', 'tabix_file_iterator', 'tabix_generic_iterator', 'tabix_index', 'tabix_iterator', 'targetcut', 'tview', 'utils', 'version', 'view']

I can't see anything in there that would suggest an annotate method. That said, there is a file in this repo that suggests that it has been at least partially implemented?

@jmarshall
Copy link
Member

Samtools and bcftools commands are added in pysam/samtools.py and pysam/bcftools.py, however they differ in small ways and in particular pysam/__init__.py imports from samtools but not bcftools.

Hence:

import pysam.bcftools
pysam.bcftools.annotate(…)

However this then proceeds to not work. So there is something to fix here for all bcftools commands.

@tirohia
Copy link
Author

tirohia commented Oct 1, 2020

Is this the same for tools like tabix and bgzip? I'm managing to call them without specifically importing pysam.samtools. i.e. they are available as long as I import pysam.

@jmarshall
Copy link
Member

What python invocations are you using for tabix and bgzip?

@jmarshall
Copy link
Member

However this then proceeds to not work. So there is something to fix here for all bcftools commands.

The problem is that bcftools calls exit() on pretty much any error. (The samtools code sometimes calls exit() but often instead returns an error indication from main().) This in turn exits from the Python interpreter (which returning from main() does not cause).

This isn't really practical to fix. When invoking bcftools commands, make very sure to get your arguments correct and avoid causing error messages 😄

@jkbonfield
Copy link

Isn't the whole problem here that pysam is calling a program as if it was a function, rather than running it in a sub-process? I think it would come under the "courageous decision" moniker.

Certainly we shouldn't be exiting from a library, but I think it's fair enough for application developers to consider exiting on error and to not be assuming they're being turned into a library by some magic of linking.

@jmarshall
Copy link
Member

it comes under the “one of the older parts of pysam, there because people have found it useful” moniker.

@jkbonfield
Copy link

I'm not doubting the functionality, just the implementation. :) Even if it's still calling it via a function, it could fork first so exit doesn't kill the main thread.

@jmarshall
Copy link
Member

Indeed. I'm sure patches are welcome!

@gregmcinnes
Copy link

It looks like some comments are missing in this thread but I think I am experience a related problem. BCFtools commands that I guess throw errors exit without showing the error. To me the commands seem correct and work on the command line, but do not work using pysam.

For example, this command does nothing. But the equivalent command on the command line works fine.

import pysam.bcftools as bcftools
bcftools.norm("-f", "data/hg38.fa.gz", "-c", "s", "-o", "test_norm.vcf", "test.vcf.gz")

I'm also not able to write files with vcftools even when the commands do seem to work.

This command does not write test_view.vcf

bcftools.view("-o", "test_view.vcf", "test.vcf.gz")

Am I misunderstanding how to run bcftools commands with pysam or is something wrong?

Thanks

@gregmcinnes
Copy link

I figured out that the output is being returned as a variable. Is it possible to have bcftools redirect this to a file?

@jmarshall
Copy link
Member

@gregmcinnes: You want bcftools.view(…, catch_stdout=False).

indraniel added a commit to indraniel/pysam that referenced this issue Mar 31, 2024
  - this wasn't explicitly stated in the documentation.  I was
    using the notes described in [github issue 958][0] as a guide.

[0]: pysam-developers#958
indraniel added a commit to indraniel/pysam that referenced this issue Mar 31, 2024
  - this wasn't explicitly stated in the documentation.  I was
    using the notes described in [github issue 958][0] as a guide.
  - in the changes also put updated links to the bcftools and htslib.org pages

[0]: pysam-developers#958
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants