Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trio pipeline #2961

Closed
kokyriakidis opened this issue Sep 30, 2019 · 56 comments
Closed

Trio pipeline #2961

kokyriakidis opened this issue Sep 30, 2019 · 56 comments

Comments

@kokyriakidis
Copy link

kokyriakidis commented Sep 30, 2019

@chapmanb

  1. I would like to run a trio analysis in whole exome samples. Can I use all callers (strelka2, deepvariant. vardict, gatk etc) for a trio analysis with samples having the same batch name? Can I use the ensemble method?

  2. I am also trying to do CNV analysis in this trio. Can I add all svcallers? Do all work with single germline sample?

It would also be nice to specify in the documentation:

Which callers can be used for Germline Variant Calling
Which callers can only be used for Somatic (Tumor-Normal) Variant Calling
Which callers can be used for Germline SV Calling
Which callers can only be used for Somatic (Tumor-Normal) SV Calling
Which callers can be user for Trio analysis

@naumenko-sa
Copy link
Contributor

Hi Konstantinos @kokyriakidis!

I used gatk4, gatk3.8, samtools, freebayes, playtypus and ensembl for Trio Exome analysis.
Yes, just use batch: family_name to call all samples together.

SV/CNV analysis is different. While for small variants you might want to call them in a batch,
SV/CNV calling in manta, lumpy, delly, wham is based on split-reads, and you call samples individually.

You may also try to call CNVs using XHMM across many samples with similar coverage
(https://atgu.mgh.harvard.edu/xhmm/tutorial.shtml) outside of bcbio.

Sergey

@kokyriakidis
Copy link
Author

kokyriakidis commented Sep 30, 2019

Hi @naumenko-sa !

Do you recommend the same tools for Trio analysis? Or should I opt for GATK, deepvariant, strelka2, vardict with the ensemble method?

@naumenko-sa
Copy link
Contributor

naumenko-sa commented Oct 1, 2019

Hi Konstantinos @kokyriakidis!

More callers + ensemble does not necessarily mean better calling.
For SNV calling, I think any tool could give a good precision and sensitivity.

Below is my simple validation from 2018. You may see that for SNV it is enough to use samtools, for indels gatk gives better results.

2018-05_WES validation no_clinical

More extensive validation methods and results you may find among publications by Justin Zook:
https://scholar.google.com/citations?hl=en&user=3Xjafy0AAAAJ&view_op=list_works&sortby=pubdate

I'd suggest to run validations using NA12878 and trios (Ashkenasim trio, Chinese trio) using Giab and their article to see how tools are working in your environment.

Bcbio would benefit from any updated validation alike:
https://github.com/bcbio/bcbio_validations
Especially, if you compared new DeepVariant. That would be a great contribution.

Using many tools and combining them with ensemble would allow you to pick up variants
which would not be called by one of the individual tools.

However, this approach has many downsides:

  • it is slow, plain gatk4 is much faster;
  • consolidation of calls between different tools to make a report is painful;
  • additional variants you gain with many tools are mostly 'trash' variants, you could get them from gatk4 if you relaxed filters;
  • more 'trash' variants makes interpretation harder;
  • https://github.com/bcbio/bcbio.variation.recall which is doing ensemble calling and merging, is working well, however, its support and further development is not the top priority, as it is not clear, why we need it; it would be more logical to drop support of some callers and focus on the best ones instead.

Briefly, for germline WES variant analysis I'd suggest to use gatk4 (or gatk3.8 - which works better for you), and focus more on validation, variant filtration, annotation, and interpretation.

Small variant calling in germline as a bioinformatics problem seems to be (almost) solved at 99% precision and sensitivity (worse for indels, but again 99% if you have WGS data).

Sergey

@kokyriakidis
Copy link
Author

kokyriakidis commented Oct 1, 2019

@naumenko-sa Thank you for your wonderful explanation!

I want to ask something that bothers me:
How does it know which is the father, the mother etc? Doesn't it have to know? I can not understand how this works

@roryk
Copy link
Collaborator

roryk commented Oct 1, 2019

@kokyriakidis
Copy link
Author

kokyriakidis commented Oct 1, 2019

@roryk Let's say I have a TRIO:

178F	1	0	0	1	1  #FATHER
178F	2	0	0	2	1  #MOTHER
178F	3	1	2	2	2  #CHILD

The columns are:

Family ID
Individual ID
Paternal ID
Maternal ID
Sex (1=male; 2=female; other=unknown)
Phenotype

How does it match with the samples? Do I have to change their name?

My template is:

details:
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    ensemble:
      numpass: 2
      use_filtered: false
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    - samtools
    - platypus
    - freebayes
  analysis: variant2
  description: 178F_CHILD
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIAN/178F/input/178F_CHILD_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIAN/178F/input/178F_CHILD_2.fq.gz
  genome_build: GRCh37
  metadata:
    batch: 178F
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    ensemble:
      numpass: 2
      use_filtered: false
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    - samtools
    - platypus
    - freebayes
  analysis: variant2
  description: 178F_FATHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIAN/178F/input/178F_FATHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIAN/178F/input/178F_FATHER_2.fq.gz
  genome_build: GRCh37
  metadata:
    batch: 178F
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    ensemble:
      numpass: 2
      use_filtered: false
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    - samtools
    - platypus
    - freebayes
  analysis: variant2
  description: 178F_MOTHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIAN/178F/input/178F_MOTHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIAN/178F/input/178F_MOTHER_2.fq.gz
  genome_build: GRCh37
  metadata:
    batch: 178F
fc_name: 178F
upload:
  dir: ../final

@roryk
Copy link
Collaborator

roryk commented Oct 1, 2019

The 'description' metadata is what should be the individual ID in the PED file. bcbio will call of the samples in the same batch together, and they will get annotated in the GEMINI database according to the family structure in the PED file. It will also check to make sure the family structure is correct; often times people are unaware what their actual family structure is, so it is good to check.

@kokyriakidis
Copy link
Author

Thank you so much for the clarification!

@kokyriakidis
Copy link
Author

kokyriakidis commented Oct 2, 2019

Any thought why octopus stalls in trio analysis?

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 567, in __call__
    return self.func(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 315, in variantcall_sample
    return genotype.variantcall_sample(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/genotype.py", line 377, in variantcall_sample
    out_file = caller_fn(align_bams, items, ref_file, assoc_files, region, out_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/octopus.py", line 28, in run
    return _run_germline(align_bams, items, ref_file, target, out_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/octopus.py", line 99, in _run_germline
    _produce_compatible_vcf(tx_out_file, items[0])
TypeError: _produce_compatible_vcf() missing 1 required positional argument: 'is_somatic'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 154, in variant2pipeline
    samples = genotype.parallel_variantcall_region(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/genotype.py", line 208, in parallel_variantcall_region
    "vrn_file", ["region", "sam_ref", "config"]))
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/split.py", line 35, in grouped_parallel_split_combine
    final_output = parallel_fn(parallel_name, split_args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 934, in __call__
    self.retrieve()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/multiprocessing/pool.py", line 670, in get
    raise self._value
TypeError: _produce_compatible_vcf() missing 1 required positional argument: 'is_somatic'


Certain chromosomes finished correctly (19/25). After that it pops this error. I tried to rerun but had the same response

@chapmanb
Copy link
Member

chapmanb commented Oct 2, 2019

Thanks much for the report and apologies about the problem. This was a bug in finalizing the octopus VCF files, which is now fixed in the latest development. If you update with (bcbio_nextgen.py upgrade -u development) and re-run in place it should post-process these correctly and hopefully finish cleanly.

@kokyriakidis
Copy link
Author

kokyriakidis commented Oct 2, 2019

@chapmanb @naumenko-sa When I run the trio pipeline I got this error. Any thoughts?

[2019-10-02T04:56Z] =============================================
[2019-10-02T04:56Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-02T04:56Z] see: https://github.com/brentp/vcfanno
[2019-10-02T04:56Z] =============================================
[2019-10-02T04:56Z] vcfanno.go:115: found 77 sources from 10 files
[2019-10-02T04:56Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_genome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-02T04:56Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-02T04:56Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-02T04:56Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-02T04:56Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-02T04:57Z] vcfanno.go:194: Info Error: AF_popmax not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-02T04:57Z] vcfanno.go:194: Info Error: CLNSIG not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-02T05:17Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-02T05:17Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-02T05:17Z] vcfanno.go:248: annotated 162216 variants in 1277.45 seconds (127.0 / second)
[2019-10-02T05:17Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-ensemble-nomultiallelic-annotated-cre.vcfanno.vcf.gz
[2019-10-02T05:17Z] GEMINI: create database with vcf2db
[2019-10-02T05:17Z] skipping 'DP4' because it has Number=4
[2019-10-02T05:17Z] skipping 'MMQ' because it has Number=R
[2019-10-02T05:17Z] setting vcfanno_gnomad_ac to Type String because it has Number=.
[2019-10-02T05:17Z] setting vcfanno_gnomad_an to Type String because it has Number=.
[2019-10-02T05:17Z] setting vcfanno_gnomad_hom to Type String because it has Number=.
[2019-10-02T05:17Z] Ethnicity None
[2019-10-02T05:17Z] Traceback (most recent call last):
[2019-10-02T05:17Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-02T05:17Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-02T05:17Z]   File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
[2019-10-02T05:17Z]     self.samples = self.create_samples()
[2019-10-02T05:17Z]   File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
[2019-10-02T05:17Z]     vals = [r[i] for r in rows]
[2019-10-02T05:17Z] IndexError: list index out of range
[2019-10-02T05:17Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-ensemble-nomultiallelic-annotated-cre.vcfanno.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-ensemble-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpygmu3evp/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-ensemble.db
skipping 'DP4' because it has Number=4
skipping 'MMQ' because it has Number=R
setting vcfanno_gnomad_ac to Type String because it has Number=.
setting vcfanno_gnomad_an to Type String because it has Number=.
setting vcfanno_gnomad_hom to Type String because it has Number=.
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 401, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 48, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 140, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-ensemble-nomultiallelic-annotated-cre.vcfanno.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-ensemble-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpygmu3evp/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-ensemble.db
skipping 'DP4' because it has Number=4
skipping 'MMQ' because it has Number=R
setting vcfanno_gnomad_ac to Type String because it has Number=.
setting vcfanno_gnomad_an to Type String because it has Number=.
setting vcfanno_gnomad_hom to Type String because it has Number=.
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.

My template is:

details:
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    ensemble:
      numpass: 2
      use_filtered: false
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    - strelka2
    - samtools
    - freebayes
    vcfanno:
    - /home/kokyriakidis/cre/cre.vcfanno.conf
  analysis: variant2
  description: 178F_CHILD
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_CHILD_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_CHILD_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    ped: ../178F_PED
    sex: female
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    ensemble:
      numpass: 2
      use_filtered: false
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    - strelka2
    - samtools
    - freebayes
    vcfanno:
    - /home/kokyriakidis/cre/cre.vcfanno.conf
  analysis: variant2
  description: 178F_FATHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_FATHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_FATHER_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    ped: ../178F_PED
    sex: male
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    ensemble:
      numpass: 2
      use_filtered: false
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    - strelka2
    - samtools
    - freebayes
    vcfanno:
    - /home/kokyriakidis/cre/cre.vcfanno.conf
  analysis: variant2
  description: 178F_MOTHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_MOTHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_MOTHER_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    ped: ../178F_PED
    sex: female
fc_name: 178F
upload:
  dir: ../final

The vcf annotation file I used is the following from the @naumenko-sa cre project:

# using prefix vcfanno to discriminate data from vcfanno and vep
[[annotation]]
file="variation/gnomad_exome.vcf.gz"
fields=["AC","nhomalt","AF_popmax","AN"]
names=["vcfanno_gnomad_ac_es","vcfanno_gnomad_hom_es","vcfanno_gnomad_af_es","vcfanno_gnomad_an_es"]
ops=["first","first","first","first"]

[[annotation]]
file="variation/gnomad_genome.vcf.gz"
fields=["AC", "nhomalt","AF_popmax","AN"]
names=["vcfanno_gnomad_ac_gs", "vcfanno_gnomad_hom_gs","vcfanno_gnomad_af_gs","vcfanno_gnomad_an_gs"]
ops=["self","self","self","self"]

[[postannotation]]
fields=["vcfanno_gnomad_ac_es","vcfanno_gnomad_ac_gs"]
op="sum"
name="vcfanno_gnomad_ac"
type="Integer"

[[postannotation]]
fields=["vcfanno_gnomad_hom_es","vcfanno_gnomad_hom_gs"]
op="sum"
name="vcfanno_gnomad_hom"
type="Integer"

[[postannotation]]
fields=["vcfanno_gnomad_af_es","vcfanno_gnomad_af_gs"]
op="max"
name="vcfanno_gnomad_af_popmax"
type="Float"

[[postannotation]]
fields=["vcfanno_gnomad_an_es","vcfanno_gnomad_an_gs"]
op="sum"
name="vcfanno_gnomad_an"
type="Integer"

[[postannotation]]
fields=["vcfanno_gnomad_ac","vcfanno_gnomad_an"]
op="div2"
name="vcfanno_gnomad_af"
type="Float"

[[annotation]]
file="variation/dbsnp-151.vcf.gz"
fields=["ID"]
names=["rs_ids"]
ops=["concat"]

[[annotation]]
file="variation/clinvar.vcf.gz"
fields=["CLNSIG"]
names=["clinvar_pathogenic"]
ops=["concat"]

#dbNSFP v3.4
[[annotation]]
file = "variation/dbNSFP.txt.gz"
names = ["CADD_phred","phyloP20way_mammalian","phastCons20way_mammalian","Vest3_score","Revel_score","Gerp_score"]
columns = [79,111,115,58,70,107]
ops = ["first","first","first","first","first","first"]

@kokyriakidis
Copy link
Author

kokyriakidis commented Oct 2, 2019

@chapmanb I get another error from OCTOPUS during it's run:

[2019-10-02T13:17Z] /bin/bash: -c: line 0: syntax error near unexpected token `('
[2019-10-02T13:17Z] /bin/bash: -c: line 0: `set -o pipefail; octopus --threads 1 --reference /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa --reads /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_CHILD/178F_CHILD-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_FATHER/178F_FATHER-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_MOTHER/178F_MOTHER-sort.bam --regions-file ('chr15', 0, 17010044) --working-directory /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0 -o /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr15_0_17010044.vcf.gz --legacy'
[2019-10-02T13:17Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command 'set -o pipefail; octopus --threads 1 --reference /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa --reads /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_CHILD/178F_CHILD-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_FATHER/178F_FATHER-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_MOTHER/178F_MOTHER-sort.bam --regions-file ('chr15', 0, 17010044) --working-directory /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0 -o /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr15_0_17010044.vcf.gz --legacy
/bin/bash: -c: line 0: syntax error near unexpected token `('
/bin/bash: -c: line 0: `set -o pipefail; octopus --threads 1 --reference /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa --reads /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_CHILD/178F_CHILD-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_FATHER/178F_FATHER-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_MOTHER/178F_MOTHER-sort.bam --regions-file ('chr15', 0, 17010044) --working-directory /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0 -o /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr15_0_17010044.vcf.gz --legacy'
' returned non-zero exit status 1.

Then octopus continues to process chromosomes and after a lot of minutes it ends with this:

[2019-10-02T14:14Z] [2019-10-02 17:14:05] <INFO> Finished calling 958,606bp, total runtime 1h 6m
[2019-10-02T14:14Z] [2019-10-02 17:14:05] <INFO> Calls have been written to "/mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpyt5thvtf/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr20_0_32180052.vcf.gz"
[2019-10-02T14:14Z] [2019-10-02 17:14:05] <INFO> Legacy VCF file written to "/mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpyt5thvtf/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr20_0_32180052.legacy.vcf.gz"
[2019-10-02T14:14Z] [2019-10-02 17:14:05] <INFO> Removed 2 temporary files
[2019-10-02T14:14Z] [2019-10-02 17:14:05] <INFO> ------------------------------------------------------------------------
[2019-10-02T14:14Z] Produce compatible VCF output file from octopus
[2019-10-02T14:14Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr20_0_32180052.vcf.gz
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 567, in __call__
    return self.func(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 315, in variantcall_sample
    return genotype.variantcall_sample(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/genotype.py", line 377, in variantcall_sample
    out_file = caller_fn(align_bams, items, ref_file, assoc_files, region, out_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/octopus.py", line 28, in run
    return _run_germline(align_bams, items, ref_file, target, out_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/octopus.py", line 98, in _run_germline
    do.run(cmd.format(**locals()), "Octopus germline calling")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command 'set -o pipefail; octopus --threads 1 --reference /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa --reads /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_CHILD/178F_CHILD-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_FATHER/178F_FATHER-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_MOTHER/178F_MOTHER-sort.bam --regions-file ('chr15', 0, 17010044) --working-directory /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0 -o /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr15_0_17010044.vcf.gz --legacy
/bin/bash: -c: line 0: syntax error near unexpected token `('
/bin/bash: -c: line 0: `set -o pipefail; octopus --threads 1 --reference /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa --reads /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_CHILD/178F_CHILD-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_FATHER/178F_FATHER-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_MOTHER/178F_MOTHER-sort.bam --regions-file ('chr15', 0, 17010044) --working-directory /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0 -o /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr15_0_17010044.vcf.gz --legacy'
' returned non-zero exit status 1.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 154, in variant2pipeline
    samples = genotype.parallel_variantcall_region(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/genotype.py", line 208, in parallel_variantcall_region
    "vrn_file", ["region", "sam_ref", "config"]))
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/split.py", line 35, in grouped_parallel_split_combine
    final_output = parallel_fn(parallel_name, split_args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 934, in __call__
    self.retrieve()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/multiprocessing/pool.py", line 670, in get
    raise self._value
subprocess.CalledProcessError: Command 'set -o pipefail; octopus --threads 1 --reference /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa --reads /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_CHILD/178F_CHILD-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_FATHER/178F_FATHER-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_MOTHER/178F_MOTHER-sort.bam --regions-file ('chr15', 0, 17010044) --working-directory /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0 -o /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr15_0_17010044.vcf.gz --legacy
/bin/bash: -c: line 0: syntax error near unexpected token `('
/bin/bash: -c: line 0: `set -o pipefail; octopus --threads 1 --reference /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa --reads /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_CHILD/178F_CHILD-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_FATHER/178F_FATHER-sort.bam /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/align/178F_MOTHER/178F_MOTHER-sort.bam --regions-file ('chr15', 0, 17010044) --working-directory /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0 -o /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmplucwgwj0/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-chr15_0_17010044.vcf.gz --legacy'
' returned non-zero exit status 1.
kokyriakidis@Konstantinos:/mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work$ 

@chapmanb
Copy link
Member

chapmanb commented Oct 3, 2019

Konstantinos;
Thanks for the additional testing and the detail bug reports. Sorry you're hitting so many issues here. We'd mostly used octopus for somatic calling tests so far so it hadn't done extensive germline testing, and we've primarily moved to a single variant caller for germline calling rather than ensemble methods. We appreciate you helping work through these issues.

The latest development has two fixes for the two issues:

  • It skips gemini database creation for the ensemble VCFs. Those are going to have a mix of tags from multiple callers and likely to fail in a number of situations. It should still create gemini databases for your initial callers.
  • Octopus command line generation should now handle both BED file input and region based inputs, the later of which caused the failure here.

Please let us know if you hit any other problems and thanks again for all the patience debugging.

@kokyriakidis
Copy link
Author

kokyriakidis commented Oct 4, 2019

@chapmanb Using just gatk-haptotype I got these error messages now:

[2019-10-03T17:42Z] =============================================
[2019-10-03T17:42Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-03T17:42Z] see: https://github.com/brentp/vcfanno
[2019-10-03T17:42Z] =============================================
[2019-10-03T17:42Z] vcfanno.go:115: found 77 sources from 10 files
[2019-10-03T17:42Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_genome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-03T17:42Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-03T17:42Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-03T17:42Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-03T17:42Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-03T17:42Z] vcfanno.go:194: Info Error: AF_popmax not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-03T17:43Z] vcfanno.go:194: Info Error: CLNSIG not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-03T18:02Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-03T18:02Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-03T18:02Z] vcfanno.go:248: annotated 148115 variants in 1244.76 seconds (119.0 / second)
[2019-10-03T18:02Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-cre.vcfanno.vcf.gz
[2019-10-03T18:02Z] GEMINI: create database with vcf2db
[2019-10-03T18:02Z] skipping 'MMQ' because it has Number=R
[2019-10-03T18:02Z] setting vcfanno_gnomad_ac to Type String because it has Number=.
[2019-10-03T18:02Z] setting vcfanno_gnomad_an to Type String because it has Number=.
[2019-10-03T18:02Z] setting vcfanno_gnomad_hom to Type String because it has Number=.
[2019-10-03T18:02Z] Ethnicity None
[2019-10-03T18:02Z] Traceback (most recent call last):
[2019-10-03T18:02Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-03T18:02Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-03T18:02Z]   File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
[2019-10-03T18:02Z]     self.samples = self.create_samples()
[2019-10-03T18:02Z]   File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
[2019-10-03T18:02Z]     vals = [r[i] for r in rows]
[2019-10-03T18:02Z] IndexError: list index out of range
[2019-10-03T18:02Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-cre.vcfanno.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpwtu0e31l/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
setting vcfanno_gnomad_ac to Type String because it has Number=.
setting vcfanno_gnomad_an to Type String because it has Number=.
setting vcfanno_gnomad_hom to Type String because it has Number=.
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 403, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 50, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 142, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-cre.vcfanno.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpwtu0e31l/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
setting vcfanno_gnomad_ac to Type String because it has Number=.
setting vcfanno_gnomad_an to Type String because it has Number=.
setting vcfanno_gnomad_hom to Type String because it has Number=.
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.

My template is:

details:
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    vcfanno:
    - /home/kokyriakidis/cre/cre.vcfanno.conf
  analysis: variant2
  description: 178F_CHILD
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_CHILD_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_CHILD_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    ped: ../178F_PED
    sex: female
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    vcfanno:
    - /home/kokyriakidis/cre/cre.vcfanno.conf
  analysis: variant2
  description: 178F_FATHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_FATHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_FATHER_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    ped: ../178F_PED
    sex: male
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
    vcfanno:
    - /home/kokyriakidis/cre/cre.vcfanno.conf
  analysis: variant2
  description: 178F_MOTHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_MOTHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_MOTHER_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    ped: ../178F_PED
    sex: female
fc_name: 178F
upload:
  dir: ../final

My vcf annotation files is:

# using prefix vcfanno to discriminate data from vcfanno and vep
[[annotation]]
file="variation/gnomad_exome.vcf.gz"
fields=["AC","nhomalt","AF_popmax","AN"]
names=["vcfanno_gnomad_ac_es","vcfanno_gnomad_hom_es","vcfanno_gnomad_af_es","vcfanno_gnomad_an_es"]
ops=["first","first","first","first"]

[[annotation]]
file="variation/gnomad_genome.vcf.gz"
fields=["AC", "nhomalt","AF_popmax","AN"]
names=["vcfanno_gnomad_ac_gs", "vcfanno_gnomad_hom_gs","vcfanno_gnomad_af_gs","vcfanno_gnomad_an_gs"]
ops=["self","self","self","self"]

[[postannotation]]
fields=["vcfanno_gnomad_ac_es","vcfanno_gnomad_ac_gs"]
op="sum"
name="vcfanno_gnomad_ac"
type="Integer"

[[postannotation]]
fields=["vcfanno_gnomad_hom_es","vcfanno_gnomad_hom_gs"]
op="sum"
name="vcfanno_gnomad_hom"
type="Integer"

[[postannotation]]
fields=["vcfanno_gnomad_af_es","vcfanno_gnomad_af_gs"]
op="max"
name="vcfanno_gnomad_af_popmax"
type="Float"

[[postannotation]]
fields=["vcfanno_gnomad_an_es","vcfanno_gnomad_an_gs"]
op="sum"
name="vcfanno_gnomad_an"
type="Integer"

[[postannotation]]
fields=["vcfanno_gnomad_ac","vcfanno_gnomad_an"]
op="div2"
name="vcfanno_gnomad_af"
type="Float"

[[annotation]]
file="variation/dbsnp-151.vcf.gz"
fields=["ID"]
names=["rs_ids"]
ops=["concat"]

[[annotation]]
file="variation/clinvar.vcf.gz"
fields=["CLNSIG"]
names=["clinvar_pathogenic"]
ops=["concat"]

#dbNSFP v3.4
[[annotation]]
file = "variation/dbNSFP.txt.gz"
names = ["CADD_phred","phyloP20way_mammalian","phastCons20way_mammalian","Vest3_score","Revel_score","Gerp_score"]
columns = [79,111,115,58,70,107]
ops = ["first","first","first","first","first","first"]

@roryk
Copy link
Collaborator

roryk commented Oct 4, 2019

I'm guessing your PED file is malformed somehow, can you pass on the PED file you are using?

@kokyriakidis
Copy link
Author

kokyriakidis commented Oct 4, 2019

@roryk

My PED file:

178F	178F_FATHER	0	0	1	1
178F	178F_MOTHER	0	0	2	1
178F	178F_CHILD	178F_FATHER	178F_MOTHER	2	2

@kokyriakidis
Copy link
Author

Got the same error when I did not specify a vcfanno annotation file. So this file does not cause the problem

@kokyriakidis
Copy link
Author

I tried running it WITHOUT a PED file and I got these errors:

[2019-10-05T12:04Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-decompose.vcf.gz
[2019-10-05T12:04Z] Annotating /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.vcf.gz with vcfanno, using /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-cre.vcfanno-combine.conf
[2019-10-05T12:04Z] =============================================
[2019-10-05T12:04Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-05T12:04Z] see: https://github.com/brentp/vcfanno
[2019-10-05T12:04Z] =============================================
[2019-10-05T12:04Z] vcfanno.go:115: found 77 sources from 10 files
[2019-10-05T12:04Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_genome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T12:04Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T12:04Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T12:04Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-05T12:04Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T12:04Z] vcfanno.go:194: Info Error: AF_popmax not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T12:05Z] vcfanno.go:194: Info Error: CLNSIG not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T12:16Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-05T12:16Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-05T12:16Z] vcfanno.go:248: annotated 148115 variants in 738.52 seconds (200.6 / second)
[2019-10-05T12:16Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-cre.vcfanno.vcf.gz
[2019-10-05T12:16Z] GEMINI: create database with vcf2db
[2019-10-05T12:16Z] skipping 'MMQ' because it has Number=R
[2019-10-05T12:16Z] setting vcfanno_gnomad_ac to Type String because it has Number=.
[2019-10-05T12:16Z] setting vcfanno_gnomad_an to Type String because it has Number=.
[2019-10-05T12:16Z] setting vcfanno_gnomad_hom to Type String because it has Number=.
[2019-10-05T12:16Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '_mnt_36642bae-9ec9-4100-8...'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T12:16Z]   (util.ellipses_string(value),),
[2019-10-05T12:16Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '-9'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T12:16Z]   (util.ellipses_string(value),),
[2019-10-05T12:16Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '2'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T12:16Z]   (util.ellipses_string(value),),
[2019-10-05T12:16Z] bad record:
[2019-10-05T12:16Z] AC 3
[2019-10-05T12:16Z] AF 0.5
[2019-10-05T12:16Z] AN 6
[2019-10-05T12:16Z] BaseQRankSum 0.216999992728
[2019-10-05T12:16Z] CADD_phred 0.015
[2019-10-05T12:16Z] ClippingRankSum 0.799000024796
[2019-10-05T12:16Z] DB None
[2019-10-05T12:16Z] DECOMPOSED None
[2019-10-05T12:16Z] DP 47
[2019-10-05T12:16Z] ExcessHet 6.98969984055
[2019-10-05T12:16Z] FS 5.49700021744
[2019-10-05T12:16Z] Gerp_score -1.86
[2019-10-05T12:16Z] LEN None
[2019-10-05T12:16Z] MLEAC 3
[2019-10-05T12:16Z] MLEAF 0.5
[2019-10-05T12:16Z] MMQ (27, 27)
[2019-10-05T12:16Z] MQ 31.3999996185
[2019-10-05T12:16Z] MQ0 0
[2019-10-05T12:16Z] MQRankSum -1.1779999733
[2019-10-05T12:16Z] OLD_MULTIALLELIC None
[2019-10-05T12:16Z] OLD_VARIANT None
[2019-10-05T12:16Z] QD 7.78000020981
[2019-10-05T12:16Z] ReadPosRankSum -0.070000000298
[2019-10-05T12:16Z] Revel_score 0.012
[2019-10-05T12:16Z] SOR 2.3900001049
[2019-10-05T12:16Z] TYPE None
[2019-10-05T12:16Z] Vest3_score 0.125,0.052,0.121,0.06
[2019-10-05T12:16Z] aa_change None
[2019-10-05T12:16Z] aa_length None
[2019-10-05T12:16Z] aaf 0.5
[2019-10-05T12:16Z] ac 3
[2019-10-05T12:16Z] ac_adj_exac_afr None
[2019-10-05T12:16Z] ac_adj_exac_amr None
[2019-10-05T12:16Z] ac_adj_exac_eas None
[2019-10-05T12:16Z] ac_adj_exac_fin None
[2019-10-05T12:16Z] ac_adj_exac_nfe None
[2019-10-05T12:16Z] ac_adj_exac_oth None
[2019-10-05T12:16Z] ac_adj_exac_sas None
[2019-10-05T12:16Z] ac_exac_all None
[2019-10-05T12:16Z] af 0.5
[2019-10-05T12:16Z] af_adj_exac_afr -1.0
[2019-10-05T12:16Z] af_adj_exac_amr -1.0
[2019-10-05T12:16Z] af_adj_exac_eas -1.0
[2019-10-05T12:16Z] af_adj_exac_fin -1.0
[2019-10-05T12:16Z] af_adj_exac_nfe -1.0
[2019-10-05T12:16Z] af_adj_exac_oth -1.0
[2019-10-05T12:16Z] af_adj_exac_sas -1.0
[2019-10-05T12:16Z] af_esp_aa -1.0
[2019-10-05T12:16Z] af_esp_all -1.0
[2019-10-05T12:16Z] af_esp_ea -1.0
[2019-10-05T12:16Z] af_exac_all -1.0
[2019-10-05T12:16Z] alt A
[2019-10-05T12:16Z] an 6
[2019-10-05T12:16Z] an_adj_exac_afr -1.0
[2019-10-05T12:16Z] an_adj_exac_amr -1.0
[2019-10-05T12:16Z] an_adj_exac_eas -1.0
[2019-10-05T12:16Z] an_adj_exac_fin -1.0
[2019-10-05T12:16Z] an_adj_exac_nfe -1.0
[2019-10-05T12:16Z] an_adj_exac_oth -1.0
[2019-10-05T12:16Z] an_adj_exac_sas -1.0
[2019-10-05T12:16Z] an_exac_all -1.0
[2019-10-05T12:16Z] baseqranksum 0.216999992728
[2019-10-05T12:16Z] biotype None
[2019-10-05T12:16Z] cadd_phred 0.015
[2019-10-05T12:16Z] call_rate 1.0
[2019-10-05T12:16Z] chrom chr1
[2019-10-05T12:16Z] clinvar_pathogenic None
[2019-10-05T12:16Z] clippingranksum 0.799000024796
[2019-10-05T12:16Z] codon_change None
[2019-10-05T12:16Z] common_pathogenic False
[2019-10-05T12:16Z] db False
[2019-10-05T12:16Z] decomposed False
[2019-10-05T12:16Z] dp 47
[2019-10-05T12:16Z] ds False
[2019-10-05T12:16Z] effect_severity None
[2019-10-05T12:16Z] end 13263050
[2019-10-05T12:16Z] ensembl_gene_id None
[2019-10-05T12:16Z] excesshet 6.98969984055
[2019-10-05T12:16Z] exon None
[2019-10-05T12:16Z] filter None
[2019-10-05T12:16Z] fs 5.49700021744
[2019-10-05T12:16Z] gene None
[2019-10-05T12:16Z] gerp_score -1.86
[2019-10-05T12:16Z] gnomad_af -1.0
[2019-10-05T12:16Z] gnomad_af_afr -1.0
[2019-10-05T12:16Z] gnomad_af_amr -1.0
[2019-10-05T12:16Z] gnomad_af_asj -1.0
[2019-10-05T12:16Z] gnomad_af_eas -1.0
[2019-10-05T12:16Z] gnomad_af_fin -1.0
[2019-10-05T12:16Z] gnomad_af_nfe -1.0
[2019-10-05T12:16Z] gnomad_af_oth -1.0
[2019-10-05T12:16Z] gnomad_af_popmax -1.0
[2019-10-05T12:16Z] gnomad_af_sas -1.0
[2019-10-05T12:16Z] gt_alt_depths i
                                   ��
�5�?9-10-05T12:16Z] gt_alt_freqs d�\(������?�?^Cy
[2019-10-05T12:16Z] gt_depths i
                               ,�
[2019-10-05T12:16Z] gt_phases ?�
[2019-10-05T12:16Z] gt_quals f
                              ,�B<B�B
[2019-10-05T12:16Z] gt_ref_depths i
                                   ,
                                    �
[2019-10-05T12:16Z] gt_types i
                              ,���
[2019-10-05T12:16Z] gts S
                         (G/AG/AG/A
[2019-10-05T12:16Z] impact None
[2019-10-05T12:16Z] impact_severity None
[2019-10-05T12:16Z] impact_so None
[2019-10-05T12:16Z] is_canonical False
[2019-10-05T12:16Z] is_coding False
[2019-10-05T12:16Z] is_exonic False
[2019-10-05T12:16Z] is_lof False
[2019-10-05T12:16Z] is_splicing False
[2019-10-05T12:16Z] len None
[2019-10-05T12:16Z] max_aaf_all -1.0
[2019-10-05T12:16Z] mleac 3
[2019-10-05T12:16Z] mleaf 0.5
[2019-10-05T12:16Z] mmq (27, 27)
[2019-10-05T12:16Z] mq 31.3999996185
[2019-10-05T12:16Z] mq0 0
[2019-10-05T12:16Z] mqranksum -1.1779999733
[2019-10-05T12:16Z] num_exac_Het None
[2019-10-05T12:16Z] num_exac_Hom None
[2019-10-05T12:16Z] num_exac_het None
[2019-10-05T12:16Z] num_exac_hom None
[2019-10-05T12:16Z] num_het 3
[2019-10-05T12:16Z] num_hom_alt 0
[2019-10-05T12:16Z] num_hom_ref 0
[2019-10-05T12:16Z] num_unknown 0
[2019-10-05T12:16Z] old_multiallelic None
[2019-10-05T12:16Z] old_variant None
[2019-10-05T12:16Z] phastCons20way_mammalian 0.000000
[2019-10-05T12:16Z] phastcons20way_mammalian 0.000000
[2019-10-05T12:16Z] phyloP20way_mammalian -2.018000
[2019-10-05T12:16Z] phylop20way_mammalian -2.018000
[2019-10-05T12:16Z] polyphen_pred None
[2019-10-05T12:16Z] polyphen_score None
[2019-10-05T12:16Z] qd 7.78000020981
[2019-10-05T12:16Z] qual 357.899993896
[2019-10-05T12:16Z] readposranksum -0.070000000298
[2019-10-05T12:16Z] ref G
[2019-10-05T12:16Z] revel_score 0.012
[2019-10-05T12:16Z] rs_ids rs2359265
[2019-10-05T12:16Z] sift_pred None
[2019-10-05T12:16Z] sift_score None
[2019-10-05T12:16Z] so None
[2019-10-05T12:16Z] sor 2.3900001049
[2019-10-05T12:16Z] start 13263049
[2019-10-05T12:16Z] sub_type ts
[2019-10-05T12:16Z] top_consequence None
[2019-10-05T12:16Z] transcript None
[2019-10-05T12:16Z] type snp
[2019-10-05T12:16Z] variant_id 1558
[2019-10-05T12:16Z] vcf_id rs2359265
[2019-10-05T12:16Z] vcfanno_gnomad_ac 550
[2019-10-05T12:16Z] vcfanno_gnomad_ac_es 55
[2019-10-05T12:16Z] vcfanno_gnomad_ac_gs 495
[2019-10-05T12:16Z] vcfanno_gnomad_af 0.0150499995798
[2019-10-05T12:16Z] vcfanno_gnomad_af_es 0.00169539998751
[2019-10-05T12:16Z] vcfanno_gnomad_af_gs 0.188199996948
[2019-10-05T12:16Z] vcfanno_gnomad_af_popmax 0.188199996948
[2019-10-05T12:16Z] vcfanno_gnomad_an 36544
[2019-10-05T12:16Z] vcfanno_gnomad_an_es 36544
[2019-10-05T12:16Z] vcfanno_gnomad_an_gs (4698, 3254)
[2019-10-05T12:16Z] vcfanno_gnomad_hom 29
[2019-10-05T12:16Z] vcfanno_gnomad_hom_es 23
[2019-10-05T12:16Z] vcfanno_gnomad_hom_gs 6
[2019-10-05T12:16Z] vest3_score 0.125,0.052,0.121,0.06
[2019-10-05T12:16Z] Traceback (most recent call last):
[2019-10-05T12:16Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-05T12:16Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-05T12:16Z]   File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
[2019-10-05T12:16Z]     self.load()
[2019-10-05T12:16Z]   File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
[2019-10-05T12:16Z]     i = self._load(self.cache, create=True, start=1)
[2019-10-05T12:16Z]   File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
[2019-10-05T12:16Z]     self.insert(variants, expanded, keys, i, create=create)
[2019-10-05T12:16Z]   File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
[2019-10-05T12:16Z]     vilengths, variant_impacts)
[2019-10-05T12:16Z]   File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
[2019-10-05T12:16Z]     self.__insert(v_objs, self.metadata.tables['variants'].insert())
[2019-10-05T12:16Z]   File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
[2019-10-05T12:16Z]     raise e
[2019-10-05T12:16Z] sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 117 - probably unsupported type.
[2019-10-05T12:16Z] [SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, cadd_phred, clippingranksum, db, decomposed, dp, ds, excesshet, fs, gerp_score, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, revel_score, sor, vest3_score, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_pathogenic, common_pathogenic, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, max_aaf_all, num_exac_het, num_exac_hom, phastcons20way_mammalian, phylop20way_mammalian, rs_ids, vcfanno_gnomad_ac, vcfanno_gnomad_ac_es, vcfanno_gnomad_ac_gs, vcfanno_gnomad_af, vcfanno_gnomad_af_es, vcfanno_gnomad_af_gs, vcfanno_gnomad_af_popmax, vcfanno_gnomad_an, vcfanno_gnomad_an_es, vcfanno_gnomad_an_gs, vcfanno_gnomad_hom, vcfanno_gnomad_hom_es, vcfanno_gnomad_hom_gs, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[2019-10-05T12:16Z] [parameters: (1558, u'chr1', 13263049, 13263050, u'rs2359265', u'G', u'A', 357.8999938964844, None, 'snp', 'ts', 1.0, 0, 3, 0, 0, 0.5, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 3, 0.5, 6, 0.21699999272823334, u'0.015', 0.7990000247955322, 0, 0, 47, 0, 6.989699840545654, 5.497000217437744, u'-1.86', None, 3, 0.5, 31.399999618530273, 0, -1.1779999732971191, None, 'None', 7.78000020980835, -0.07000000029802322, u'0.012', 2.390000104904175, u'0.125,0.052,0.121,0.06', None, None, None, None, None, None, None, None, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, None, 0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, None, None, u'0.000000', u'-2.018000', u'rs2359265', 550, 55, 495, 0.015049999579787254, 0.0016953999875113368, 0.1881999969482422, 0.1881999969482422, 36544, 36544, (4698, 3254), 29, 23, 6, <read-only buffer for 0x7f1e05aa7110, size -1, offset 0 at 0x7f1e035287f0>, <read-only buffer for 0x7f1e05aa70d8, size -1, offset 0 at 0x7f1e035287b0>, <read-only buffer for 0x7f1e05b25c30, size -1, offset 0 at 0x7f1e03528770>, <read-only buffer for 0x7f1e05aa7148, size -1, offset 0 at 0x7f1e03528670>, <read-only buffer for 0x7f1e05aa7180, size -1, offset 0 at 0x7f1e03528730>, <read-only buffer for 0x7f1e05aa71b8, size -1, offset 0 at 0x7f1e035286f0>, <read-only buffer for 0x7f1e05aa71f0, size -1, offset 0 at 0x7f1e035286b0>, <read-only buffer for 0x7f1e05b249b0, size -1, offset 0 at 0x7f1e03528630>)]
[2019-10-05T12:16Z] (Background on this error at: http://sqlalche.me/e/rvf5)
[2019-10-05T12:16Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-cre.vcfanno.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpty6jb_71/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype.db
gnomad_af -1.0
gnomad_af_afr -1.0
gnomad_af_amr -1.0
gnomad_af_asj -1.0
gnomad_af_eas -1.0
gnomad_af_fin -1.0
gnomad_af_nfe -1.0
gnomad_af_oth -1.0
gnomad_af_popmax -1.0
gnomad_af_sas -1.0
gt_alt_depths i
               ��
�5�?lt_freqs d�\(������?�?^Cy
gt_depths i
           ,�
gt_phases ?�
gt_quals f
          ,�B<B�B
gt_ref_depths i
               ,
                �
gt_types i
          ,���
gts S
     (G/AG/AG/A
impact None
impact_severity None
impact_so None
is_canonical False
is_coding False
is_exonic False
is_lof False
is_splicing False
len None
max_aaf_all -1.0
mleac 3
mleaf 0.5
mmq (27, 27)
mq 31.3999996185
mq0 0
mqranksum -1.1779999733
num_exac_Het None
num_exac_Hom None
num_exac_het None
num_exac_hom None
num_het 3
num_hom_alt 0
num_hom_ref 0
num_unknown 0
old_multiallelic None
old_variant None
phastCons20way_mammalian 0.000000
phastcons20way_mammalian 0.000000
phyloP20way_mammalian -2.018000
phylop20way_mammalian -2.018000
polyphen_pred None
polyphen_score None
qd 7.78000020981
qual 357.899993896
readposranksum -0.070000000298
ref G
revel_score 0.012
rs_ids rs2359265
sift_pred None
sift_score None
so None
sor 2.3900001049
start 13263049
sub_type ts
top_consequence None
transcript None
type snp
variant_id 1558
vcf_id rs2359265
vcfanno_gnomad_ac 550
vcfanno_gnomad_ac_es 55
vcfanno_gnomad_ac_gs 495
vcfanno_gnomad_af 0.0150499995798
vcfanno_gnomad_af_es 0.00169539998751
vcfanno_gnomad_af_gs 0.188199996948
vcfanno_gnomad_af_popmax 0.188199996948
vcfanno_gnomad_an 36544
vcfanno_gnomad_an_es 36544
vcfanno_gnomad_an_gs (4698, 3254)
vcfanno_gnomad_hom 29
vcfanno_gnomad_hom_es 23
vcfanno_gnomad_hom_gs 6
vest3_score 0.125,0.052,0.121,0.06
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
    self.load()
  File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
    i = self._load(self.cache, create=True, start=1)
  File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
    self.insert(variants, expanded, keys, i, create=create)
  File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
    vilengths, variant_impacts)
  File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
    self.__insert(v_objs, self.metadata.tables['variants'].insert())
  File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
    raise e
sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 117 - probably unsupported type.
[SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, cadd_phred, clippingranksum, db, decomposed, dp, ds, excesshet, fs, gerp_score, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, revel_score, sor, vest3_score, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_pathogenic, common_pathogenic, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, max_aaf_all, num_exac_het, num_exac_hom, phastcons20way_mammalian, phylop20way_mammalian, rs_ids, vcfanno_gnomad_ac, vcfanno_gnomad_ac_es, vcfanno_gnomad_ac_gs, vcfanno_gnomad_af, vcfanno_gnomad_af_es, vcfanno_gnomad_af_gs, vcfanno_gnomad_af_popmax, vcfanno_gnomad_an, vcfanno_gnomad_an_es, vcfanno_gnomad_an_gs, vcfanno_gnomad_hom, vcfanno_gnomad_hom_es, vcfanno_gnomad_hom_gs, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (1558, u'chr1', 13263049, 13263050, u'rs2359265', u'G', u'A', 357.8999938964844, None, 'snp', 'ts', 1.0, 0, 3, 0, 0, 0.5, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 3, 0.5, 6, 0.21699999272823334, u'0.015', 0.7990000247955322, 0, 0, 47, 0, 6.989699840545654, 5.497000217437744, u'-1.86', None, 3, 0.5, 31.399999618530273, 0, -1.1779999732971191, None, 'None', 7.78000020980835, -0.07000000029802322, u'0.012', 2.390000104904175, u'0.125,0.052,0.121,0.06', None, None, None, None, None, None, None, None, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, None, 0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, None, None, u'0.000000', u'-2.018000', u'rs2359265', 550, 55, 495, 0.015049999579787254, 0.0016953999875113368, 0.1881999969482422, 0.1881999969482422, 36544, 36544, (4698, 3254), 29, 23, 6, <read-only buffer for 0x7f1e05aa7110, size -1, offset 0 at 0x7f1e035287f0>, <read-only buffer for 0x7f1e05aa70d8, size -1, offset 0 at 0x7f1e035287b0>, <read-only buffer for 0x7f1e05b25c30, size -1, offset 0 at 0x7f1e03528770>, <read-only buffer for 0x7f1e05aa7148, size -1, offset 0 at 0x7f1e03528670>, <read-only buffer for 0x7f1e05aa7180, size -1, offset 0 at 0x7f1e03528730>, <read-only buffer for 0x7f1e05aa71b8, size -1, offset 0 at 0x7f1e035286f0>, <read-only buffer for 0x7f1e05aa71f0, size -1, offset 0 at 0x7f1e035286b0>, <read-only buffer for 0x7f1e05b249b0, size -1, offset 0 at 0x7f1e03528630>)]
(Background on this error at: http://sqlalche.me/e/rvf5)
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 403, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 50, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 142, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-cre.vcfanno.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpty6jb_71/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype.db
gnomad_af -1.0
gnomad_af_afr -1.0
gnomad_af_amr -1.0
gnomad_af_asj -1.0
gnomad_af_eas -1.0
gnomad_af_fin -1.0
gnomad_af_nfe -1.0
gnomad_af_oth -1.0
gnomad_af_popmax -1.0
gnomad_af_sas -1.0
gt_alt_depths i
               ��
�5�?lt_freqs d�\(������?�?^Cy
gt_depths i
           ,�
gt_phases ?�
gt_quals f
          ,�B<B�B
gt_ref_depths i
               ,
                �
gt_types i
          ,���
gts S
     (G/AG/AG/A
impact None
impact_severity None
impact_so None
is_canonical False
is_coding False
is_exonic False
is_lof False
is_splicing False
len None
max_aaf_all -1.0
mleac 3
mleaf 0.5
mmq (27, 27)
mq 31.3999996185
mq0 0
mqranksum -1.1779999733
num_exac_Het None
num_exac_Hom None
num_exac_het None
num_exac_hom None
num_het 3
num_hom_alt 0
num_hom_ref 0
num_unknown 0
old_multiallelic None
old_variant None
phastCons20way_mammalian 0.000000
phastcons20way_mammalian 0.000000
phyloP20way_mammalian -2.018000
phylop20way_mammalian -2.018000
polyphen_pred None
polyphen_score None
qd 7.78000020981
qual 357.899993896
readposranksum -0.070000000298
ref G
revel_score 0.012
rs_ids rs2359265
sift_pred None
sift_score None
so None
sor 2.3900001049
start 13263049
sub_type ts
top_consequence None
transcript None
type snp
variant_id 1558
vcf_id rs2359265
vcfanno_gnomad_ac 550
vcfanno_gnomad_ac_es 55
vcfanno_gnomad_ac_gs 495
vcfanno_gnomad_af 0.0150499995798
vcfanno_gnomad_af_es 0.00169539998751
vcfanno_gnomad_af_gs 0.188199996948
vcfanno_gnomad_af_popmax 0.188199996948
vcfanno_gnomad_an 36544
vcfanno_gnomad_an_es 36544
vcfanno_gnomad_an_gs (4698, 3254)
vcfanno_gnomad_hom 29
vcfanno_gnomad_hom_es 23
vcfanno_gnomad_hom_gs 6
vest3_score 0.125,0.052,0.121,0.06
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
    self.load()
  File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
    i = self._load(self.cache, create=True, start=1)
  File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
    self.insert(variants, expanded, keys, i, create=create)
  File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
    vilengths, variant_impacts)
  File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
    self.__insert(v_objs, self.metadata.tables['variants'].insert())
  File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
    raise e
sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 117 - probably unsupported type.
[SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, cadd_phred, clippingranksum, db, decomposed, dp, ds, excesshet, fs, gerp_score, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, revel_score, sor, vest3_score, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_pathogenic, common_pathogenic, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, max_aaf_all, num_exac_het, num_exac_hom, phastcons20way_mammalian, phylop20way_mammalian, rs_ids, vcfanno_gnomad_ac, vcfanno_gnomad_ac_es, vcfanno_gnomad_ac_gs, vcfanno_gnomad_af, vcfanno_gnomad_af_es, vcfanno_gnomad_af_gs, vcfanno_gnomad_af_popmax, vcfanno_gnomad_an, vcfanno_gnomad_an_es, vcfanno_gnomad_an_gs, vcfanno_gnomad_hom, vcfanno_gnomad_hom_es, vcfanno_gnomad_hom_gs, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (1558, u'chr1', 13263049, 13263050, u'rs2359265', u'G', u'A', 357.8999938964844, None, 'snp', 'ts', 1.0, 0, 3, 0, 0, 0.5, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 3, 0.5, 6, 0.21699999272823334, u'0.015', 0.7990000247955322, 0, 0, 47, 0, 6.989699840545654, 5.497000217437744, u'-1.86', None, 3, 0.5, 31.399999618530273, 0, -1.1779999732971191, None, 'None', 7.78000020980835, -0.07000000029802322, u'0.012', 2.390000104904175, u'0.125,0.052,0.121,0.06', None, None, None, None, None, None, None, None, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, None, 0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, None, None, u'0.000000', u'-2.018000', u'rs2359265', 550, 55, 495, 0.015049999579787254, 0.0016953999875113368, 0.1881999969482422, 0.1881999969482422, 36544, 36544, (4698, 3254), 29, 23, 6, <read-only buffer for 0x7f1e05aa7110, size -1, offset 0 at 0x7f1e035287f0>, <read-only buffer for 0x7f1e05aa70d8, size -1, offset 0 at 0x7f1e035287b0>, <read-only buffer for 0x7f1e05b25c30, size -1, offset 0 at 0x7f1e03528770>, <read-only buffer for 0x7f1e05aa7148, size -1, offset 0 at 0x7f1e03528670>, <read-only buffer for 0x7f1e05aa7180, size -1, offset 0 at 0x7f1e03528730>, <read-only buffer for 0x7f1e05aa71b8, size -1, offset 0 at 0x7f1e035286f0>, <read-only buffer for 0x7f1e05aa71f0, size -1, offset 0 at 0x7f1e035286b0>, <read-only buffer for 0x7f1e05b249b0, size -1, offset 0 at 0x7f1e03528630>)]
(Background on this error at: http://sqlalche.me/e/rvf5)
' returned non-zero exit status 1.

@roryk
Copy link
Collaborator

roryk commented Oct 5, 2019

Thanks for investigating, sorry for being slow about getting back to you.

I think the original problem is with the format of your PED file. I'm guessing that the samplenames don't match what is in the PED file. Do your samplenames get swapped to having an X in front of them during the bcbio run? We try to make everything compatible downstream with R, since lots of folks use it for analysis, and R won't allow you have column names that start with a number. I'm guessing that is the problem with the PED file.

@roryk
Copy link
Collaborator

roryk commented Oct 5, 2019

Is the last run from your vcfanno file without a PED file? I think your vcfanno file is probably not doing the right thing, if so. Do you need to use a custom file? Can you just use the annotations bcbio provides?

@roryk roryk reopened this Oct 5, 2019
@kokyriakidis
Copy link
Author

kokyriakidis commented Oct 5, 2019

The previous run was WITH VCFANNO file and WITHOUT PED file.

I run it again WITHOUT PED OR VCFANNO FILE with the following template

details:
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
  analysis: variant2
  description: 178FCHILD
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_CHILD_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_CHILD_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    sex: female
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
  analysis: variant2
  description: 178FFATHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_FATHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_FATHER_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    sex: male
- algorithm:
    aligner: bwa
    effects: vep
    effects_transcripts: all
    mark_duplicates: true
    realign: false
    recalibrate: false
    save_diskspace: true
    tools_on:
    - gemini
    - svplots
    - qualimap
    - vep_splicesite_annotations
    - noalt_calling
    variantcaller:
    - gatk-haplotype
  analysis: variant2
  description: 178FMOTHER
  files:
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_MOTHER_1.fq.gz
  - /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/input/178F_MOTHER_2.fq.gz
  genome_build: hg38
  metadata:
    batch: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F
    sex: female
fc_name: 178F
upload:
  dir: ../final

I get these errors:

[2019-10-05T15:00Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-decompose.vcf.gz
[2019-10-05T15:00Z] Annotating /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.vcf.gz with vcfanno, using /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-gemini-combine.conf
[2019-10-05T15:00Z] =============================================
[2019-10-05T15:00Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-05T15:00Z] see: https://github.com/brentp/vcfanno
[2019-10-05T15:00Z] =============================================
[2019-10-05T15:00Z] vcfanno.go:115: found 61 sources from 5 files
[2019-10-05T15:00Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T15:00Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T15:00Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T15:00Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-05T15:00Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T15:00Z] vcfanno.go:194: Info Error: af_adj_exac_sas not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T15:00Z] vcfanno.go:194: Info Error: af_esp_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T15:00Z] vcfanno.go:194: Info Error: CLNDN not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T15:05Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-05T15:05Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-05T15:05Z] vcfanno.go:248: annotated 148115 variants in 274.66 seconds (539.3 / second)
[2019-10-05T15:05Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz
[2019-10-05T15:05Z] GEMINI: create database with vcf2db
[2019-10-05T15:05Z] skipping 'MMQ' because it has Number=R
[2019-10-05T15:05Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '_mnt_36642bae-9ec9-4100-8...'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T15:05Z]   (util.ellipses_string(value),),
[2019-10-05T15:05Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '-9'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T15:05Z]   (util.ellipses_string(value),),
[2019-10-05T15:05Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '2'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T15:05Z]   (util.ellipses_string(value),),
[2019-10-05T15:05Z] bad record:
[2019-10-05T15:05Z] AC 2
[2019-10-05T15:05Z] AF 0.333000004292
[2019-10-05T15:05Z] AN 6
[2019-10-05T15:05Z] BaseQRankSum -1.42299997807
[2019-10-05T15:05Z] ClippingRankSum 0.736999988556
[2019-10-05T15:05Z] DB None
[2019-10-05T15:05Z] DECOMPOSED None
[2019-10-05T15:05Z] DP 49
[2019-10-05T15:05Z] ExcessHet 3.97939991951
[2019-10-05T15:05Z] FS 0.0
[2019-10-05T15:05Z] LEN None
[2019-10-05T15:05Z] MLEAC 2
[2019-10-05T15:05Z] MLEAF 0.333000004292
[2019-10-05T15:05Z] MMQ (40, 40)
[2019-10-05T15:05Z] MQ 37.5800018311
[2019-10-05T15:05Z] MQ0 0
[2019-10-05T15:05Z] MQRankSum -2.37800002098
[2019-10-05T15:05Z] OLD_MULTIALLELIC None
[2019-10-05T15:05Z] OLD_VARIANT None
[2019-10-05T15:05Z] QD 7.92000007629
[2019-10-05T15:05Z] ReadPosRankSum -1.04200005531
[2019-10-05T15:05Z] SOR 0.675000011921
[2019-10-05T15:05Z] TYPE None
[2019-10-05T15:05Z] aa_change None
[2019-10-05T15:05Z] aa_length None
[2019-10-05T15:05Z] aaf 0.333333333333
[2019-10-05T15:05Z] ac 2
[2019-10-05T15:05Z] ac_adj_exac_afr 0.0
[2019-10-05T15:05Z] ac_adj_exac_amr 2.0
[2019-10-05T15:05Z] ac_adj_exac_eas 0.0
[2019-10-05T15:05Z] ac_adj_exac_fin 0.0
[2019-10-05T15:05Z] ac_adj_exac_nfe 0.0
[2019-10-05T15:05Z] ac_adj_exac_oth 0.0
[2019-10-05T15:05Z] ac_adj_exac_sas 0.0
[2019-10-05T15:05Z] ac_exac_all 2.0
[2019-10-05T15:05Z] af 0.333000004292
[2019-10-05T15:05Z] af_adj_exac_afr 0.0
[2019-10-05T15:05Z] af_adj_exac_amr 0.000295420002658
[2019-10-05T15:05Z] af_adj_exac_eas 0.0
[2019-10-05T15:05Z] af_adj_exac_fin 0.0
[2019-10-05T15:05Z] af_adj_exac_nfe 0.0
[2019-10-05T15:05Z] af_adj_exac_oth 0.0
[2019-10-05T15:05Z] af_adj_exac_sas 0.0
[2019-10-05T15:05Z] af_esp_aa -1.0
[2019-10-05T15:05Z] af_esp_all -1.0
[2019-10-05T15:05Z] af_esp_ea -1.0
[2019-10-05T15:05Z] af_exac_all 5.88199982303e-05
[2019-10-05T15:05Z] alt A
[2019-10-05T15:05Z] an 6
[2019-10-05T15:05Z] an_adj_exac_afr 1262.0
[2019-10-05T15:05Z] an_adj_exac_amr 6770.0
[2019-10-05T15:05Z] an_adj_exac_eas 2486.0
[2019-10-05T15:05Z] an_adj_exac_fin 1828.0
[2019-10-05T15:05Z] an_adj_exac_nfe 16896.0
[2019-10-05T15:05Z] an_adj_exac_oth 240.0
[2019-10-05T15:05Z] an_adj_exac_sas 4520.0
[2019-10-05T15:05Z] an_exac_all 34002.0
[2019-10-05T15:05Z] baseqranksum -1.42299997807
[2019-10-05T15:05Z] biotype None
[2019-10-05T15:05Z] call_rate 1.0
[2019-10-05T15:05Z] chrom chr1
[2019-10-05T15:05Z] clinvar_disease_name None
[2019-10-05T15:05Z] clinvar_sig None
[2019-10-05T15:05Z] clippingranksum 0.736999988556
[2019-10-05T15:05Z] codon_change None
[2019-10-05T15:05Z] common_pathogenic False
[2019-10-05T15:05Z] db False
[2019-10-05T15:05Z] decomposed False
[2019-10-05T15:05Z] dp 49
[2019-10-05T15:05Z] ds False
[2019-10-05T15:05Z] effect_severity None
[2019-10-05T15:05Z] end 146984765
[2019-10-05T15:05Z] ensembl_gene_id None
[2019-10-05T15:05Z] excesshet 3.97939991951
[2019-10-05T15:05Z] exon None
[2019-10-05T15:05Z] filter None
[2019-10-05T15:05Z] fs 0.0
[2019-10-05T15:05Z] gene None
[2019-10-05T15:05Z] gnomAD_AC 8
[2019-10-05T15:05Z] gnomAD_AF 0.000136870003189
[2019-10-05T15:05Z] gnomAD_AN (79506, 58450)
[2019-10-05T15:05Z] gnomad_ac 8
[2019-10-05T15:05Z] gnomad_af 0.000136870003189
[2019-10-05T15:05Z] gnomad_af_afr -1.0
[2019-10-05T15:05Z] gnomad_af_amr -1.0
[2019-10-05T15:05Z] gnomad_af_asj -1.0
[2019-10-05T15:05Z] gnomad_af_eas -1.0
[2019-10-05T15:05Z] gnomad_af_fin -1.0
[2019-10-05T15:05Z] gnomad_af_nfe -1.0
[2019-10-05T15:05Z] gnomad_af_oth -1.0
[2019-10-05T15:05Z] gnomad_af_popmax -1.0
[2019-10-05T15:05Z] gnomad_af_sas -1.0
[2019-10-05T15:05Z] gnomad_an (79506, 58450)
[2019-10-05T15:05Z] gt_alt_depths i
                                   ,
                                    �
[2019-10-05T15:05Z] gt_alt_freqs d�U�D�?��q��q�?
[2019-10-05T15:05Z] gt_depths i
                               ,$�	
[2019-10-05T15:05Z] gt_phases ?�
[2019-10-05T15:05Z] gt_quals f
                              ,�B@A0A
[2019-10-05T15:05Z] gt_ref_depths i
                                   ,��
[2019-10-05T15:05Z] gt_types i
                              ,��
[2019-10-05T15:05Z] gts S
                         (G/AG/GG/A
[2019-10-05T15:05Z] impact None
[2019-10-05T15:05Z] impact_severity None
[2019-10-05T15:05Z] impact_so None
[2019-10-05T15:05Z] is_canonical False
[2019-10-05T15:05Z] is_coding False
[2019-10-05T15:05Z] is_exonic False
[2019-10-05T15:05Z] is_lof False
[2019-10-05T15:05Z] is_splicing False
[2019-10-05T15:05Z] len None
[2019-10-05T15:05Z] max_aaf_all 0.000295420002658
[2019-10-05T15:05Z] mleac 2
[2019-10-05T15:05Z] mleaf 0.333000004292
[2019-10-05T15:05Z] mmq (40, 40)
[2019-10-05T15:05Z] mq 37.5800018311
[2019-10-05T15:05Z] mq0 0
[2019-10-05T15:05Z] mqranksum -2.37800002098
[2019-10-05T15:05Z] num_exac_Het 2.0
[2019-10-05T15:05Z] num_exac_Hom 0.0
[2019-10-05T15:05Z] num_exac_het 2.0
[2019-10-05T15:05Z] num_exac_hom 0.0
[2019-10-05T15:05Z] num_het 2
[2019-10-05T15:05Z] num_hom_alt 0
[2019-10-05T15:05Z] num_hom_ref 1
[2019-10-05T15:05Z] num_unknown 0
[2019-10-05T15:05Z] old_multiallelic None
[2019-10-05T15:05Z] old_variant None
[2019-10-05T15:05Z] polyphen_pred None
[2019-10-05T15:05Z] polyphen_score None
[2019-10-05T15:05Z] qd 7.92000007629
[2019-10-05T15:05Z] qual 285.200012207
[2019-10-05T15:05Z] readposranksum -1.04200005531
[2019-10-05T15:05Z] ref G
[2019-10-05T15:05Z] rs_ids rs1437329596
[2019-10-05T15:05Z] sift_pred None
[2019-10-05T15:05Z] sift_score None
[2019-10-05T15:05Z] so None
[2019-10-05T15:05Z] sor 0.675000011921
[2019-10-05T15:05Z] start 146984764
[2019-10-05T15:05Z] sub_type ts
[2019-10-05T15:05Z] top_consequence None
[2019-10-05T15:05Z] transcript None
[2019-10-05T15:05Z] type snp
[2019-10-05T15:05Z] variant_id 7671
[2019-10-05T15:05Z] vcf_id rs1437329596
[2019-10-05T15:05Z] Traceback (most recent call last):
[2019-10-05T15:05Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-05T15:05Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-05T15:05Z]   File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
[2019-10-05T15:05Z]     self.load()
[2019-10-05T15:05Z]   File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
[2019-10-05T15:05Z]     i = self._load(self.cache, create=True, start=1)
[2019-10-05T15:05Z]   File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
[2019-10-05T15:05Z]     self.insert(variants, expanded, keys, i, create=create)
[2019-10-05T15:05Z]   File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
[2019-10-05T15:05Z]     vilengths, variant_impacts)
[2019-10-05T15:05Z]   File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
[2019-10-05T15:05Z]     self.__insert(v_objs, self.metadata.tables['variants'].insert())
[2019-10-05T15:05Z]   File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
[2019-10-05T15:05Z]     raise e
[2019-10-05T15:05Z] sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 100 - probably unsupported type.
[2019-10-05T15:05Z] [SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, clippingranksum, db, decomposed, dp, ds, excesshet, fs, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, sor, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_disease_name, clinvar_sig, common_pathogenic, gnomad_ac, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, gnomad_an, max_aaf_all, num_exac_het, num_exac_hom, rs_ids, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[2019-10-05T15:05Z] [parameters: (7671, u'chr1', 146984764, 146984765, u'rs1437329596', u'G', u'A', 285.20001220703125, None, 'snp', 'ts', 1.0, 1, 2, 0, 0, 0.3333333333333333, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 2, 0.3330000042915344, 6, -1.4229999780654907, 0.7369999885559082, 0, 0, 49, 0, 3.9793999195098877, 0.0, None, 2, 0.3330000042915344, 37.58000183105469, 0, -2.378000020980835, None, 'None', 7.920000076293945, -1.0420000553131104, 0.675000011920929, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0002954200026579201, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, -1.0, -1.0, 5.881999823031947e-05, 1262.0, 6770.0, 2486.0, 1828.0, 16896.0, 240.0, 4520.0, 34002.0, None, None, 0, 8, 0.00013687000318896025, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, (79506, 58450), 0.0002954200026579201, 2.0, 0.0, u'rs1437329596', <read-only buffer for 0x7f95d163f148, size -1, offset 0 at 0x7f95d0b01530>, <read-only buffer for 0x7f95d163f110, size -1, offset 0 at 0x7f95d0b014f0>, <read-only buffer for 0x7f95d163de70, size -1, offset 0 at 0x7f95d0b01130>, <read-only buffer for 0x7f95d163f180, size -1, offset 0 at 0x7f95d0b010f0>, <read-only buffer for 0x7f95d163f1b8, size -1, offset 0 at 0x7f95d0b010b0>, <read-only buffer for 0x7f95d163f1f0, size -1, offset 0 at 0x7f95d0b01070>, <read-only buffer for 0x7f95d163f228, size -1, offset 0 at 0x7f95d0b01030>, <read-only buffer for 0x7f95d163e4b0, size -1, offset 0 at 0x7f95d0b01770>)]
[2019-10-05T15:05Z] (Background on this error at: http://sqlalche.me/e/rvf5)
[2019-10-05T15:05Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpv18n_zur/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype.db
codon_change None
common_pathogenic False
db False
decomposed False
dp 49
ds False
effect_severity None
end 146984765
ensembl_gene_id None
excesshet 3.97939991951
exon None
filter None
fs 0.0
gene None
gnomAD_AC 8
gnomAD_AF 0.000136870003189
gnomAD_AN (79506, 58450)
gnomad_ac 8
gnomad_af 0.000136870003189
gnomad_af_afr -1.0
gnomad_af_amr -1.0
gnomad_af_asj -1.0
gnomad_af_eas -1.0
gnomad_af_fin -1.0
gnomad_af_nfe -1.0
gnomad_af_oth -1.0
gnomad_af_popmax -1.0
gnomad_af_sas -1.0
gnomad_an (79506, 58450)
gt_alt_depths i
               ,
                �
gt_alt_freqs d�U�D�?��q��q�?
gt_depths i
           ,$�	
gt_phases ?�
gt_quals f
          ,�B@A0A
gt_ref_depths i
               ,��
gt_types i
          ,��
gts S
     (G/AG/GG/A
impact None
impact_severity None
impact_so None
is_canonical False
is_coding False
is_exonic False
is_lof False
is_splicing False
len None
max_aaf_all 0.000295420002658
mleac 2
mleaf 0.333000004292
mmq (40, 40)
mq 37.5800018311
mq0 0
mqranksum -2.37800002098
num_exac_Het 2.0
num_exac_Hom 0.0
num_exac_het 2.0
num_exac_hom 0.0
num_het 2
num_hom_alt 0
num_hom_ref 1
num_unknown 0
old_multiallelic None
old_variant None
polyphen_pred None
polyphen_score None
qd 7.92000007629
qual 285.200012207
readposranksum -1.04200005531
ref G
rs_ids rs1437329596
sift_pred None
sift_score None
so None
sor 0.675000011921
start 146984764
sub_type ts
top_consequence None
transcript None
type snp
variant_id 7671
vcf_id rs1437329596
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
    self.load()
  File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
    i = self._load(self.cache, create=True, start=1)
  File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
    self.insert(variants, expanded, keys, i, create=create)
  File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
    vilengths, variant_impacts)
  File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
    self.__insert(v_objs, self.metadata.tables['variants'].insert())
  File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
    raise e
sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 100 - probably unsupported type.
[SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, clippingranksum, db, decomposed, dp, ds, excesshet, fs, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, sor, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_disease_name, clinvar_sig, common_pathogenic, gnomad_ac, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, gnomad_an, max_aaf_all, num_exac_het, num_exac_hom, rs_ids, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (7671, u'chr1', 146984764, 146984765, u'rs1437329596', u'G', u'A', 285.20001220703125, None, 'snp', 'ts', 1.0, 1, 2, 0, 0, 0.3333333333333333, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 2, 0.3330000042915344, 6, -1.4229999780654907, 0.7369999885559082, 0, 0, 49, 0, 3.9793999195098877, 0.0, None, 2, 0.3330000042915344, 37.58000183105469, 0, -2.378000020980835, None, 'None', 7.920000076293945, -1.0420000553131104, 0.675000011920929, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0002954200026579201, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, -1.0, -1.0, 5.881999823031947e-05, 1262.0, 6770.0, 2486.0, 1828.0, 16896.0, 240.0, 4520.0, 34002.0, None, None, 0, 8, 0.00013687000318896025, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, (79506, 58450), 0.0002954200026579201, 2.0, 0.0, u'rs1437329596', <read-only buffer for 0x7f95d163f148, size -1, offset 0 at 0x7f95d0b01530>, <read-only buffer for 0x7f95d163f110, size -1, offset 0 at 0x7f95d0b014f0>, <read-only buffer for 0x7f95d163de70, size -1, offset 0 at 0x7f95d0b01130>, <read-only buffer for 0x7f95d163f180, size -1, offset 0 at 0x7f95d0b010f0>, <read-only buffer for 0x7f95d163f1b8, size -1, offset 0 at 0x7f95d0b010b0>, <read-only buffer for 0x7f95d163f1f0, size -1, offset 0 at 0x7f95d0b01070>, <read-only buffer for 0x7f95d163f228, size -1, offset 0 at 0x7f95d0b01030>, <read-only buffer for 0x7f95d163e4b0, size -1, offset 0 at 0x7f95d0b01770>)]
(Background on this error at: http://sqlalche.me/e/rvf5)
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 403, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 50, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 142, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/bcbiotx/tmpv18n_zur/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype.db
codon_change None
common_pathogenic False
db False
decomposed False
dp 49
ds False
effect_severity None
end 146984765
ensembl_gene_id None
excesshet 3.97939991951
exon None
filter None
fs 0.0
gene None
gnomAD_AC 8
gnomAD_AF 0.000136870003189
gnomAD_AN (79506, 58450)
gnomad_ac 8
gnomad_af 0.000136870003189
gnomad_af_afr -1.0
gnomad_af_amr -1.0
gnomad_af_asj -1.0
gnomad_af_eas -1.0
gnomad_af_fin -1.0
gnomad_af_nfe -1.0
gnomad_af_oth -1.0
gnomad_af_popmax -1.0
gnomad_af_sas -1.0
gnomad_an (79506, 58450)
gt_alt_depths i
               ,
                �
gt_alt_freqs d�U�D�?��q��q�?
gt_depths i
           ,$�	
gt_phases ?�
gt_quals f
          ,�B@A0A
gt_ref_depths i
               ,��
gt_types i
          ,��
gts S
     (G/AG/GG/A
impact None
impact_severity None
impact_so None
is_canonical False
is_coding False
is_exonic False
is_lof False
is_splicing False
len None
max_aaf_all 0.000295420002658
mleac 2
mleaf 0.333000004292
mmq (40, 40)
mq 37.5800018311
mq0 0
mqranksum -2.37800002098
num_exac_Het 2.0
num_exac_Hom 0.0
num_exac_het 2.0
num_exac_hom 0.0
num_het 2
num_hom_alt 0
num_hom_ref 1
num_unknown 0
old_multiallelic None
old_variant None
polyphen_pred None
polyphen_score None
qd 7.92000007629
qual 285.200012207
readposranksum -1.04200005531
ref G
rs_ids rs1437329596
sift_pred None
sift_score None
so None
sor 0.675000011921
start 146984764
sub_type ts
top_consequence None
transcript None
type snp
variant_id 7671
vcf_id rs1437329596
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
    self.load()
  File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
    i = self._load(self.cache, create=True, start=1)
  File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
    self.insert(variants, expanded, keys, i, create=create)
  File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
    vilengths, variant_impacts)
  File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
    self.__insert(v_objs, self.metadata.tables['variants'].insert())
  File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
    raise e
sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 100 - probably unsupported type.
[SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, clippingranksum, db, decomposed, dp, ds, excesshet, fs, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, sor, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_disease_name, clinvar_sig, common_pathogenic, gnomad_ac, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, gnomad_an, max_aaf_all, num_exac_het, num_exac_hom, rs_ids, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (7671, u'chr1', 146984764, 146984765, u'rs1437329596', u'G', u'A', 285.20001220703125, None, 'snp', 'ts', 1.0, 1, 2, 0, 0, 0.3333333333333333, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 2, 0.3330000042915344, 6, -1.4229999780654907, 0.7369999885559082, 0, 0, 49, 0, 3.9793999195098877, 0.0, None, 2, 0.3330000042915344, 37.58000183105469, 0, -2.378000020980835, None, 'None', 7.920000076293945, -1.0420000553131104, 0.675000011920929, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0002954200026579201, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, -1.0, -1.0, 5.881999823031947e-05, 1262.0, 6770.0, 2486.0, 1828.0, 16896.0, 240.0, 4520.0, 34002.0, None, None, 0, 8, 0.00013687000318896025, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, (79506, 58450), 0.0002954200026579201, 2.0, 0.0, u'rs1437329596', <read-only buffer for 0x7f95d163f148, size -1, offset 0 at 0x7f95d0b01530>, <read-only buffer for 0x7f95d163f110, size -1, offset 0 at 0x7f95d0b014f0>, <read-only buffer for 0x7f95d163de70, size -1, offset 0 at 0x7f95d0b01130>, <read-only buffer for 0x7f95d163f180, size -1, offset 0 at 0x7f95d0b010f0>, <read-only buffer for 0x7f95d163f1b8, size -1, offset 0 at 0x7f95d0b010b0>, <read-only buffer for 0x7f95d163f1f0, size -1, offset 0 at 0x7f95d0b01070>, <read-only buffer for 0x7f95d163f228, size -1, offset 0 at 0x7f95d0b01030>, <read-only buffer for 0x7f95d163e4b0, size -1, offset 0 at 0x7f95d0b01770>)]
(Background on this error at: http://sqlalche.me/e/rvf5)
' returned non-zero exit status 1.

@roryk
Copy link
Collaborator

roryk commented Oct 5, 2019

Thanks, sorry for all the back and forth. Could you send me a snippet of the VCF file that has this error so I can take a look? It looks like you might be able to subset it down to just rs1437329596 to get the error to be reproducible. It would be helpful if you could include a few variants that don't fail, if this wasn't the first variant as well.

/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/test.db

is the command that is failing. If you replace

 /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/178F/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_178F-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz

with your snippet, we should be able to see if the snippet fails, which would work as a test case.

@kokyriakidis
Copy link
Author

First I will try to run it again with no vcfanno file or ped file BUT with sample names and description not starting with number and I will update you

@roryk
Copy link
Collaborator

roryk commented Oct 5, 2019

Ok!

@kokyriakidis
Copy link
Author

I got the same errors again when running with NO VCFANNO AND NO PED FILE AND SAMPLE AND DESCRIPTION NAMES NOT STARTING WITH NUMBERS:

[2019-10-05T20:07Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-decompose.vcf.gz
[2019-10-05T20:07Z] Annotating /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.vcf.gz with vcfanno, using /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini-combine.conf
[2019-10-05T20:07Z] =============================================
[2019-10-05T20:07Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-05T20:07Z] see: https://github.com/brentp/vcfanno
[2019-10-05T20:07Z] =============================================
[2019-10-05T20:07Z] vcfanno.go:115: found 61 sources from 5 files
[2019-10-05T20:07Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-05T20:07Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-05T20:07Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T20:07Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-05T20:07Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T20:07Z] vcfanno.go:194: Info Error: af_adj_exac_sas not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T20:07Z] vcfanno.go:194: Info Error: af_esp_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T20:07Z] vcfanno.go:194: Info Error: CLNDN not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-05T20:11Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-05T20:11Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-05T20:11Z] vcfanno.go:248: annotated 148115 variants in 275.74 seconds (537.2 / second)
[2019-10-05T20:11Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz
[2019-10-05T20:11Z] GEMINI: create database with vcf2db
[2019-10-05T20:11Z] skipping 'MMQ' because it has Number=R
[2019-10-05T20:11Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '_mnt_36642bae-9ec9-4100-8...'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T20:11Z]   (util.ellipses_string(value),),
[2019-10-05T20:11Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '-9'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T20:11Z]   (util.ellipses_string(value),),
[2019-10-05T20:11Z] /RED/RESOURCES/bcbio/anaconda/envs/python2/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:269: SAWarning: Unicode type received non-unicode bind param value '2'. (this warning may be suppressed after 10 occurrences)
[2019-10-05T20:11Z]   (util.ellipses_string(value),),
[2019-10-05T20:11Z] bad record:
[2019-10-05T20:11Z] AC 2
[2019-10-05T20:11Z] AF 0.333000004292
[2019-10-05T20:11Z] AN 6
[2019-10-05T20:11Z] BaseQRankSum -1.42299997807
[2019-10-05T20:11Z] ClippingRankSum 0.736999988556
[2019-10-05T20:11Z] DB None
[2019-10-05T20:11Z] DECOMPOSED None
[2019-10-05T20:11Z] DP 49
[2019-10-05T20:11Z] ExcessHet 3.97939991951
[2019-10-05T20:11Z] FS 0.0
[2019-10-05T20:11Z] LEN None
[2019-10-05T20:11Z] MLEAC 2
[2019-10-05T20:11Z] MLEAF 0.333000004292
[2019-10-05T20:11Z] MMQ (40, 40)
[2019-10-05T20:11Z] MQ 37.5800018311
[2019-10-05T20:11Z] MQ0 0
[2019-10-05T20:11Z] MQRankSum -2.37800002098
[2019-10-05T20:11Z] OLD_MULTIALLELIC None
[2019-10-05T20:11Z] OLD_VARIANT None
[2019-10-05T20:11Z] QD 7.92000007629
[2019-10-05T20:11Z] ReadPosRankSum -1.04200005531
[2019-10-05T20:11Z] SOR 0.675000011921
[2019-10-05T20:11Z] TYPE None
[2019-10-05T20:11Z] aa_change None
[2019-10-05T20:11Z] aa_length None
[2019-10-05T20:11Z] aaf 0.333333333333
[2019-10-05T20:11Z] ac 2
[2019-10-05T20:11Z] ac_adj_exac_afr 0.0
[2019-10-05T20:11Z] ac_adj_exac_amr 2.0
[2019-10-05T20:11Z] ac_adj_exac_eas 0.0
[2019-10-05T20:11Z] ac_adj_exac_fin 0.0
[2019-10-05T20:11Z] ac_adj_exac_nfe 0.0
[2019-10-05T20:11Z] ac_adj_exac_oth 0.0
[2019-10-05T20:11Z] ac_adj_exac_sas 0.0
[2019-10-05T20:11Z] ac_exac_all 2.0
[2019-10-05T20:11Z] af 0.333000004292
[2019-10-05T20:11Z] af_adj_exac_afr 0.0
[2019-10-05T20:11Z] af_adj_exac_amr 0.000295420002658
[2019-10-05T20:11Z] af_adj_exac_eas 0.0
[2019-10-05T20:11Z] af_adj_exac_fin 0.0
[2019-10-05T20:11Z] af_adj_exac_nfe 0.0
[2019-10-05T20:11Z] af_adj_exac_oth 0.0
[2019-10-05T20:11Z] af_adj_exac_sas 0.0
[2019-10-05T20:11Z] af_esp_aa -1.0
[2019-10-05T20:11Z] af_esp_all -1.0
[2019-10-05T20:11Z] af_esp_ea -1.0
[2019-10-05T20:11Z] af_exac_all 5.88199982303e-05
[2019-10-05T20:11Z] alt A
[2019-10-05T20:11Z] an 6
[2019-10-05T20:11Z] an_adj_exac_afr 1262.0
[2019-10-05T20:11Z] an_adj_exac_amr 6770.0
[2019-10-05T20:11Z] an_adj_exac_eas 2486.0
[2019-10-05T20:11Z] an_adj_exac_fin 1828.0
[2019-10-05T20:11Z] an_adj_exac_nfe 16896.0
[2019-10-05T20:11Z] an_adj_exac_oth 240.0
[2019-10-05T20:11Z] an_adj_exac_sas 4520.0
[2019-10-05T20:11Z] an_exac_all 34002.0
[2019-10-05T20:11Z] baseqranksum -1.42299997807
[2019-10-05T20:11Z] biotype None
[2019-10-05T20:11Z] call_rate 1.0
[2019-10-05T20:11Z] chrom chr1
[2019-10-05T20:11Z] clinvar_disease_name None
[2019-10-05T20:11Z] clinvar_sig None
[2019-10-05T20:11Z] clippingranksum 0.736999988556
[2019-10-05T20:11Z] codon_change None
[2019-10-05T20:11Z] common_pathogenic False
[2019-10-05T20:11Z] db False
[2019-10-05T20:11Z] decomposed False
[2019-10-05T20:11Z] dp 49
[2019-10-05T20:11Z] ds False
[2019-10-05T20:11Z] effect_severity None
[2019-10-05T20:11Z] end 146984765
[2019-10-05T20:11Z] ensembl_gene_id None
[2019-10-05T20:11Z] excesshet 3.97939991951
[2019-10-05T20:11Z] exon None
[2019-10-05T20:11Z] filter None
[2019-10-05T20:11Z] fs 0.0
[2019-10-05T20:11Z] gene None
[2019-10-05T20:11Z] gnomAD_AC 8
[2019-10-05T20:11Z] gnomAD_AF 0.000136870003189
[2019-10-05T20:11Z] gnomAD_AN (79506, 58450)
[2019-10-05T20:11Z] gnomad_ac 8
[2019-10-05T20:11Z] gnomad_af 0.000136870003189
[2019-10-05T20:11Z] gnomad_af_afr -1.0
[2019-10-05T20:11Z] gnomad_af_amr -1.0
[2019-10-05T20:11Z] gnomad_af_asj -1.0
[2019-10-05T20:11Z] gnomad_af_eas -1.0
[2019-10-05T20:11Z] gnomad_af_fin -1.0
[2019-10-05T20:11Z] gnomad_af_nfe -1.0
[2019-10-05T20:11Z] gnomad_af_oth -1.0
[2019-10-05T20:11Z] gnomad_af_popmax -1.0
[2019-10-05T20:11Z] gnomad_af_sas -1.0
[2019-10-05T20:11Z] gnomad_an (79506, 58450)
[2019-10-05T20:11Z] gt_alt_depths i
                                   ,
                                    �
[2019-10-05T20:11Z] gt_alt_freqs d�U�D�?��q��q�?
[2019-10-05T20:11Z] gt_depths i
                               ,$�	
[2019-10-05T20:11Z] gt_phases ?�
[2019-10-05T20:11Z] gt_quals f
                              ,�B@A0A
[2019-10-05T20:11Z] gt_ref_depths i
                                   ,��
[2019-10-05T20:11Z] gt_types i
                              ,��
[2019-10-05T20:11Z] gts S
                         (G/AG/GG/A
[2019-10-05T20:11Z] impact None
[2019-10-05T20:11Z] impact_severity None
[2019-10-05T20:11Z] impact_so None
[2019-10-05T20:11Z] is_canonical False
[2019-10-05T20:11Z] is_coding False
[2019-10-05T20:11Z] is_exonic False
[2019-10-05T20:11Z] is_lof False
[2019-10-05T20:11Z] is_splicing False
[2019-10-05T20:11Z] len None
[2019-10-05T20:11Z] max_aaf_all 0.000295420002658
[2019-10-05T20:11Z] mleac 2
[2019-10-05T20:11Z] mleaf 0.333000004292
[2019-10-05T20:11Z] mmq (40, 40)
[2019-10-05T20:11Z] mq 37.5800018311
[2019-10-05T20:11Z] mq0 0
[2019-10-05T20:11Z] mqranksum -2.37800002098
[2019-10-05T20:11Z] num_exac_Het 2.0
[2019-10-05T20:11Z] num_exac_Hom 0.0
[2019-10-05T20:11Z] num_exac_het 2.0
[2019-10-05T20:11Z] num_exac_hom 0.0
[2019-10-05T20:11Z] num_het 2
[2019-10-05T20:11Z] num_hom_alt 0
[2019-10-05T20:11Z] num_hom_ref 1
[2019-10-05T20:11Z] num_unknown 0
[2019-10-05T20:11Z] old_multiallelic None
[2019-10-05T20:11Z] old_variant None
[2019-10-05T20:11Z] polyphen_pred None
[2019-10-05T20:11Z] polyphen_score None
[2019-10-05T20:11Z] qd 7.92000007629
[2019-10-05T20:11Z] qual 285.200012207
[2019-10-05T20:11Z] readposranksum -1.04200005531
[2019-10-05T20:11Z] ref G
[2019-10-05T20:11Z] rs_ids rs1437329596
[2019-10-05T20:11Z] sift_pred None
[2019-10-05T20:11Z] sift_score None
[2019-10-05T20:11Z] so None
[2019-10-05T20:11Z] sor 0.675000011921
[2019-10-05T20:11Z] start 146984764
[2019-10-05T20:11Z] sub_type ts
[2019-10-05T20:11Z] top_consequence None
[2019-10-05T20:11Z] transcript None
[2019-10-05T20:11Z] type snp
[2019-10-05T20:11Z] variant_id 7671
[2019-10-05T20:11Z] vcf_id rs1437329596
[2019-10-05T20:11Z] Traceback (most recent call last):
[2019-10-05T20:11Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-05T20:11Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-05T20:11Z]   File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
[2019-10-05T20:11Z]     self.load()
[2019-10-05T20:11Z]   File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
[2019-10-05T20:11Z]     i = self._load(self.cache, create=True, start=1)
[2019-10-05T20:11Z]   File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
[2019-10-05T20:11Z]     self.insert(variants, expanded, keys, i, create=create)
[2019-10-05T20:11Z]   File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
[2019-10-05T20:11Z]     vilengths, variant_impacts)
[2019-10-05T20:11Z]   File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
[2019-10-05T20:11Z]     self.__insert(v_objs, self.metadata.tables['variants'].insert())
[2019-10-05T20:11Z]   File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
[2019-10-05T20:11Z]     raise e
[2019-10-05T20:11Z] sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 100 - probably unsupported type.
[2019-10-05T20:11Z] [SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, clippingranksum, db, decomposed, dp, ds, excesshet, fs, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, sor, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_disease_name, clinvar_sig, common_pathogenic, gnomad_ac, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, gnomad_an, max_aaf_all, num_exac_het, num_exac_hom, rs_ids, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[2019-10-05T20:11Z] [parameters: (7671, u'chr1', 146984764, 146984765, u'rs1437329596', u'G', u'A', 285.20001220703125, None, 'snp', 'ts', 1.0, 1, 2, 0, 0, 0.3333333333333333, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 2, 0.3330000042915344, 6, -1.4229999780654907, 0.7369999885559082, 0, 0, 49, 0, 3.9793999195098877, 0.0, None, 2, 0.3330000042915344, 37.58000183105469, 0, -2.378000020980835, None, 'None', 7.920000076293945, -1.0420000553131104, 0.675000011920929, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0002954200026579201, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, -1.0, -1.0, 5.881999823031947e-05, 1262.0, 6770.0, 2486.0, 1828.0, 16896.0, 240.0, 4520.0, 34002.0, None, None, 0, 8, 0.00013687000318896025, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, (79506, 58450), 0.0002954200026579201, 2.0, 0.0, u'rs1437329596', <read-only buffer for 0x7f452b0f9148, size -1, offset 0 at 0x7f452a506270>, <read-only buffer for 0x7f452b0f9110, size -1, offset 0 at 0x7f452a506230>, <read-only buffer for 0x7f452b0f7ea0, size -1, offset 0 at 0x7f452a5061f0>, <read-only buffer for 0x7f452b0f9180, size -1, offset 0 at 0x7f452a5061b0>, <read-only buffer for 0x7f452b0f91b8, size -1, offset 0 at 0x7f452a506170>, <read-only buffer for 0x7f452b0f91f0, size -1, offset 0 at 0x7f452a506130>, <read-only buffer for 0x7f452b0f9228, size -1, offset 0 at 0x7f452a5060f0>, <read-only buffer for 0x7f452b0f84b0, size -1, offset 0 at 0x7f452a5060b0>)]
[2019-10-05T20:11Z] (Background on this error at: http://sqlalche.me/e/rvf5)
[2019-10-05T20:11Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmpna8xq85x/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
codon_change None
common_pathogenic False
db False
decomposed False
dp 49
ds False
effect_severity None
end 146984765
ensembl_gene_id None
excesshet 3.97939991951
exon None
filter None
fs 0.0
gene None
gnomAD_AC 8
gnomAD_AF 0.000136870003189
gnomAD_AN (79506, 58450)
gnomad_ac 8
gnomad_af 0.000136870003189
gnomad_af_afr -1.0
gnomad_af_amr -1.0
gnomad_af_asj -1.0
gnomad_af_eas -1.0
gnomad_af_fin -1.0
gnomad_af_nfe -1.0
gnomad_af_oth -1.0
gnomad_af_popmax -1.0
gnomad_af_sas -1.0
gnomad_an (79506, 58450)
gt_alt_depths i
               ,
                �
gt_alt_freqs d�U�D�?��q��q�?
gt_depths i
           ,$�	
gt_phases ?�
gt_quals f
          ,�B@A0A
gt_ref_depths i
               ,��
gt_types i
          ,��
gts S
     (G/AG/GG/A
impact None
impact_severity None
impact_so None
is_canonical False
is_coding False
is_exonic False
is_lof False
is_splicing False
len None
max_aaf_all 0.000295420002658
mleac 2
mleaf 0.333000004292
mmq (40, 40)
mq 37.5800018311
mq0 0
mqranksum -2.37800002098
num_exac_Het 2.0
num_exac_Hom 0.0
num_exac_het 2.0
num_exac_hom 0.0
num_het 2
num_hom_alt 0
num_hom_ref 1
num_unknown 0
old_multiallelic None
old_variant None
polyphen_pred None
polyphen_score None
qd 7.92000007629
qual 285.200012207
readposranksum -1.04200005531
ref G
rs_ids rs1437329596
sift_pred None
sift_score None
so None
sor 0.675000011921
start 146984764
sub_type ts
top_consequence None
transcript None
type snp
variant_id 7671
vcf_id rs1437329596
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
    self.load()
  File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
    i = self._load(self.cache, create=True, start=1)
  File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
    self.insert(variants, expanded, keys, i, create=create)
  File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
    vilengths, variant_impacts)
  File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
    self.__insert(v_objs, self.metadata.tables['variants'].insert())
  File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
    raise e
sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 100 - probably unsupported type.
[SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, clippingranksum, db, decomposed, dp, ds, excesshet, fs, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, sor, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_disease_name, clinvar_sig, common_pathogenic, gnomad_ac, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, gnomad_an, max_aaf_all, num_exac_het, num_exac_hom, rs_ids, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (7671, u'chr1', 146984764, 146984765, u'rs1437329596', u'G', u'A', 285.20001220703125, None, 'snp', 'ts', 1.0, 1, 2, 0, 0, 0.3333333333333333, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 2, 0.3330000042915344, 6, -1.4229999780654907, 0.7369999885559082, 0, 0, 49, 0, 3.9793999195098877, 0.0, None, 2, 0.3330000042915344, 37.58000183105469, 0, -2.378000020980835, None, 'None', 7.920000076293945, -1.0420000553131104, 0.675000011920929, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0002954200026579201, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, -1.0, -1.0, 5.881999823031947e-05, 1262.0, 6770.0, 2486.0, 1828.0, 16896.0, 240.0, 4520.0, 34002.0, None, None, 0, 8, 0.00013687000318896025, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, (79506, 58450), 0.0002954200026579201, 2.0, 0.0, u'rs1437329596', <read-only buffer for 0x7f452b0f9148, size -1, offset 0 at 0x7f452a506270>, <read-only buffer for 0x7f452b0f9110, size -1, offset 0 at 0x7f452a506230>, <read-only buffer for 0x7f452b0f7ea0, size -1, offset 0 at 0x7f452a5061f0>, <read-only buffer for 0x7f452b0f9180, size -1, offset 0 at 0x7f452a5061b0>, <read-only buffer for 0x7f452b0f91b8, size -1, offset 0 at 0x7f452a506170>, <read-only buffer for 0x7f452b0f91f0, size -1, offset 0 at 0x7f452a506130>, <read-only buffer for 0x7f452b0f9228, size -1, offset 0 at 0x7f452a5060f0>, <read-only buffer for 0x7f452b0f84b0, size -1, offset 0 at 0x7f452a5060b0>)]
(Background on this error at: http://sqlalche.me/e/rvf5)
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 403, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 50, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 142, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmpna8xq85x/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
codon_change None
common_pathogenic False
db False
decomposed False
dp 49
ds False
effect_severity None
end 146984765
ensembl_gene_id None
excesshet 3.97939991951
exon None
filter None
fs 0.0
gene None
gnomAD_AC 8
gnomAD_AF 0.000136870003189
gnomAD_AN (79506, 58450)
gnomad_ac 8
gnomad_af 0.000136870003189
gnomad_af_afr -1.0
gnomad_af_amr -1.0
gnomad_af_asj -1.0
gnomad_af_eas -1.0
gnomad_af_fin -1.0
gnomad_af_nfe -1.0
gnomad_af_oth -1.0
gnomad_af_popmax -1.0
gnomad_af_sas -1.0
gnomad_an (79506, 58450)
gt_alt_depths i
               ,
                �
gt_alt_freqs d�U�D�?��q��q�?
gt_depths i
           ,$�	
gt_phases ?�
gt_quals f
          ,�B@A0A
gt_ref_depths i
               ,��
gt_types i
          ,��
gts S
     (G/AG/GG/A
impact None
impact_severity None
impact_so None
is_canonical False
is_coding False
is_exonic False
is_lof False
is_splicing False
len None
max_aaf_all 0.000295420002658
mleac 2
mleaf 0.333000004292
mmq (40, 40)
mq 37.5800018311
mq0 0
mqranksum -2.37800002098
num_exac_Het 2.0
num_exac_Hom 0.0
num_exac_het 2.0
num_exac_hom 0.0
num_het 2
num_hom_alt 0
num_hom_ref 1
num_unknown 0
old_multiallelic None
old_variant None
polyphen_pred None
polyphen_score None
qd 7.92000007629
qual 285.200012207
readposranksum -1.04200005531
ref G
rs_ids rs1437329596
sift_pred None
sift_score None
so None
sor 0.675000011921
start 146984764
sub_type ts
top_consequence None
transcript None
type snp
variant_id 7671
vcf_id rs1437329596
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 234, in __init__
    self.load()
  File "/RED/TOOLS/bin/vcf2db.py", line 319, in load
    i = self._load(self.cache, create=True, start=1)
  File "/RED/TOOLS/bin/vcf2db.py", line 312, in _load
    self.insert(variants, expanded, keys, i, create=create)
  File "/RED/TOOLS/bin/vcf2db.py", line 374, in insert
    vilengths, variant_impacts)
  File "/RED/TOOLS/bin/vcf2db.py", line 402, in _insert
    self.__insert(v_objs, self.metadata.tables['variants'].insert())
  File "/RED/TOOLS/bin/vcf2db.py", line 436, in __insert
    raise e
sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 100 - probably unsupported type.
[SQL: INSERT INTO variants (variant_id, chrom, start, "end", vcf_id, ref, alt, qual, filter, type, sub_type, call_rate, num_hom_ref, num_het, num_hom_alt, num_unknown, aaf, gene, ensembl_gene_id, transcript, is_exonic, is_coding, is_lof, is_splicing, is_canonical, exon, codon_change, aa_change, aa_length, biotype, impact, impact_so, impact_severity, polyphen_pred, polyphen_score, sift_pred, sift_score, ac, af, an, baseqranksum, clippingranksum, db, decomposed, dp, ds, excesshet, fs, len, mleac, mleaf, mq, mq0, mqranksum, old_multiallelic, old_variant, qd, readposranksum, sor, ac_adj_exac_afr, ac_adj_exac_amr, ac_adj_exac_eas, ac_adj_exac_fin, ac_adj_exac_nfe, ac_adj_exac_oth, ac_adj_exac_sas, ac_exac_all, af_adj_exac_afr, af_adj_exac_amr, af_adj_exac_eas, af_adj_exac_fin, af_adj_exac_nfe, af_adj_exac_oth, af_adj_exac_sas, af_esp_aa, af_esp_all, af_esp_ea, af_exac_all, an_adj_exac_afr, an_adj_exac_amr, an_adj_exac_eas, an_adj_exac_fin, an_adj_exac_nfe, an_adj_exac_oth, an_adj_exac_sas, an_exac_all, clinvar_disease_name, clinvar_sig, common_pathogenic, gnomad_ac, gnomad_af, gnomad_af_afr, gnomad_af_amr, gnomad_af_asj, gnomad_af_eas, gnomad_af_fin, gnomad_af_nfe, gnomad_af_oth, gnomad_af_popmax, gnomad_af_sas, gnomad_an, max_aaf_all, num_exac_het, num_exac_hom, rs_ids, gts, gt_types, gt_phases, gt_depths, gt_ref_depths, gt_alt_depths, gt_quals, gt_alt_freqs) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: (7671, u'chr1', 146984764, 146984765, u'rs1437329596', u'G', u'A', 285.20001220703125, None, 'snp', 'ts', 1.0, 1, 2, 0, 0, 0.3333333333333333, None, None, None, 0, 0, 0, 0, 0, None, None, None, None, None, None, None, None, None, None, None, None, 2, 0.3330000042915344, 6, -1.4229999780654907, 0.7369999885559082, 0, 0, 49, 0, 3.9793999195098877, 0.0, None, 2, 0.3330000042915344, 37.58000183105469, 0, -2.378000020980835, None, 'None', 7.920000076293945, -1.0420000553131104, 0.675000011920929, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0002954200026579201, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, -1.0, -1.0, 5.881999823031947e-05, 1262.0, 6770.0, 2486.0, 1828.0, 16896.0, 240.0, 4520.0, 34002.0, None, None, 0, 8, 0.00013687000318896025, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, (79506, 58450), 0.0002954200026579201, 2.0, 0.0, u'rs1437329596', <read-only buffer for 0x7f452b0f9148, size -1, offset 0 at 0x7f452a506270>, <read-only buffer for 0x7f452b0f9110, size -1, offset 0 at 0x7f452a506230>, <read-only buffer for 0x7f452b0f7ea0, size -1, offset 0 at 0x7f452a5061f0>, <read-only buffer for 0x7f452b0f9180, size -1, offset 0 at 0x7f452a5061b0>, <read-only buffer for 0x7f452b0f91b8, size -1, offset 0 at 0x7f452a506170>, <read-only buffer for 0x7f452b0f91f0, size -1, offset 0 at 0x7f452a506130>, <read-only buffer for 0x7f452b0f9228, size -1, offset 0 at 0x7f452a5060f0>, <read-only buffer for 0x7f452b0f84b0, size -1, offset 0 at 0x7f452a5060b0>)]
(Background on this error at: http://sqlalche.me/e/rvf5)
' returned non-zero exit status 1.

I am very frustrated. Neither the PED nor the VCFANNO file cause the problem. Do you want me to upload the fastq files so you can have a look?

@kokyriakidis
Copy link
Author

kokyriakidis commented Oct 5, 2019

I have uploaded the whole project folder after the failed run so you can have a look!

https://drive.google.com/drive/folders/1BStwCZFMcFL7-6AruM723mbpSWH0XozS?usp=sharing

@kokyriakidis
Copy link
Author

kokyriakidis commented Oct 6, 2019

@naumenko-sa

  1. deleted

  2. Rerun without vcfanno file and still get these errors:

[2019-10-06T10:22Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-decompose.vcf.gz
[2019-10-06T10:22Z] Annotating /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.vcf.gz with vcfanno, using /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini-combine.conf
[2019-10-06T10:22Z] =============================================
[2019-10-06T10:22Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-06T10:22Z] see: https://github.com/brentp/vcfanno
[2019-10-06T10:22Z] =============================================
[2019-10-06T10:22Z] vcfanno.go:115: found 61 sources from 5 files
[2019-10-06T10:22Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T10:22Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T10:22Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T10:22Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-06T10:22Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T10:22Z] vcfanno.go:194: Info Error: af_adj_exac_sas not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T10:22Z] vcfanno.go:194: Info Error: af_esp_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T10:22Z] vcfanno.go:194: Info Error: CLNDN not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T10:26Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-06T10:26Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-06T10:26Z] vcfanno.go:248: annotated 148115 variants in 275.46 seconds (537.7 / second)
[2019-10-06T10:27Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz
[2019-10-06T10:27Z] GEMINI: create database with vcf2db
[2019-10-06T10:27Z] skipping 'MMQ' because it has Number=R
[2019-10-06T10:27Z] Ethnicity None
[2019-10-06T10:27Z] Traceback (most recent call last):
[2019-10-06T10:27Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-06T10:27Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-06T10:27Z]   File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
[2019-10-06T10:27Z]     self.samples = self.create_samples()
[2019-10-06T10:27Z]   File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
[2019-10-06T10:27Z]     vals = [r[i] for r in rows]
[2019-10-06T10:27Z] IndexError: list index out of range
[2019-10-06T10:27Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmp71dgss81/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 403, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 50, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 142, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmp71dgss81/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.

  1. I tried norm but I get
bcftools norm -d '/mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' 

The argument to -d not recognised: /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz

I had to specify

bcftools norm -d all <vcf_file.gz>

When the process ended, the last command stated:

Lines   total/split/realigned/skipped:	14961513/0/0/0

Is this normal?

  1. Rerun but with no luck:
[2019-10-06T12:21Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-decompose.vcf.gz
[2019-10-06T12:21Z] Annotating /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.vcf.gz with vcfanno, using /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini-combine.conf
[2019-10-06T12:21Z] =============================================
[2019-10-06T12:21Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-06T12:21Z] see: https://github.com/brentp/vcfanno
[2019-10-06T12:21Z] =============================================
[2019-10-06T12:21Z] vcfanno.go:115: found 61 sources from 5 files
[2019-10-06T12:21Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-06T12:21Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-06T12:21Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T12:21Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-06T12:21Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T12:21Z] vcfanno.go:194: Info Error: af_adj_exac_sas not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T12:21Z] vcfanno.go:194: Info Error: af_esp_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T12:21Z] vcfanno.go:194: Info Error: CLNDN not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-06T12:26Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-06T12:26Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-06T12:26Z] vcfanno.go:248: annotated 148115 variants in 275.68 seconds (537.3 / second)
[2019-10-06T12:26Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz
[2019-10-06T12:26Z] GEMINI: create database with vcf2db
[2019-10-06T12:26Z] skipping 'MMQ' because it has Number=R
[2019-10-06T12:26Z] Ethnicity None
[2019-10-06T12:26Z] Traceback (most recent call last):
[2019-10-06T12:26Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-06T12:26Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-06T12:26Z]   File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
[2019-10-06T12:26Z]     self.samples = self.create_samples()
[2019-10-06T12:26Z]   File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
[2019-10-06T12:26Z]     vals = [r[i] for r in rows]
[2019-10-06T12:26Z] IndexError: list index out of range
[2019-10-06T12:26Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmpopsnbp30/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 403, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 50, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 142, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmpopsnbp30/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.

@kokyriakidis
Copy link
Author

@naumenko-sa Do we have any news?

@naumenko-sa
Copy link
Contributor

naumenko-sa commented Oct 11, 2019

Hi @kokyriakidis !

  1. You have removed vcfanno from the config, but the error is still there. I think it is because you have tools_on: gemini. That causes vcfanno to annotate with genomes/Hsapiens/hg38/config/vcfanno/gemini.conf, which includes the same variation/gnomad_exome.vcf.gz file. Could you please try to remove tools_off: gemini to confirm that failing step is indeed vcfanno?

  2. I've removed duplicates from variation/gnomad_exome.vcf.gz:
    Before removing duplicates there were 3 variants at chr1:146984765

tabix gnomad_exome.vcf.gz chr1:146984765-146984765
chr1	146984765	3	G	T	7794.38
chr1	146984765	3	G	A	7794.38
chr1	146984765	3	G	A	8470.69

after removing duplicates with

bcftools norm -dall -Oz gnomad_exome.vcf.gz > gnomad_exome.no_duplicates.vcf.gz
tabix gnomad_exome.no_duplicates.vcf.gz

the duplicate record is gone, but the allele A is gone as well

#CHROM	POS	ID	REF	ALT	QUAL	FILTER
chr1	146984765	rs781933389	G	T	7794.38	PASS

I've submitted a bug to
samtools/bcftools#1089

  1. I used a rival vt tool to remove duplicates:
    https://github.com/naumenko-sa/bioscripts/blob/master/variation/vcf.remove_duplicates.sh

this time the allele A is kept:

chr1	146984765	3	G	A	8470.69
chr1	146984765	3	G	T	7794.38
  1. I've replaced gnomad_exome.vcf.gz with gnomad_exome.no_duplicates.vcf.gz in the bcbio installation:
cd genomes/Hsapiens/hg38/variation
mv gnomad_exome.vcf.gz gnomad_exome.with_duplicates.vcf.gz
ln -s gnomad_exome.no_duplicates.vcf.gz gnomad_exome.vcf.gz
  1. I am running your WES trio project as a test.

  2. I've submitted a fixed gnomad_exome recipe to remove duplicates in cloudbiolinux:
    filtered non-unique variants in hg38/gnomad_exome. using the single v… chapmanb/cloudbiolinux#325

SN

@kokyriakidis
Copy link
Author

kokyriakidis commented Oct 11, 2019

Thanks so much for all your help Sergey!

So in order to get it work I just update bcbio after cloudbiolinux PR is merged or do I have to do the steps you did?

@naumenko-sa
Copy link
Contributor

Yes, you'd need to bcbio-nextgen.py upgrade -u skip --data , and let me know if you have any issues during installation of with your trio. I have not finished a trio run on my side (it is running), so there still might be issues.
SN

@kokyriakidis
Copy link
Author

Ok! I am waiting for your confirmation! Thanks again for helping and testing!

@kokyriakidis
Copy link
Author

kokyriakidis commented Oct 12, 2019

It seems that I get this error now when running with GEMINI:

[2019-10-12T08:23Z] Timing: joint squaring off/backfilling
[2019-10-12T08:23Z] Timing: variant post-processing
[2019-10-12T08:23Z] multiprocessing: postprocess_variants
[2019-10-12T08:23Z] Finalizing variant calls: F178CHILD, gatk-haplotype
[2019-10-12T08:23Z] Calculating variation effects for F178CHILD, gatk-haplotype
[2019-10-12T08:23Z] Annotate VCF file: F178CHILD, gatk-haplotype
[2019-10-12T08:23Z] Annotate with dbSNP
[2019-10-12T08:23Z] =============================================
[2019-10-12T08:23Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-12T08:23Z] see: https://github.com/brentp/vcfanno
[2019-10-12T08:23Z] =============================================
[2019-10-12T08:23Z] vcfanno.go:115: found 1 sources from 1 files
[2019-10-12T08:23Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-12T08:23Z] vcfanno.go:194: Info Error: rs_ids not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-12T08:28Z] vcfanno.go:248: annotated 158712 variants in 275.09 seconds (576.9 / second)
[2019-10-12T08:28Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated.vcf.gz
[2019-10-12T08:28Z] Filtering for F178CHILD, gatk-haplotype
[2019-10-12T08:28Z] Removing variants with missing alts from /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gatk-haplotype/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated.vcf.gz.
[2019-10-12T08:28Z] bgzip _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated-nomissingalt.vcf
[2019-10-12T08:28Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated-nomissingalt.vcf.gz
[2019-10-12T08:28Z] Cutoff-based soft filtering /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gatk-haplotype/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated-nomissingalt.vcf.gz with TYPE="snp" && (MQRankSum < -12.5 || ReadPosRankSum < -8.0 || QD < 2.0 || FS > 60.0 || (QD < 10.0 && AD[0:1] / (AD[0:1] + AD[0:0]) < 0.25 && ReadPosRankSum < 0.0) || MQ < 30.0) : F178CHILD
[2019-10-12T08:28Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated-nomissingalt-filterSNP.vcf.gz
[2019-10-12T08:28Z] Cutoff-based soft filtering /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gatk-haplotype/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated-nomissingalt-filterSNP.vcf.gz with TYPE="indel" && (ReadPosRankSum < -20.0 || QD < 2.0 || FS > 200.0 || SOR > 10.0 || (QD < 10.0 && AD[0:1] / (AD[0:1] + AD[0:0]) < 0.25 && ReadPosRankSum < 0.0)) : F178CHILD
[2019-10-12T08:28Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-annotated-nomissingalt-filterSNP-filterINDEL.vcf.gz
[2019-10-12T08:28Z] Prioritization for F178CHILD, gatk-haplotype
[2019-10-12T08:28Z] multiprocessing: split_variants_by_sample
[2019-10-12T08:28Z] Timing: prepped BAM merging
[2019-10-12T08:28Z] Timing: validation
[2019-10-12T08:28Z] multiprocessing: compare_to_rm
[2019-10-12T08:28Z] Timing: ensemble calling
[2019-10-12T08:28Z] Timing: validation summary
[2019-10-12T08:28Z] Timing: structural variation
[2019-10-12T08:28Z] Timing: structural variation
[2019-10-12T08:28Z] Timing: structural variation ensemble
[2019-10-12T08:28Z] Timing: structural variation validation
[2019-10-12T08:28Z] multiprocessing: validate_sv
[2019-10-12T08:28Z] Timing: heterogeneity
[2019-10-12T08:28Z] Timing: population database
[2019-10-12T08:28Z] multiprocessing: prep_gemini_db
[2019-10-12T08:28Z] Multi-allelic to single allele
[2019-10-12T08:28Z] normalize v0.5
[2019-10-12T08:28Z] options:     input VCF file                                  -
[2019-10-12T08:28Z]          [o] output VCF file                                 -
[2019-10-12T08:28Z]          [w] sorting window size                             10000
[2019-10-12T08:28Z]          [n] no fail on reference inconsistency for non SNPs true
[2019-10-12T08:28Z]          [q] quiet                                           false
[2019-10-12T08:28Z]          [d] debug                                           false
[2019-10-12T08:28Z]          [r] reference FASTA file                            /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa
[2019-10-12T08:28Z] decompose v0.5
[2019-10-12T08:28Z] options:     input VCF file        -
[2019-10-12T08:28Z]          [s] smart decomposition   true (experimental)
[2019-10-12T08:28Z]          [o] output VCF file       -
[2019-10-12T08:28Z] stats: no. variants                 : 147535
[2019-10-12T08:28Z]        no. biallelic variants       : 146972
[2019-10-12T08:28Z]        no. multiallelic variants    : 563
[2019-10-12T08:28Z]        no. additional biallelics    : 580
[2019-10-12T08:28Z]        total no. of biallelics      : 148115
[2019-10-12T08:28Z] Time elapsed: 3.03s
[2019-10-12T08:28Z] stats: biallelic
[2019-10-12T08:28Z]           no. left trimmed                      : 0
[2019-10-12T08:28Z]           no. right trimmed                     : 218
[2019-10-12T08:28Z]           no. left and right trimmed            : 0
[2019-10-12T08:28Z]           no. right trimmed and left aligned    : 0
[2019-10-12T08:28Z]           no. left aligned                      : 2
[2019-10-12T08:28Z]        total no. biallelic normalized           : 220
[2019-10-12T08:28Z]        multiallelic
[2019-10-12T08:28Z]           no. left trimmed                      : 0
[2019-10-12T08:28Z]           no. right trimmed                     : 0
[2019-10-12T08:28Z]           no. left and right trimmed            : 0
[2019-10-12T08:28Z]           no. right trimmed and left aligned    : 0
[2019-10-12T08:28Z]           no. left aligned                      : 0
[2019-10-12T08:28Z]        total no. multiallelic normalized        : 0
[2019-10-12T08:28Z]        total no. variants normalized            : 220
[2019-10-12T08:28Z]        total no. variants observed              : 148115
[2019-10-12T08:28Z]        total no. reference observed             : 0
[2019-10-12T08:28Z] Time elapsed: 3.25s
[2019-10-12T08:28Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-decompose.vcf.gz
[2019-10-12T08:28Z] Annotating /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.vcf.gz with vcfanno, using /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini-combine.conf
[2019-10-12T08:28Z] =============================================
[2019-10-12T08:28Z] vcfanno version 0.3.2 [built with go1.12.1]
[2019-10-12T08:28Z] see: https://github.com/brentp/vcfanno
[2019-10-12T08:28Z] =============================================
[2019-10-12T08:28Z] vcfanno.go:115: found 61 sources from 5 files
[2019-10-12T08:28Z] vcfanno.go:145: using 2 worker threads to decompress bgzip file
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AFR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AF_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ACR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AC_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ANR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_AMR' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_ASJ' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_EAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_FIN' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_NFE' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_OTH' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_SAS' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'AN_POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'POPMAX' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Male' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] api.go:811: WARNING: using op 'self' when with Number='1' for 'Hom_Female' from '/RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/gnomad_exome.vcf.gz' can result in out-of-order values when the query is multi-allelic
[2019-10-12T08:28Z] api.go:812:        : this is not an issue if the query has been decomposed.
[2019-10-12T08:28Z] vcfanno.go:194: Info Error: max_aaf_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-12T08:28Z] vcfanno.go:194: Info Error: Hom_Female not found in header >> this error/warning may occur many times. reporting once here...
[2019-10-12T08:28Z] vcfanno.go:194: Info Error: clinvar_sig not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-12T08:28Z] vcfanno.go:194: Info Error: af_adj_exac_sas not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-12T08:28Z] vcfanno.go:194: Info Error: af_esp_all not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-12T08:28Z] vcfanno.go:194: Info Error: CLNDN not found in INFO >> this error/warning may occur many times. reporting once here...
[2019-10-12T08:32Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/exac.vcf.gz
[2019-10-12T08:32Z] bix.go:251: chromosome chrM not found in /RED/RESOURCES/bcbio/genomes/Hsapiens/hg38/variation/esp.vcf.gz
[2019-10-12T08:32Z] vcfanno.go:248: annotated 148115 variants in 277.93 seconds (532.9 / second)
[2019-10-12T08:32Z] tabix index _mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz
[2019-10-12T08:32Z] GEMINI: create database with vcf2db
[2019-10-12T08:32Z] skipping 'MMQ' because it has Number=R
[2019-10-12T08:32Z] Ethnicity None
[2019-10-12T08:32Z] Traceback (most recent call last):
[2019-10-12T08:32Z]   File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
[2019-10-12T08:32Z]     impacts_extras=a.impacts_field, aok=a.a_ok)
[2019-10-12T08:32Z]   File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
[2019-10-12T08:32Z]     self.samples = self.create_samples()
[2019-10-12T08:32Z]   File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
[2019-10-12T08:32Z]     vals = [r[i] for r in rows]
[2019-10-12T08:32Z] IndexError: list index out of range
[2019-10-12T08:32Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmp6wl69_05/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 245, in <module>
    main(**kwargs)
  File "/RED/TOOLS/bin/bcbio_nextgen.py", line 46, in main
    run_main(**kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 50, in run_main
    fc_dir, run_info_yaml)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
    for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/pipeline/main.py", line 188, in variant2pipeline
    samples = population.prep_db_parallel(samples, run_parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 403, in prep_db_parallel
    output = parallel_fn("prep_gemini_db", to_process)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 921, in __call__
    if self.dispatch_one_batch(iterator):
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/utils.py", line 55, in wrapper
    return f(*args, **kwargs)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/distributed/multitasks.py", line 383, in prep_gemini_db
    return population.prep_gemini_db(*args)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 50, in prep_gemini_db
    gemini_db = create_gemini_db(ann_vcf, data, gemini_db, ped_file)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/variation/population.py", line 142, in create_gemini_db
    do.run(cmd, "GEMINI: create database with vcf2db")
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 26, in run
    _do_run(cmd, checks, log_stdout, env=env)
  File "/RED/RESOURCES/bcbio/anaconda/lib/python3.6/site-packages/bcbio/provenance/do.py", line 106, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/RED/TOOLS/bin/vcf2db.py /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic-annotated-gemini.vcf.gz /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/bcbiotx/tmp6wl69_05/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype.db
skipping 'MMQ' because it has Number=R
Ethnicity None
Traceback (most recent call last):
  File "/RED/TOOLS/bin/vcf2db.py", line 924, in <module>
    impacts_extras=a.impacts_field, aok=a.a_ok)
  File "/RED/TOOLS/bin/vcf2db.py", line 228, in __init__
    self.samples = self.create_samples()
  File "/RED/TOOLS/bin/vcf2db.py", line 598, in create_samples
    vals = [r[i] for r in rows]
IndexError: list index out of range
' returned non-zero exit status 1.

Without GEMINI everything worked fine

@kokyriakidis
Copy link
Author

Hi Sergey and the rest!

Will sticking to gnomad 2.1 help with this issue?

@roryk
Copy link
Collaborator

roryk commented Oct 15, 2019

This is still due to a PED file problem-- could you send along the PED file you are using?

@kokyriakidis
Copy link
Author

178F_PED.zip

@roryk
Copy link
Collaborator

roryk commented Oct 15, 2019

Thanks, could you also pass along the YAML file you are using?

@kokyriakidis
Copy link
Author

F178.zip

@roryk
Copy link
Collaborator

roryk commented Oct 15, 2019

Thanks, how about this file? /mnt/36642bae-9ec9-4100-88a2-ac173a20ea16/WORKDIR/TRIO_BRAZILIANS/F178/work/gemini/_mnt_36642bae-9ec9-4100-88a2-ac173a20ea16_WORKDIR_TRIO_BRAZILIANS_F178-gatk-haplotype-nomultiallelic.ped

@kokyriakidis
Copy link
Author

@roryk The PED file is fine right?

@roryk
Copy link
Collaborator

roryk commented Oct 15, 2019

Hi @kokyriakidis,

Yes, it looks fine to me. Could you also send along the VCF file or a snippet from it so I can test locally?

@kokyriakidis
Copy link
Author

The whole project is here:

https://drive.google.com/drive/folders/1BStwCZFMcFL7-6AruM723mbpSWH0XozS?usp=sharing

@roryk
Copy link
Collaborator

roryk commented Oct 15, 2019

Thanks, the PED file and the VCF file don't have the same names. The PED file has the names as F178CHILD but the VCF file has them named F178_CHILD.

@kokyriakidis
Copy link
Author

@roryk Sorry this was from another run. Please check again the link above. I have uploaded the updated run (which also fails)

@roryk
Copy link
Collaborator

roryk commented Oct 15, 2019

Thanks, it looks like the PED file has the ethnicity column but no entry for it. If you add an ethnicity column to the PED file and populate it with -9 it should fix this problem.

@kokyriakidis
Copy link
Author

hmm thanks! Maybe update the readme, so others know that

@roryk
Copy link
Collaborator

roryk commented Oct 15, 2019

Thanks, I think we shouldn't be adding that column unless it is there, so it is a bug.

@kokyriakidis
Copy link
Author

kokyriakidis commented Oct 15, 2019

Running the command again caused the same error when changed PED. Should I delete the work folder and run it again?

@roryk
Copy link
Collaborator

roryk commented Oct 15, 2019

Heya, it will still be using the broken PED file if it is on the disk, I think if you just delete the PED file in the GEMINI directory it will populate a new one.

@roryk
Copy link
Collaborator

roryk commented Oct 15, 2019

I'll update the documentation to mention we require the ethnicity as well, sorry about that.

@kokyriakidis
Copy link
Author

It finally run successfully! Thanks a lot all of you for the help!

@roryk
Copy link
Collaborator

roryk commented Oct 15, 2019

Great! Sorry for the problems!

@roryk roryk closed this as completed Oct 15, 2019
roryk added a commit that referenced this issue Oct 21, 2019
Ethnicity is not required in the PED file description but we were
treating it as if it was, leading to errors with standard PED
files.

Addresses issue brought up in #2961.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants