Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not a BGZF file #9

Closed
jaredo opened this issue Nov 2, 2016 · 5 comments
Closed

Not a BGZF file #9

jaredo opened this issue Nov 2, 2016 · 5 comments
Assignees
Labels

Comments

@jaredo
Copy link
Contributor

jaredo commented Nov 2, 2016

Hello again!

I am still receiving the above warning message. This is on an Ubuntu 16.04 VM. The output looks fine as far as I call tell.

$ octopus-0.1.4-alpha/bin/octopus -I ~/Dropbox/Public/NA12878.bam  -R /mnt/genomes/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa -o test.vcf.gz
[2016-11-01 17:19:17] <INFO> ------------------------------------------------------------------------
[2016-11-01 17:19:17] <INFO> octopus v0.1.4-alpha
[2016-11-01 17:19:17] <INFO> Copyright (c) 2016 University of Oxford
[2016-11-01 17:19:17] <INFO> ------------------------------------------------------------------------
Not a BGZF file: /home/joconnell/workspace/test.vcf.gz
[2016-11-01 17:19:17] <INFO> Done initialising calling components in 388ms
[2016-11-01 17:19:17] <INFO> Invoked calling model: individual
[2016-11-01 17:19:17] <INFO> Detected 1 sample: "NA12878"
[2016-11-01 17:19:17] <INFO> Writing calls to "/home/joconnell/workspace/test.vcf.gz"
[2016-11-01 17:19:17] <INFO> ------------------------------------------------------------------------
[2016-11-01 17:19:17] <INFO>      current      |                   |     time      |     estimated   
[2016-11-01 17:19:17] <INFO>      position     |     completed     |     taken     |     ttc         
[2016-11-01 17:19:17] <INFO> ------------------------------------------------------------------------
[2016-11-01 17:19:50] <INFO>  chr1:197003592             6.4%             32s           8.03m
[2016-11-01 17:20:08] <INFO>  chr2:136820959            10.8%             50s           7.06m
[2016-11-01 17:20:20] <INFO>  chr3:129195987            15.0%           1.05m           6.01m
[2016-11-01 17:20:48] <INFO>    chr4:3026996            15.1%           1.51m           8.63m
[2016-11-01 17:21:07] <INFO>  chr5:111993204            18.7%           1.81m           8.03m
[2016-11-01 17:21:30] <INFO>   chr6:26036096            19.5%           2.21m           9.21m
[2016-11-01 17:21:43] <INFO>  chr7:117071382            23.3%           2.41m           8.08m
[2016-11-01 17:21:58] <INFO>   chr8:67039906            25.5%           2.68m           7.95m
[2016-11-01 17:22:06] <INFO>  chr9:135715793            29.9%           2.80m           6.68m
[2016-11-01 17:22:16] <INFO> chr10:118949728            33.7%           2.96m           5.93m
[2016-11-01 17:22:36] <INFO>  chr11:27625597            34.6%           3.30m           6.33m
[2016-11-01 17:22:53] <INFO>  chr12:52856682            36.3%           3.60m           6.40m
[2016-11-01 17:23:11] <INFO>  chr13:32840198            37.4%           3.88m           6.60m
[2016-11-01 17:23:23] <INFO>  chr14:88349470            40.3%           4.08m           6.18m
[2016-11-01 17:23:39] <INFO>  chr15:48649338            41.8%           4.35m           6.16m
[2016-11-01 17:23:57] <INFO>   chr16:3723247            42.0%           4.66m           6.56m
[2016-11-01 17:23:58] <WARN> Skipping region chr16:3826935-3827292 as there are too many haplotypes
[2016-11-01 17:24:24] <INFO>  chr17:41144546            43.3%           5.10m           6.80m
[2016-11-01 17:24:47] <INFO>  chr18:21060436            44.0%           5.48m           7.11m
[2016-11-01 17:25:00] <INFO>  chr19:41852114            45.4%           5.70m           7.01m
[2016-11-01 17:25:14] <INFO>  chr20:10568782            45.7%           5.93m           7.20m
[2016-11-01 17:25:27] <INFO>  chr21:27201382            46.6%           6.15m           7.23m
[2016-11-01 17:25:57] <INFO>  chr22:42471627            48.0%           6.65m           7.40m
[2016-11-01 17:26:10] <INFO>  chrX:154014372            53.0%           6.88m           6.25m
[2016-11-01 17:26:12] <INFO>               -             100%           6.91m               -
[2016-11-01 17:26:12] <INFO> Finished calling 3,095,693,983bp, total runtime 6.91m
[2016-11-01 17:26:12] <INFO> ------------------------------------------------------------------------
@iqbal-lab
Copy link

Is that 6 minutes to variant call a human ??

@dancooke dancooke self-assigned this Nov 2, 2016
@dancooke dancooke added the bug label Nov 2, 2016
@dancooke
Copy link
Member

dancooke commented Nov 2, 2016

Hi Jared,

Thanks for reporting this. Although I can't reproduce the issue on my machine, I think I know what's going on. Basically octopus will try to generate an index for compressed VCF (or BCF) output. This happens on destruction of the handle to the file. When the handle gets moved around early in the run (before anything has been written) the 'moved from' handle gets destroyed. I think the old handle remains in a valid state in your setup and therefore attempts to build an index on the empty file, which triggers this warning.

This shouldn't have any adverse affects downstream, but I've added some additional checks to stop the index being generated on empty handles. Would you mind checking out the develop branch and seeing if the warning is still being generated?

Many thanks,
Dan

@jaredo
Copy link
Contributor Author

jaredo commented Nov 2, 2016

Is that 6 minutes to variant call a human ??

@iqbal-lab Alas no, this is on a ~100MB test .bam. Still, we are seeing running times of ~75mins with --fast enabled on WGS 30x data on a 20-core node

@dancooke checking develop now

@iqbal-lab
Copy link

Wow.

@jaredo
Copy link
Contributor Author

jaredo commented Nov 2, 2016

@dancooke

I can confirm that I no longer see the warning message in the develop branch. Thanks for the fast response.

@jaredo jaredo closed this as completed Nov 2, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants