Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tabix Error [E::hts_idx_push] chromosome blocks not continuous #624

Closed
inti opened this issue Oct 1, 2014 · 19 comments
Closed

tabix Error [E::hts_idx_push] chromosome blocks not continuous #624

inti opened this issue Oct 1, 2014 · 19 comments

Comments

@inti
Copy link

inti commented Oct 1, 2014

Hi @chapmanb ,
I think I tracked down this error message which I reported before on #613 .
It turns out that what happens is exactly what the message says ...

Briefly, the pipeline sorts the bed file by chromosome and position. However, for some reason they are not sorted properly on the chromosome name. Using the sort command of the pipeline I am getting this which is not sorted properly by chromosome, which means that "chromosome blocks not continuous"

[ipedroso@jimi work]$ sort -k1,1 -k2,2n 1_2014-09-26_populus_trichocarpa_diversity_1-sort-callable-callableblocks.bed | grep scaffold_104 
scaffold_1040   1586    2955
scaffold_104    0       242
scaffold_1040   566     606
scaffold_1040   709     1271
scaffold_1040   3113    5047
scaffold_1040   5177    6391
scaffold_1040   6502    13319
scaffold_104    2774    3373
scaffold_104    14249   16134
scaffold_104    16283   16356
scaffold_104    16478   42093
scaffold_1043   0       203
scaffold_1043   342     551
scaffold_1043   867     985
scaffold_1043   1182    2397
scaffold_1043   2509    3187
scaffold_1043   3553    11938
scaffold_1043   12773   12835
scaffold_104    421     456
scaffold_104    3529    4705
scaffold_104    4825    4939
scaffold_104    5546    5668
scaffold_104    5904    7718
scaffold_104    42225   44057
scaffold_104    44182   44766
scaffold_104    68438   68580
scaffold_104    68726   69439
scaffold_1047   105     559
scaffold_104    73523   73694
scaffold_104    75867   75954
scaffold_1047   677     2510
scaffold_1047   2634    2982
scaffold_1047   3364    12767
scaffold_1047   12871   13201
scaffold_104    8097    8805

I found here https://www.biostars.org/p/64687/ that using the V option on the sort would solve the problem

[ipedroso@jimi work]$ sort -k1,1V -k2,2n 1_2014-09-26_populus_trichocarpa_diversity_1-sort-callable-callableblocks.bed | grep scaffold_104 
scaffold_104    140844  142121
scaffold_104    142240  145935
scaffold_104    146082  146515
scaffold_1040   566     606
scaffold_1040   709     1271
scaffold_1040   1586    2955
scaffold_1040   3113    5047
scaffold_1040   5177    6391
scaffold_1040   6502    13319
scaffold_1043   0       203
scaffold_1043   342     551
scaffold_1043   867     985
scaffold_1043   1182    2397
scaffold_1043   2509    3187
scaffold_1043   3553    11938
scaffold_1043   12773   12835
scaffold_1047   105     559
scaffold_1047   677     2510
scaffold_1047   2634    2982
scaffold_1047   3364    12767
scaffold_1047   12871   13201

and it actually works :), see example below

[ipedroso@jimi work]$ sort -k1,1 -k2,2n 1_2014-09-26_populus_trichocarpa_diversity_1-sort-callable-callableblocks.bed > bed.bed
[ipedroso@jimi work]$ bgzip -c bed.bed > bed.bed.gz
[ipedroso@jimi work]$ tabix -f -p bed bed.bed.gz 
[E::hts_idx_push] chromosome blocks not continuous
tbx_index_build failed: bed.bed.gz

and with the V option added

sort -k1,1V -k2,2n 1_2014-09-26_populus_trichocarpa_diversity_1-sort-callable-callableblocks.bed > bed.bed
bgzip -c bed.bed > bed.bed.gz
tabix -f -p bed bed.bed.gz 

The BioStart post says that it may not work on all sort installations so I am not sure how this will work on every installation.

@roryk
Copy link
Collaborator

roryk commented Oct 1, 2014

Weird, works fine for me:

sort -k1,1 -k2,2n sort_error.bed
scaffold_104    0       242
scaffold_104    421     456
scaffold_104    2774    3373
scaffold_104    3529    4705
scaffold_104    4825    4939
scaffold_104    5546    5668
scaffold_104    5904    7718
scaffold_104    8097    8805
scaffold_104    14249   16134
scaffold_104    16283   16356
scaffold_104    16478   42093
scaffold_104    42225   44057
scaffold_104    44182   44766
scaffold_104    68438   68580
scaffold_104    68726   69439
scaffold_104    73523   73694
scaffold_104    75867   75954
scaffold_1040   566     606
scaffold_1040   709     1271
scaffold_1040   1586    2955
scaffold_1040   3113    5047
scaffold_1040   5177    6391

What version of sort is on your machine? I have an old version:

rory@clotho:~$ sort --version
sort (GNU coreutils) 5.93

The -V option won't do it because you need a super new version of sort, which we can't guarantee.

@inti
Copy link
Author

inti commented Oct 1, 2014

Yep, it mush be a version thing.

Does it work for toy with sort -k1,1V -k2,2n ?

[ipedroso@jimi ~]$ sort --version
sort (GNU coreutils) 8.22
Copyright (C) 2013 Free Software Foundation, Inc.
Licencia GPLv3+: GPL de GNU versión 3 o posterior
http://gnu.org/licenses/gpl.html.
Esto es software libre: usted es libre de cambiarlo y redistribuirlo.
No hay NINGUNA GARANTÍA, hasta donde permite la ley.

Escrito por Mike Haertel y Paul Eggert.

On Oct 1, 2014, at 15:24, Rory Kirchner notifications@github.com wrote:

Weird, works fine for me:

sort -k1,1 -k2,2n sort_error.bed
scaffold_104 0 242
scaffold_104 421 456
scaffold_104 2774 3373
scaffold_104 3529 4705
scaffold_104 4825 4939
scaffold_104 5546 5668
scaffold_104 5904 7718
scaffold_104 8097 8805
scaffold_104 14249 16134
scaffold_104 16283 16356
scaffold_104 16478 42093
scaffold_104 42225 44057
scaffold_104 44182 44766
scaffold_104 68438 68580
scaffold_104 68726 69439
scaffold_104 73523 73694
scaffold_104 75867 75954
scaffold_1040 566 606
scaffold_1040 709 1271
scaffold_1040 1586 2955
scaffold_1040 3113 5047
scaffold_1040 5177 6391
What version of sort is on your machine? I have an old version:

rory@clotho:~$ sort --version
sort (GNU coreutils) 5.93
The -V option won't do it because you need a super new version of sort, which we can't guarantee.


Reply to this email directly or view it on GitHub.

@roryk
Copy link
Collaborator

roryk commented Oct 1, 2014

Hi Inti,

Nope, -V doesn't work on my older version of sort. What version are you using? It is weird that the first commnand doesn't work!

@inti
Copy link
Author

inti commented Oct 1, 2014

sorry the version got hidden on previous message

[ipedroso@jimi ~]$ sort --version
sort (GNU coreutils) 8.22

@lpantano
Copy link
Collaborator

lpantano commented Oct 1, 2014

Hi,

I have 8.21 and it works with and without -V ...

@inti
Copy link
Author

inti commented Oct 1, 2014

Hi Lorena,
What OS do you have?
I am on centos

[ipedroso@jimi ~]$ uname -a
Linux jimi 3.10.0-123.8.1.el7.x86_64 #1 SMP Mon Sep 22 19:06:58 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

On Oct 1, 2014, at 15:31, Lorena notifications@github.com wrote:

Hi,

I have 8.21 and it works with and without -V ...


Reply to this email directly or view it on GitHub.

@lpantano
Copy link
Collaborator

lpantano commented Oct 1, 2014

Linux miro 3.13.0-34-generic #60-Ubuntu SMP Wed Aug 13 15:45:27 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

this is weird :)

@inti
Copy link
Author

inti commented Oct 1, 2014

Is it possible to pack sort within the bcbio-nextgen installation using cloudbiolinux?
perhaps that can avoid this ... although it seems only to be me :(

@roryk
Copy link
Collaborator

roryk commented Oct 1, 2014

Works fine in CentOS 5 with an old version of sort. The changelog between 8.21 and 8.22 doesn't have any clues. Uhhh... I'm at a loss.

@inti
Copy link
Author

inti commented Oct 1, 2014

sort -k1,1 -k2,2 it works fine on my mac os laptop with

LONCO:~ inti$ sort --version
sort (GNU coreutils) 5.93

@lpantano
Copy link
Collaborator

lpantano commented Oct 1, 2014

only I can think of compaling sort locally, and adding it to your PATH, so bcbio finds this first.

@inti
Copy link
Author

inti commented Oct 1, 2014

Most recent version works: sort (GNU coreutils) 8.23, this is odd

On Oct 1, 2014, at 15:59, Lorena notifications@github.com wrote:

only I can think of compaling sort locally, and adding it to your PATH, so bcbio finds this first.


Reply to this email directly or view it on GitHub.

@inti inti closed this as completed Oct 1, 2014
@roryk
Copy link
Collaborator

roryk commented Oct 1, 2014

Hm-- maybe a bug in 8.22? Changelog does't seem like it but maybe?

@baroslava
Copy link

Hello,
I have received the same error message when i have tried to index bam file. The bam file have been made by CrossMap from hg19 bam file to hg38. I have overcome trouble with headers, but the final bam file cannot be indexed for proper use. Still getting message [E::hts_idx_push] chromosome blocks not continuous. Any suggestions? Thanks. B

@chapmanb
Copy link
Member

Is this referring to usage/indexing with bcbio? If you're having a general problem with CrossMap or samtools indexing the htslib ((http://www.htslib.org/) or CrossMap folks (http://crossmap.sourceforge.net/#contact) would be more able to help. My guess is you need to sort the BAM file after running CrossMap.

@baroslava
Copy link

I have tried to sort bam file after CrossMap. No help at all.

@chapmanb chapmanb mentioned this issue Sep 16, 2015
@genecell
Copy link

It's a little strange here, my sort version is 8.22 and only sort -k 1,1 -k2,2n works for me, rather thansort -k 1,1V -k2,2n :

sort (GNU coreutils) 8.22
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and Paul Eggert.

@jingydz
Copy link

jingydz commented Mar 23, 2024

$ tabix -p vcf xxx.PASS.mac1.vcf.gz
[E::hts_idx_push] Chromosome blocks not continuous
tbx_index_build failed: xxx.PASS.mac1.vcf.gz

How should I operate on a VCF file?

@jingydz
Copy link

jingydz commented Mar 23, 2024

I have resolved this issue.

zcat xxx.PASS.vcf.gz |grep "^#" >xxx.PASS.vcf.gz.header
zcat xxx.PASS.vcf.gz |grep -v "^#" >xxx.PASS.vcf.gz.record
sort -k 1,1V -k2,2n xxx.PASS.vcf.gz.record >xxx.PASS.vcf.gz.record.sort
cat xxx.PASS.vcf.gz.header xxx.PASS.vcf.gz.record.sort |bgzip -c >xxx.PASS.sort.vcf.gz
tabix -p vcf xxx.PASS.sort.vcf.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants