BAM output in ADAM appears to be corrupt #962

Closed
fnothaft opened this Issue Feb 25, 2016 · 7 comments

Comments

Projects
None yet
3 participants
@fnothaft
Member

fnothaft commented Feb 25, 2016

@fnothaft fnothaft self-assigned this Feb 25, 2016

@fnothaft fnothaft referenced this issue in BD2KGenomics/toil-scripts Feb 25, 2016

Closed

ADAM output is corrupt in S3 #117

@heuermh heuermh modified the milestone: 0.19.1 Feb 25, 2016

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Feb 26, 2016

Member

No dice on reading the per-partition BAM file:

~/adam$ samtools view part-r-00000
[E::hts_hopen] fail to open file 'part-r-00000' 
[E::hts_open_format] fail to open file 'part-r-00000' 
samtools view: failed to open "part-r-00000" for reading: Success

I don't think we write the BAM header with each partition, which might be scoobying this test. Let me change that and retry...

Member

fnothaft commented Feb 26, 2016

No dice on reading the per-partition BAM file:

~/adam$ samtools view part-r-00000
[E::hts_hopen] fail to open file 'part-r-00000' 
[E::hts_open_format] fail to open file 'part-r-00000' 
samtools view: failed to open "part-r-00000" for reading: Success

I don't think we write the BAM header with each partition, which might be scoobying this test. Let me change that and retry...

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Feb 26, 2016

Member

OK, same thing if I write the BAM header as part of each partition. IDK...?

Time to go all stone ages and toLocalIterator this up...

Member

fnothaft commented Feb 26, 2016

OK, same thing if I write the BAM header as part of each partition. IDK...?

Time to go all stone ages and toLocalIterator this up...

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Feb 26, 2016

Member

OK, have a WIP patch I am testing now at https://github.com/fnothaft/adam/tree/single-bam-localiter-ehf. So far, seems to work. Ugly as sin, but I'll clean it up later.

Member

fnothaft commented Feb 26, 2016

OK, have a WIP patch I am testing now at https://github.com/fnothaft/adam/tree/single-bam-localiter-ehf. So far, seems to work. Ugly as sin, but I'll clean it up later.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Feb 26, 2016

Member

The patch works but the performance is terrible because SAMRecordWritable seems to fall back on Java serialization. I'm working on a new patch... TBD.

Member

fnothaft commented Feb 26, 2016

The patch works but the performance is terrible because SAMRecordWritable seems to fall back on Java serialization. I'm working on a new patch... TBD.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Feb 26, 2016

Member

OK, got a working fix! PR is on the way shortly...

Member

fnothaft commented Feb 26, 2016

OK, got a working fix! PR is on the way shortly...

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 26, 2016

[ADAM-962] Fix corrupt single-file BAM output.
It seems like we were doing something incorrectly when writing the header.
Additionally, we now write a correct end-of-file. Resolves #962.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 26, 2016

[ADAM-962] Fix corrupt single-file BAM output.
It seems like we were doing something incorrectly when writing the header.
Additionally, we now write a correct end-of-file. Resolves #962.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 26, 2016

[ADAM-962] Fix corrupt single-file BAM output.
It seems like we were doing something incorrectly when writing the header.
Additionally, we now write a correct end-of-file. Resolves #962.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 26, 2016

[ADAM-962] Fix corrupt single-file BAM output.
It seems like we were doing something incorrectly when writing the header.
Additionally, we now write a correct end-of-file. Resolves #962.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 26, 2016

[ADAM-962] Fix corrupt single-file BAM output.
It seems like we were doing something incorrectly when writing the header.
Additionally, we now write a correct end-of-file. Resolves #962.

fnothaft added a commit to fnothaft/cgl-docker-lib that referenced this issue Feb 26, 2016

@fnothaft fnothaft referenced this issue in BD2KGenomics/cgl-docker-lib Feb 26, 2016

Merged

Hotfix for ADAM/BAM output issue. #89

@fnothaft fnothaft modified the milestones: 0.20.0, 0.19.1 May 18, 2016

@jpdna

This comment has been minimized.

Show comment
Hide comment
@jpdna

jpdna Jun 6, 2016

Member

Can this issue be closed?

Member

jpdna commented Jun 6, 2016

Can this issue be closed?

@jpdna jpdna added the bug label Jun 6, 2016

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jun 6, 2016

Member

No, the open PR still needs a fix.

Member

fnothaft commented Jun 6, 2016

No, the open PR still needs a fix.

@heuermh heuermh referenced this issue Jun 7, 2016

Closed

Release ADAM version 0.20.0 #1048

47 of 61 tasks complete

fnothaft added a commit to fnothaft/adam that referenced this issue Jul 16, 2016

[ADAM-962] Fix corrupt single-file BAM output.
It seems like we were doing something incorrectly when writing the header.
Additionally, we now write a correct end-of-file. Resolves #962.

fnothaft added a commit to fnothaft/adam that referenced this issue Jul 19, 2016

[ADAM-962] Fix corrupt single-file BAM output.
It seems like we were doing something incorrectly when writing the header.
Additionally, we now write a correct end-of-file. Resolves #962.

@heuermh heuermh closed this in #964 Jul 19, 2016

cket added a commit to cket/cgl-docker-lib that referenced this issue Sep 7, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment