-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BAM header is not getting set on partition 0 with headerless BAM output format #916
Comments
FYI @tomwhite added support for merging BAMs in GATK4, see ReadsSparkSink.java |
I've got a fix for this prepped; just cleaning up a unit test failure, should be good to go in 15min. |
fnothaft
added a commit
to fnothaft/adam
that referenced
this issue
Jan 12, 2016
Resolves bigdatagenomics#916. Makes several modifications that should eliminate the header attach issue when writing back to SAM/BAM: * Writes the SAM/BAM header as a single file. * Instead of trying to attach the SAM/BAM header to the output format via a singleton object, we pass the path to the SAM/BAM header file via the Hadoop configuration. * The output format reads the header from HDFS when creating the record writer. * At the end, once we've written the full RDD and the header file, we merge all via Hadoop's FsUtil.
fnothaft
added a commit
to fnothaft/adam
that referenced
this issue
Jan 12, 2016
Resolves bigdatagenomics#916. Makes several modifications that should eliminate the header attach issue when writing back to SAM/BAM: * Writes the SAM/BAM header as a single file. * Instead of trying to attach the SAM/BAM header to the output format via a singleton object, we pass the path to the SAM/BAM header file via the Hadoop configuration. * The output format reads the header from HDFS when creating the record writer. * At the end, once we've written the full RDD and the header file, we merge all via Hadoop's FsUtil.
fnothaft
added a commit
to fnothaft/adam
that referenced
this issue
Jan 13, 2016
Resolves bigdatagenomics#916. Makes several modifications that should eliminate the header attach issue when writing back to SAM/BAM: * Writes the SAM/BAM header as a single file. * Instead of trying to attach the SAM/BAM header to the output format via a singleton object, we pass the path to the SAM/BAM header file via the Hadoop configuration. * The output format reads the header from HDFS when creating the record writer. * At the end, once we've written the full RDD and the header file, we merge all via Hadoop's FsUtil.
fnothaft
added a commit
to fnothaft/adam
that referenced
this issue
Jan 13, 2016
Resolves bigdatagenomics#916. Makes several modifications that should eliminate the header attach issue when writing back to SAM/BAM: * Writes the SAM/BAM header as a single file. * Instead of trying to attach the SAM/BAM header to the output format via a singleton object, we pass the path to the SAM/BAM header file via the Hadoop configuration. * The output format reads the header from HDFS when creating the record writer. * At the end, once we've written the full RDD and the header file, we merge all via Hadoop's FsUtil.
fnothaft
added a commit
to fnothaft/adam
that referenced
this issue
Jan 14, 2016
Resolves bigdatagenomics#916. Makes several modifications that should eliminate the header attach issue when writing back to SAM/BAM: * Writes the SAM/BAM header as a single file. * Instead of trying to attach the SAM/BAM header to the output format via a singleton object, we pass the path to the SAM/BAM header file via the Hadoop configuration. * The output format reads the header from HDFS when creating the record writer. * At the end, once we've written the full RDD and the header file, we merge all via Hadoop's FsUtil.
fnothaft
added a commit
to fnothaft/adam
that referenced
this issue
Jan 14, 2016
Resolves bigdatagenomics#916. Makes several modifications that should eliminate the header attach issue when writing back to SAM/BAM: * Writes the SAM/BAM header as a single file. * Instead of trying to attach the SAM/BAM header to the output format via a singleton object, we pass the path to the SAM/BAM header file via the Hadoop configuration. * The output format reads the header from HDFS when creating the record writer. * At the end, once we've written the full RDD and the header file, we merge all via Hadoop's FsUtil.
fnothaft
added a commit
to fnothaft/adam
that referenced
this issue
Jan 14, 2016
Resolves bigdatagenomics#916. Makes several modifications that should eliminate the header attach issue when writing back to SAM/BAM: * Writes the SAM/BAM header as a single file. * Instead of trying to attach the SAM/BAM header to the output format via a singleton object, we pass the path to the SAM/BAM header file via the Hadoop configuration. * The output format reads the header from HDFS when creating the record writer. * At the end, once we've written the full RDD and the header file, we merge all via Hadoop's FsUtil.
fnothaft
added a commit
to fnothaft/adam
that referenced
this issue
Jan 14, 2016
Resolves bigdatagenomics#916. Makes several modifications that should eliminate the header attach issue when writing back to SAM/BAM: * Writes the SAM/BAM header as a single file. * Instead of trying to attach the SAM/BAM header to the output format via a singleton object, we pass the path to the SAM/BAM header file via the Hadoop configuration. * The output format reads the header from HDFS when creating the record writer. * At the end, once we've written the full RDD and the header file, we merge all via Hadoop's FsUtil.
fnothaft
added a commit
to fnothaft/adam
that referenced
this issue
Jan 14, 2016
Resolves bigdatagenomics#916. Makes several modifications that should eliminate the header attach issue when writing back to SAM/BAM: * Writes the SAM/BAM header as a single file. * Instead of trying to attach the SAM/BAM header to the output format via a singleton object, we pass the path to the SAM/BAM header file via the Hadoop configuration. * The output format reads the header from HDFS when creating the record writer. * At the end, once we've written the full RDD and the header file, we merge all via Hadoop's FsUtil.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The bug that will not die... Reported by @almussel. See #676, #691, #711, #712, #721...
@almussel will get us Spark logs tomorrow.
The text was updated successfully, but these errors were encountered: