New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-962] Fix corrupt single-file BAM output. #964

Merged
merged 1 commit into from Jul 19, 2016

Conversation

Projects
None yet
4 participants
@fnothaft
Member

fnothaft commented Feb 26, 2016

It seems like we were doing something incorrectly when writing the header. Additionally, we now write a correct end-of-file. Resolves #962.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Feb 26, 2016

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1098/

Build result: FAILURE

GitHub pull request #964 of commit f27ffe6 automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/964/merge^{commit} # timeout=10 > git branch -a --contains 7031d8874959dffd46097e0aaa7c72d1db5aded6 # timeout=10 > git rev-parse remotes/origin/pr/964/merge^{commit} # timeout=10Checking out Revision 7031d8874959dffd46097e0aaa7c72d1db5aded6 (origin/pr/964/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f 7031d8874959dffd46097e0aaa7c72d1db5aded6First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

AmplabJenkins commented Feb 26, 2016

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1098/

Build result: FAILURE

GitHub pull request #964 of commit f27ffe6 automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/964/merge^{commit} # timeout=10 > git branch -a --contains 7031d8874959dffd46097e0aaa7c72d1db5aded6 # timeout=10 > git rev-parse remotes/origin/pr/964/merge^{commit} # timeout=10Checking out Revision 7031d8874959dffd46097e0aaa7c72d1db5aded6 (origin/pr/964/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f 7031d8874959dffd46097e0aaa7c72d1db5aded6First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Feb 26, 2016

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1099/
Test FAILed.

AmplabJenkins commented Feb 26, 2016

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1099/
Test FAILed.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Feb 26, 2016

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1100/

Build result: FAILURE

GitHub pull request #964 of commit 3687d7e automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/964/merge^{commit} # timeout=10 > git branch -a --contains b40c1ab # timeout=10 > git rev-parse remotes/origin/pr/964/merge^{commit} # timeout=10Checking out Revision b40c1ab (origin/pr/964/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f b40c1ab97a031932bc12a9fe1ec10f356324bfeeFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

AmplabJenkins commented Feb 26, 2016

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1100/

Build result: FAILURE

GitHub pull request #964 of commit 3687d7e automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/964/merge^{commit} # timeout=10 > git branch -a --contains b40c1ab # timeout=10 > git rev-parse remotes/origin/pr/964/merge^{commit} # timeout=10Checking out Revision b40c1ab (origin/pr/964/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f b40c1ab97a031932bc12a9fe1ec10f356324bfeeFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Feb 26, 2016

Member

Jenkins, retest this please.

Member

fnothaft commented Feb 26, 2016

Jenkins, retest this please.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Feb 26, 2016

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1101/

Build result: FAILURE

GitHub pull request #964 of commit 3687d7e automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/964/merge^{commit} # timeout=10 > git branch -a --contains b40c1ab # timeout=10 > git rev-parse remotes/origin/pr/964/merge^{commit} # timeout=10Checking out Revision b40c1ab (origin/pr/964/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f b40c1ab > git rev-list b40c1ab # timeout=10Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

AmplabJenkins commented Feb 26, 2016

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1101/

Build result: FAILURE

GitHub pull request #964 of commit 3687d7e automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/964/merge^{commit} # timeout=10 > git branch -a --contains b40c1ab # timeout=10 > git rev-parse remotes/origin/pr/964/merge^{commit} # timeout=10Checking out Revision b40c1ab (origin/pr/964/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f b40c1ab > git rev-list b40c1ab # timeout=10Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Mar 30, 2016

Member

This worked for me in at least one instance.

Is this just waiting on fixing unit test failures? I would guess there is simply a problem with AlignmentRecordRDDFunctionsSuite.checkFiles not handling binary file encoding correctly. I can look into that.

Member

heuermh commented Mar 30, 2016

This worked for me in at least one instance.

Is this just waiting on fixing unit test failures? I would guess there is simply a problem with AlignmentRecordRDDFunctionsSuite.checkFiles not handling binary file encoding correctly. I can look into that.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Mar 30, 2016

Member

Is this just waiting on fixing unit test failures? I would guess there is simply a problem with AlignmentRecordRDDFunctionsSuite.checkFiles not handling binary file encoding correctly. I can look into that.

IIRC, this was waiting on HadoopGenomics/Hadoop-BAM#80, but honestly, I'm a bit foggy.

Member

fnothaft commented Mar 30, 2016

Is this just waiting on fixing unit test failures? I would guess there is simply a problem with AlignmentRecordRDDFunctionsSuite.checkFiles not handling binary file encoding correctly. I can look into that.

IIRC, this was waiting on HadoopGenomics/Hadoop-BAM#80, but honestly, I'm a bit foggy.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Mar 30, 2016

Member

In a branch with this PR applied and Hadoop-BAM at version 7.4.1-SNAPSHOT (including the PR linked above), I still see similar test failures.

Member

heuermh commented Mar 30, 2016

In a branch with this PR applied and Hadoop-BAM at version 7.4.1-SNAPSHOT (including the PR linked above), I still see similar test failures.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Mar 30, 2016

Member

Ah, OK, that confirms that I am remembering wrong... If you wouldn't mind looking into checkFiles, I'd be appreciative.

Member

fnothaft commented Mar 30, 2016

Ah, OK, that confirms that I am remembering wrong... If you wouldn't mind looking into checkFiles, I'd be appreciative.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jun 3, 2016

Member

Could you rebase and push this? I'd like to confirm that there are still issues with Hadoop-BAM version 7.5.0 before changing checkFiles.

Member

heuermh commented Jun 3, 2016

Could you rebase and push this? I'd like to confirm that there are still issues with Hadoop-BAM version 7.5.0 before changing checkFiles.

@jpdna jpdna added the bug label Jun 3, 2016

@heuermh heuermh modified the milestone: 0.20.0 Jun 5, 2016

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jul 16, 2016

Member

This is good to go!

Member

fnothaft commented Jul 16, 2016

This is good to go!

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jul 16, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1338/
Test PASSed.

AmplabJenkins commented Jul 16, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1338/
Test PASSed.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jul 18, 2016

Member

Some thought to ordering between this and #1060 may be necessary, since the latter bumps Hadoop-BAM to 7.6.0. I propose merging this first and then dealing with any 7.6.0-related changes in behaviour separately.

Member

heuermh commented Jul 18, 2016

Some thought to ordering between this and #1060 may be necessary, since the latter bumps Hadoop-BAM to 7.6.0. I propose merging this first and then dealing with any 7.6.0-related changes in behaviour separately.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jul 18, 2016

Member

I don't think Hadoop-BAM 7.6.0 will impact this patch, but I am OK with merging this before #1060.

Member

fnothaft commented Jul 18, 2016

I don't think Hadoop-BAM 7.6.0 will impact this patch, but I am OK with merging this before #1060.

@heuermh heuermh modified the milestone: 0.20.0 Jul 18, 2016

[ADAM-962] Fix corrupt single-file BAM output.
It seems like we were doing something incorrectly when writing the header.
Additionally, we now write a correct end-of-file. Resolves #962.
@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Jul 19, 2016

Member

Rebased!

Member

fnothaft commented Jul 19, 2016

Rebased!

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Jul 19, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1350/
Test PASSed.

AmplabJenkins commented Jul 19, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1350/
Test PASSed.

@heuermh heuermh merged commit 4de31aa into bigdatagenomics:master Jul 19, 2016

1 check passed

default Merged build finished.
Details
@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Jul 19, 2016

Member

Thank you, @fnothaft!

Member

heuermh commented Jul 19, 2016

Thank you, @fnothaft!

fnothaft added a commit to fnothaft/adam that referenced this pull request Aug 8, 2016

[ADAM-676] Clean up header issues for sharded files.
Resolves #676. In #964, we resolved the "header not set" issues for single file
SAM/BAM output. This change propegates this fix to sharded SAM/BAM output, and
VCF.

fnothaft added a commit to fnothaft/adam that referenced this pull request Aug 30, 2016

[ADAM-676] Clean up header issues for sharded files.
Resolves #676. In #964, we resolved the "header not set" issues for single file
SAM/BAM output. This change propegates this fix to sharded SAM/BAM output, and
VCF.

fnothaft added a commit to fnothaft/adam that referenced this pull request Sep 6, 2016

[ADAM-676] Clean up header issues for sharded files.
Resolves #676. In #964, we resolved the "header not set" issues for single file
SAM/BAM output. This change propegates this fix to sharded SAM/BAM output, and
VCF.

fnothaft added a commit to fnothaft/adam that referenced this pull request Sep 7, 2016

[ADAM-676] Clean up header issues for sharded files.
Resolves #676. In #964, we resolved the "header not set" issues for single file
SAM/BAM output. This change propegates this fix to sharded SAM/BAM output, and
VCF.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment