New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-436] Optionally output original qualities to fastq #467

Merged
merged 1 commit into from Nov 7, 2014

Conversation

Projects
None yet
3 participants
@ryan-williams
Member

ryan-williams commented Nov 7, 2014

Some improvements to fastq-writing flow / adam2fastq:

  • optionally write out original qualities or "recalibrated" ones
  • run adam2fastq through the single-or-paired-fastq interface based on the number of file arguments passed
  • fix a bug where the projection was leaving out needed fields
  • add an optional additional sanity check when writing paired-fastq
@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Nov 7, 2014

Member

+1, LGTM. Squash?

Member

fnothaft commented Nov 7, 2014

+1, LGTM. Squash?

@ryan-williams

This comment has been minimized.

Show comment
Hide comment
@ryan-williams

ryan-williams Nov 7, 2014

Member

squashed

Member

ryan-williams commented Nov 7, 2014

squashed

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Nov 7, 2014

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/377/

Build result: FAILURE

GitHub pull request #467 of commit 889e653 automatically merged.[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-slave-01 (centos) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/467/merge^{commit} # timeout=10Checking out Revision 28c877b288071db8a061148f71e6f5e4f161ef14 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 28c877b288071db8a061148f71e6f5e4f161ef14 > git rev-list a7a2569f68f2981425bb14e2fffc1bdbd8024b8c # timeout=10Triggering ADAM-prb » 1.0.4,centosTriggering ADAM-prb » 2.2.0,centosTriggering ADAM-prb » 2.3.0,centosADAM-prb » 1.0.4,centos completed with result FAILUREADAM-prb » 2.2.0,centos completed with result FAILUREADAM-prb » 2.3.0,centos completed with result FAILURE
Test FAILed.

AmplabJenkins commented Nov 7, 2014

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/377/

Build result: FAILURE

GitHub pull request #467 of commit 889e653 automatically merged.[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-slave-01 (centos) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/467/merge^{commit} # timeout=10Checking out Revision 28c877b288071db8a061148f71e6f5e4f161ef14 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 28c877b288071db8a061148f71e6f5e4f161ef14 > git rev-list a7a2569f68f2981425bb14e2fffc1bdbd8024b8c # timeout=10Triggering ADAM-prb » 1.0.4,centosTriggering ADAM-prb » 2.2.0,centosTriggering ADAM-prb » 2.3.0,centosADAM-prb » 1.0.4,centos completed with result FAILUREADAM-prb » 2.2.0,centos completed with result FAILUREADAM-prb » 2.3.0,centos completed with result FAILURE
Test FAILed.

@ryan-williams

This comment has been minimized.

Show comment
Hide comment
@ryan-williams

ryan-williams Nov 7, 2014

Member

huh, could be real failures, let me look

Member

ryan-williams commented Nov 7, 2014

huh, could be real failures, let me look

@ryan-williams

This comment has been minimized.

Show comment
Hide comment
@ryan-williams

ryan-williams Nov 7, 2014

Member

right, FastqRecordConverter puts fastq qualities into AlignmentRecord.qual, which is asymmetric with my change here to default to AR.origQual.

I guess I'll have the writing code try to write (origQual|qual) and fall back to writing the other if the first choice doesn't exist...

@fnothaft @massie any thoughts on whether origQual or qual inherently makes more sense on each end? what stage are the "recalibrated" qualities typically computed at?

Member

ryan-williams commented Nov 7, 2014

right, FastqRecordConverter puts fastq qualities into AlignmentRecord.qual, which is asymmetric with my change here to default to AR.origQual.

I guess I'll have the writing code try to write (origQual|qual) and fall back to writing the other if the first choice doesn't exist...

@fnothaft @massie any thoughts on whether origQual or qual inherently makes more sense on each end? what stage are the "recalibrated" qualities typically computed at?

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Nov 7, 2014

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/379/

Build result: FAILURE

GitHub pull request #467 of commit 4df9b44 automatically merged.[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-slave-01 (centos) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/467/merge^{commit} # timeout=10Checking out Revision 887c78f7f44a265ef8e256958d8a2937feb6f931 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 887c78f7f44a265ef8e256958d8a2937feb6f931 > git rev-list a7a2569f68f2981425bb14e2fffc1bdbd8024b8c # timeout=10Triggering ADAM-prb » 1.0.4,centosTriggering ADAM-prb » 2.2.0,centosTriggering ADAM-prb » 2.3.0,centosADAM-prb » 1.0.4,centos completed with result FAILUREADAM-prb » 2.2.0,centos completed with result FAILUREADAM-prb » 2.3.0,centos completed with result FAILURE
Test FAILed.

AmplabJenkins commented Nov 7, 2014

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/379/

Build result: FAILURE

GitHub pull request #467 of commit 4df9b44 automatically merged.[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-slave-01 (centos) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/467/merge^{commit} # timeout=10Checking out Revision 887c78f7f44a265ef8e256958d8a2937feb6f931 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 887c78f7f44a265ef8e256958d8a2937feb6f931 > git rev-list a7a2569f68f2981425bb14e2fffc1bdbd8024b8c # timeout=10Triggering ADAM-prb » 1.0.4,centosTriggering ADAM-prb » 2.2.0,centosTriggering ADAM-prb » 2.3.0,centosADAM-prb » 1.0.4,centos completed with result FAILUREADAM-prb » 2.2.0,centos completed with result FAILUREADAM-prb » 2.3.0,centos completed with result FAILURE
Test FAILed.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Nov 7, 2014

Member

@ryan-williams I'd go with qual. originalQual will only be populated if you've run BQSR before. There's not necessarily a clear reason to align reads, run BQSR, and then realign the reads, but I'm sure someone's done it before.

Member

fnothaft commented Nov 7, 2014

@ryan-williams I'd go with qual. originalQual will only be populated if you've run BQSR before. There's not necessarily a clear reason to align reads, run BQSR, and then realign the reads, but I'm sure someone's done it before.

@ryan-williams

This comment has been minimized.

Show comment
Hide comment
@ryan-williams

ryan-williams Nov 7, 2014

Member

thanks, that makes sense

Member

ryan-williams commented Nov 7, 2014

thanks, that makes sense

fastq-writing bug-fixes and cleanups
- unify code-paths for single-/paired-fastq writing

  also use recalibrated qualities

- shorten `-validation` argument to adam2fastq

- add first/second-in-pair to adam2fastq projection

  these are necessary for outputting “/1”/“/2”!

- optional extra pair-checking to adam2fastq

- add extra check to fastq test suite

- whitespaces
@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Nov 7, 2014

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/380/

Build result: FAILURE

GitHub pull request #467 of commit 2716937 automatically merged.[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-slave-01 (centos) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/467/merge^{commit} # timeout=10Checking out Revision a34074082d54668273cacb5528d8aaed1b9c6ea8 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f a34074082d54668273cacb5528d8aaed1b9c6ea8 > git rev-list 887c78f7f44a265ef8e256958d8a2937feb6f931 # timeout=10Triggering ADAM-prb » 1.0.4,centosTriggering ADAM-prb » 2.2.0,centosTriggering ADAM-prb » 2.3.0,centosADAM-prb » 1.0.4,centos completed with result FAILUREADAM-prb » 2.2.0,centos completed with result FAILUREADAM-prb » 2.3.0,centos completed with result FAILURE
Test FAILed.

AmplabJenkins commented Nov 7, 2014

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/380/

Build result: FAILURE

GitHub pull request #467 of commit 2716937 automatically merged.[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-slave-01 (centos) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/467/merge^{commit} # timeout=10Checking out Revision a34074082d54668273cacb5528d8aaed1b9c6ea8 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f a34074082d54668273cacb5528d8aaed1b9c6ea8 > git rev-list 887c78f7f44a265ef8e256958d8a2937feb6f931 # timeout=10Triggering ADAM-prb » 1.0.4,centosTriggering ADAM-prb » 2.2.0,centosTriggering ADAM-prb » 2.3.0,centosADAM-prb » 1.0.4,centos completed with result FAILUREADAM-prb » 2.2.0,centos completed with result FAILUREADAM-prb » 2.3.0,centos completed with result FAILURE
Test FAILed.

@ryan-williams

This comment has been minimized.

Show comment
Hide comment
@ryan-williams

ryan-williams Nov 7, 2014

Member

hrm, it seems like jenkins ran the previous SHA somehow? the line numbers it's giving don't even match up to the commit 2716937

Member

ryan-williams commented Nov 7, 2014

hrm, it seems like jenkins ran the previous SHA somehow? the line numbers it's giving don't even match up to the commit 2716937

@ryan-williams

This comment has been minimized.

Show comment
Hide comment
@ryan-williams

ryan-williams Nov 7, 2014

Member

@fnothaft can you ask jenkins to retest this?

Member

ryan-williams commented Nov 7, 2014

@fnothaft can you ask jenkins to retest this?

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Nov 7, 2014

Member

Jenkins, retest this please.

Member

fnothaft commented Nov 7, 2014

Jenkins, retest this please.

@ryan-williams

This comment has been minimized.

Show comment
Hide comment
@ryan-williams

ryan-williams Nov 7, 2014

Member

also, since you asked me to squash this earlier, I went ahead and wrote a script to squash my commits together that includes all the commit messages, bulleted and indented :)

Member

ryan-williams commented Nov 7, 2014

also, since you asked me to squash this earlier, I went ahead and wrote a script to squash my commits together that includes all the commit messages, bulleted and indented :)

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Nov 7, 2014

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/381/
Test PASSed.

AmplabJenkins commented Nov 7, 2014

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/381/
Test PASSed.

@ryan-williams

This comment has been minimized.

Show comment
Hide comment
@ryan-williams

ryan-williams Nov 7, 2014

Member

huzzah!

Member

ryan-williams commented Nov 7, 2014

huzzah!

@ryan-williams

This comment has been minimized.

Show comment
Hide comment
@ryan-williams

ryan-williams Nov 7, 2014

Member

weird that it ran the old code right after I pushed the new code...

Member

ryan-williams commented Nov 7, 2014

weird that it ran the old code right after I pushed the new code...

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Nov 7, 2014

Member

huzzah indeed!

weird that it ran the old code right after I pushed the new code...

I think I kicked off the new build (by asking Jenkins) before you pushed the updated code.

Member

fnothaft commented Nov 7, 2014

huzzah indeed!

weird that it ran the old code right after I pushed the new code...

I think I kicked off the new build (by asking Jenkins) before you pushed the updated code.

fnothaft added a commit that referenced this pull request Nov 7, 2014

Merge pull request #467 from ryan-williams/fastq
[ADAM-436] Optionally output original qualities to fastq

@fnothaft fnothaft merged commit 198e00d into bigdatagenomics:master Nov 7, 2014

1 check passed

default Merged build finished.
Details
@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Nov 7, 2014

Member

Merged! Thanks @ryan-williams.

Member

fnothaft commented Nov 7, 2014

Merged! Thanks @ryan-williams.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment