New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-883] Add caching to Transform pipeline. #884

Merged
merged 1 commit into from Nov 19, 2015

Conversation

Projects
None yet
3 participants
@fnothaft
Member

fnothaft commented Nov 18, 2015

The Transform pipeline in the CLI has several stages (e.g., sort, indel
realignment, BQSR) that trigger recomputation. If you are running a single
stage off of local storage/HDFS/Tachyon, this is OK. However, if you're running
multiple stages, or you are loading data from S3/etc, this can lead to serious
performance degradation. To address this, I've added the proper caching
statements. Additionally, I've added a hook so that the user can specify the
storage level to use for caching. Resolves #883.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Nov 18, 2015

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1016/
Test PASSed.

AmplabJenkins commented Nov 18, 2015

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1016/
Test PASSed.

[ADAM-883] Add caching to Transform pipeline.
The Transform pipeline in the CLI has several stages (e.g., sort, indel
realignment, BQSR) that trigger recomputation. If you are running a single
stage off of local storage/HDFS/Tachyon, this is OK. However, if you're running
multiple stages, or you are loading data from S3/etc, this can lead to serious
performance degradation. To address this, I've added the proper caching
statements. Additionally, I've added a hook so that the user can specify the
storage level to use for caching. Resolves #883.
@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Nov 19, 2015

Member

Fixed nit and rebased.

Member

fnothaft commented Nov 19, 2015

Fixed nit and rebased.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Nov 19, 2015

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1018/
Test PASSed.

AmplabJenkins commented Nov 19, 2015

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1018/
Test PASSed.

heuermh added a commit that referenced this pull request Nov 19, 2015

Merge pull request #884 from fnothaft/caching
[ADAM-883] Add caching to Transform pipeline.

@heuermh heuermh merged commit 5845b15 into bigdatagenomics:master Nov 19, 2015

1 check passed

default Merged build finished.
Details
@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Nov 19, 2015

Member

Thanks!

Member

heuermh commented Nov 19, 2015

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment