[ADAM-883] Add caching to Transform pipeline. #884

The Transform pipeline in the CLI has several stages (e.g., sort, indel realignment, BQSR) that trigger recomputation. If you are running a single stage off of local storage/HDFS/Tachyon, this is OK. However, if you're running multiple stages, or you are loading data from S3/etc, this can lead to serious performance degradation. To address this, I've added the proper caching statements. Additionally, I've added a hook so that the user can specify the storage level to use for caching. Resolves bigdatagenomics#883.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ADAM-883] Add caching to Transform pipeline. #884

[ADAM-883] Add caching to Transform pipeline. #884

Commits on Nov 19, 2015