Add cache argument to loadFeatures, additional Feature timers #1427

Merged
merged 2 commits into from Mar 14, 2017

Conversation

Projects
4 participants
@heuermh
Member

heuermh commented Mar 7, 2017

Fixes #1321

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Mar 7, 2017

Coverage Status

Coverage increased (+0.04%) to 76.442% when pulling dd17834 on heuermh:feature-rdd-cache into 07c1982 on bigdatagenomics:master.

coveralls commented Mar 7, 2017

Coverage Status

Coverage increased (+0.04%) to 76.442% when pulling dd17834 on heuermh:feature-rdd-cache into 07c1982 on bigdatagenomics:master.

@@ -1156,18 +1157,20 @@ class ADAMContext(@transient val sc: SparkContext) extends Serializable with Log
* @return Returns a FeatureRDD.
*/
def loadGff3(filePath: String,
+ cache: Boolean = true,

This comment has been minimized.

@fnothaft

fnothaft Mar 7, 2017

Member

I would set the storage level instead of caching these. Provides more flexibility, esp. if you're running out-of-core.

@fnothaft

fnothaft Mar 7, 2017

Member

I would set the storage level instead of caching these. Provides more flexibility, esp. if you're running out-of-core.

This comment has been minimized.

@heuermh

heuermh Mar 7, 2017

Member

I thought storage level was for rdd.persist, as used in cli Transform? This uses rdd.cache.

@heuermh

heuermh Mar 7, 2017

Member

I thought storage level was for rdd.persist, as used in cli Transform? This uses rdd.cache.

This comment has been minimized.

@fnothaft

fnothaft Mar 7, 2017

Member

rdd.cache = rdd.persist(StorageLevel.MEMORY_ONLY)

@fnothaft

fnothaft Mar 7, 2017

Member

rdd.cache = rdd.persist(StorageLevel.MEMORY_ONLY)

This comment has been minimized.

@heuermh

heuermh Mar 7, 2017

Member

Ah, so sorry if I'm being thick, we should pass the StorageLevel around the feature APIs instead, and then replace the call to rdd.cache in FeatureRDD with rdd.persist(storageLevel)?

@heuermh

heuermh Mar 7, 2017

Member

Ah, so sorry if I'm being thick, we should pass the StorageLevel around the feature APIs instead, and then replace the call to rdd.cache in FeatureRDD with rdd.persist(storageLevel)?

This comment has been minimized.

@fnothaft

fnothaft Mar 7, 2017

Member

No prob; we all have our thick days. What you suggested (passing the StorageLevel) is what I was thinking.

@fnothaft

fnothaft Mar 7, 2017

Member

No prob; we all have our thick days. What you suggested (passing the StorageLevel) is what I was thinking.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Mar 7, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1845/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1845/
Test PASSed.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Mar 8, 2017

Coverage Status

Coverage increased (+0.007%) to 76.406% when pulling 31769e7 on heuermh:feature-rdd-cache into 07c1982 on bigdatagenomics:master.

coveralls commented Mar 8, 2017

Coverage Status

Coverage increased (+0.007%) to 76.406% when pulling 31769e7 on heuermh:feature-rdd-cache into 07c1982 on bigdatagenomics:master.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Mar 8, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1846/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1846/
Test PASSed.

@heuermh heuermh modified the milestone: 0.23.0 Mar 8, 2017

@heuermh heuermh added this to Triage in Release 0.23.0 Mar 8, 2017

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Mar 14, 2017

Member

Jenkins, test this please.

Member

fnothaft commented Mar 14, 2017

Jenkins, test this please.

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Mar 14, 2017

Coverage Status

Coverage increased (+0.2%) to 76.562% when pulling 31769e7 on heuermh:feature-rdd-cache into 07c1982 on bigdatagenomics:master.

coveralls commented Mar 14, 2017

Coverage Status

Coverage increased (+0.2%) to 76.562% when pulling 31769e7 on heuermh:feature-rdd-cache into 07c1982 on bigdatagenomics:master.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Mar 14, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1874/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1874/
Test PASSed.

@fnothaft fnothaft merged commit 9938d3c into bigdatagenomics:master Mar 14, 2017

2 checks passed

coverage/coveralls Coverage increased (+0.2%) to 76.562%
Details
default Merged build finished.
Details
@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Mar 14, 2017

Member

Merged! Thanks @heuermh!

Member

fnothaft commented Mar 14, 2017

Merged! Thanks @heuermh!

@heuermh heuermh deleted the heuermh:feature-rdd-cache branch Mar 15, 2017

@heuermh heuermh referenced this pull request Mar 16, 2017

Merged

BQSR refactor for perf improvements #1423

4 of 4 tasks complete

@heuermh heuermh moved this from Triage to Completed in Release 0.23.0 Mar 21, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment