Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cache argument to loadFeatures, additional Feature timers #1427

Merged
merged 2 commits into from Mar 14, 2017

Conversation

@heuermh
Copy link
Member

heuermh commented Mar 7, 2017

Fixes #1321

@coveralls
Copy link

coveralls commented Mar 7, 2017

Coverage Status

Coverage increased (+0.04%) to 76.442% when pulling dd17834 on heuermh:feature-rdd-cache into 07c1982 on bigdatagenomics:master.

@@ -1156,18 +1157,20 @@ class ADAMContext(@transient val sc: SparkContext) extends Serializable with Log
* @return Returns a FeatureRDD.
*/
def loadGff3(filePath: String,
cache: Boolean = true,

This comment has been minimized.

Copy link
@fnothaft

fnothaft Mar 7, 2017

Member

I would set the storage level instead of caching these. Provides more flexibility, esp. if you're running out-of-core.

This comment has been minimized.

Copy link
@heuermh

heuermh Mar 7, 2017

Author Member

I thought storage level was for rdd.persist, as used in cli Transform? This uses rdd.cache.

This comment has been minimized.

Copy link
@fnothaft

fnothaft Mar 7, 2017

Member

rdd.cache = rdd.persist(StorageLevel.MEMORY_ONLY)

This comment has been minimized.

Copy link
@heuermh

heuermh Mar 7, 2017

Author Member

Ah, so sorry if I'm being thick, we should pass the StorageLevel around the feature APIs instead, and then replace the call to rdd.cache in FeatureRDD with rdd.persist(storageLevel)?

This comment has been minimized.

Copy link
@fnothaft

fnothaft Mar 7, 2017

Member

No prob; we all have our thick days. What you suggested (passing the StorageLevel) is what I was thinking.

@AmplabJenkins
Copy link

AmplabJenkins commented Mar 7, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1845/
Test PASSed.

@coveralls
Copy link

coveralls commented Mar 8, 2017

Coverage Status

Coverage increased (+0.007%) to 76.406% when pulling 31769e7 on heuermh:feature-rdd-cache into 07c1982 on bigdatagenomics:master.

@AmplabJenkins
Copy link

AmplabJenkins commented Mar 8, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1846/
Test PASSed.

@heuermh heuermh modified the milestone: 0.23.0 Mar 8, 2017
@heuermh heuermh added this to Triage in Release 0.23.0 Mar 8, 2017
@fnothaft
Copy link
Member

fnothaft commented Mar 14, 2017

Jenkins, test this please.

@coveralls
Copy link

coveralls commented Mar 14, 2017

Coverage Status

Coverage increased (+0.2%) to 76.562% when pulling 31769e7 on heuermh:feature-rdd-cache into 07c1982 on bigdatagenomics:master.

@AmplabJenkins
Copy link

AmplabJenkins commented Mar 14, 2017

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1874/
Test PASSed.

@fnothaft fnothaft merged commit 9938d3c into bigdatagenomics:master Mar 14, 2017
2 checks passed
2 checks passed
coverage/coveralls Coverage increased (+0.2%) to 76.562%
Details
default Merged build finished.
Details
@fnothaft
Copy link
Member

fnothaft commented Mar 14, 2017

Merged! Thanks @heuermh!

@heuermh heuermh deleted the heuermh:feature-rdd-cache branch Mar 15, 2017
@heuermh heuermh mentioned this pull request Mar 16, 2017
4 of 4 tasks complete
@heuermh heuermh moved this from Triage to Completed in Release 0.23.0 Mar 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.