Skip to content

Conversation

@ilganeli
Copy link

@ilganeli ilganeli commented Jul 1, 2015

Updated all usages within the ML Lib module to use blocking = false.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jul 1, 2015

Test build #36269 has started for PR 7160 at commit f31f0c1.

Ilya Ganelin added 3 commits July 1, 2015 10:59
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jul 1, 2015

Test build #36269 has finished for PR 7160 at commit f31f0c1.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@SparkQA
Copy link

SparkQA commented Jul 1, 2015

Test build #36272 has started for PR 7160 at commit 381dc42.

@SparkQA
Copy link

SparkQA commented Jul 1, 2015

Test build #36272 has finished for PR 7160 at commit 381dc42.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented Jul 1, 2015

Test build #36273 has started for PR 7160 at commit d976187.

@SparkQA
Copy link

SparkQA commented Jul 1, 2015

Test build #36273 has finished for PR 7160 at commit d976187.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@mengxr
Copy link
Contributor

mengxr commented Jul 1, 2015

@ilganeli I'm not sure whether this is a safe change. If the old ones are not deleted from block managers, new RDDs might not be able to cache or have to downgrade the storage level. Since it is hard to tell when the next caching happens, we set blocking to true to be safe. It would be great if you can share some benchmark results, so we can easily see the trade-off and decide.

@ilganeli
Copy link
Author

ilganeli commented Jul 1, 2015

@mengxr @jkbradley Initially reported this, perhaps he should weigh in here?

@andrewor14
Copy link
Contributor

I think the concern is that the memory cache is not fully LRU. If a new big block shows up when the memory cache is almost full, then there's a chance that the new block will end up being dropped straight to disk instead.

Here's how this issue is related: if the asynchronous unpersist happens too slowly, a workload that previously would have benefited from the new block being cached in memory may now drop the new block to disk directly instead. Whether or not this actually happens is largely workload dependent.

@ilganeli Have you had a chance to benchmark performances of a few algorithms before and after this change? It'll be good to see if this does indeed speed up many workloads.

@jkbradley
Copy link
Member

Good points; I suppose unpersisting without blocking might cause other slowdowns which could cause other timeouts to occur. This does seem like the kind of change which might require running all of the MLlib tests in spark-perf on a cluster to test for speed changes. Perhaps a better solution in the meantime is simply to adjust timeout settings as needed.

@mengxr I'm OK with closing the JIRA if it does not seem worth the trouble of testing. @ilganeli I doubt we have the bandwidth to test right now, but do let us know if you'd like to keep this open & run tests.

@ilganeli
Copy link
Author

ilganeli commented Jul 2, 2015

All - thanks for the clarification and weigh in. I don't have any pipelines set up that would exercise this with the level of rigor necessary. I think it's fine to close it for now. Thanks.

Thank you,
Ilya Ganelin

-----Original Message-----
From: jkbradley [notifications@github.commailto:notifications@github.com]
Sent: Thursday, July 02, 2015 06:09 PM Eastern Standard Time
To: apache/spark
Cc: Ganelin, Ilya
Subject: Re: [spark] [SPARK-8733][MLLIB] ML RDD.unpersist calls should use blocking = false (#7160)

Good points; I suppose unpersisting without blocking might cause other slowdowns which could cause other timeouts to occur. This does seem like the kind of change which might require running all of the MLlib tests in spark-perf on a cluster to test for speed changes. Perhaps a better solution in the meantime is simply to adjust timeout settings as needed.

@mengxrhttps://github.com/mengxr I'm OK with closing the JIRA if it does not seem worth the trouble of testing. @ilganelihttps://github.com/ilganeli I doubt we have the bandwidth to test right now, but do let us know if you'd like to keep this open & run tests.


Reply to this email directly or view it on GitHubhttps://github.com//pull/7160#issuecomment-118181016.


The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.

@jkbradley
Copy link
Member

OK, sorry for the trouble! Can you please close this PR? (We can't)

@ilganeli
Copy link
Author

ilganeli commented Jul 3, 2015

Will do once I have access to a PC. No trouble at all.

Thank you,
Ilya Ganelin

-----Original Message-----
From: jkbradley [notifications@github.commailto:notifications@github.com]
Sent: Thursday, July 02, 2015 06:58 PM Eastern Standard Time
To: apache/spark
Cc: Ganelin, Ilya
Subject: Re: [spark] [SPARK-8733][MLLIB] ML RDD.unpersist calls should use blocking = false (#7160)

OK, sorry for the trouble! Can you please close this PR? (We can't)


Reply to this email directly or view it on GitHubhttps://github.com//pull/7160#issuecomment-118188231.


The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.

@ilganeli ilganeli closed this Jul 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants