Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13727] [SQL] SparkConf.contains does not consider deprecated keys #11568

Closed
wants to merge 5 commits into from
Closed

[SPARK-13727] [SQL] SparkConf.contains does not consider deprecated keys #11568

wants to merge 5 commits into from

Conversation

bomeng
Copy link
Contributor

@bomeng bomeng commented Mar 8, 2016

What changes were proposed in this pull request?

The contains() method does not return consistently with get() if the key is deprecated. For example,
import org.apache.spark.SparkConf
val conf = new SparkConf()
conf.set("spark.io.compression.lz4.block.size", "12345") # display some deprecated warning message
conf.get("spark.io.compression.lz4.block.size") # return 12345
conf.get("spark.io.compression.lz4.blockSize") # return 12345
conf.contains("spark.io.compression.lz4.block.size") # return true
conf.contains("spark.io.compression.lz4.blockSize") # return false

The fix will make the contains() and get() more consistent.

How was this patch tested?

I've added a test case for this.

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
Unit tests should be sufficient.

@@ -351,7 +351,16 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with Logging {
def getAppId: String = get("spark.app.id")

/** Does the configuration contain a given parameter? */
def contains(key: String): Boolean = settings.containsKey(key)
def contains(key: String): Boolean = {
if (settings.containsKey(key)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It always bothers me when I see if (true) true. I think settings.containsKey(key) || ... would be a bit nicer. What do you think?

// try to find the settings in the alternatives
configsWithAlternatives.get(key).flatMap { alts =>
alts.collectFirst { case alt if contains(alt.key) => true }
}.isDefined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's far too complicated. Would configsWithAlternatives.get("one").contains(...) work here? What is this supposed to do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It will find the alternatives using the key: configsWithAlternatives.get(key), it will return an Option of Seq, since alternative could be multiple;
  2. It will then try to use the key of each alternative to see if it is already defined in the conf, here contains(alt.key) is a recursive call;
  3. collectFirst will just get the first matching
  4. the final step isDefined() is to see whether we find such a match or not.

Remember configsWithAlternatives.get("one") returns Option of Seq[], any suggestion to simplify?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK what about ...flatMap(_.find(contains(_.key)))...? maybe that's too terse. The case statement mapping to true seemed unnecessary. If I'm still missing something, pardon, just do what you see fit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've provided another approach in my latest code I pushed. It avoids the flatMap and use exist to check. Can you take a look? Thanks.

@bomeng
Copy link
Contributor Author

bomeng commented Mar 8, 2016

@jaceklaskowski @srowen I've some changes. How about the logic this time?

settings.containsKey(key) || {
// try to find the settings in the alternatives
val alts = configsWithAlternatives.get(key)
if (alts.isDefined) alts.get.exists { alt => contains(alt.key) } else false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this better; can it just be alts.isDefined && alts.get.exists ...?

@bomeng
Copy link
Contributor Author

bomeng commented Mar 8, 2016

@srowen thanks for the suggestion. i've pushed the changes to make it more concise.

@andrewor14
Copy link
Contributor

ok to test @vanzin

settings.containsKey(key) || {
// try to find the settings in the alternatives
val alts = configsWithAlternatives.get(key)
alts.isDefined && alts.get.exists { alt => contains(alt.key) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just do

val containsAlternatives =
  configsWithAlternatives.get(key).toSeq.flatten.exists { alt => contains(alt.key) }
settings.containsKey(key) || containsAlternatives

@bomeng
Copy link
Contributor Author

bomeng commented Mar 10, 2016

Made some changes based on @andrewor14 comments. Thanks!

@andrewor14
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Mar 10, 2016

Test build #52803 has finished for PR 11568 at commit eede12b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 10, 2016

Test build #52808 has finished for PR 11568 at commit 6061b86.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Mar 10, 2016

Ok, merging to master. Thanks!

@vanzin
Copy link
Contributor

vanzin commented Mar 10, 2016

(BTW I'll fix the title during merge since this is not a sql change.)

@asfgit asfgit closed this in 235f4ac Mar 10, 2016
roygao94 pushed a commit to roygao94/spark that referenced this pull request Mar 22, 2016
The contains() method does not return consistently with get() if the key is deprecated. For example,
import org.apache.spark.SparkConf
val conf = new SparkConf()
conf.set("spark.io.compression.lz4.block.size", "12345")  # display some deprecated warning message
conf.get("spark.io.compression.lz4.block.size") # return 12345
conf.get("spark.io.compression.lz4.blockSize") # return 12345
conf.contains("spark.io.compression.lz4.block.size") # return true
conf.contains("spark.io.compression.lz4.blockSize") # return false

The fix will make the contains() and get() more consistent.

I've added a test case for this.

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
Unit tests should be sufficient.

Author: bomeng <bmeng@us.ibm.com>

Closes apache#11568 from bomeng/SPARK-13727.
@bomeng bomeng deleted the SPARK-13727 branch April 9, 2016 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants