-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-13727] [SQL] SparkConf.contains does not consider deprecated keys #11568
Conversation
@@ -351,7 +351,16 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with Logging { | |||
def getAppId: String = get("spark.app.id") | |||
|
|||
/** Does the configuration contain a given parameter? */ | |||
def contains(key: String): Boolean = settings.containsKey(key) | |||
def contains(key: String): Boolean = { | |||
if (settings.containsKey(key)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It always bothers me when I see if (true) true
. I think settings.containsKey(key) || ...
would be a bit nicer. What do you think?
// try to find the settings in the alternatives | ||
configsWithAlternatives.get(key).flatMap { alts => | ||
alts.collectFirst { case alt if contains(alt.key) => true } | ||
}.isDefined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's far too complicated. Would configsWithAlternatives.get("one").contains(...)
work here? What is this supposed to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It will find the alternatives using the key: configsWithAlternatives.get(key), it will return an Option of Seq, since alternative could be multiple;
- It will then try to use the key of each alternative to see if it is already defined in the conf, here contains(alt.key) is a recursive call;
- collectFirst will just get the first matching
- the final step isDefined() is to see whether we find such a match or not.
Remember configsWithAlternatives.get("one") returns Option of Seq[], any suggestion to simplify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK what about ...flatMap(_.find(contains(_.key)))...
? maybe that's too terse. The case statement mapping to true seemed unnecessary. If I'm still missing something, pardon, just do what you see fit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've provided another approach in my latest code I pushed. It avoids the flatMap and use exist to check. Can you take a look? Thanks.
@jaceklaskowski @srowen I've some changes. How about the logic this time? |
settings.containsKey(key) || { | ||
// try to find the settings in the alternatives | ||
val alts = configsWithAlternatives.get(key) | ||
if (alts.isDefined) alts.get.exists { alt => contains(alt.key) } else false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this better; can it just be alts.isDefined && alts.get.exists ...
?
@srowen thanks for the suggestion. i've pushed the changes to make it more concise. |
ok to test @vanzin |
settings.containsKey(key) || { | ||
// try to find the settings in the alternatives | ||
val alts = configsWithAlternatives.get(key) | ||
alts.isDefined && alts.get.exists { alt => contains(alt.key) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just do
val containsAlternatives =
configsWithAlternatives.get(key).toSeq.flatten.exists { alt => contains(alt.key) }
settings.containsKey(key) || containsAlternatives
Made some changes based on @andrewor14 comments. Thanks! |
LGTM |
Test build #52803 has finished for PR 11568 at commit
|
Test build #52808 has finished for PR 11568 at commit
|
Ok, merging to master. Thanks! |
(BTW I'll fix the title during merge since this is not a sql change.) |
The contains() method does not return consistently with get() if the key is deprecated. For example, import org.apache.spark.SparkConf val conf = new SparkConf() conf.set("spark.io.compression.lz4.block.size", "12345") # display some deprecated warning message conf.get("spark.io.compression.lz4.block.size") # return 12345 conf.get("spark.io.compression.lz4.blockSize") # return 12345 conf.contains("spark.io.compression.lz4.block.size") # return true conf.contains("spark.io.compression.lz4.blockSize") # return false The fix will make the contains() and get() more consistent. I've added a test case for this. (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) Unit tests should be sufficient. Author: bomeng <bmeng@us.ibm.com> Closes apache#11568 from bomeng/SPARK-13727.
What changes were proposed in this pull request?
The contains() method does not return consistently with get() if the key is deprecated. For example,
import org.apache.spark.SparkConf
val conf = new SparkConf()
conf.set("spark.io.compression.lz4.block.size", "12345") # display some deprecated warning message
conf.get("spark.io.compression.lz4.block.size") # return 12345
conf.get("spark.io.compression.lz4.blockSize") # return 12345
conf.contains("spark.io.compression.lz4.block.size") # return true
conf.contains("spark.io.compression.lz4.blockSize") # return false
The fix will make the contains() and get() more consistent.
How was this patch tested?
I've added a test case for this.
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
Unit tests should be sufficient.