Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-26594][SQL] DataSourceOptions.asMap should return CaseInsensitiveMap #24062

Closed
wants to merge 2 commits into from

Conversation

xuanyuanking
Copy link
Member

What changes were proposed in this pull request?

The DataSourceOptions use an immutable string-to-string map internally in which keys are case-insensitive, so while we call asMap, it should also return a case-insensitive map.
Here make DataSourceOptions.asMap return CaseInsensitiveMap.

How was this patch tested?

New case in DataSourceOptionsSuite.

@xuanyuanking
Copy link
Member Author

ping @zsxwing.

@SparkQA
Copy link

SparkQA commented Mar 11, 2019

Test build #103353 has finished for PR 24062 at commit 8b06606.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -92,8 +94,8 @@ public DataSourceOptions(Map<String, String> originalMap) {
}
}

public Map<String, String> asMap() {
return new HashMap<>(keyLowerCasedMap);
public CaseInsensitiveMap<String> asMap() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was a discussion that returning CaseInsensitiveMap was a bad design before (IIRC). I think that's the reason why it return a regular map instead. Why don't we just clearly document what it returns? I think this can be a conservative compromise as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any link for the discussion? I didn't find in #19925.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I talked with @cloud-fan and @rxin. I might remember this wrongly. CaseInsensitiveMap is an internal class from catalyst. We shouldn't expose it like this anyway.

@cloud-fan
Copy link
Contributor

Can we wait for #24025 ?

@dongjoon-hyun
Copy link
Member

Ya. Let's proceed #24025 first.

@xuanyuanking
Copy link
Member Author

Yeah, let's wait for #24025 finish first.

@SparkQA
Copy link

SparkQA commented Mar 12, 2019

Test build #103358 has finished for PR 24062 at commit 83e1960.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -104,4 +104,9 @@ class DataSourceOptionsSuite extends SparkFunSuite {

assert(options.paths().toSeq == Seq("c", "d\"e"))
}

test("asMap") {
val map = new DataSourceOptions(Map("fooBar" -> "x").asJava).asMap
Copy link
Member Author

@xuanyuanking xuanyuanking Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After #24025, below test passed.

val map = new CaseInsensitiveStringMap(Map("fooBar" -> "x").asJava).asCaseSensitiveMap()
assert(map.get("fooBar") == "x")

#24094 also add asCaseSensitiveMap in CaseInsensitiveStringMap for Hadoop Configurations. So this PR is no more needed.

@xuanyuanking
Copy link
Member Author

As mentioned above, close this PR and maybe the jira can also be closed after #24025.

@xuanyuanking xuanyuanking deleted the SPARK-26594 branch March 20, 2019 15:33
@chiragsanghvi18
Copy link

This File DataSourceOptions, what is it renamed as in Spark-3 scala 2.12 code? As the complete directory structure is changed.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jul 16, 2021

Hi, @chiragsanghvi18 . It sounds like a git command question instead of Apache Spark question.

This File DataSourceOptions, what is it renamed as in Spark-3 scala 2.12 code? As the complete directory structure is changed.

Please try the following. SPARK-27106 removed it instead of renaming.

$ git log --oneline -n1 --all --full-history -- sql/core/src/main/java/org/apache/spark/sql/sources/v2/DataSourceOptions.java
2a80a4cd39 [SPARK-27106][SQL] merge CaseInsensitiveStringMap and DataSourceOptions

@chiragsanghvi18
Copy link

Hi, @chiragsanghvi18 . It sounds like a git command question instead of Apache Spark question.

This File DataSourceOptions, what is it renamed as in Spark-3 scala 2.12 code? As the complete directory structure is changed.

Please try the following. SPARK-27106 removed it instead of renaming.

$ git log --oneline -n1 --all --full-history -- sql/core/src/main/java/org/apache/spark/sql/sources/v2/DataSourceOptions.java
2a80a4cd39 [SPARK-27106][SQL] merge CaseInsensitiveStringMap and DataSourceOptions

Thanks a ton, you're a savior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants