Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33667][SQL][2.4] Respect the spark.sql.caseSensitive config while resolving partition spec in v1 SHOW PARTITIONS #30627

Conversation

MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Dec 6, 2020

What changes were proposed in this pull request?

Preprocess the partition spec passed to the V1 SHOW PARTITIONS implementation ShowPartitionsCommand, and normalize the passed spec according to the partition columns w.r.t the case sensitivity flag spark.sql.caseSensitive.

Why are the changes needed?

V1 SHOW PARTITIONS is case sensitive in fact, and doesn't respect the SQL config spark.sql.caseSensitive which is false by default, for instance:

spark-sql> CREATE TABLE tbl1 (price int, qty int, year int, month int)
         > USING parquet
         > PARTITIONED BY (year, month);
spark-sql> INSERT INTO tbl1 PARTITION(year = 2015, month = 1) SELECT 1, 1;
spark-sql> SHOW PARTITIONS tbl1 PARTITION(YEAR = 2015, Month = 1);
Error in query: Non-partitioning column(s) [YEAR, Month] are specified for SHOW PARTITIONS;

The SHOW PARTITIONS command must show the partition year = 2015, month = 1 specified by YEAR = 2015, Month = 1.

Does this PR introduce any user-facing change?

Yes. After the changes, the command above works as expected:

spark-sql> SHOW PARTITIONS tbl1 PARTITION(YEAR = 2015, Month = 1);
year=2015/month=1

How was this patch tested?

By running the affected test suites:

$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly org.apache.spark.sql.hive.execution.HiveCatalogedDDLSuite"

… resolving partition spec in v1 `SHOW PARTITIONS`

Preprocess the partition spec passed to the V1 SHOW PARTITIONS implementation `ShowPartitionsCommand`, and normalize the passed spec according to the partition columns w.r.t the case sensitivity flag  **spark.sql.caseSensitive**.

V1 SHOW PARTITIONS is case sensitive in fact, and doesn't respect the SQL config **spark.sql.caseSensitive** which is false by default, for instance:
```sql
spark-sql> CREATE TABLE tbl1 (price int, qty int, year int, month int)
         > USING parquet
         > PARTITIONED BY (year, month);
spark-sql> INSERT INTO tbl1 PARTITION(year = 2015, month = 1) SELECT 1, 1;
spark-sql> SHOW PARTITIONS tbl1 PARTITION(YEAR = 2015, Month = 1);
Error in query: Non-partitioning column(s) [YEAR, Month] are specified for SHOW PARTITIONS;
```
The `SHOW PARTITIONS` command must show the partition `year = 2015, month = 1` specified by `YEAR = 2015, Month = 1`.

Yes. After the changes, the command above works as expected:
```sql
spark-sql> SHOW PARTITIONS tbl1 PARTITION(YEAR = 2015, Month = 1);
year=2015/month=1
```

By running the affected test suites:
- `v1/ShowPartitionsSuite`
- `v2/ShowPartitionsSuite`

Closes apache#30615 from MaxGekk/show-partitions-case-sensitivity-test.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 4829781)
Signed-off-by: Max Gekk <max.gekk@gmail.com>
@SparkQA
Copy link

SparkQA commented Dec 6, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36898/

@SparkQA
Copy link

SparkQA commented Dec 6, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36898/

@SparkQA
Copy link

SparkQA commented Dec 6, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36899/

@SparkQA
Copy link

SparkQA commented Dec 6, 2020

Test build #132298 has finished for PR 30627 at commit ae64df8.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 6, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36899/

@SparkQA
Copy link

SparkQA commented Dec 6, 2020

Test build #132299 has finished for PR 30627 at commit e45f11b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @MaxGekk .
Merged to branch-2.4.

dongjoon-hyun pushed a commit that referenced this pull request Dec 6, 2020
…while resolving partition spec in v1 `SHOW PARTITIONS`

### What changes were proposed in this pull request?
Preprocess the partition spec passed to the V1 SHOW PARTITIONS implementation `ShowPartitionsCommand`, and normalize the passed spec according to the partition columns w.r.t the case sensitivity flag  **spark.sql.caseSensitive**.

### Why are the changes needed?
V1 SHOW PARTITIONS is case sensitive in fact, and doesn't respect the SQL config **spark.sql.caseSensitive** which is false by default, for instance:
```sql
spark-sql> CREATE TABLE tbl1 (price int, qty int, year int, month int)
         > USING parquet
         > PARTITIONED BY (year, month);
spark-sql> INSERT INTO tbl1 PARTITION(year = 2015, month = 1) SELECT 1, 1;
spark-sql> SHOW PARTITIONS tbl1 PARTITION(YEAR = 2015, Month = 1);
Error in query: Non-partitioning column(s) [YEAR, Month] are specified for SHOW PARTITIONS;
```
The `SHOW PARTITIONS` command must show the partition `year = 2015, month = 1` specified by `YEAR = 2015, Month = 1`.

### Does this PR introduce _any_ user-facing change?
Yes. After the changes, the command above works as expected:
```sql
spark-sql> SHOW PARTITIONS tbl1 PARTITION(YEAR = 2015, Month = 1);
year=2015/month=1
```

### How was this patch tested?
By running the affected test suites:
```
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly org.apache.spark.sql.hive.execution.HiveCatalogedDDLSuite"
```

Closes #30627 from MaxGekk/show-partitions-case-sensitivity-test-2.4.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@MaxGekk MaxGekk deleted the show-partitions-case-sensitivity-test-2.4 branch December 11, 2020 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants