Skip to content

Comments

[SPARK-43541][SQL][3.4] Propagate all Project tags in resolving of expressions and missing columns#41220

Closed
MaxGekk wants to merge 1 commit intoapache:branch-3.4from
MaxGekk:fix-using-join-3.4
Closed

[SPARK-43541][SQL][3.4] Propagate all Project tags in resolving of expressions and missing columns#41220
MaxGekk wants to merge 1 commit intoapache:branch-3.4from
MaxGekk:fix-using-join-3.4

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented May 18, 2023

What changes were proposed in this pull request?

In the PR, I propose to propagate all tags in a Project while resolving of expressions and missing columns in ColumnResolutionHelper.resolveExprsAndAddMissingAttrs().

This is a backport of #41204.

Why are the changes needed?

To fix the bug reproduced by the query below:

spark-sql (default)> WITH
                   >   t1 AS (select key from values ('a') t(key)),
                   >   t2 AS (select key from values ('a') t(key))
                   > SELECT t1.key
                   > FROM t1 FULL OUTER JOIN t2 USING (key)
                   > WHERE t1.key NOT LIKE 'bb.%';
[UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `t1`.`key` cannot be resolved. Did you mean one of the following? [`key`].; line 4 pos 7;

Does this PR introduce any user-facing change?

No. It fixes a bug, and outputs the expected result: a.

How was this patch tested?

By new test added to using-join.sql:

$ PYSPARK_PYTHON=python3 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z using-join.sql"

and the related test suites:

$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly org.apache.spark.sql.hive.HiveContextCompatibilitySuite"

Authored-by: Max Gekk max.gekk@gmail.com
Signed-off-by: Max Gekk max.gekk@gmail.com
(cherry picked from commit 09d5742)

@github-actions github-actions bot added the SQL label May 18, 2023
@MaxGekk
Copy link
Member Author

MaxGekk commented May 18, 2023

@dongjoon-hyun @cloud-fan Could you review this backport, please.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @MaxGekk .

dongjoon-hyun pushed a commit that referenced this pull request May 18, 2023
…expressions and missing columns

### What changes were proposed in this pull request?
In the PR, I propose to propagate all tags in a `Project` while resolving of expressions and missing columns in `ColumnResolutionHelper.resolveExprsAndAddMissingAttrs()`.

This is a backport of #41204.

### Why are the changes needed?
To fix the bug reproduced by the query below:
```sql
spark-sql (default)> WITH
                   >   t1 AS (select key from values ('a') t(key)),
                   >   t2 AS (select key from values ('a') t(key))
                   > SELECT t1.key
                   > FROM t1 FULL OUTER JOIN t2 USING (key)
                   > WHERE t1.key NOT LIKE 'bb.%';
[UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `t1`.`key` cannot be resolved. Did you mean one of the following? [`key`].; line 4 pos 7;
```

### Does this PR introduce _any_ user-facing change?
No. It fixes a bug, and outputs the expected result: `a`.

### How was this patch tested?
By new test added to `using-join.sql`:
```
$ PYSPARK_PYTHON=python3 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z using-join.sql"
```
and the related test suites:
```
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly org.apache.spark.sql.hive.HiveContextCompatibilitySuite"
```

Authored-by: Max Gekk <max.gekkgmail.com>
Signed-off-by: Max Gekk <max.gekkgmail.com>
(cherry picked from commit 09d5742)

Closes #41220 from MaxGekk/fix-using-join-3.4.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Member

Merged to branch-3.4 for Apache Spark 3.4.1.

snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
…expressions and missing columns

### What changes were proposed in this pull request?
In the PR, I propose to propagate all tags in a `Project` while resolving of expressions and missing columns in `ColumnResolutionHelper.resolveExprsAndAddMissingAttrs()`.

This is a backport of apache#41204.

### Why are the changes needed?
To fix the bug reproduced by the query below:
```sql
spark-sql (default)> WITH
                   >   t1 AS (select key from values ('a') t(key)),
                   >   t2 AS (select key from values ('a') t(key))
                   > SELECT t1.key
                   > FROM t1 FULL OUTER JOIN t2 USING (key)
                   > WHERE t1.key NOT LIKE 'bb.%';
[UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `t1`.`key` cannot be resolved. Did you mean one of the following? [`key`].; line 4 pos 7;
```

### Does this PR introduce _any_ user-facing change?
No. It fixes a bug, and outputs the expected result: `a`.

### How was this patch tested?
By new test added to `using-join.sql`:
```
$ PYSPARK_PYTHON=python3 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z using-join.sql"
```
and the related test suites:
```
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly org.apache.spark.sql.hive.HiveContextCompatibilitySuite"
```

Authored-by: Max Gekk <max.gekkgmail.com>
Signed-off-by: Max Gekk <max.gekkgmail.com>
(cherry picked from commit 09d5742)

Closes apache#41220 from MaxGekk/fix-using-join-3.4.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
GladwinLee pushed a commit to lyft/spark that referenced this pull request Oct 10, 2023
…expressions and missing columns

### What changes were proposed in this pull request?
In the PR, I propose to propagate all tags in a `Project` while resolving of expressions and missing columns in `ColumnResolutionHelper.resolveExprsAndAddMissingAttrs()`.

This is a backport of apache#41204.

### Why are the changes needed?
To fix the bug reproduced by the query below:
```sql
spark-sql (default)> WITH
                   >   t1 AS (select key from values ('a') t(key)),
                   >   t2 AS (select key from values ('a') t(key))
                   > SELECT t1.key
                   > FROM t1 FULL OUTER JOIN t2 USING (key)
                   > WHERE t1.key NOT LIKE 'bb.%';
[UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `t1`.`key` cannot be resolved. Did you mean one of the following? [`key`].; line 4 pos 7;
```

### Does this PR introduce _any_ user-facing change?
No. It fixes a bug, and outputs the expected result: `a`.

### How was this patch tested?
By new test added to `using-join.sql`:
```
$ PYSPARK_PYTHON=python3 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z using-join.sql"
```
and the related test suites:
```
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly org.apache.spark.sql.hive.HiveContextCompatibilitySuite"
```

Authored-by: Max Gekk <max.gekkgmail.com>
Signed-off-by: Max Gekk <max.gekkgmail.com>
(cherry picked from commit 09d5742)

Closes apache#41220 from MaxGekk/fix-using-join-3.4.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
catalinii pushed a commit to lyft/spark that referenced this pull request Oct 10, 2023
…expressions and missing columns

### What changes were proposed in this pull request?
In the PR, I propose to propagate all tags in a `Project` while resolving of expressions and missing columns in `ColumnResolutionHelper.resolveExprsAndAddMissingAttrs()`.

This is a backport of apache#41204.

### Why are the changes needed?
To fix the bug reproduced by the query below:
```sql
spark-sql (default)> WITH
                   >   t1 AS (select key from values ('a') t(key)),
                   >   t2 AS (select key from values ('a') t(key))
                   > SELECT t1.key
                   > FROM t1 FULL OUTER JOIN t2 USING (key)
                   > WHERE t1.key NOT LIKE 'bb.%';
[UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `t1`.`key` cannot be resolved. Did you mean one of the following? [`key`].; line 4 pos 7;
```

### Does this PR introduce _any_ user-facing change?
No. It fixes a bug, and outputs the expected result: `a`.

### How was this patch tested?
By new test added to `using-join.sql`:
```
$ PYSPARK_PYTHON=python3 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z using-join.sql"
```
and the related test suites:
```
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly org.apache.spark.sql.hive.HiveContextCompatibilitySuite"
```

Authored-by: Max Gekk <max.gekkgmail.com>
Signed-off-by: Max Gekk <max.gekkgmail.com>
(cherry picked from commit 09d5742)

Closes apache#41220 from MaxGekk/fix-using-join-3.4.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants