Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-11574: [Rust][DataFusion] Upgrade sqlparser to support parsing all TPC-H queries #9456

Closed
wants to merge 3 commits into from

Conversation

Dandandan
Copy link
Contributor

@Dandandan Dandandan commented Feb 9, 2021

This PR completes parsing support for the TPC-H queries.
TPC-H 6 had a small change before compared to the original (0.06 instead of .06) and the syntax for query 22 substring(c_phone from 1 for 2) is now supported.

@github-actions
Copy link

github-actions bot commented Feb 9, 2021

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Dandandan . Looks good to me

@@ -386,7 +386,7 @@ fn create_logical_plan(ctx: &mut ExecutionContext, query: usize) -> Result<Logic
// where
// l_shipdate >= date '1994-01-01'
// and l_shipdate < date '1994-01-01' + interval '1' year
// and l_discount between 0.06 - 0.01 and 0.06 + 0.01
// and l_discount between .06 - 0.01 and .06 + 0.01
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it really matters, but my reading of the TPCH spec (validation query) has 0.06 rather than .06.

http://tpc.org/tpc_documents_current_versions/pdf/tpc-h_v2.18.0.pdf

Screen Shot 2021-02-09 at 5 47 20 PM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm weird. I generated the queries some time ago in here: https://github.com/ballista-compute/sqlparser-rs/tree/main/tests/queries/tpch . maybe the code to generate the queries has been changed slightly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just ran https://github.com/databricks/tpch-dbgen again, it still generates the values as .06

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at the time I went through and copied them from the TPC-H document directly to ensure they were 100% aligned.

@@ -2052,7 +2057,8 @@ mod tests {
fn select_simple_aggregate_with_groupby_non_column_expression_nested_and_not_resolvable(
) {
// The query should fail, because age + 9 is not in the group by.
let sql = "SELECT ((age + 1) / 2) * (age + 9), MIN(first_name) FROM person GROUP BY age + 1";
let sql =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like a whitespace only change

@alamb
Copy link
Contributor

alamb commented Feb 10, 2021

I think this one is ready . FYI @andygrove

@alamb
Copy link
Contributor

alamb commented Feb 10, 2021

Looks like there was a conflict with #9399, sorry @Dandandan

@alamb alamb added the needs-rebase A PR that needs to be rebased by the author label Feb 10, 2021
@Dandandan
Copy link
Contributor Author

Thanks, fixed @alamb

@alamb alamb closed this in 6cfbd22 Feb 10, 2021
nevi-me pushed a commit to nevi-me/arrow that referenced this pull request Feb 13, 2021
…all TPC-H queries

This PR completes parsing support for the  TPC-H queries.
TPC-H 6 had a small change before compared to the original (`0.06` instead of `.06`) and the syntax for query 22 `substring(c_phone from 1 for 2)` is now supported.

Closes apache#9456 from Dandandan/sqlparser_upgrade

Authored-by: Heres, Daniel <danielheres@gmail.com>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
sgnkc pushed a commit to sgnkc/arrow that referenced this pull request Feb 17, 2021
…all TPC-H queries

This PR completes parsing support for the  TPC-H queries.
TPC-H 6 had a small change before compared to the original (`0.06` instead of `.06`) and the syntax for query 22 `substring(c_phone from 1 for 2)` is now supported.

Closes apache#9456 from Dandandan/sqlparser_upgrade

Authored-by: Heres, Daniel <danielheres@gmail.com>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
…all TPC-H queries

This PR completes parsing support for the  TPC-H queries.
TPC-H 6 had a small change before compared to the original (`0.06` instead of `.06`) and the syntax for query 22 `substring(c_phone from 1 for 2)` is now supported.

Closes apache#9456 from Dandandan/sqlparser_upgrade

Authored-by: Heres, Daniel <danielheres@gmail.com>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
michalursa pushed a commit to michalursa/arrow that referenced this pull request Jun 13, 2021
…all TPC-H queries

This PR completes parsing support for the  TPC-H queries.
TPC-H 6 had a small change before compared to the original (`0.06` instead of `.06`) and the syntax for query 22 `substring(c_phone from 1 for 2)` is now supported.

Closes apache#9456 from Dandandan/sqlparser_upgrade

Authored-by: Heres, Daniel <danielheres@gmail.com>
Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Rust - DataFusion Component: Rust needs-rebase A PR that needs to be rebased by the author
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants