Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand SQL formatter to LakeFormation #1684

Merged
merged 5 commits into from
Oct 25, 2022
Merged

Conversation

LeonLuttenberger
Copy link
Contributor

Feature or Bugfix

  • Feature

Detail

  • Expand SQL formatter to LakeFormation

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@malachi-constant

This comment was marked as outdated.

@malachi-constant

This comment was marked as outdated.

@malachi-constant

This comment was marked as outdated.

@malachi-constant

This comment was marked as outdated.

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: 23cdef4
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant

This comment was marked as outdated.

@malachi-constant

This comment was marked as outdated.

@LeonLuttenberger LeonLuttenberger self-assigned this Oct 13, 2022
@LeonLuttenberger LeonLuttenberger added major release Will be addressed in the next major release and removed major release Will be addressed in the next major release labels Oct 13, 2022
@LeonLuttenberger LeonLuttenberger added this to the 3.0.0 milestone Oct 13, 2022
@LeonLuttenberger LeonLuttenberger marked this pull request as ready for review October 13, 2022 20:37
@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubLoadTests5656BB24-s6u9F3qN9oFy
  • Commit ID: 23cdef4
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubStandardCodeBuild8C06-llutOAimTATs
  • Commit ID: 23cdef4
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

Copy link
Contributor

@jaidisido jaidisido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe LakeFormation uses Redshift Spectrum under the hood so I don't believe the Presto Engine definition we have would work for conditions like IN. Could we please just check that we don't need to create a new Engine definition for it? Looks good otherwise, thanks!

@LeonLuttenberger
Copy link
Contributor Author

I believe LakeFormation uses Redshift Spectrum under the hood so I don't believe the Presto Engine definition we have would work for conditions like IN. Could we please just check that we don't need to create a new Engine definition for it? Looks good otherwise, thanks!

@jaidisido The only difference I found between how the engines handle literals was that in how Presto and Hive escape string quotations. However, PartiQL does it the same way that Presto does (' within a string is escaped as '').

I'm not sure what you mean by IN though? Presto and PartiQL seem to have the same syntax for IN (values separated by commas enclosed by parenthesis). There is an issue here thought that customers have no way of passing us a list of parameters that go into IN, and have us parse it as such.

For example, if they give us

sql="SELECT * FROM table WHERE id IN :ids",
params={"ids": [1, 2, 3]}

we will parse this as SELECT * FROM table WHERE id IN ARRAY[1, 2, 3]. This query would therefore be invalid for both Presto and PartiQL. I'm not sure how to solve this gap though, as we need to parse lists as arrays.

Copy link
Contributor

@malachi-constant malachi-constant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, holding off on approval until discussion started by @jaidisido is resovled.

@jaidisido
Copy link
Contributor

I believe LakeFormation uses Redshift Spectrum under the hood so I don't believe the Presto Engine definition we have would work for conditions like IN. Could we please just check that we don't need to create a new Engine definition for it? Looks good otherwise, thanks!

@jaidisido The only difference I found between how the engines handle literals was that in how Presto and Hive escape string quotations. However, PartiQL does it the same way that Presto does (' within a string is escaped as '').

I'm not sure what you mean by IN though? Presto and PartiQL seem to have the same syntax for IN (values separated by commas enclosed by parenthesis). There is an issue here thought that customers have no way of passing us a list of parameters that go into IN, and have us parse it as such.

For example, if they give us

sql="SELECT * FROM table WHERE id IN :ids",
params={"ids": [1, 2, 3]}

we will parse this as SELECT * FROM table WHERE id IN ARRAY[1, 2, 3]. This query would therefore be invalid for both Presto and PartiQL. I'm not sure how to solve this gap though, as we need to parse lists as arrays.

The Array one is tricky indeed, but I meant for even simpler examples like dt.datetime, this would parse to TIMESTAMP '...' which I don't believe would be accepted in a partiql LakeFormation/Redshift query. I think we need to test this formatter against some actual queries submitted to the engine to determine our gaps

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: 3e0d7ec
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubLoadTests5656BB24-s6u9F3qN9oFy
  • Commit ID: 3e0d7ec
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubStandardCodeBuild8C06-llutOAimTATs
  • Commit ID: 3e0d7ec
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubDistributedCodeBuild6-jWcl5DLmvupS
  • Commit ID: d0f375a
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubLoadTests5656BB24-s6u9F3qN9oFy
  • Commit ID: d0f375a
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@malachi-constant
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: GitHubStandardCodeBuild8C06-llutOAimTATs
  • Commit ID: d0f375a
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

Copy link
Contributor

@jaidisido jaidisido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks great, thank you for taking the time to deep dive partiql!

@LeonLuttenberger LeonLuttenberger merged commit 2d73100 into release-3.0.0 Oct 25, 2022
@LeonLuttenberger LeonLuttenberger deleted the sql-formatter branch October 25, 2022 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants