Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(optimizer)!: Preserve DPipe in simplify_concat #3317

Merged
merged 4 commits into from Apr 15, 2024

Conversation

VaggelisD
Copy link
Collaborator

Fixes TobikoData/sqlmesh#2439

The simplify_concat rule would merge a list of binary exp.DPipe concatenations into one exp.Concat function call while constant folding any running string literals on the way. That (optimized) expression would then be generated as CONCAT(a, b, ...) which might not have the same semantics as a || b in every dialect e.g. tsvector in Postgres.

This PR aims to preserve exp.DPipe by tagging any exp.Concat expressions generated by simplify_concat with dpipe_source=True, which will cause them to be generated back as a list of (now optimized) binary exp.DPipe concatenations

@georgesittas
Copy link
Collaborator

Isn't the issue local to simplify_concat? Why do we need to introduce yet another arg to Concat for this? Imo this should be as simple as checking the type of the input expression and then ensuring we return that type before returning, eg. if we have a Concat at the end but the original type was DPipe, then do something like this (without the sql part):

def concat_to_dpipe_sql(self: Generator, expression: exp.Concat) -> str:
return self.sql(reduce(lambda x, y: exp.DPipe(this=x, expression=y), expression.expressions))

@VaggelisD
Copy link
Collaborator Author

@georgesittas Check out the latest commit, reduce did simplify things a lot, thanks!

Copy link
Collaborator

@georgesittas georgesittas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

sqlglot/optimizer/simplify.py Outdated Show resolved Hide resolved
Co-authored-by: Jo <46752250+georgesittas@users.noreply.github.com>
sqlglot/optimizer/simplify.py Outdated Show resolved Hide resolved
@georgesittas georgesittas merged commit 290e408 into main Apr 15, 2024
5 checks passed
@georgesittas georgesittas deleted the vaggelisd/simplify_concat branch April 15, 2024 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect parsing of tsvector || tsvector [postgres]
2 participants