Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat:implement postgres style 'overlay' string function #8117

Merged
merged 13 commits into from
Nov 14, 2023

Conversation

Syleechan
Copy link
Contributor

Which issue does this PR close?

https://www.postgresql.org/docs/current/functions-string.html#:~:text=overlay%20(%20string,4)%20%E2%86%92%20Thomas

Closes #.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions labels Nov 10, 2023
@2010YOUY01
Copy link
Contributor

Thank you for your contribution!

Could you move tests to sqllogictest, and add SQL reference?

@alamb alamb changed the title feat:implement posgres style 'overlay' string function feat:implement postgres style 'overlay' string function Nov 10, 2023
@alamb
Copy link
Contributor

alamb commented Nov 12, 2023

Thank you for your contribution!

Could you move tests to sqllogictest, and add SQL reference?

I agree -- thank you @Syleechan

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution @Syleechan - I agree with @2010YOUY01 that all this needs are some end to end sql tests (in sqllogictest): https://github.com/apache/arrow-datafusion/tree/main/datafusion/sqllogictest

# Conflicts:
#	datafusion/expr/src/built_in_function.rs
#	datafusion/physical-expr/src/functions.rs
#	datafusion/proto/proto/datafusion.proto
#	datafusion/proto/src/generated/pbjson.rs
#	datafusion/proto/src/generated/prost.rs
#	datafusion/proto/src/logical_plan/from_proto.rs
@github-actions github-actions bot added sql SQL Planner sqllogictest SQL Logic Tests (.slt) labels Nov 13, 2023
@Syleechan
Copy link
Contributor Author

Thanks @2010YOUY01, @alamb . I have added the sqllogic test and sql reference.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution @Syleechan -- I think this is looking very nice

I had a few small suggestions, but they can also be done as follow on PRs if you prefer.

Thanks again!

('123', 'abc', 4, 5),
('abcdefg', 'qwertyasdfg', 1, 7),
('xyz', 'ijk', 1, 2),
('Txxxxas', 'hom', 2, 4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please also test NULL handling here? I think your code handles it properly, but it would be good to have this in the SQL tests. We can do this as a follow on PR if you prefer.

Specifically

Suggested change
('Txxxxas', 'hom', 2, 4)
('Txxxxas', 'hom', 2, 4),
(NULL, 'hom', 2, 4), -- expect NULL output
('Txxxxas', 'hom', NULL, 4), -- expect NULL output
('Txxxxas', 'hom', 2, NULL) -- expect NULL output

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have added these cases, thanks.

- **str**: String expression to operate on.
- **substr**: the string to replace part of str.
- **pos**: the start position to replace of str.
- **count**: the count of characters to be replaced from start position of str.If not specified, will use substr length instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **count**: the count of characters to be replaced from start position of str.If not specified, will use substr length instead.
- **count**: the count of characters to be replaced from start position of str. If not specified, will use substr length instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved, thanks.

```
overlay(str PLACING substr FROM pos [FOR count])
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the example made it much easier for me to see what was going on in postgres. Can we add an example here as well?

Suggested change
For example, `overlay('Txxxxas' placing 'hom' from 2 for 4) → Thomas`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To resolve the conflict, I have added this, thanks

# Conflicts:
#	datafusion/proto/proto/datafusion.proto
#	datafusion/proto/src/generated/pbjson.rs
#	datafusion/proto/src/generated/prost.rs
Copy link
Member

@Ted-Jiang Ted-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Syleechan lgtm, thx, will merge this, some pr is depend on this,

@Ted-Jiang Ted-Jiang merged commit 4535551 into apache:main Nov 14, 2023
23 checks passed
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great -- thank you @Syleechan and @Ted-Jiang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions physical-expr Physical Expressions sql SQL Planner sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants