Skip to content

[CALCITE-6999] Invalid unparse for TRIM in PrestoDialect#4360

Merged
mihaibudiu merged 1 commit intoapache:mainfrom
xuzifu666:support_trim_presto_mysql
May 8, 2025
Merged

[CALCITE-6999] Invalid unparse for TRIM in PrestoDialect#4360
mihaibudiu merged 1 commit intoapache:mainfrom
xuzifu666:support_trim_presto_mysql

Conversation

@xuzifu666
Copy link
Copy Markdown
Member

@xuzifu666 xuzifu666 force-pushed the support_trim_presto_mysql branch 3 times, most recently from 135157d to b9c2983 Compare May 4, 2025 10:22
Copy link
Copy Markdown
Member

@xiedeyantu xiedeyantu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@caicancai caicancai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the BigQuery dialect files were modified ?

@xuzifu666
Copy link
Copy Markdown
Member Author

Why the BigQuery dialect files were modified ?

The logic can be reused by other dialects,so move it to RelToSqlConverterUtil avoid duplicate code. @caicancai

@caicancai
Copy link
Copy Markdown
Member

There are two reasons. The first is that you cannot be sure that every dialect has this function and the format is the same. The second is that you did not modify the above comment.

@xuzifu666
Copy link
Copy Markdown
Member Author

xuzifu666 commented May 5, 2025

There are two reasons. The first is that you cannot be sure that every dialect has this function and the format is the same. The second is that you did not modify the above comment.

Yes, here need to adapt system which support trim/ltrim/rtrim functions and presto is supported well(which test result in JIRA:https://issues.apache.org/jira/browse/CALCITE-6999), and I had updated the comments with presto link. thanks for reminder @caicancai

@caicancai
Copy link
Copy Markdown
Member

There are two reasons. The first is that you cannot be sure that every dialect has this function and the format is the same. The second is that you did not modify the above comment.

Yes, here need to adapt system which support trim/ltrim/rtrim functions and presto is supported well(which test result in JIRA:https://issues.apache.org/jira/browse/CALCITE-6999), and I had updated the comments with presto link. thanks for reminder @caicancai

So now all dialects support trim function parsing?

@xuzifu666
Copy link
Copy Markdown
Member Author

There are two reasons. The first is that you cannot be sure that every dialect has this function and the format is the same. The second is that you did not modify the above comment.

Yes, here need to adapt system which support trim/ltrim/rtrim functions and presto is supported well(which test result in JIRA:https://issues.apache.org/jira/browse/CALCITE-6999), and I had updated the comments with presto link. thanks for reminder @caicancai

So now all dialects support trim function parsing?

This can not confirm,but which need to use(like BigQuery/Presto) to convert right way can use the method.

* <a href="https://prestodb.io/docs/current/functions/string.html#trim-string-varchar">
* Presto Trim Function</a>.
*/
public static void unparseTrim(SqlWriter writer, SqlCall call, int leftPrec,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this function is specific to BigQuery, the function name should reflect that.
I would not abbreviate BQ in the JavaDoc, why force people to try to guess what it is when you can just write it down?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes,make sense, this method not only used for BigQuery,can be used by any need convert trim(both from) to trim/ltrim/rtrim functions, so method name may not need to be changed, I had updated the bbreviate BQ in JavaDoc.Thanks for suggestion @mihaibudiu

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the current name is too generic, this function is not suitable for unparsingTrim for all dialects, as the name would suggest
I think unparseTrimBigQuery is better, or unparseTrimLR if you want.
Many dialects can share it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I thought unparseTrimLR is better, so changed the function to unparseTrimLR~ @mihaibudiu

call.operand(2).unparse(writer, leftPrec, rightPrec);

// If the trimmed character is a non-space character, add it to the target SQL.
// eg: TRIM(BOTH 'A' from 'ABCD'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing parens in comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// eg: TRIM(BOTH 'A' from 'ABCD'
// Output Query: TRIM('ABC', 'A')
String value = requireNonNull(valueToTrim.toValue(), "valueToTrim.toValue()");
if (!value.matches("\\s+")) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this correct?
is trim(s, "\n") the same as trimming all whitespaces?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here should be correct, I had test for it:

select trim('   aa c', ' ');
>aa c

trim with '\n'

select trim('   aa c', '\n');
>   aa c

trim(s, "\n") is not the same as trimming all whitespaces.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about a literal tab character (not the \t escape sequence)?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the regexp \s+ will match a tab, or even a sequence of spaces and tabs and newlines

Copy link
Copy Markdown
Member Author

@xuzifu666 xuzifu666 May 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes,you are right,maybe we can use StringUtils.isWhitespace(value) to instead of it? which can only match white spaces. @mihaibudiu

@xuzifu666
Copy link
Copy Markdown
Member Author

Hi @mihaibudiu all comments had been addressed, whether is OK currently?

// eg: TRIM(BOTH 'A' from 'ABCD')
// Output Query: TRIM('ABC', 'A')
String value = requireNonNull(valueToTrim.toValue(), "valueToTrim.toValue()");
if (!StringUtils.isWhitespace(value)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think this is correct.
The spec says that the default value is just a single space.
So the test should be for value.equals(" "), or it can be removed altogether.

Copy link
Copy Markdown
Member Author

@xuzifu666 xuzifu666 May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, I agree with it and had changed it, PTAL thanks @mihaibudiu

@xuzifu666 xuzifu666 requested a review from mihaibudiu May 8, 2025 01:39
Copy link
Copy Markdown
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can squash the commits

@mihaibudiu mihaibudiu added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label May 8, 2025
@xuzifu666 xuzifu666 force-pushed the support_trim_presto_mysql branch from 7aa2abb to 089c626 Compare May 8, 2025 05:20
@xuzifu666
Copy link
Copy Markdown
Member Author

Thanks for review,I had squashed commits. @mihaibudiu

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud bot commented May 8, 2025

@mihaibudiu mihaibudiu merged commit 06355b6 into apache:main May 8, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

LGTM-will-merge-soon Overall PR looks OK. Only minor things left.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants