Skip to content

Support CONVERT expressions#1048

Merged
alamb merged 1 commit intoapache:mainfrom
lovasoa:convert-expressions
Nov 20, 2023
Merged

Support CONVERT expressions#1048
alamb merged 1 commit intoapache:mainfrom
lovasoa:convert-expressions

Conversation

@lovasoa
Copy link
Copy Markdown
Contributor

@lovasoa lovasoa commented Nov 18, 2023

fixes #1047

adds support for the following CONVERT syntaxes:

  • CONVERT('héhé' USING utf8mb4) (MySQL, Postgres)
  • CONVERT('héhé', CHAR CHARACTER SET utf8mb4) (MySQL)
  • CONVERT(DECIMAL(10, 5), 42) (MSSQL) - the type comes first

@coveralls
Copy link
Copy Markdown

coveralls commented Nov 18, 2023

Pull Request Test Coverage Report for Build 6915102482

  • 59 of 69 (85.51%) changed or added relevant lines in 7 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.007%) to 87.708%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/dialect/mod.rs 1 2 50.0%
src/dialect/mssql.rs 1 2 50.0%
src/dialect/redshift.rs 0 2 0.0%
src/ast/mod.rs 14 17 82.35%
src/parser/mod.rs 33 36 91.67%
Files with Coverage Reduction New Missed Lines %
src/ast/mod.rs 1 78.94%
Totals Coverage Status
Change from base Build 6914235970: -0.007%
Covered Lines: 17845
Relevant Lines: 20346

💛 - Coveralls

@lovasoa lovasoa force-pushed the convert-expressions branch from b222845 to e5399b3 Compare November 18, 2023 16:36
fixes apache#1047

adds support for the following CONVERT syntaxes:
 - `CONVERT('héhé' USING utf8mb4)` (MySQL, Postgres)
 - `CONVERT('héhé', CHAR CHARACTER SET utf8mb4)` (MySQL)
 - `CONVERT(DECIMAL(10, 5), 42)` (MSSQL) - the type comes first
@tobyhede
Copy link
Copy Markdown
Contributor

tobyhede commented Nov 19, 2023

The "simple" function syntax already works (eg CONVERT(DECIMAL(10, 5), 42))

You can possibly make the code simpler by keep this behaviour and adding the more complex CONVERT statement parsing if it is detected.

Adding something like the following in the parser.

if self.parse_keyword(Keyword::USING) {
   // build CONVERT exp
} else {
  // default to existing parse_function
}

This approach would remove the need for the check for function order and mean CONVERT struct would not need so many Optional types as you only ever use it for the "complex" CONVERT statements.

@lovasoa
Copy link
Copy Markdown
Contributor Author

lovasoa commented Nov 20, 2023

@tobyhede , we cannot do a if self.parse_keyword(Keyword::USING) { after having already parsed the first argument as a data type. In the MySQL syntax, the first argument is an expression, not a data type.

@lovasoa
Copy link
Copy Markdown
Contributor Author

lovasoa commented Nov 20, 2023

The "simple" function syntax already works (eg CONVERT(DECIMAL(10, 5), 42))

Does it ? I think CONVERT(DECIMAL(10, 5), 42) is parsed incorrectly today. DECIMAL(10, 5) would be parsed as an expression (a function call) instead of a data type.

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Nov 20, 2023

Does it ? I think CONVERT(DECIMAL(10, 5), 42) is parsed incorrectly today. DECIMAL(10, 5) would be parsed as an expression (a function call) instead of a data type.

I agree this is what it seems to do:

$ echo "select CONVERT(DECIMAL(10, 5), 42)" > /tmp/foo.sql
$ cargo run --example cli -- /tmp/foo.sql
    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/examples/cli /tmp/foo.sql`
Parsing from file '/tmp/foo.sql' using GenericDialect
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] Parsing sql 'select CONVERT(DECIMAL(10, 5), 42)
'...
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] parsing expr
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] parsing expr
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] parsing expr
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] prefix: Value(Number("10", false))
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] get_next_precedence() TokenWithLocation { token: Comma, location: Location { line: 1, column: 26 } }
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] 0: , 1: 5 2: )
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] next precedence: 0
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] parsing expr
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] prefix: Value(Number("5", false))
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] get_next_precedence() TokenWithLocation { token: RParen, location: Location { line: 1, column: 29 } }
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] 0: ) 1: , 2: 42
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] next precedence: 0
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] prefix: Function(Function { name: ObjectName([Ident { value: "DECIMAL", quote_style: None }]), args: [Unnamed(Expr(Value(Number("10", false)))), Unnamed(Expr(Value(Number("5", false))))], filter: None, null_treatment: None, over: None, distinct: false, special: false, order_by: [] })
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] get_next_precedence() TokenWithLocation { token: Comma, location: Location { line: 1, column: 30 } }
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] 0: , 1: 42 2: )
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] next precedence: 0
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] parsing expr
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] prefix: Value(Number("42", false))
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] get_next_precedence() TokenWithLocation { token: RParen, location: Location { line: 1, column: 34 } }
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] 0: ) 1: EOF 2: EOF
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] next precedence: 0
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] prefix: Function(Function { name: ObjectName([Ident { value: "CONVERT", quote_style: None }]), args: [Unnamed(Expr(Function(Function { name: ObjectName([Ident { value: "DECIMAL", quote_style: None }]), args: [Unnamed(Expr(Value(Number("10", false)))), Unnamed(Expr(Value(Number("5", false))))], filter: None, null_treatment: None, over: None, distinct: false, special: false, order_by: [] }))), Unnamed(Expr(Value(Number("42", false))))], filter: None, null_treatment: None, over: None, distinct: false, special: false, order_by: [] })
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] get_next_precedence() TokenWithLocation { token: EOF, location: Location { line: 0, column: 0 } }
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] 0: EOF 1: EOF 2: EOF
2023-11-20T19:50:50.259Z DEBUG [sqlparser::parser] next precedence: 0
Round-trip:
'SELECT CONVERT(DECIMAL(10, 5), 42)'
Parse results:
[
    Query(
        Query {
            with: None,
            body: Select(
                Select {
                    distinct: None,
                    top: None,
                    projection: [
                        UnnamedExpr(
                            Function(
                                Function {
                                    name: ObjectName(
                                        [
                                            Ident {
                                                value: "CONVERT",
                                                quote_style: None,
                                            },
                                        ],
                                    ),
                                    args: [
                                        Unnamed(
                                            Expr(
                                                Function(
                                                    Function {
                                                        name: ObjectName(
                                                            [
                                                                Ident {
                                                                    value: "DECIMAL",
                                                                    quote_style: None,
                                                                },
                                                            ],
                                                        ),
                                                        args: [
                                                            Unnamed(
                                                                Expr(
                                                                    Value(
                                                                        Number(
                                                                            "10",
                                                                            false,
                                                                        ),
                                                                    ),
                                                                ),
                                                            ),
                                                            Unnamed(
                                                                Expr(
                                                                    Value(
                                                                        Number(
                                                                            "5",
                                                                            false,
                                                                        ),
                                                                    ),
                                                                ),
                                                            ),
                                                        ],
                                                        filter: None,
                                                        null_treatment: None,
                                                        over: None,
                                                        distinct: false,
                                                        special: false,
                                                        order_by: [],
                                                    },
                                                ),
                                            ),
                                        ),
                                        Unnamed(
                                            Expr(
                                                Value(
                                                    Number(
                                                        "42",
                                                        false,
                                                    ),
                                                ),
                                            ),
                                        ),
                                    ],
                                    filter: None,
                                    null_treatment: None,
                                    over: None,
                                    distinct: false,
                                    special: false,
                                    order_by: [],
                                },
                            ),
                        ),
                    ],
                    into: None,
                    from: [],
                    lateral_views: [],
                    selection: None,
                    group_by: Expressions(
                        [],
                    ),
                    cluster_by: [],
                    distribute_by: [],
                    sort_by: [],
                    having: None,
                    named_window: [],
                    qualify: None,
                },
            ),
            order_by: [],
            limit: None,
            limit_by: [],
            offset: None,
            fetch: None,
            locks: [],
            for_clause: None,
        },
    ),
]

Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you for the contribution @lovasoa

Comment thread src/ast/mod.rs
/// The target data type
data_type: Option<DataType>,
/// The target character encoding
charset: Option<ObjectName>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW Postgres calls this "conversion name" but that appears to be an ObjectName so 👍

https://www.postgresql.org/docs/8.2/functions-string.html

Comment thread src/dialect/mssql.rs
|| ch == '_'
}

/// SQL Server has `CONVERT(type, value)` instead of `CONVERT(value, type)`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤯

@alamb alamb changed the title add support for CONVERT expressions Support CONVERT expressions Nov 20, 2023
@alamb alamb merged commit c905ee0 into apache:main Nov 20, 2023
@lovasoa lovasoa deleted the convert-expressions branch November 20, 2023 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

support mysql's CONVERT(string USING charset)

4 participants