Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UNION ALL should strip table identifiers in its resulting schema #10706

Closed
phillipleblanc opened this issue May 29, 2024 · 1 comment · Fixed by #11082
Closed

UNION ALL should strip table identifiers in its resulting schema #10706

phillipleblanc opened this issue May 29, 2024 · 1 comment · Fixed by #11082
Assignees
Labels
bug Something isn't working

Comments

@phillipleblanc
Copy link
Contributor

phillipleblanc commented May 29, 2024

edit:
This bug cannot be fixed as-is due to a requirement that we keep the table identifiers to prevent a duplicate column issue. See #5410 for more context on that issue

The PR that will close this issue only fixes it for the unparser scenario: #11082

Describe the bug

The schema that is the result of a UNION ALL should not have any table qualifiers, as the table information has effectively been erased and is no longer a valid reference.

Consider the following SQL:

SELECT table1.foo FROM table1
UNION ALL
SELECT table2.foo FROM table2

The logical schema from this UNION ALL should be just foo, but it is currently taking the first input's schema directly, resulting in a schema of table1.foo.

This came up as an issue when trying to validate the unparser works correctly for UNION ALL statements, which I added in #10603

Slightly modifying the above example, if I add an ORDER BY clause to the input SQL:

SELECT table1.foo FROM table1
UNION ALL
SELECT table2.foo FROM table2
ORDER BY foo

Then the resulting unparsed SQL will be:

SELECT table1.foo FROM table1
UNION ALL
SELECT table2.foo FROM table2
ORDER BY table1.foo

Because the unparser takes the schema directly from the Union node when writing out the column expressions.

To Reproduce

Running this test fails, when it should succeed:

#[test]
fn roundtrip_statement_with_dialect() -> Result<()> {
    struct TestStatementWithDialect {
        sql: &'static str,
        expected: &'static str,
        parser_dialect: Box<dyn Dialect>,
        unparser_dialect: Box<dyn UnparserDialect>,
    }
    let tests: Vec<TestStatementWithDialect> = vec![
        TestStatementWithDialect {
            sql: "SELECT j1_id FROM j1
                  UNION ALL
                  SELECT tb.j2_id as j1_id FROM j2 tb
                  ORDER BY j1_id
                  LIMIT 10;",
            expected: r#"SELECT j1.j1_id FROM j1 UNION ALL SELECT tb.j2_id AS j1_id FROM j2 AS tb ORDER BY j1_id ASC NULLS LAST LIMIT 10"#,
            parser_dialect: Box::new(GenericDialect {}),
            unparser_dialect: Box::new(UnparserDefaultDialect {}),
        },
    ];

    for query in tests {
        let statement = Parser::new(&*query.parser_dialect)
            .try_with_sql(query.sql)?
            .parse_statement()?;

        let context = MockContextProvider::default();
        let sql_to_rel = SqlToRel::new(&context);
        let plan = sql_to_rel
            .sql_statement_to_plan(statement)
            .unwrap_or_else(|e| panic!("Failed to parse sql: {}\n{e}", query.sql));

        let unparser = Unparser::new(&*query.unparser_dialect);
        let roundtrip_statement = unparser.plan_to_sql(&plan)?;

        let actual = format!("{}", &roundtrip_statement);
        println!("roundtrip sql: {actual}");
        println!("plan {}", plan.display_indent());

        assert_eq!(query.expected, actual);
    }

    Ok(())
}

Expected behavior

The table qualifiers are stripped from the schema of a UNION ALL LogicalPlan node.

Additional context

No response

@phillipleblanc
Copy link
Contributor Author

take

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant