Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark3: Support for LATERAL VIEW clause #2687

Merged
merged 19 commits into from Feb 18, 2022

Conversation

R7L208
Copy link
Contributor

@R7L208 R7L208 commented Feb 18, 2022

Brief summary of the change made

  • update sqlfluff/core/parser/grammar/anyof.py to allow exclude named argument to be an iterable
  • created LateralViewClauseSegment in sqlfluff/dialects/dialect_spark3.py to match/parse the clause
  • removed unneeded comments & TODOs
  • Add exclusion for LATERAL keyword in AliasExpressionSegment
  • Redefine FromExpressionElementSegment from sqlfluff/dialects/dialect_ansi.py to include LateralViewClauseSegment
  • Updated L026 to be default off for Spark3 due to false positives in LATERAL VIEW (and structs are applicable here as well)

Are there any other side effects of this change that we should be aware of?

  • I added a try/except in sqlfluff/core/parser/grammar/anyof.py that will use any() to exclude segments then revert to the previously existing code block on a TypeError
  • Updated docstring to reflect all dialects in the exclusion list for L026

Pull Request checklist

  • Please confirm you have completed any of the necessary steps below.

  • Included test cases to demonstrate any code changes, which may be one or more of the following:

    • .yml rule test cases in test/fixtures/rules/std_rule_cases.
    • .sql/.yml parser test cases in test/fixtures/dialects (note YML files can be auto generated with tox -e generate-fixture-yml).
    • Full autofix test cases in test/fixtures/linter/autofix.
    • Other.
  • Added appropriate documentation for the change.

  • Created GitHub issues for any relevant followup/future enhancements if appropriate.

@codecov
Copy link

codecov bot commented Feb 18, 2022

Codecov Report

Merging #2687 (52d8fc9) into main (0307c76) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##              main     #2687   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          163       163           
  Lines        12025     12034    +9     
=========================================
+ Hits         12025     12034    +9     
Impacted Files Coverage Δ
src/sqlfluff/core/parser/grammar/anyof.py 100.00% <100.00%> (ø)
src/sqlfluff/dialects/dialect_spark3.py 100.00% <100.00%> (ø)
src/sqlfluff/rules/L026.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0307c76...52d8fc9. Read the comment docs.

Copy link
Member

@tunetheweb tunetheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot cleaner now! But spotted a few more things.

src/sqlfluff/dialects/dialect_spark3.py Outdated Show resolved Hide resolved
src/sqlfluff/dialects/dialect_spark3.py Outdated Show resolved Hide resolved
src/sqlfluff/dialects/dialect_spark3.py Outdated Show resolved Hide resolved
src/sqlfluff/dialects/dialect_spark3.py Outdated Show resolved Hide resolved
R7L208 and others added 3 commits February 18, 2022 13:07
Co-authored-by: Barry Pollard <barry_pollard@hotmail.com>
Comment on lines 1533 to 1535
AnyNumberOf(
Ref("AliasExpressionSegment"),
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last question, I promise!

Wouldn't this allow this?:

    LATERAL VIEW OUTER EXPLODE(ARRAY()) AS tbl_name AS c_age;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep asking 🙂 Updated (and will include CI updates as well) to fix this.

Using the below instead.

        Ref("SingleIdentifierGrammar", optional=True),
        Ref("AliasExpressionSegment", optional=True),

I ran the query with AS tbl_name AS c_age; and it failed. The first AS isn't allowed like you raised.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested the second AS and going to push one more commit then should be good to go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also removed two other test cases that were not valid so thank you for catching that initial case.

Comment on lines 1538 to 1542
Sequence(
"AS",
Ref("SingleIdentifierGrammar"),
optional=True,
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My read of the above link is this should be:

Suggested change
Sequence(
"AS",
Ref("SingleIdentifierGrammar"),
optional=True,
),
Sequence(
"AS",
Delimited(
Ref("SingleIdentifierGrammar")
),
optional=True,
),

Is that so? And if so should we add a test case for multiple aliases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, not firing on all cylinders today.

        Sequence(
            "AS",
            Delimited(
                Ref("SingleIdentifierGrammar")
            ),
        ),

Should be correct. I forgot to remove the optional clause. I'll add a case for multiple column aliases

Copy link
Member

@tunetheweb tunetheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got there in the end!
😁

@tunetheweb tunetheweb merged commit ef27896 into sqlfluff:main Feb 18, 2022
@R7L208 R7L208 deleted the r7l208/spark3-lateral-view-clause branch August 30, 2022 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants