Rough autofix for L028 #2757

OTooleMichael · 2022-03-02T09:43:22Z

This adds fixing L028

However L028 is maybe flawed from the bottom up? The changes are subject to opinion and maybe not all are wanted/needed

remove aliases when we desire unqualified
add aliases when qualified
add fix directive fix_inconsistent_to setting for fix inconsistent to either unqualified or qualified when encountered
check the "STUCT" case when possible. a.struct_value FROM tbl_name must be a struct as a is not the table reference. If a is intended to be the table this code will not run. If we assume its valid code and "fix" it, it will not change its runnableness (valid code will remain valid [but now linted], in valid code will remain invalid).
fix_inconsistent_to also works on struct dialect
Added tests to cover alias case, and schema.table case and quoted case. Some of these cases are a bit weird in the past and also now. Namely SELECT schema.table.col1, table.col2 FROM schema.table is consistent and qualified.

Are there any other side effects of this change that we should be aware of?

Fixes catches and understands structs better but also differently.

Pull Request checklist

Please confirm you have completed any of the necessary steps below.
Included test cases to demonstrate any code changes, which may be one or more of the following:
- [x].yml rule test cases in test/fixtures/rules/std_rule_cases.

Notes:

Typing is off references = list(sc.recursive_crawl("object_reference")) have methods like is_qualified but are typed as List[BaseSegment]. We can't run any sensible guard statement like assert isintanceof(ref, (A,B,C)) cause ObjectRefs are reimplmented in various dialects directly from BaseSegment
The Rule Inheritance feels like a bad idea Rule_L028(Rule_L020) - in this particular case it serve very little other than to make things hard to read and to make method/call signatures weird
Issue with creating more complex tree elements occurs again. Table Str_ref may be quoted or naked depending on the query and it is hard to construct the correct tree types. Without correct types how will future Rules fix things correctly
Inheritance essentially just does this block

if context.segment.is_type("select_statement"):
            select_info = get_select_statement_info(context.segment, context.dialect)
            if not select_info:
                return None

            # Work out if we have a parent select function
            parent_select = None
            for seg in reversed(context.parent_stack):
                if seg.is_type("select_statement"):
                    parent_select = seg
                    break

probably setting up an fn is more reusable. Also on a side note get_select_statement_info might be a good candidate for memo'ing

codecov · 2022-03-02T10:06:24Z

Codecov Report

Merging #2757 (9b9565f) into main (78bb1bd) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main     #2757   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          163       163           
  Lines        12232     12264   +32     
=========================================
+ Hits         12232     12264   +32

Impacted Files	Coverage Δ
src/sqlfluff/rules/L028.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 78bb1bd...9b9565f. Read the comment docs.

tunetheweb

Not had too much of a look at the code, as I've a more fundamental question on the new attribute and whether it's even needed.

Love the fact you support STRUCTS now for this rule!

src/sqlfluff/core/rules/config_info.py

src/sqlfluff/rules/L028.py

test/fixtures/rules/std_rule_cases/L028.yml

tunetheweb · 2022-03-02T10:32:31Z

test/fixtures/rules/std_rule_cases/L028.yml

+  fail_str: SELECT bar FROM my_tbl WHERE foo
+  fix_str: SELECT my_tbl.bar FROM my_tbl WHERE my_tbl.foo


Ohh interesting that it affects WHERE clause too! Did it always do that? Do we need a simple "pass" test for that?

There is some clever stuff about managing self references in certain dialects.
Redshift can ref a created column.

That said

SELECT a, b, a + b AS col_created_right_here, col_created_right_here + 1 AS sub_self_ref FROM tbl

Is also valid in Redshift, Snowflake and BQ
So.....

Ahhh that actually could be quite a large problem.

SELECT tbl.a, tbl.b, tbl.a + tbl.b AS col_created_right_here, col_created_right_here + 1 AS sub_self_ref -- noqa: disable=L028 FROM tbl

This noqa: is very much intentional (I normally want qualified but this is actually a self ref and I know it), will it be correctly respected ?

I think we recently discovered that noqa doesn't work in the YAML files.

My original comment was we have a fail and fix test case for WHERE reference, but no pass test case. Should we have one? In theory fail/fix should be sufficient but I always like to have a pass test case in these instances too.

Still wonder if we should have a simple pass_str test for this?

Re: noqa in YAML tests. This should now work correctly for the usual YAML tests (test/rules/yaml_test_cases_test.py::test__rule_test_case). Until recently, pass tests used a weird code path that skipped parts of the linter. As of a couple weeks ago, everything should work.

Ill add a test for this.

Test added. for fail and pass w/ noqa.

Later I'd like to revisit this / make another rule for lateral column references - they cause huge problems.

Rule like lateral self references must begin with __self_

then there would be some interaction with this and other rules.

src/sqlfluff/rules/L028.py

tunetheweb · 2022-03-02T15:52:42Z

@OTooleMichael what's your thoughts on this change and v0.11.0? Think we're ready to go on that but happy to hold off if you think you'll get this finished out in next day or so?

OTooleMichael · 2022-03-02T16:33:39Z

More

@OTooleMichael what's your thoughts on this change and v0.11.0? Think we're ready to go on that but happy to hold off if you think you'll get this finished out in next day or so?

I'll fix this in under 12 hours from now.
bigger deal is the bug fix to L049 which actually fixes to an unparsable output :) - It mangaled a tonne of my files

tunetheweb · 2022-03-02T16:41:34Z

Cool. Will hold off for this then. And that other one!

OTooleMichael · 2022-03-03T12:22:59Z

This should be ready

tunetheweb

Looks pretty good to me now, but have some more comments for you.

@barrywhart could you give it a review too since you know the rules code better than me.

tunetheweb · 2022-03-03T13:45:01Z

src/sqlfluff/rules/L028.py

       structs which trigger false positives. It can be enabled with the
       ``force_enable = True`` flag.


Since you now support structs in this rule do we still need this note?
And the force_enable flag?

Also nit but if keeping it, can we update text to "This rule is disabled by default for BigQuery, Hive and Redshift due to their use of"

To what should we change the text?

I'd probably leave force enabled flag for now (issue for a bigger release) its a more serious change that could be breaking and probably outside the thought level I can give this before pushing out 0.11.1

Sounds good.

At the moment the text just mentions BigQuery, but it's three dialects this affects so think we should mention the other two: "This rule is disabled by default for BigQuery, Hive and Redshift due to their use of..."

Or alternatively should say "This rule is disabled by default for dialects like BigQuery due to their use of..." to avoid having to keep it in sync in future, but do think listing all three is more friendly to users.

src/sqlfluff/rules/L028.py

tunetheweb · 2022-03-03T13:49:16Z

test/fixtures/rules/std_rule_cases/L028.yml

+  fail_str: SELECT bar FROM my_tbl WHERE foo
+  fix_str: SELECT my_tbl.bar FROM my_tbl WHERE my_tbl.foo


Still wonder if we should have a simple pass_str test for this?

src/sqlfluff/rules/L028.py

tunetheweb requested changes Mar 2, 2022

View reviewed changes

OTooleMichael force-pushed the L028_autofix branch from e879234 to ef74a8d Compare March 3, 2022 12:00

OTooleMichael requested a review from tunetheweb March 3, 2022 12:00

OTooleMichael added 4 commits March 3, 2022 14:39

Rough autofix for L028

d64a2aa

Fix Yaml duplicate key

486112f

cleaned up

e23cf6e

flip back docs

1c06f0c

OTooleMichael force-pushed the L028_autofix branch from 00f287b to 1c06f0c Compare March 3, 2022 13:40

tunetheweb reviewed Mar 3, 2022

View reviewed changes

tunetheweb mentioned this pull request Mar 3, 2022

L049 bug: correct over zealous = --> IS #2760

Merged

1 task

barrywhart reviewed Mar 3, 2022

View reviewed changes

src/sqlfluff/rules/L028.py Show resolved Hide resolved

src/sqlfluff/rules/L028.py Outdated Show resolved Hide resolved

tests and cleaning

60cb096

OTooleMichael force-pushed the L028_autofix branch from 6acc997 to 60cb096 Compare March 3, 2022 16:22

OTooleMichael added 5 commits March 3, 2022 17:38

Correct Notes

2457146

Correct Notes

fe04e6d

changes for coverage

015092b

fix coverage

f77f84b

Merge branch 'main' into L028_autofix

0f2019b

OTooleMichael requested a review from tunetheweb March 3, 2022 20:46

Merge branch 'main' into L028_autofix

9b9565f

tunetheweb approved these changes Mar 4, 2022

View reviewed changes

tunetheweb merged commit 267b6bb into sqlfluff:main Mar 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rough autofix for L028 #2757

Rough autofix for L028 #2757

OTooleMichael commented Mar 2, 2022

codecov bot commented Mar 2, 2022 •

edited

tunetheweb left a comment

tunetheweb Mar 2, 2022

OTooleMichael Mar 2, 2022

OTooleMichael Mar 2, 2022

tunetheweb Mar 2, 2022

tunetheweb Mar 3, 2022

barrywhart Mar 3, 2022

OTooleMichael Mar 3, 2022

OTooleMichael Mar 3, 2022 •

edited

tunetheweb commented Mar 2, 2022

OTooleMichael commented Mar 2, 2022

tunetheweb commented Mar 2, 2022

OTooleMichael commented Mar 3, 2022

tunetheweb left a comment

tunetheweb Mar 3, 2022

OTooleMichael Mar 3, 2022

tunetheweb Mar 3, 2022

tunetheweb Mar 3, 2022

		fail_str: SELECT bar FROM my_tbl WHERE foo
		fix_str: SELECT my_tbl.bar FROM my_tbl WHERE my_tbl.foo

		structs which trigger false positives. It can be enabled with the
		``force_enable = True`` flag.

Rough autofix for L028 #2757

Rough autofix for L028 #2757

Conversation

OTooleMichael commented Mar 2, 2022

Are there any other side effects of this change that we should be aware of?

Pull Request checklist

codecov bot commented Mar 2, 2022 • edited

Codecov Report

tunetheweb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

OTooleMichael Mar 3, 2022 • edited

Choose a reason for hiding this comment

tunetheweb commented Mar 2, 2022

OTooleMichael commented Mar 2, 2022

tunetheweb commented Mar 2, 2022

OTooleMichael commented Mar 3, 2022

tunetheweb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Mar 2, 2022 •

edited

OTooleMichael Mar 3, 2022 •

edited