Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

L016: "sqlfluff fix" adds too many newlines #1382

Conversation

barrywhart
Copy link
Member

Fixes #1344

L016 "Line is too long" has a bug that causes it to behave pathologically if it finds a line with an inline comment that is:

  • Is on a line with other code
  • Exceeds the maximum line length even after moving it to a line by itself

E.g.

    ,--  the "y variable" and uses_small_subject_line to be the "x variable" in terms of the regression line.

In this case, L016 keeps adding blank lines endlessly until it hits the sqlfluff fix iteration limit (typically 10).

Pull Request checklist

  • Please confirm you have completed any of the necessary steps below.

  • Included test cases to demonstrate any code changes, which may be one or more of the following:

    • .yml rule test cases in test/fixtures/rules/std_rule_cases.
    • .sql/.yml parser test cases in test/fixtures/parser (note YML files can be auto generated with python test/generate_parse_fixture_yml.py or by running tox locally).
    • Full autofix test cases in test/fixtures/linter/autofix.
    • Other.
  • Added appropriate documentation for the change.

  • Created GitHub issues for any relevant followup/future enhancements if appropriate.

@@ -391,7 +391,6 @@ def _eval(self, segment, raw_stack, **kwargs):

# We'll need the indent, so let's get it for fixing.
line_indent = []
idx = 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My IDE identified this as an unused variable, so I removed it. This is not related to the fix.

# this case, the rule should do nothing, otherwise it
# triggers an endless cycle of "fixes" that simply keeps
# adding blank lines.
self.logger.info(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewers: The lines inside the if are very similar to existing lines 407-413, only the condition is different.

@@ -0,0 +1,13 @@
SELECT
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is sort of a real-world test for the fix, adapted from the existing file mentioned in the issue, benchmarks/bench/bench_002_pearson_fix.sql.

During the analysis of the issue, three rules were primarily involved in cleaning up the file, and this seemed like a good test case to add to the suite.

@barrywhart barrywhart changed the title Issue 1344: "sqlfluff fix" adds too many newlines Issue 1344: L016: "sqlfluff fix" adds too many newlines Sep 11, 2021
Copy link
Member

@tunetheweb tunetheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - nice find!

One question: Why didn't the timing query bomb out when the LOOP LIMIT was reached? Should that be changed from a WARNING to an ERROR to help us spot these issues in the future?

@barrywhart
Copy link
Member Author

The loop limit did stop it, but it was adding one additional blank line each time, so it added a bunch of them before it stopped.

I think I'd be okay changing it to an error if we also raised the default limit a bit, maybe to 20 or 25.

Related: Did you see my note about how it was moving the comma up one line at a time? IMHO, that's at best a poorly written rule, probably a bug. Imagine if there's a really long block of comments between the initial location of a leading comma and its final, desired, trailing location above. It's poor design for the linter to require 'n' iterations to move a comma by 'n' lines. Think I should create an issue for that?

@barrywhart barrywhart merged commit 1acb2a7 into sqlfluff:main Sep 12, 2021
@tunetheweb
Copy link
Member

The loop limit did stop it, but it was adding one additional blank line each time, so it added a bunch of them before it stopped.
I think I'd be okay changing it to an error if we also raised the default limit a bit, maybe to 20 or 25.

My point is reaching the loop limit implies there’s a problem surely? But that wasn’t surfaced. Should reaching the loop limit throw an exception rather than silently swallowing it?

Related: Did you see my note about how it was moving the comma up one line at a time? IMHO, that's at best a poorly written rule, probably a bug. Imagine if there's a really long block of comments between the initial location of a leading comma and its final, desired, trailing location above. It's poor design for the linter to require 'n' iterations to move a comma by 'n' lines. Think I should create an issue for that?

Possibly. It might as well skip that it if can. But it’s not the only rule that can result in a rerun surfacing more fixes.

@barrywhart
Copy link
Member Author

barrywhart commented Sep 12, 2021

Yes, reaching the loop limit most likely means there was a problem. I'm a little torn about having it throw an exception:

  • Pros: Issues get surfaced and reported.
  • Cons: Perhaps we improved the user's SQL. Are we talking about stopping the run, or reporting an exception within that file and still processing other files? Do we still rewrite their SQL file?

I'm generally +1 on your suggestion, but want to consider the details. I'll create an issue for this so we can discuss and fix it.

@barrywhart
Copy link
Member Author

See #1386 re: loop limit.

@barrywhart
Copy link
Member Author

See #1387 re: moving the comma one line at a time.

@tunetheweb tunetheweb changed the title Issue 1344: L016: "sqlfluff fix" adds too many newlines L016: "sqlfluff fix" adds too many newlines Sep 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sqlfluff fix adding too many newlines to bench_002_pearson.sql
2 participants