Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FORMATTING] Comments move their lines after formatting #365

Closed
primeapple opened this issue Aug 1, 2022 · 6 comments · Fixed by #388
Closed

[FORMATTING] Comments move their lines after formatting #365

primeapple opened this issue Aug 1, 2022 · 6 comments · Fixed by #388
Labels

Comments

@primeapple
Copy link
Contributor

primeapple commented Aug 1, 2022

Input data

select
  name, -- student name
  age -- student age
from
  students
-- ending comment
/* another ending comment */

Expected Output

select
  name, -- student name
  age -- student age
from
  students
-- ending comment
/* another ending comment */

Actual Output

select
  name,
  -- student name
  age -- student age
from
  students -- ending comment
  /* another ending comment */

Usage

I made these using the default settings on https://sql-formatter-org.github.io/sql-formatter/

I would be interested in getting this fixed, if you could guide me in the right direction. Seems to be similar to #50

@primeapple primeapple added the bug label Aug 1, 2022
@nene
Copy link
Collaborator

nene commented Aug 1, 2022

Currently the formatter works so that it's blind regarding original whitespace. Basically it sees the above code like this:

<SELECT> <IDENT name> <COMMA> <COMMENT "student name"> <IDENT age> <COMMENT "student age">

Or to put it another way. For the formatter these two select statements look exactly the same:

select
  name, -- student name
  age -- student age
from
  students; -- comment

select
  name,
  -- student name
  age
  -- student age
from
  students;
  -- comment

This is currently by design. The formatter preserves no original formatting, completely reindenting the code regardless of how the whitespace was distributed originally. In that regard, it's expected that comments might be positioned differently than they were in the input. Like in your example there's identifier age followed by comment and identifier students followed by comment, but you're expecting them to be formatted differently. That's not really compatible with the current design of the formatter.

Another question is: how should the comments be placed? There's really two main options here: a) keep them at the end of preceding line, b) place each comment on a separate line. At the moment the formatter leans towards the first approach, though not quite purely. It basically just places them at the position where the next ordinary (non-comment) token would go.

Note: the comments placement works much better when organized like so:

select
  -- student name
  name,
  -- student age
  age
from
  -- comment
  students;

In general, I think it might be a better approach to just place all comments on a separate line. In my experience, it's nearly always possible to organize your comments so that each one is on a separate line, but it's not really possible to organize all comments so that they all are at the end of a preceding line (e.g. when there are multiple single-line comments after one another).

@primeapple
Copy link
Contributor Author

Thank you for your answer!

How I see it there are three options for comments, that reference a single line:

  1. making them above the line
  2. making them behind the content of the line
  3. making them below the line

Option 1 would be preferred by you and this is totally fine. I think it is the best option as well.
Option 2 is something that some users do, however they would probably be fine with Option 1 as well.
Option 3 is something I never saw before. This is generally seen as a comment to the following line, so this should be avoided.

So what would the best behaviour be?
In my opinion:

  • If the comment is on a line with no content, just indent it, do NOT move it.
  • If the comment is on the end of the line, put it on the line above. This could be configurable via a setting, so that it could also stay where it is

What should never happen:

  • moving a comment that was on a line without content to a line with content:
select *
from
city
-- final comment

currently gets formatted to

select
  *
from
  city -- final comment

This is only a problem for single line comments, with multiline comments everything is fine:

select 
id 
from city
/* final comment */

gets converted to

select
  id
from
  city
  /* final comment */
  • moving a comment that is on a line with content to the following line, since this changes the context of the comment (Option 3):
select 
id, -- this is an id
name
from
city

currently gets formatted to:

select
  id,
  -- this is an id
  name
from
  city

To answer your questions:

Another question is: how should the comments be placed? There's really two main options here: a) keep them at the end of preceding line, b) place each comment on a separate line. At the moment the formatter leans towards the first approach, though not quite purely. It basically just places them at the position where the next ordinary (non-comment) token would go.

Consistency is the key, they should always be on a seperate line. If the comment was at the end of the preceding line, place them above the preceding line.

Like in your example there's identifier age followed by comment and identifier students followed by comment, but you're expecting them to be formatted differently. That's not really compatible with the current design of the formatter.

Yes, I expect them to be formatted differently, because their position changes the understanding of the code. Consider this sample output of the formatter:

select
  id,
  -- integer ALWAYS below 20
  age
from
  city

What column is the comment now referencing? We'll never know.

@nene
Copy link
Collaborator

nene commented Aug 2, 2022

I pretty much agree with you. I don't remember there being any changes to formatting of comments since this old issue #50. Perhaps it's time to for a small overhaul.

From the top of my head I can see two things we can do to improve the formatting of comments:

  • Make the comments always be on a separate line. The multi-line comments are already treated like so, so it would be just making single-line comments behave the same. This should be pretty simple to implement. (That would fix your first "this should never happen".)
  • Preserving the original placement of a comment. This involves first storing the original preceding whitespace of a comment and then later formatting the comment accordingly. A bit more work, and perhaps some tricky edge-cases pop up. Don't know. But in general I think it shouldn't be too hard to do.

The latter thing also ties in with issue #329, which also asks for ability to preserve existing whitespace.

@primeapple
Copy link
Contributor Author

Preserving the original placement of a comment. This involves first storing the original preceding whitespace of a comment and then later formatting the comment accordingly. A bit more work, and perhaps some tricky edge-cases pop up. Don't know. But in general I think it shouldn't be too hard to do.

So this would leave comments like from city -- a table as they are?
I would be fine with this, maybe you could even add an option to always add them before this line, to make comments more consistent.

@nene
Copy link
Collaborator

nene commented Aug 2, 2022

maybe you could even add an option to always add them before this line, to make comments more consistent.

That would be trickier. Moving a comment after the line it's on is simple: one needs to just insert a newline. But moving a comment back is much more involved... we'd need to backtrack the formatting progress to the start of current line. Plus I'd rather avoid adding additional options if possible.

@nene
Copy link
Collaborator

nene commented Aug 15, 2022

This is now released in 9.2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants