Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--cursor-offset output is wrong sometimes #1981

Closed
josephfrazier opened this issue Jun 5, 2017 · 7 comments · Fixed by #4397
Closed

--cursor-offset output is wrong sometimes #1981

josephfrazier opened this issue Jun 5, 2017 · 7 comments · Fixed by #4397
Labels
locked-due-to-inactivity Please open a new issue and fill out the template instead of commenting. type:bug Issues identifying ugly output, or a defect in the program

Comments

@josephfrazier
Copy link
Collaborator

I just wanted to note that since the cursorOffset option (introduced in #1637) works by tracking the cursor position relative to an AST node (rather than a CST token), it can produce incorrect results. For example:

$ echo "return         15" | prettier --cursor-offset 15 # the cursor is positioned immediately before `15`
return 15;
15

(related to prettier/prettier-vscode#12 (comment))

josephfrazier added a commit to josephfrazier/prettier that referenced this issue Jun 5, 2017
Since the `cursorOffset` option (introduced in prettier#1637) works by tracking
the cursor position relative to an AST node (rather than a CST token),
it can produce incorrect results.

See prettier#1981
josephfrazier added a commit to josephfrazier/prettier that referenced this issue Jun 5, 2017
This partially addresses
prettier#1981 by ensuring that the
cursor cannot jump outside of its node upon formatting. It could be
improved upon by keeping track of the offset relative to the end of the
node as well.
josephfrazier added a commit to josephfrazier/prettier that referenced this issue Jun 5, 2017
@josephfrazier
Copy link
Collaborator Author

Hmm, after working on #1983, I see that the issue with this particular test case is that findNodeAtOffset only returns the closest SourceElement (see here). However, for the purposes of cursor tracking, we want the closest node possible, while the range-formatting is what needs the closest SourceElement. I'll open another PR with the relevant changes.

@vjeux
Copy link
Contributor

vjeux commented Jun 5, 2017

Ohh, so the issue is that it's based on Statements and not every node. Interesting!

josephfrazier added a commit to josephfrazier/prettier that referenced this issue Jun 5, 2017
josephfrazier added a commit to josephfrazier/prettier that referenced this issue Jun 5, 2017
Since the `cursorOffset` option (introduced in prettier#1637) works by tracking
the cursor position relative to an AST node (rather than a CST token),
it can produce incorrect results.

See prettier#1981
vjeux pushed a commit that referenced this issue Jun 6, 2017
* Add test demonstrating SourceElement-relative cursor translation

See #1981 (comment)

* Translate cursor relative to nearest node, not SourceElement

This partially address #1981

See #1981 (comment)

* Add test demonstrating incorrect cursor translation

Since the `cursorOffset` option (introduced in #1637) works by tracking
the cursor position relative to an AST node (rather than a CST token),
it can produce incorrect results.

See #1981
@vjeux
Copy link
Contributor

vjeux commented Jun 9, 2017

Is this fixed now?

@josephfrazier
Copy link
Collaborator Author

Not quite, there's still the bug mentioned here: #1989 (comment)

@yukulele
Copy link

yukulele commented Aug 29, 2017

other bug example:

input: while (true) { |}
output : while (true) {}|
expected: while (true) {|}

(where | reprensent cursor position )

@lydell lydell added the type:bug Issues identifying ugly output, or a defect in the program label Sep 5, 2017
ds300 pushed a commit to ds300/prettier that referenced this issue Apr 20, 2018
@ds300
Copy link
Contributor

ds300 commented Apr 20, 2018

I've been looking into fixing cursorOffset tracking recently, and I intend to submit a PR that will make it at least a lot better than it is currently, if not perfect. Will document what I've learned so far here now.

prettier currently decides where the cursor should end up using the following method:

  1. Find cursorNode, i.e. the smallest AST node which encompasses the cursor. It might not be a 'leaf' node.

    e.g.

    return      15
    

    gets broken down into something like this (i'm heavily paraphrasing from memory):

    ReturnExpression {
      loc: { start: 0, end: 14 }
      children: [
        NumberExpression {
          value: '15',
          loc: { start: 12, end: 14 }
        }
      ]
    }
    

    If the cursor was at position 12, (return |15), the selected node would be the NumberExpression, since the cursor lies in the range specified by its source location entry. If the cursor was at position 11, (return | 15) the selected node would be the ReturnExpression.

  2. Calculate the relative offset of the cursor from the start of cursorNode. So at position 12 in the previous example, the cursors relative offset would be 0 since it is at the beginning of the NumberExpression node. And if the cursor was at position 11, its relative offset would be 11 since it is 11 characters away from the start of the ReturnExpression node.

  3. While transforming the AST in order to produce prettier's internal 'doc' representation, check whether the current AST node is the cursorNode, and if so insert a 'cursor placeholder' in the output doc before the node.

  4. While printing the doc to a string (the part where prettier decides how to actually format the code), insert the placeholder into the string. (it's actually still arrays at this point, but that's just an implementation detail)

  5. Before returning the formatted string, find the index of the cursor placeholder and then slice it out of the string. This index is supposed to then be the starting offset of the cursorNode in the final output string.

  6. Add that index to the relative offset calculated during step 2 and return that as the output cursorOffset along with the formatted code.

This approach has two problems right now

  1. If cursorNode has extra whitespace, or other characters that get deleted, and the cursor is positioned after those characters, its output position is going to be wrong, since that deletion is not accounted for at any point. Haven't found a good fix for this yet, but a decent first attempt involved using the nearest leaf node as cursorNode in certain situations. Hope to find something better.

  2. Step 3 is buggy. It doesn't handle preceding comments properly. e.g.

    // hello
    func|tion () {
    ...
    

    becomes

    // h|ello
    function () {
    ...
    

    I've got a fix for that, and could submit it as a separate PR if there's immediate interest in having that fixed. I get the impression there isn't though, since nobody has opened an issue about it :) – even this Issue hasn't seen any activity for over 6 months, so probably cursorOffset is not being used much in the wild at the moment.

@ds300
Copy link
Contributor

ds300 commented Apr 21, 2018

Again, publicly keeping track of my own progress on this, in case I get distracted and it helps someone else one day:

A potential approach to solving the first issue mentioned above is as follows:

  1. instead of only inserting one 'cursor placeholder' into the doc, insert one either side of the cursorNode so that we can get a chunk of text representing the cursor node during the printing stage
  2. During the printing stage, extract out the original cursorNode text and compare it against the new version. If they are identical, then stick with the previous approach.
  3. Otherwise, split both old and new text by the empty string, creating two arrays of individual chars
  4. Insert a Symbol into the old array at the cursor offset.
  5. diff the arrays using the diff npm module
  6. Walk through the diff and make a note of the final text offset at the point where the cursor symbol is removed
  7. use that + the offset of the cursorNode in the output text to determine the final cursorOffset.

I've tried this out. It seems to work very nicely. Need to do more extensive testing though.

@lock lock bot added the locked-due-to-inactivity Please open a new issue and fill out the template instead of commenting. label Jul 30, 2018
@lock lock bot locked as resolved and limited conversation to collaborators Jul 30, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked-due-to-inactivity Please open a new issue and fill out the template instead of commenting. type:bug Issues identifying ugly output, or a defect in the program
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants