Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: editing subtree that depends-on-column: ... #3218

Closed
wants to merge 7 commits into from

Conversation

rooney
Copy link
Contributor

@rooney rooney commented Mar 26, 2024

... invalidation should be based on the edit's end-column.

Fix #3217

…ues: before and after parse-tree structures should be the same

The edit in the test is just changing an identifier from "c" to "c1234".
Such edit should not affect the resulting parse tree's structure (the parse tree's S-expression should stay the same).
…update strings-read assertion

After the fix on ts_subtree_edit(), fewer characters are read
…rder now reflects the edit correctly (just the identifier being renamed)
@maxbrunsfeld
Copy link
Contributor

maxbrunsfeld commented Apr 3, 2024

Hi, I don't know what behavior this new code would implement, but the existing behavior is what we want, I believe.

The existing behavior is that if the external scanner's get_column method is called when scanning a token, then any edit that occurs before that token on a line (thus shifting the token's start column) will invalidate the token. This is, of course, in addition to the normal invalidation behavior, where tokens are invalidated if an edit intersects the range of text that was read when scanning them.

@maxbrunsfeld
Copy link
Contributor

I'm going to close this out because I believe the existing test is correct, and your change to the test would break the desired behavior. If I'm missing some more subtle reason why this invalidation logic should be changed, then feel free to explain further, and we can reopen.

@rooney
Copy link
Contributor Author

rooney commented Apr 3, 2024

Hi @maxbrunsfeld

Alright, I have reworked my PR so it's much simpler now -- only adding two checks and nothing else (no test change, and all the tests still pass).

master...rooney:tree-sitter:col-dependent-fix
(I assume it's pretty clear since it's only a two-liner, but please let me know if I should explain)

The change is needed because there are some cases where a token-that-depends-on-column should be invalidated but isn't. Let me demo it using a simplest-as-possible example:

// grammar.js

  rules: {
    source_file: ($) => seq(/[ \t\r\n]+/, choice($.odd, $.even), $.x),
    x: () => "x",
  },
  externals: ($) => [$.odd, $.even],

Here, our grammar will: 1. consume leading whitespace, 2. consult the external scanner to see if the column is at odd/even position, and 3. match "x".

(the scanner is stateless):

// scanner.c

enum TokenType { ODD, EVEN };

bool tree_sitter_coldemo_external_scanner_scan(void *payload, TSLexer *lexer,
                                               const bool *valid_symbols) {
  lexer->result_symbol = lexer->get_column(lexer) % 2 ? ODD : EVEN;
  return true;
}

Now, here's how the playground goes on master:
MOV to GIF
See how it stays "odd" even though the positions are correctly updating.

And here's how the playground goes after the two-line patch:
Mov to Gif Conversion
See how it correctly updates the odd/even now.

tree-sitter-coldemo.zip (contains only grammar.js and src/scanner.c)

To be extra sure, I also have checked this new patch against the complete tree-sitter-haskell -- all tests there still pass running this patch.

@maxbrunsfeld
Copy link
Contributor

Thanks for the response, and nice work. As part of your branch, can you add your new example scanner as a test to the Tree-sitter test suite?

@maxbrunsfeld
Copy link
Contributor

Reopening.

@rooney
Copy link
Contributor Author

rooney commented Apr 4, 2024

Reopening.

Thanks @maxbrunsfeld

  • OK test added, made sure it doesn't pass without the patch.

Also, added an optimization for when the edits don't actually move the columns. (The test for this optimization checks that there's fewer bytes being read by the recorder).
master...rooney:tree-sitter:col-dependent-fix

Also have re-run tree-sitter-haskell test with the optimization, all good there.

@amaanq
Copy link
Member

amaanq commented Apr 4, 2024

I don't think it can't be reopened, can you open another PR @rooney?

@maxbrunsfeld
Copy link
Contributor

Oh sorry, thought that I reopened it. The GitHub UI mislead me a bit there.

@rooney
Copy link
Contributor Author

rooney commented Apr 4, 2024

sure, np, created another PR #3257

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parsing-After-Editing Test Case is incorrect (and passing)
3 participants