Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(matlab)!: replace parser #4944

Merged
merged 1 commit into from Jun 19, 2023
Merged

Conversation

acristoffers
Copy link
Contributor

I wrote a MATLAB parser which has more features than the current one:

  • It detects commands. The code
variable P(1, 2, 3) semidefinte

returns a function_call token with 2 argument tokens, which is how matlab sees it. I use a scanner to correctly detect the line, so it won't mix it with a function call followed by a identifier.

  • It detects string escapes and formatting. This returns a string with two tokens inside, differentiating escape sequences from formatting options. Unfortunately there is an issue with tree-sitter that prevents me from doing it with single-quoted string at the moment, but it works fine with double-quoted ones.
"Format: \t %2.3f"
  • It detects line continuations, lambdas, parfor statement, arguments statements (a kind of MATLAB block), two more operators, superclass function call, and when it comes to classes, it correctly detects and generates specific tokens for everything (MATLAB has some weird constructs, like the properties block).

The parser also generates more tokens, which allows to better handle not only syntax highlighting, but specially textobjects. For instance, matrices and cells are composed of rows with values, which allows the creation of textobjects to operate on those, which is something I was missing a lot. At this point I believe I have implemented everything in MATLAB's grammar.

@clason
Copy link
Contributor

clason commented Jun 12, 2023

  • Commit message should be feat(matlab)!: replace parser (breaking change).

  • tags.scm are not used in this project, so please drop that file.

  • Consider adding an injections.scm (at least for comments, but ideally for any language -- pure C? -- where it makes sense).

  • Consider adding @spell captures where appropriate.

The full list of allowed captures (which is different from upstream/Helix) is here: https://github.com/nvim-treesitter/nvim-treesitter/blob/master/CONTRIBUTING.md#highlights.

@acristoffers acristoffers force-pushed the master branch 2 times, most recently from 45dcc90 to d71654a Compare June 12, 2023 09:06
@acristoffers
Copy link
Contributor Author

Commit message should be feat(matlab)!: replace parser (breaking change).

Fixed.

tags.scm are not used in this project, so please drop that file.

Removed.

Consider adding an injections.scm (at least for comments, but ideally for any language -- pure C? -- where it makes sense).

Was already there, only comments. MATLAB does not allow injections.

Consider adding @spell captures where appropriate.

Was already there, at the end of the file. For strings and comments only, as most other projects do.

The full list of allowed captures (which is different from upstream/Helix) is here: https://github.com/nvim-treesitter/nvim-treesitter/blob/master/CONTRIBUTING.md#highlights.

I fixed the function call one, but it seems to be the only trully different from what I had already.

@clason
Copy link
Contributor

clason commented Jun 12, 2023

We also have indents; you could try adding at least initial support for them? (Not required.)

@clason
Copy link
Contributor

clason commented Jun 12, 2023

You also need to change the commit hash in the lockfile.json.

@clason clason added the breaking change changes that users need to adapt to label Jun 12, 2023
@clason clason changed the title Replaces MATLAB parser. feat(matlab)!: replace parser Jun 12, 2023
@clason clason requested a review from lucario387 June 12, 2023 12:10
@acristoffers
Copy link
Contributor Author

We also have indents; you could try adding at least initial support for them? (Not required.)

I forgot about it, thank you for the reminder. It's in the repo now. And I also changed the commit in the lockfile.

@acristoffers
Copy link
Contributor Author

Just did two small fixes to use the @boolean and @error captures.

@clason
Copy link
Contributor

clason commented Jun 12, 2023

Note that the locals captures also differ from upstream: https://github.com/nvim-treesitter/nvim-treesitter/blob/master/CONTRIBUTING.md#locals

Copy link
Member

@lucario387 lucario387 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indents.scm looks fine to me now 👍

@clason
Copy link
Contributor

clason commented Jun 12, 2023

@structure query is illegal in nvim-treesitter (for the moment).

@acristoffers
Copy link
Contributor Author

@structure query is illegal in nvim-treesitter (for the moment).

Just saw it. Already changed it to be just @keyword.

@acristoffers
Copy link
Contributor Author

I had a calloc in the parser which was unnecessary, so I changed the code to just use the stack instead.

@acristoffers
Copy link
Contributor Author

When I first wrote this I was thinking about taking screenshots of the existing grammar and mine to show the difference, and then I thought it would be too much trouble and just wrote that text. Now I just realized that Helix uses the existing one too, so I only had to open helix and neovim to compare both. Night and day.

parser

@acristoffers
Copy link
Contributor Author

I added textobjects to some classes constructs, which required a small grammar adaptation.

@acristoffers
Copy link
Contributor Author

The change in order revealed that I had a named node and some aliases with the same name, but totally different contexts. Fixed that.

@acristoffers
Copy link
Contributor Author

MATLAB accepts cell-indexing as a function identifier, a{1}(1). Fixed that.

@acristoffers acristoffers force-pushed the master branch 2 times, most recently from 208c8b3 to 787c183 Compare June 13, 2023 20:09
@acristoffers
Copy link
Contributor Author

Fixed the matrix parsing, it now follows MATLAB way of doing it a bit closer.

@amaanq
Copy link
Member

amaanq commented Jun 14, 2023

There are some design choices and many issues with the grammar and queries that I do not agree with/are incorrect and I think are better fixed upstream before adding here. Let me know if you'd be welcome to a PR on my end, that'd be easier than writing a bunch of comments here

@acristoffers
Copy link
Contributor Author

There are some design choices and many issues with the grammar and queries that I do not agree with/are incorrect and I think are better fixed upstream before adding here. Let me know if you'd be welcome to a PR on my end, that'd be easier than writing a bunch of comments here

Sure, go ahead.

@amaanq
Copy link
Member

amaanq commented Jun 15, 2023

auto-merge was automatically disabled June 19, 2023 06:16

Head branch was pushed to by a user without write access

@acristoffers
Copy link
Contributor Author

I added a file that should not be in the commit. The last force-push only removes that file.

@amaanq
Copy link
Member

amaanq commented Jun 19, 2023

Is the pr ready to be merged?

@acristoffers
Copy link
Contributor Author

Yes.

@acristoffers
Copy link
Contributor Author

@clason clason disabled auto-merge June 19, 2023 07:43
@clason clason merged commit df3f47a into nvim-treesitter:master Jun 19, 2023
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change changes that users need to adapt to
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants