-
Notifications
You must be signed in to change notification settings - Fork 17.3k
Use new TextBuffer LanguageMode API #16087
Conversation
4397f71 to
140a783
Compare
0460484 to
78c2e45
Compare
Signed-off-by: Nathan Sobo <nathan@github.com>
78c2e45 to
113b563
Compare
9765ddd to
316bcda
Compare
316bcda to
de27d4e
Compare
81b862f to
653ecd7
Compare
|
😓 Finally ✅ . |
|
/cc @atom/maintainers - Curious if anyone has thoughts or opinions on the API changes described above ( |
@Alhadis @lierdakil this change would mean that any TextMate grammars that are assigned programmatically are required to have a |
|
I'm fine with having a I'm more concerned with the usage of a human-readable identifier instead of a |
It's definitely a fairly major change. My problem with the current paradigm is that the concept of scope names is kind of TextMate-specific. With most parsing systems, syntax tree nodes aren't named in this hierarchical scheme like Practically speaking, I think grammars' names should be just as unique as their I agree that some of the names like I think one solution is to fix some of the |
|
My approach would be to introduce a new field simply called Non-alphanumeric characters are replaced with dashes, so that Thoughts? :) |
|
The word id: "ruby"
name: "Ruby"
scopeName: "source.ruby"Can you elaborate on what concrete problem you're seeing with using the grammar's name to identify it? |
|
The |
|
If a package author changes the |
|
I also wouldn't filter out grammars simply because they contain an |
|
We could go with same process as the atom config. i.e. the name is "fortranFixedForm" and if you want to later change how it's automatically reformatted for display then you use the optional title field to set it to "Fortran (Fixed Form)" |
|
Possibly, but that still wouldn't address something like, say, |
|
Ideally, it'd be great if identifiers matched whatever followed I'm all for having a SSOT, and I'm all for getting rid of the weird, nonsensical scope-name conventions used by TextMate (which have never made sense to me; the
Personally, I feel the |
v1v2What am I missing? |
I see your point. I guess @damieng I think the issue is that the |
Even that's inconsistent, though. The same field-name is used for describing the scopes of successful matches. E.g., the It's all a horrid mess, in all honesty... |
|
@Alhadis @damieng @50Wliu I've updated the PR to remove the use of language names in Tree-sitter grammars will have their own ids that will look a bit different: Thanks to @as-cii for helping come up with this safer path forward. Does this seem reasonable to everyone? I think it is less risky and disruptive than what I originally proposed. |
We will still wait to tokenize a buffer until it is visible in *some* tab, but we no longer enable and disable tokenization as different editors become visible.
|
LGTM. |
Summary: setGrammar is being deprecated soon, see atom/atom#16087 Based on the changes recommended in the pull request changing the used API for setting language modes. Gating the change to just atom 1.24 and above. Reviewed By: tjfryan Differential Revision: D6776948 fbshipit-source-id: 29553915de77f9fa876b439bdb986cb3332139d5
Summary: setGrammar is being deprecated soon, see atom/atom#16087 Based on the changes recommended in the pull request changing the used API for setting language modes. Gating the change to just atom 1.24 and above. Reviewed By: tjfryan Differential Revision: D6776948 fbshipit-source-id: 29553915de77f9fa876b439bdb986cb3332139d5
Depends on atom/text-buffer#260
This is a continuation of work started in #15713, with the overall goal of allowing Atom to use parsing systems other than TextMate grammars.
One Language Per Buffer
The main user-facing change introduced in this PR is that language modes (previously called
TokenizedBuffers) now belong to aTextBuffer, not an individualTextEditor. This means that if have you have a file open in 3 different tabs, the file will only be parsed once instead of 3 times for the purpose of syntax highlighting. It also means that when you select a language for the file using the grammar selector, your choice will apply in all 3 places.TextEditor API Changes
Previously, the way to configure an editor's language was to assign to the editor a TextMate
Grammar, and the editor would internally construct aTokenizedBufferwhich would use thefirst-mateAPIs to parse the buffer.Now, the
TextEditorclass is no longer responsible for constructing its parser directly; it relies on the buffer having a language mode which provides information about syntax highlighting, folding, commenting, and suggested indentation.This means that the
.getGrammarand.setGrammarmethods don't really make sense any more, and neither does thegrammarparameter to theTextEditorconstructor. These APIs will continue to work for backwards-compatibility, but will be deprecated at some point.Other API Changes
Other language-related APIs like
TextEditorRegistry.setGrammarOverridewere designed around the TextMate concept of scope names - strings likesource.jsthat are intended to fit into TextMate's scope selector system. Though scope selectors will continue to be a concept in Atom, we will no longer assume that these selectors use the TextMate-style naming conventions. Where possible, I'd like to start referring to these strings as language ids, not scope names.Since TextEditors no longer construct their own language modes, I've moved this functionality into the existing
GrammarRegistryobject exposed asatom.grammars. It is implemented in the following methods:atom.grammars.assignLanguageMode(buffer, languageId)- Assign the buffer a language mode that uses the language with the given id. For textmate grammars, we use thescopeName(e.g.source.js) as an ID. Tree-sitter grammars will have different looking ids (probably justjavascript,go, etc).atom.grammars.autoAssignLanguageMode(buffer)- Assign the buffer a language mode that uses the most appropriate language, based on the buffer's file extension and content.atom.grammars.maintainLanguageMode(buffer)- Assign the buffer a language mode and continue to update the language mode as additional language packages are loaded, or the buffer's file path changes.The
TextEditorRegistryis no longer responsible for anything to do with grammars. The grammar-related methods will continue to work but they will be deprecated sometime soon.Remaining Tasks
TokenizedBuffertoTextMateLanguageMode(I'm going to do this in a separate PR to keep this PR's diff more readable).Next Steps
It should now be possible to integrate Tree-sitter parsers into Atom just by changing the
GrammarRegistryand thePackageManager. The core editor no longer assumes anything about how its parser works. Once this is merged, I will begin doing this.