Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show Tree-sitter grammars in grammar selector #18738

Merged
merged 37 commits into from Jun 21, 2019

Conversation

@Aerijo
Copy link
Member

commented Jan 24, 2019

Requirements for Adding, Changing, or Removing a Feature

  • Fill out the template below. Any pull request that does not include enough information to be reviewed in a timely manner may be closed at the maintainers' discretion.
  • The pull request must contribute a change that has been endorsed by the maintainer team. See details in the template below.
  • The pull request must update the test suite to exercise the updated functionality. For guidance, please see https://flight-manual.atom.io/hacking-atom/sections/writing-specs/.
  • After you create the pull request, all status checks must be pass before a maintainer reviews your contribution. For more details, please see https://github.com/atom/atom/tree/master/CONTRIBUTING.md#pull-requests.

Issue or RFC Endorsed by Atom's Maintainers

Fixes #17029 (I endorse it...)
Closes atom/language-html#226

Description of the Change

Adds an optional argument to atom.grammars.getGrammars to allow Tree-sitter grammars as well. It defaults to false to avoid any backwards compatibility issues, though I'd be fine with making it true or removing it entirely, given that Tree-sitter is now on by default for widely used languages like JS, C, etc.

Tests have not been added yet. Still need to style the grammar selector and fix my Dev mode ideally.

Alternate Designs

Aside from the above, the grammar selector element constructor also needs additional styling to mark the type of grammar. Leaving TextMate grammars as they are is fine and least confusing, but Tree-sitter grammars often share a name with these, so they look like duplicate entries. I was thinking of adding Ts before the scope, but I would ask @simurai or someone to show me how to do it (my attempt failed miserably...).

Possible Drawbacks

Current design - making the argument a direct parameter could be awkward if we decide to add more options in future. Making it an object would future proof it.

If we enable it by default, it may break some community packages that ... handle grammars? That aren't already somewhat broken if they are ignoring Tree-sitter anyway? I'm not convinced this is an issue, so would like some thoughts on it.

Verification Process

Manually overrided it in my init script. Dev mode currently does not open for me, and I have not worked out why.

Release Notes

Show Tree-sitter grammars in grammar selector

@Aerijo

This comment has been minimized.

Copy link
Member Author

commented Jan 25, 2019

Adding the workaround here for reference while this is in progress. Adding the following to your init.js file should allow you to see TS grammars (or you can translate to CoffeeScript for init.coffee)

atom.grammars.getGrammars = () => {
  return [...atom.grammars.textmateRegistry.getGrammars(), ...Object.values(atom.grammars.treeSitterGrammarsById)]
}

It will show apparent duplicates for lots of languages, so you might want to filter out TextMate grammars with the same name.

@simurai

This comment has been minimized.

Copy link
Member

commented Feb 12, 2019

Leaving TextMate grammars as they are is fine and least confusing, but Tree-sitter grammars often share a name with these, so they look like duplicate entries. I was thinking of adding Ts before the scope

Or write Ts out to Tree-sitter?

image

Although I wonder... For users that don't follow Atom development or read the Atom blog, do they know what "Tree-sitter" means? We could also be more descriptive and say that there is a normal and an "advanced" version:

image

@Aerijo

This comment has been minimized.

Copy link
Member Author

commented Feb 12, 2019

@simurai Writing our the full Tree sitter is fine, but I don’t like calling it advanced; that would also be just as confusing IMO. They are already referred to as Tree-sitter in the core setting, and “advanced” is a generic word that could refer to anything, including potential future supported grammar engines.

E.g., if the Sublime grammar engine gets ported successfully (even a community package), there could end up be three types of grammar.

It would be best if there was a reliable way to tell users about Tree-sitter vs TextMate grammars, and why they’d pick one over the other. Unfortunately, they’re unlikely to go to the grammar selector package README for this. Maybe a small description somewhere on the prompt could link to an explanation?

@Aerijo Aerijo closed this Feb 12, 2019

@Aerijo Aerijo reopened this Feb 12, 2019

@Aerijo

This comment was marked as resolved.

Copy link
Member Author

commented Feb 12, 2019

Did not mean to close

@50Wliu

This comment has been minimized.

Copy link
Member

commented Feb 12, 2019

Is it possible for the Grammar Selector to hide grammars that also have a tree-sitter variant?

@Aerijo

This comment has been minimized.

Copy link
Member Author

commented Feb 12, 2019

@50Wliu Yes, there should be a setting for that (normal users don’t need TextMate, but there are still cases where I would want to specify the TextMate one without disabling things)

@simurai

This comment has been minimized.

Copy link
Member

commented Feb 13, 2019

Ohh.. It just occurred to me that JavaScript (Tree-siter) might also be confusing because it looks like it's some sort of "flavor" of JavaScript and not the parsing system used. How about having green badges that just say Tree-sitter in front of the scopes? I'll push that.

screen shot 2019-02-13 at 2 05 13 pm

Maybe a small description somewhere on the prompt could link to an explanation?

That would be helpful. Not sure how to describe it in the very small space left. Without, users that are curious would have to google it and hopefully find a blog post like https://github.blog/2018-10-31-atoms-new-parsing-system/.

@Arcanemagus

This comment has been minimized.

Copy link
Member

commented Feb 13, 2019

That would be helpful. Not sure how to describe it in the very small space left.

Can those Tree-sitter badges have alt-text added on hover?

@simurai

This comment has been minimized.

Copy link
Member

commented Feb 13, 2019

Can those Tree-sitter badges have alt-text added on hover?

Yes, here with a title attribute:

title

Does the following description make sense?

A fast parsing system with improved syntax highlighting, code folding, syntax-aware selection.
@Aerijo

This comment has been minimized.

Copy link
Member Author

commented Feb 13, 2019

My suggestion:

(Recommended) A faster parser with improved syntax highlighting & code navigation support.

Shorter and, in case the user can't tell from the listed improvements, explicitly says it's recommended.

simurai and others added some commits Feb 13, 2019

@Aerijo

This comment has been minimized.

Copy link
Member Author

commented Feb 17, 2019

OK, couple of outstanding issues I think can be fixed in this PR:

  1. Order of TS and TM grammars is random when both have same name. E.g., C has TS first, but C++ has TM first.
  2. Choosing TM grammar does not actually choose the TM grammar. done
let grammars = this.textmateRegistry.getGrammars()
if (includeTreesitter) grammars = grammars.concat(Object.values(this.treeSitterGrammarsById))
return grammars
getGrammars (params = {includeTreeSitter: false}) {

This comment has been minimized.

Copy link
@Aerijo

Aerijo Feb 17, 2019

Author Member

Turned into options object so we can add / change settings later without breaking changes

Maybe add TM-TS duplicate removal here? It would be slightly easier, and allow it to be used by all packages instead of making them do it themselves

@Aerijo

This comment has been minimized.

Copy link
Member Author

commented Jun 15, 2019

I've been considering returning Tree-sitter grammars by default, or explicit opt in. I've gone with by default, because they are supposed to be the preferred grammar now. There were a couple of hitches with other parts of GrammarRegistry that expose grammars, but I think everything works (the specs pass).

The failing specs seem completely unrelated; the mac build failed to load, throwing

[command]/bin/bash --noprofile --norc /Users/vsts/agent/2.153.1/work/_temp/d77d45f0-4dd1-4d72-a48e-8ec869406f78.sh
internal/modules/cjs/loader.js:596
    throw err;
    ^

Error: Cannot find module 'colors'
    at Function.Module._resolveFilename (internal/modules/cjs/loader.js:594:15)
    at Function.Module._load (internal/modules/cjs/loader.js:520:25)
    at Module.require (internal/modules/cjs/loader.js:650:17)
    at require (internal/modules/cjs/helpers.js:20:18)
    at Object.<anonymous> (/Users/vsts/agent/2.153.1/work/1/s/script/test:5:1)
    at Module._compile (internal/modules/cjs/loader.js:702:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:713:10)
    at Module.load (internal/modules/cjs/loader.js:612:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:551:12)
    at Function.Module._load (internal/modules/cjs/loader.js:543:3)

I think this is ready to be reviewed and tried in beta.

@Aerijo Aerijo referenced this pull request Jun 15, 2019

Closed

MacOS CI broken #19543

@nathansobo nathansobo self-assigned this Jun 19, 2019

@nathansobo
Copy link
Contributor

left a comment

I really love the UX you came up with that explains what a Tree Sitter grammar is to the user. Nice work. From a code perspective, this all seems legit. I'm concerned about the API implications however and have a comment with more details about it inline.

return grammar !== atom.grammars.nullGrammar && grammar.name;
});

if (atom.config.get('grammar-selector.hideDuplicateTextMateGrammars')) {

This comment has been minimized.

Copy link
@nathansobo

nathansobo Jun 19, 2019

Contributor

In this case, do you really need the second call to atom.grammars.getGrammars({ textMateOnly: true }? It seems like you could simply iterate through the grammars array twice, once to populate the blacklist of Tree Sitter grammars and then a second time to filter. It's not a huge deal though.

This comment has been minimized.

Copy link
@Aerijo

Aerijo Jun 20, 2019

Author Member

Yeah, cleaned it up

Show resolved Hide resolved src/grammar-registry.js

Aerijo added some commits Jun 20, 2019

@Aerijo Aerijo closed this Jun 20, 2019

@Aerijo Aerijo reopened this Jun 20, 2019

@nathansobo

This comment has been minimized.

Copy link
Contributor

commented Jun 20, 2019

The CI failure is really odd.

@nathansobo

This comment has been minimized.

Copy link
Contributor

commented Jun 20, 2019

Okay, I merged master into this branch after merging #19580, so hopefully we get better behavior out of the tests.

@nathansobo

This comment has been minimized.

Copy link
Contributor

commented Jun 20, 2019

Before we merge, I'm going to spend some time seeing if we can actually return tokens from the methods on grammar. The current shims won't break the current API, but I also think they don't really perform as we advertise in the documentation, which could be surprising. Let's see what I can come up with.

@nathansobo

This comment has been minimized.

Copy link
Contributor

commented Jun 20, 2019

Sorry. After digging into this I really don't think we can sanely shim tokenizeLine. It's intended to carry the state of the first-mate parser across calls, and my scan of all the package code out there shows that people are indeed using it. I think we should avoid returning objects that violate the documented interface of Grammar without an explicit option. And maybe we leave that option undocumented for now. I can take care of making the relevant changes.

Aerijo added some commits Jun 21, 2019

@Aerijo

This comment has been minimized.

Copy link
Member Author

commented Jun 21, 2019

OK, it's now set back up to require opt in for Tree-sitter. The change took a few tries, because there are a couple of places grammars are exposed, and it wasn't consistent to begin with (see forEachGrammar, which exposed TS grammars for the auto assignment methods).

I left the shims in, because they are better than nothing. They are not meant to be functional though.

@nathansobo

This comment has been minimized.

Copy link
Contributor

commented Jun 21, 2019

Thanks so much! It will be really nice to be able to select these grammars via the UI. Really, the functionality was incomplete without this.

@nathansobo nathansobo merged commit 9fdcc95 into atom:master Jun 21, 2019

1 check passed

Atom Pull Requests #20190621.5 succeeded
Details
@nathansobo

This comment has been minimized.

Copy link
Contributor

commented Jun 21, 2019

A nice next step would be to start thinking about how we can expose the unique capabilities of TreeSitter grammars and language modes in our official API.

@maxbrunsfeld

This comment has been minimized.

Copy link
Contributor

commented Jun 21, 2019

Nice work @Aerijo and @simurai. Looks great.

Alhadis added a commit to Alhadis/.atom that referenced this pull request Aug 14, 2019

@JulianHQ JulianHQ referenced this pull request Aug 14, 2019

Open

Syntax highlighting wrong for '->', '=' and '::' #366

1 of 1 task complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.