Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ESLint Language Plugins #99

Merged
merged 28 commits into from Mar 24, 2023
Merged

feat: ESLint Language Plugins #99

merged 28 commits into from Mar 24, 2023

Conversation

nzakas
Copy link
Member

@nzakas nzakas commented Dec 6, 2022

Summary

This RFC specifies a plugin format that would allow ESLint plugins to fully define their own languages, effectively expanding ESLint from a JavaScript-focused linter into a more general-purpose linter.

I would love some feedback from:

  • @B2o5T - given your experience with graphql-eslint, I'm hoping you have some feedback on if this would make your life easier.
  • @jeddy3, @ybiquitous - would this make a compelling case for reformulating stylelint as an ESLint plugin?
  • @DavidAnson - would this make a compelling case for reformulating markdownlint as an ESLint plugin?
  • @bradzacher, @JoshuaKGoldberg - would this make life easier for TypeScript-ESLint?

The goal here is to take the boring parts of a linter (file finding, configuration, etc.) and separate that out from the JS-specific parts so no one needs to rebuild the boring parts over and over again.

Related Issues

@nzakas nzakas added Initial Commenting This RFC is in the initial feedback stage and removed triage labels Dec 6, 2022
@DavidAnson
Copy link

Thank you, @nzakas, it’s an honor to be called out and this is a very interesting possibility to think about!

Having looked over the proposal on my phone during lunch, here are some initial thoughts about how it might apply to markdownlint:

  • Some rules take advantage of the markdown-it parser, and I think that fits in with your proposal, though the parse tree that produces is very flat (just blocks and nested inlines). However, a significant number of rules alert on things that are syntactically “invalid” and therefore don’t get parsed into appropriate structures. This is a big challenge for me and leads to a lot of direct analysis of the input text that completely ignores the parser. I don’t love this at all, but I also don’t see a better way much of the time. I don’t know if this is a unique scenario or would fit within what you envision.
  • The markdown-it parser does not emit column or granular line information for its tokens and this causes a significant amount of trouble for me. I have looked at augmenting the parser a couple of times, but not done so because it seems like it could be a large investment. Again, I’m not sure how well that plays with the architecture you propose as consumers of the Markdown language will probably expect detailed information?
  • There is precedent for inline comments to ignore lines/sections, but I think that would convert fairly seamlessly.
  • This would be a major breaking change for markdownlint users. Maybe they’ll be happy to do it!
  • The “boring parts” you are trying to save me from are indeed boring and time-consuming, but for the most part, that’s all sunk cost and markdownlint-cli2 seems to generally support the kind of scenarios people want. There is definite benefit to the ecosystem from unification, but this part of your value proposition is not quite as meaningful for me. :)

@ota-meshi
Copy link
Member

ota-meshi commented Dec 7, 2022

I'm the maintainer of eslint-plugin-vue, and a member and user of Stylelint.

If the integration works well, that would be great! It might also be possible to unify the editor integration features maintained by each linter.

However I have a question. Any idea how vue's language options would be configured if Stylelint was integrated via ESLint core?
Currently, to lint *.vue with ESLint, parse with vue-eslint-parser and ESLint rules process the AST of HTML and <script> tags.
To lint *.vue with Stylelint, parse with postcss-html and Stylelint rules process the AST of the <style> tags and style attributes.
This means we need to divert multiple parsers and different language rules.
I'd love to know if you have any thoughts on how this might change.

@RedGuy12
Copy link

RedGuy12 commented Dec 7, 2022

This means we need to divert multiple parsers and different language rules.

This could also allow us to use @html-eslint and eslint-plugin-html together.

Copy link

@bradzacher bradzacher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this make life easier for TypeScript ESLint?

No, I don't think it would make much of a difference for us.

At a glance it looks like it would actually make things harder for us as we would need to duplicate ESLint's code around SourceFile creation.

As I mentioned in the rewrite discussion - we purposely want our parser to produce an ESTree-compatible AST and we want TypeScript to be treated like JavaScript as much as possible by ESLint so that we can leverage the rule ecosystem (including the core ESLint rules) without anyone needing to make any changes anywhere.


* [GraphQL-ESLint](https://github.com/B2o5T/graphql-eslint)
* [eslint-plugin-json](https://www.npmjs.com/package/eslint-plugin-json)
* [typescript-eslint](https://typescript-eslint.io)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't think that @typescript-eslint hacks anything substantial.
On pure JS code our parser will produce the exact same AST as espree (well it would have some additional, empty properties, but otherwise the same AST).

In parsing TS we still produce an AST that is compatible with the ESTree spec, with some new nodes and properties to support TS-specific language features.
For the most part the fact that the user is using TS instead of JS should be invisible to plugin authors, barring some edge-cases.

I would call some of the extension rules we have as hacky - though most of that hacking is because we want to avoid fully forking a rule to save us having to take on the maintenance burden of merging in upstream changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If memory serves, you also needed to create your own scope manager?

FWIW, parseForESLint() will still work, though I think it itself is pretty hacky and would love to move away from it in the future.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If memory serves, you also needed to create your own scope manager?

We did need to, yes, but that's because certain decisions were made early on in the AST definition that caused conflicts with ESLint's rules.

For example - types use Identifier nodes for their names. Seems like a good idea in theory, but in practice this is an issue because the scope manager is built fairly naively and just sees most Identifiers as possible variable references - causing false positives for no-unused-vars, no-undef, etc.

We tried augmenting the lint rule with an extra rule (like eslint-plugin-react/jsx-uses-vars does) to mark identifiers as used to hide them from no-unused-vars, but that was hacky and inaccurate.
A new scope manager was the best option to ensure the ecosystem would work as expected.

FWIW the scope manager is essentially just a fork of eslint-scope - adding additional code paths so that the scope analysis can understand type declarations and such (I ended up rewriting most of it from the ground up to modernise the code, but at its core it is a fork as I used eslint-scope as a reference).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fun fact - I also did a similar line of work for Flow. hermes-eslint is a package not unlike typescript-eslint - it enables full support for Flow code in ESLint rules (the hermes parser can parse Flow code and it includes visitor keys, as well as a scope manager that understands flow types)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, super interesting, I wasn't aware of hermes-eslint.

* [eslint-plugin-json](https://www.npmjs.com/package/eslint-plugin-json)
* [typescript-eslint](https://typescript-eslint.io)

Instead of forcing others to rewrite what is core linting functionality, or asking people to produce an ESTree format AST for some language other than JavaScript, we can make everyone's lives easier by providing formal language support such that languages can be plugged in to ESLint easily.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for TS/Flow - IMO producing an ESTree AST is paramount as it allows them to leverage the existing rule ecosystem.

for other things (css, markdown, json, graphql, etc) definitely agree that producing some ESTree-like AST is bad and difficult.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been working on JSON(C|5)?, YAML, TOML, Astro, and Svelte, (and Vue) parsers, and I agree they're difficult.
https://github.com/ota-meshi/eslint-plugin-jsonc
https://github.com/ota-meshi/eslint-plugin-yml
https://github.com/ota-meshi/eslint-plugin-toml
https://github.com/ota-meshi/eslint-plugin-astro
https://github.com/ota-meshi/eslint-plugin-svelte
https://github.com/vuejs/eslint-plugin-vue
But I think it was easier than thinking about integrating these editors and whatnot.

(※ Vue parser is not something I started with. Also, the svelte plugin is not official.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also written a JSON/JSON-C parser, and I did it with the specific goal of making it ESTree-like for easier linting:
https://github.com/humanwhocodes/momoa

data: object;
loc: LocationRange;
suggest: Array<Suggestion>;
fix(fixer: Fixer): Iterable<FixOp>;
Copy link
Sponsor Member

@bmish bmish Dec 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AST-based autofixing (and suggestions) is a popular idea that's mentioned in the rewrite discussion (eslint/eslint#16557). It's also how various other linters do autofixing (including ember-template-lint, a handlebars template linter that I help maintain. As an aside, we've been interested to use ESLint under the hood for ember-template-lint for a long time, so excited to see this RFC).

When adding support for new languages, it seems like AST-based autofixing would be a critical feature to have ready, for a number of reasons:

  1. Support the migration of other linters that are already using AST-based autofixing without having to rewrite those autofixers to be based on string manipulation.
  2. Avoid getting off to a bad start by requiring all the new language plugins to use string-manipulation-based autofixers that could soon be obsolete.
  3. We could even decide to ONLY allow AST-based autofixing for language plugins. If that's the goal, the best time to enforce it is before any string-manipulation-based autofixers have been written. If that isn't the goal, then perhaps just make AST-based autofixing the default/recommendation for language plugins.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with ast vs string autofixing is that you need a printer to convert the AST back to code and that printer has to make decisions about where to put braces or parentheses that aren't always encoded in the AST.

For example - const x = (1); the AST doesn't encode the location of the =, the existence of the parentheses, or the existence of the semicolon (I.e const x= 1 produces the exact same AST). Some of the info can be inferred from ranges sometimes - but a lot cannot.

So unless ESLint wants to start making stylistic decisions about code it prints after fixes - it needs a solution for that (like using a CST). Which to me is outside the scope of this particular RFC.

Copy link
Sponsor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A CST still might be better done before this RFC, though, for all the reasons indicated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing about this RFC that requires a CST, nor is there anything about this RFC that precludes AST autofixing. However, I don't think it's prudent to bundle AST mutation with this RFC or make it a dependency. What a language decides to return from parse() is up to it - AST, CST, whatever. Each language can decide whether or not it wants to allow AST mutation, and it may not be possible to mutate the ASTs of different languages in the same way.

Fundamentally, AST mutation just requires:

  1. The ability to mutate the AST. For ESLint's architecture, this would likely need to be applied after linting vs. during traversing like in Babel. This is already the way autofixing works, so that wouldn't be an issue.
  2. The ability to generate the source code given a set of AST/CST nodes.

Both of these can be added on to a language definition later.

@nzakas
Copy link
Member Author

nzakas commented Dec 8, 2022

@DavidAnson thanks for taking a look! Indeed, in order for markdown linting to work in ESLint, we'd need to get line and column information. I haven't looked too deeply at markdown-it (though I have tinkered a bit). Would switching to Remark be an option? It appears they already output an AST with location information.

@nzakas
Copy link
Member Author

nzakas commented Dec 8, 2022

@ota-meshi Thanks for reviewing! I would need to look closer at vue-eslint-parser. Does it create an entire AST containing all HTML, CSS, and JS as one structure? Is there an example output anywhere I can look at?

Am I correct in that you also maintain astro-eslint-parser? Would you have the same concerns with that?

@DavidAnson
Copy link

I don't think remark existed when I started markdownlint for Node. If it's a good choice here, there is already a remark-lint package that could be a more natural fit. I haven't looked much into swapping the parser because it would be very disruptive. (Though I should consider that when I look at getting better line/column info out of markdown-it.)

@nzakas
Copy link
Member Author

nzakas commented Dec 8, 2022

@DavidAnson oh that's right, I've seen remark-lint before. IMHO, it's a lot more difficult to use due to the rule format, so it seems like another alternative would be welcome. I haven't looked closely, but what would prevent you from using the Remark AST and translating it into your current AST format? (TypeScript-ESLint does this, translating the TypeScript AST into an ESTree-compatible format.)

@DavidAnson
Copy link

what would prevent you from using the Remark AST and translating it into your current AST format

Time? Intelligence? Maybe nothing? :) One thing I'd want from switching is a more useful tree, so I'd maybe only translate to keep extensions working.

@nzakas
Copy link
Member Author

nzakas commented Dec 8, 2022

@DavidAnson I doubt intelligence is the gating factor but I do understand time. :) I'd be interested in exploring this with you if you're open to it. I feel fairly confident that there would be a way to use remark for parsing and then create an adapter to make existing markdownlint rules work.

@DavidAnson
Copy link

I appreciate the offer! I want to get the current round of releases out and then I should be in a better place to explore parser stuff. I assume this none of this is blocking for your proposal?

@ybiquitous
Copy link

Hi @nzakas, thanks for sharing the RFC.

This idea sounds good to end-users. But as a Stylelint maintainer, I also have a similar concern to @ota-meshi's one.

To lint *.vue with Stylelint, parse with postcss-html and Stylelint rules process the AST of the <style> tags and style attributes.
This means we need to divert multiple parsers and different language rules.

#99 (comment)

The current implementation of Stylelint is mainly based on the PostCSS AST. If compatibility would break, the Stylelint community might not reap the idea’s benefits.

@ota-meshi
Copy link
Member

@ota-meshi Thanks for reviewing! I would need to look closer at vue-eslint-parser. Does it create an entire AST containing all HTML, CSS, and JS as one structure? Is there an example output anywhere I can look at?

vue-eslint-parser parses result of <style> content as raw text. Also, currently the <style> tag is not included in the normal AST (it is possible to access the original information using a parser service).
If you want to check the AST, you can check it at https://astexplorer.net/, but it doesn't include the <style> tag so it may not be of value to you.

If you want to check the AST of postcss-html, you can check it at https://ota-meshi.github.io/postcss-html/. Inside <style> tags and style attributes are parsed as CSS.

Am I correct in that you also maintain astro-eslint-parser? Would you have the same concerns with that?

I maintain astro-eslint-parser (still experimental).
I have almost the same concerns with astro-eslint-parser. It doesn't parse CSS now.

@nzakas
Copy link
Member Author

nzakas commented Dec 14, 2022

@DavidAnson that's correct, we can move forward either way. Just offering to help out after the fact. :)

@ybiquitous my idea is to still parse into PostCSS AST format, but that would be done inside of ESLint, so (in theory) rules running against PostCSS nodes would still work.

@ota-meshi Hmm, I'm still not getting a good picture of how this is currently working. You are using vue-eslint-parser for both ESLint and Stylelint currently, correct? Are you using the same output in both places? Do you actually have a PostCSS AST embedded inside of an ESTree AST?

@ota-meshi
Copy link
Member

@ota-meshi Hmm, I'm still not getting a good picture of how this is currently working. You are using vue-eslint-parser for both ESLint and Stylelint currently, correct?

I use vue-eslint-parser to parse *.vue in ESLint, but not vue-eslint-parser to parse *.vue in Stylelint.
Use postcss-html to parse *.vue with Stylelint.

Do you actually have a PostCSS AST embedded inside of an ESTree AST?

I haven't embedded it yet. But I think we may need to embed the PostCSS AST inside the vue-eslint-parser parsed (ESTree-like) AST if it's optimal for the Language Plugins.

@JamesHenry
Copy link
Member

JamesHenry commented Dec 15, 2022

@nzakas It's confusing and a shame that I wasn't called out on the typescript-eslint note, and if I had been I could have also noted the other massive (5M downloads per month) ecosystem project which is relevant here: https://github.com/angular-eslint/angular-eslint

@angular-eslint/templater-parser users the Angular compiler to parse Angular-flavoured HTML and turn it into an AST which ESLint can traverse: https://github.com/angular-eslint/angular-eslint/blob/main/packages/template-parser/src/index.ts

@yeonjuan thank you for starting the html-eslint project - there will be a good amount of overlap between template rules in angular-eslint and html-eslint, maybe we could figure out a way to collaborate?

@nzakas
Copy link
Member Author

nzakas commented Dec 15, 2022

@ota-meshi Ah, I see! That makes sense. So the way this could work would be to create a processor that splits out the JS and CSS parts so they can be parsed separately. Then, you would be able to configure rules just for the JS and CSS parts.

@JamesHenry I didn't intentionally omit you -- I just haven't seen you in other discussions so you're weren't top-of-mind. That is why the RFC process requires at least 28 days, to make sure we are open for feedback from folks we don't think of right from the start. So it looks like you are parsing into one AST that contains everything, correct? Would this proposal make life easier for you in any way?

@ota-meshi
Copy link
Member

So the way this could work would be to create a processor that splits out the JS and CSS parts so they can be parsed separately. Then, you would be able to configure rules just for the JS and CSS parts.

Hmm.. I don't think returning partial text in the processor is a good option for me if it is the same as the current system.
I think there would be problems such as parsers and rules reporting wrong indentation since they cannot know the whole document. For example, the indentation of css in the style attribute.
If we use a processor, I think we will get better results if the processor returns the whole text in duplicate and each parser processes it.

@nzakas
Copy link
Member Author

nzakas commented Dec 26, 2022

@ota-meshi part of the role of processors is to normalize that information by mapping the whole into parts and vice versa.

Another possibility that this proposal opens up: You can parse an entire file into a single AST, where parts are ESTree and parts are PostCSS. Then, you can enable both JS and CSS rules on the file, and assuming there are no naming collisions between nodes in different sections of the AST, the rules should just work.

@ota-meshi
Copy link
Member

Another possibility that this proposal opens up:

I like the idea of including JS and CSS (and HTML) information in one tree, but I'm a little concerned that it would work well with the check rules for each language plugin.
I think that the vue (and HTML) parser may need to be able to know each language configuration (or at least parser configuration information) for CSS, SASS, SCSS, Less, Stylus, etc.
I am concerned that parsing on its own may result in loss of compatibility with each language's checking rules.

@nzakas
Copy link
Member Author

nzakas commented Dec 28, 2022

@ota-meshi your current system can continue to work, you won't need to switch over. So the worst case is that you keep doing things the way that you are. I think ultimately, though, treating a Vue file or an Astro file as its own type of AST made up of HTML, CSS, and JS would allow the most flexibility.

* After reading a file, ESLint passes the file information to `parse()` to create a raw parse result.
* The raw parse result is then passed into `createSourceCode()` to create the `SourceCode` object that ESLint will use to interact with the code.

#### The `validateOptions()` Method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could also allow customizing merging behavior. In fact, instead of the validateOptions() method, this could be a languageOptionsSchema property. We could then use the provided schema in place of the current languageOptions schema.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For convenience, we could move deepMerge (I assume this merge strategy was intended for language options) into ObjectSchema. Then, specifying a schema that just validates the object as a whole wouldn't be much more complicated than specifying a validateOptions() method:

// An ESLintLanguage implementation
{
    // ...

    languageOptionsSchema: {
        merge: "deepAssign",
        validate(options) {
            // ....
        }        
    }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, that is an interesting idea. That would definitely give the same level of control to language implementers while also providing a bit more structure than a function that just throws an error. I like it!

@mnkiefer
Copy link
Contributor

@nzakas: I just stumbled upon this RFC and I'm very excited to hear about this development. May I give this a try with our custom ESLint plugin @sap/eslint-plugin-cds and report back?

In our case, we currently hack into the rule creation such that writing our CDS language rules is identical to ESLint. Instead of traversing an AST, we use the CDS language model (CSN) with its own selectors and APIs and a rule report wrapper to adjust our normalise our rule reports to ESLint.

@nzakas
Copy link
Member Author

nzakas commented Mar 20, 2023

@mnkiefer of course! We don't have a prototype to play with yet, but all feedback is welcome.

Copy link
Member

@mdjermanovic mdjermanovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I just left a comment on the remaining open question.


## Open Questions

### Should we eliminate `:exit` from selectors?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this question was more important for earlier versions of this RFC where language implementations were expected to handle selectors. Now that parsing and processing of selectors stays in the ESLint core, this doesn't seem that closely related to the goals of this RFC, so we could evaluate this change separately?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, definitely.

@nzakas nzakas merged commit 91e184d into main Mar 24, 2023
@mnkiefer
Copy link
Contributor

@mnkiefer of course! We don't have a prototype to play with yet, but all feedback is welcome.

@nzakas: Thank you! I've read the RFC and can report that this approach could make our current plugin for the CDS language much simpler, depending on whether the following scenario could be handled by ESLint:

As already mentioned, our plugin does not rely on an AST but the CDS language model (CSN) which is computed from multiple files. Also, most of our CDS lint rules can't be considered for each file in isolation but only in the context of the entire model.

Hence, for the first parsed file belonging to a certain model: We compute the model, run all of the lint rules on it and cache the results. If the next file is part of that model, the lint results for that file can simply be collected from cache and be reported.

I'm wondering whether it would be possible for ESLint's LintResultCache to receive results for files it has not yet seen?

@nzakas nzakas deleted the languages branch March 27, 2023 19:46
@nzakas
Copy link
Member Author

nzakas commented Mar 27, 2023

@mnkiefer This seems like a much larger discussion that is prudent to do here. Can you open a discussion on the eslint/eslint repo so we can continue there?

I'm wondering whether it would be possible for ESLint's LintResultCache to receive results for files it has not yet seen?

No. The way you're describing this working isn't the way ESLint works internally, so I'm not sure it will be a good fit without significant changes.

@mnkiefer
Copy link
Contributor

mnkiefer commented Apr 5, 2023

@mnkiefer This seems like a much larger discussion that is prudent to do here. Can you open a discussion on the eslint/eslint repo so we can continue there?

I'm wondering whether it would be possible for ESLint's LintResultCache to receive results for files it has not yet seen?

No. The way you're describing this working isn't the way ESLint works internally, so I'm not sure it will be a good fit without significant changes.

@nzakas: Thanks for getting back to me on this. I've opened a discussion here. 👍🏼

bmish added a commit to bmish/eslint-rfcs that referenced this pull request May 21, 2023
* main:
  chore: Add Mastodon status post to workflow (eslint#110)
  feat: ESLint Language Plugins (eslint#99)
  feat: support for testing invalid rule schemas and runtime exceptions (eslint#103)
  feat!: check for parsing errors in suggestion fixes (eslint#101)
nzakas added a commit to eslint/eslint that referenced this pull request Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Final Commenting This RFC is in the final week of commenting Initial Commenting This RFC is in the initial feedback stage
Projects
None yet