RFC: Biome Plugins Proposal #1762

arendjr · 2024-02-06T21:44:03Z

arendjr
Feb 6, 2024
Maintainer

Introduction

A few weeks ago, I wrote an RFC to collect ideas and preferences for Biome's upcoming plugin system. First of all, I would like to thank everyone who responded! A lot of great suggestions were received, both in the RFC itself and also in follow-up discussions we had on our Discord.

In this RFC I would like to propose a more concrete direction. It will become quite the read, so feel free to jump to sections that peak your interest. I expect this to be the last RFC for a while, so while you can leave suggestions and critiques here of course, if you want to follow our progress, I suggest hopping onto the #dev-plugins channel on our Discord. Things may evolve considerably from what I'm writing down here.

Before diving into the deep, let me give you a quick summary of what to expect:

As expected, the community seems greatly in favor of supporting JS/TS plugins.
We discovered a query language called GritQL that would give users a high-performance and easy-to-use alternative to both query and transform syntax trees.
In this RFC I will explore both options and see how we can combine the two to try to get the best of both worlds.
Of all the plugin use cases, users seem to value custom lint rules the most, so I will try to explore what kind of plugin API could best facilitate this use case.
Finally, I will also briefly address what some consider to be the elephant in the room: how does Biome intend to compete with ESLint's TypeScript plugin?

With that out of the way, let's dig in.

GritQL plugins

I would like to start by discussing GritQL. In the first RFC several use cases were discussed: formatting, linting, transformation, a query engine, as well as codemods. But if you look carefully, they are all different masks that hide the same face:

A query engine would allow us to find specific code patterns in a code base
A transformation engine would allow to transform those code patterns into new patterns
A codemod is essentially a one-off, sometimes complex, transformation
A linter splits the phases: it queries for patterns that it reports diagnostics on, and optionally provides "fixers" that are again transformations.
A formatter is effectively a transformer with an emphasis on trivia (where should the whitespace go?)

At this point it should be clear the keywords here are querying and transformation. Those are GritQL's bread and butter.

Syntax

A large part of the appeal for GritQL is in its syntax. At first glance, Grit queries look like the code you're trying to match. Consider the following query:

`console.log("Hello world!");`

Can you guess what it matches? The nice thing is that is doesn't just match console.log("Hello world!"), it also matches
console.log('Hello world!') (notice the single quotes) and console . log ( "Hello world!" ). This is because it does structural matching based on the syntax tree of a program.

Transformations are strikingly easy too. Imagine you want to transform foo.bar && foo.bar() into foo.bar?.(), and you want it to work regardless of the identifier path of the function being called. Here is the GritQL to do it:

`$path && $path()` => `$path?.()`

There's a whole lot more it can do, but suffice to say that I was convinced. If you want to read up some more about GritQL, I recommend having a look at the tutorial: https://docs.grit.io/tutorials/gritql

Alternatives Considered

Of course GritQL isn't the only language designed for this purpose. There are a lot of query languages out there, but fewer that also support transformation. I'll limit this section to alternatives that support both.

Comby is an alternative that doesn't rely on tree parsing. This can make it a bit faster than other approaches, but the downside is that it has less syntactic awareness of parsed programs. And given that Biome already parses code into a tree, it doesn't look like a great match for us.

Semgrep and ast-grep both use pattern matching that looks and acts a lot like what GritQL does, although both seem to have a preference for wrapping their patterns in YAML files. We could of course try to take the pattern syntax only, but then we quickly find it to be limited in functionality compared to GritQL's syntax. For instance, where GritQL has direct syntax support for things such as conditions and nested patterns, ast-grep has to fall back to complex-looking YAML structures for more advanced patterns: https://ast-grep.github.io/guide/rule-config.html#rule-file

Finally, it should be mentioned that Morgante, the founder of Grit.io, quickly joined the conversations and showed willingness to cooperate with us. Having their support certainly helped us to confirm our choice.

GritQL in Biome

Looking back at the use cases that we want to support, I believe it is a no-brainer to support GritQL in the transformation and codemod use cases. We haven't fully fleshed out yet how we would integrate the query engine use case into Biome, but one possibility that I find appealing is to add a sidebar to our VS Code extension, and let users submit GritQL queries from there. It would give our users a syntax-aware Search/Replace feature right in their editor.

For formatting, things are more subtle. GritQL hasn't been designed with support for transforming trivia in mind. However, if we create a GritQL implementation on top of Biome's CST, which is very much trivia-aware, we might be able to use it here too. Fingers crossed.

That leaves the linter. Let's imagine how we could use GritQL to implement Biome's noImplicitBoolean rule, which disallows JSX attributes without value, such as in <input disabled />. Here's how we could write this rule with GritQL:

or {
    `<$component $attrs />`,
    `<$component $attrs>$...</$component>`
} where {
    $attrs <: some $attr => diagnostic(
      message = "Use explicit boolean values for boolean JSX props.",
      fixer = `$attr={true}`,
      fixerDescription = "Add explicit `true` literal for this attribute",
      category = "quickFix",
      applicability = "always"
    ) where $attr <: r"[\w-]+"
}`

You can see the fixer for this rule in action for yourself: https://app.grit.io/studio?key=kPdo1E85sb-gxiTFLA9Ag

The diagnostic() function isn't defined by GritQL, but the language allows for custom functions. We could use it to report the given message when linting, and to apply the replacement in the fixer when the user asks for it.

noImplicitBoolean isn't a complex rule to begin with, but it's hard to beat the expressiveness of the above snippet. As an added bonus, Grit already supports Markdown pattern files which would allow us to conveniently store a rule like the above together with the Markdown documentation for it.

JS/TS plugins

Given the overwhelming preference of our users to write plugins in JavaScript or TypeScript, I think it only makes sense to try to support them in Biome. It's a natural fit for web developers, but the main concern that needs addressing is performance. One of Biome's main selling points is that it is so much faster than the established alternatives. And the main reason it is so much faster, is precisely because it is not written in JavaScript. So if we allow plugins written in JavaScript, we need to be careful to avoid letting their performance drag down Biome as a whole. To achieve this, I would suggest the following strategy:

We stick with Biome's current strategy of integrating common rules into Biome's core. Just because something can be a plugin, doesn't mean it has to be. This also means that plugins that become popular in the community may become candidates to be integrated directly into Biome.
Where possible, GritQL plugins should be preferred over JavaScript ones, since those are expected to run much faster. Depending on how fast they actually end up being, we may even allow GritQL plugins in Biome's core, which is something I frankly don't expect for JavaScript plugins.
JavaScript plugins should not do their own syntax tree traversal, since this would introduce a lot of serialization overhead and computation on the plugin side. Instead, plugins should only cause overhead on the code patterns they are actually interested in.

Linter API

As I like to say, "A code example speaks a thousand words". Let's look at what the above noImplicitBoolean rule could look like as a pure JS plugin:

import Biome, { transform, into } from "$biome-plugin";

Biome.traversal.onEnter({ kind: "JsxAttribute" }, (attr) => {
  if (attr.initializer == null) {
    Biome.linter.reportDiagnostic({
      message: "Use explicit boolean values for boolean JSX props.",
      fixer: () => transform(attr, into`${attr}={true}`,
      fixerDescription: "Add explicit `true` literal for this attribute",
      category: "quickFix",
      applicability: "always"
    });
  }
});

The principle here is straightforward: The plugin registers a callback which is invoked for AST nodes of a given kind. On those nodes, it can apply some analysis, and report diagnostics when it finds issues.

At this point I would propose two intentional limitations for the AST nodes passed to callbacks:

They wouldn't contain any trivia. Linter rules tend to ignore those anyway, so it would be a waste of the serialization overhead. For us syntax nerds: This is where Biome's internal CST nodes are converted into AST nodes.
They wouldn't contain child statements. This doesn't affect the example above, but a rule that matches functions, for instance, would only be able to analyze the bodies of those functions by setting up additional callbacks for nested nodes. This is in support of the strategy mentioned earlier, that JavaScript plugins shouldn't do their own traversal, and again helps to avoid serialization overhead.

If either of those limitations turns out to cause too much friction, we can always introduce options that would allow rules to opt-in to additional serialization.

But the most interesting aspect of the snippet above is the fixer that is used. The fixer callback uses a transform() function that I'll explain in the next section.

Transformation API

Remember that GritQL snippet from above, where we transformed function calls preceeded by a test for the callee with a use of optional chaining?

`$path && $path()` => `$path?.()`

Here is what it looks like if we used the proposed JS API:

import Biome, { matches, transform, into } from "$biome-plugin";

Biome.traversal.onEnter({ kind: "JsLogicalExpression", operator: "&&" }, (expr) => {
  if (expr.right.kind === "JsCallExpression" && matches(expr.left, expr.right.callee)) {
    transform(expr, into`${expr.left}?.()`);
  }
});

We've seen the traversal API above with the linter API. And we've seen a transform() call as well, since the linter's fixer also used it. It might be time to explain how it works...

transform() takes an AST node (the first argument) and transforms it into a snippet (the second argument). Often the node to be transformed matches the one being visited, but it could be another, such as expr.left in the example above. The snippet is expressed as an array where each element is either a node, or a string. Snippets can be conveniently created with the into tagged template helper.

The reason for using snippets instead of just flattening everything to a string is simple: We can't reliably turn it into a string, because we've lost information when we took Biome's internal CST nodes and exposed only part of their information as AST nodes to the script. So instead, we pass references to the AST nodes (along with replacement strings) back to Rust, where the complete replacement is constructed for us.

Grit Patterns

Of course, if JavaScript plugins are not allowed to perform AST traversal themselves, it makes matching certain types of traversal and analysis annoyingly complex. But what if instead of using multiple listeners to detect nested patterns, a Grit query could be used to match instead?

For instance, consider this transformation for flattening nested if-else statements:

import Biome, { grit, transform, into } from "$biome-plugin";

Biome.traversal.onQuery(
  grit`if ($cond) { $consequent } else { $alternative }`,
  { where: { alternative: grit`if ($cond2) { $consequent2 } else { $alternative2 }` } },
  (node, { cond, cond2, consequent, consequent2, alternative2  }) => {
    transform(node, into`if (${cond}) {
      ${consequent}
    } else if (${cond2}) {
      ${consequent2}
    } else {
      ${alternative}
    }`);
  }
);

It allows for far richer queries than plain node traversal can, and also has performance benefits since all the traversal and query matching happens in native code inside Biome. But meanwhile developers writing their own rules or transformations still have the full flexibility of JavaScript available in processing the matches.

Of course the above syntax is merely a teaser, but I think if we go in this direction, we can truly get the best of both worlds.

Choice of Engine

One big question that will inevitably need answering if we want to support JavaScript plugins is, which engine will we use to implement it? I haven't made a decision here, so I'll just list the options as I see them:

V8 (Deno)

V8 seems the go to engine people like to embed. Node and Deno are both built on it, and of course the world's most-used browser uses it. And thanks to Deno, it's even relatively easy to embed in a Rust program, because they have crates that we can use for exactly that purpose. It even has a permissions model built-in so we can offer a certain amount of sandboxing around plugins.

Deno also has built-in TypeScript support, but that support is implemented in the CLI; not the lower-level Rust crates. To some extend that's a blessing for us, since Deno uses SWC for transpiling TypeScript, while we would probably like to use our own infrastructure for that. Deno already provides us with a module loader interface where we could implement this.

To be clear, I think embedding Deno is probably our most-promising road to offering JavaScript plugin support. But I'll list some alternatives for the sake of completeness.

V8 (Node)

Of course we could also V8 through Node.js, and I even found a useful presentation about it: https://speakerdeck.com/jlkiri/node-dot-js-in-rust-how-to-do-it-and-what-to-expect-from-it

That said, with the exception of NPM compatibility, which may be easier to achieve, it seems like going with Node.js mostly has downsides. Integration into our Rust codebase would be more difficult, there's no sandboxing, and we'd inherit CommonJS vs. ESM incompatibilities.

SpiderMonkey

SpiderMonkey is Firefox's engine, and it also has Rust bindings we can use. It's quite a bit more low-level since we'd be dealing with the engine directly, but it appears SpiderMonkey is also a relatively popular engine for embedding, such as with CouchDB and MongoDB.

Performance-wise I wouldn't expect major changes from V8.

JavaScriptCore (Bun)

JavaScriptCore made some waves recently for becoming the engine of choice for Bun, the high-performance alternative to Node and Deno. It seems JSC's main performance advantage is in startup times, which would be a nice benefit for loading plugins too, of course.

Unfortunately, I haven't really found an official embedding API for Bun, and we'd either have to interface to Bun through C APIs, or use JSC's C++ APIs directly.

QuickJS

Another low-level engine, and because it has no JIT it will be significantly slower than the prior options. That said, it will
help to keep Biome's binary from growing as much compared to the others and it has excellent startup performance.

Custom engine

As QuickJS shows, writing a JS engine from scratch isn't entirely infeasible, especially without a JIT. Biome already has a parser, and a custom engine could start execution without needing transpilation or an extra parsing step. Bindings to our CST model could also be very lean. So while a custom engine could never beat any of the JIT engines in raw performance, it could have the lowest overhead of any of the options.

As fun as it to theorize about this however, I wouldn't suggest going down this road unless someone volunteers with way too much time on their hands :)

Ecosystem Support

Another big question that comes up when offering JavaScript plugins is whether they can load third-party modules or not. Offering compatibility with either the NPM or Deno ecosystems would open up a lot of possibilities for further integrations. It would also be a lot of work. Realistically, we would need to either embed Deno, Node, or Bun if we want to have a shot at NPM compatibility. Deno has an advantage here, in that tapping into the Deno ecosystem will already bring a lot of benefits while being a lot easier to support than full NPM compatibility. Either way, I think it's best to tamper user expectations here, and not make any promises on this front for now.

One final remark I'd like to make here: Even though NPM may seem ubiquitous, not every web project uses it. Biome certainly doesn't use it. While I can see the obvious value in allowing plugins to load NPM modules, I would be hesitant towards requiring the use of NPM for installing Biome plugins.

TypeScript support

So far I've talked only about JavaScript specifically, but adding TypeScript support shouldn't be too hard for our plugin system.

When running a TypeScript plugin, Biome can simply strip the type annotations and hand over the resulting JavaScript to the engine that we settle on. One caveat is that no type-checking will be done when running a TypeScript plugin. Which leads us to...

An alternative for `typescript-eslint`

I will give a little bit of special attention to what might be the most popular ESLint plugin: typescript-eslint. Some users have made it very clear that supporting everything typescript-eslint can do is a dealbreaker for them. This included a suggestion that our plugins would need to support full NPM compatibility so that we could use the typescript package that way. There's a few things to address here:

The Biome team has announced their intention to build their own TypeScript-compatible type inference system directly into Biome's core. This should not require any plugins or external dependencies to get to work.
Even so, some users expressed concerns that a new type inference implementation would lack too many features to be useful enough for their use cases. I don't think such concerns are entirely unwarranted, and we may be open to support users that prefer to extract type information from the TypeScript Language Server as well. It looks like we may already have a volunteer willing to help out in this area (shoutout @DaniGuardiola), but we will need all the help here we can get.

Neither of the above approaches has much overlap with the plugins discussed in this RFC however, so if you're interested in this domain, I suggest hopping onto our Discord and checking out the #dev-type-inference channel instead.

Wrapping Up

This RFC proposed two different paths towards supporting plugins in Biome. I intend to begin by exploring how we can implement GritQL plugins first, since I expect this to be less effort, and GritQL plugins should already be able to handle a large amount of use cases for Biome. JS/TS plugins will potentially be even more powerful, but implementing support for them will likely also be more effort. And with the inclusion of Grit-matching in the API for our JS/TS plugins, it makes sense to complete GritQL support first.

Exploring the integration of a JS engine into Biome could be done in parallel, so if someone is already interested in starting this work, I would happily support them. I may be able to offer mentoring opportunities here as well. If this speaks to you, please reach out on the #dev-plugins channel on our Discord server.

That should hopefully wrap things up for now. There's a lot of work to be done, so from here there'll be more coding and less RFCs... If you do have questions, suggestions, or concerns, feel free to leave them here. If you would like to follow our progress, I would again recommend our #dev-plugins channel :)

Thanks for reading!

morgante · 2024-02-06T23:27:39Z

morgante
Feb 6, 2024

I just wanted to chime in that I'm fully in support of this and Grit is excited to share technology with Biome. 🚀 We're planning our first open source release in the next month.

I'm particularly interested in the ideas around interop between JS plugins and GritQL queries. It's along the lines of what i have already thought about for a Grit SDK and should theoretically be able to combine the expressiveness of JS with the performance of Rust (accessed through declarative queries).

3 replies

ilevyor · 2024-02-07T14:01:40Z

ilevyor
Feb 7, 2024

For formatting, things are more subtle. GritQL hasn't been designed with support for transforming trivia in mind. However, if we create a GritQL implementation on top of Biome's CST, which is very much trivia-aware, we might be able to use it here too.

This an exciting idea! GritQL is currently backed by TreeSitter, which gives a lot of flexibility in supporting many languages, and modifying the grammars to suit our particular needs. I'd love to exchange notes on GritQL internals, and Biomes CST to further explore this idea.

5 replies

raulfdm Feb 7, 2024

Is GritQL open source though?

morgante Feb 7, 2024

We are planning to open source it.

This comment was marked as disruptive content.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

jamiebuilds · 2024-02-08T06:35:17Z

jamiebuilds
Feb 8, 2024

Why make the formatter have plugins at all?

Most prettier plugins are for new grammars (which is notably not enabled by GritQL), and I’m not sure it would be in the interest of Biome’s architecture to add new grammars at this level. There’s an entire pipeline to consider that sits higher than the formatter

The only other prettier plugins of any note are for things Biome should probably just support out of the box (jsdoc, tailwind)

Adding plugins to the formatter seems to accept pretty massive cost for little to no benefit

4 replies

arendjr Feb 8, 2024
Maintainer Author

Some users had valid examples that would require formatter plugins or custom implementation into Biome’s core otherwise. For instance, some even maintain a custom Prettier fork just to comply with Wordpress formatting rules.

GritQL supports those use cases, so if we have GritQL anyway, I hardly see a reason not to use it here. It definitely wouldn’t be a massive cost anymore.

fbartho Feb 18, 2024

I contribute to a prettier plugin that organizes, sorts, and groups import expressions.

If this can easily be implemented without a plugin, that's fine!

I could easily see a plugin API which is just a way to load GritQL code-mods assuming GritQL can express this!

https://github.com/IanVS/prettier-plugin-sort-imports

arendjr Feb 18, 2024
Maintainer Author

Thanks @fbartho , that’s a great use case! I suspect if we wanted to enable such a plugin, it might be more convenient if we give formatter plugins access to the JS API too. But could be @morgante has a good idea how to solve this with pure GritQL :)

morgante Feb 19, 2024

This would definitely be solvable with pure GritQL, since it's mostly just rearranging a list based on some properties within it.

jamiebuilds · 2024-02-08T07:00:49Z

jamiebuilds
Feb 8, 2024

A note on typescript eslint and the possibility of building a type checker in Biome.

The approach described in the Biome docs to eventually add a “stricter subset” of TS is almost certainly impossible to achieve without some sort of major change to how TS itself is being built

Put simply, it’s too much work, and people’s tolerance for differences to the official type-checker are nonexistent

That doesn’t mean that Biome couldn’t do things in the space. Quite a few TS-ESLint rules would be enabled by a robust control flow graph and data flow analysis.

If Biome did go down the path of doing type analysis. It would be far more successful to actually be less strict. Effectively giving up in some key places, such as T extends U ? Y : Z where you’d effectively have to implement all of TypeScripts type checker in order to answer T extends U. This would never be useful as a type checker but could maybe be useful to query for basic type information in lint rules.

The other alternative for that is to directly interface with the TypeScript language server. If Biome is to never build a replacement type checker, then users will always have to be running TypeScript alongside Biome anyways, might as well query from it. This has obvious perf implications, so you’d want to avoid it in certain critical paths, but if it’s just for linting it seems acceptable with thoughtful scheduling (tbh you might want to consider cleanly dividing linting that requires typings and linting that does not)

2 replies

arendjr Feb 8, 2024
Maintainer Author

I feel you’re misunderstanding the intentions of what Biome is trying to achieve with their custom type inference. It’s there to power lint rules, not to implement a type checker.

It will indeed use “type flow analysis” to do that.

The other alternative for that is to directly interface with the TypeScript language server.

As mentioned in the RFC, someone is investigating this angle as well.

DaniGuardiola Feb 8, 2024
Collaborator

This is what I've been arguing, and while I respect the official effort and want to see how far it gets, I'm looking into integrating actual TypeScript support for type-aware rules. Josh Goldberg (typescript-eslint maintainer) has been helping out, and I've also chatted with Ben about possibly integrating his Ezno checker (which is a WIP Rust TypeScript alternative) in the long run.

Interfacing with the language server is something I want to explore though there'd still need to be support for static runs outside of an IDE context - I also have some concerns regarding how rules are authored since, in typescript-eslint, they rely on the TypeScript compiler API and I'm not sure how much of that is available through LSP (including the upcoming assignability check APIs that the typescript-eslint folks are waiting on).

jamiebuilds · 2024-02-08T07:24:30Z

jamiebuilds
Feb 8, 2024

I think this is far too optimistic of what GritQL would actually be useful for.

There are different levels of analysis required by different kinds of lint rules.

Some operate directly on the structure of the code. These have largely been replaced by formatters.

The majority of lint rules out there require a lot more analysis than that. They need to look at scope information, or the control flow, or even type information.

Even further, there are a lot of lint rules that need to pull in libraries for parsing or validating or any number of other things

GritQL seems great for codemods. But it doesn’t seem useful for lint rules outside of the most straightforward kinds of lint rules, which also happen to be the easiest to just put in Biome (if they aren’t there already, as is the case with the example given in this RFC)

There are alternatives to GritQL that can query higher levels of analysis such as CodeQL, but that may also go too far in the other direction and be too confusing to be useful

18 replies

morgante Feb 8, 2024

It's not my decision if Biome wants to use GritQL or not. If anyone has questions about how Grit works, I'm happy to engage. If Biome wants to use Grit, we're happy to collaborate. Otherwise, I don't think I need to defend against incorrect assumptions about how it works.

ematipico Feb 9, 2024
Maintainer

@morgante

This isn't a technical comment, but it's a comment about your engagement with a member of our community. I have read all the thread, plus all the messages that were edited. I think you have taken this too personally and interacted with them inappropriately.

I would like to keep things technical without judgement and accusation. Jamie's first message looked like neutral and technical to me, based on his knowledge that he gathered from your documentation - here, I assume that the documentation was written by you or one of your collaborators -, and he draw his own conclusions, saying that GritQL might not add enough value as a linter to a Biome plugin system.

Your first response:

Why are you speculating about what GritQL can and cannot do without spending time learning it? We certainly have many rules that look at control flow and scopes.

This tone seems personal to me. If the documentation you - yourself and your collaborators - wrote doesn't provide all the cases that you claim grit provides, I would expect a message that explains it.

Also, I want to assume that your documentation has everything, but it's also possible that a user cannot reach a specific chapter of the documentation. If the user doesn't read something, it's also possible that some parts of the documentation aren't visible. It's fine to share the part of the documentation that the user didn't read, but it's not fine to say:

again affirms you haven't spent any time to understand what you're disparaging

You should teach us about your tool, with an informative tone without being personal. Let's have a technical discussion. You should not be degrading other people for not reading your documentation enough. Maybe your documentation isn't clear enough?

As a maintainer of Biome and Astro, and Rome and webpack-cli in the past, I understand the frustration, but we should never judge other users for not knowing a tool.

This is a formal warning, and I expect an apology to our user Jamie.

morgante Feb 9, 2024

I don't think it's a good idea to continue polluting this thread with debates over tone, so I can follow up about that elsewhere.

I'm sorry for my part in dragging things in that direction, and apologize to Jamie for being overly aggressive in my reaction.

yacinehmito Feb 22, 2024

In order to make JS plugins fast, Biome is going to need to add a querying system. You've already mentioned one in the RFC. If anything I think it's not ambitious enough. This querying system should be designed as an actor model, it should operate at a higher abstraction level, and it should only pull in parts of the AST that it actually requires. Much like how the green tree works already.

If Biome has its own querying system, the plugin API becomes by design extensible and can change as Biome grows into supporting more use cases, whereas embedding a DSL like GritQL limits Biome to the capabilities of that DSL. It looks to me to be a safer path if one wants to be able to properly react to unknown unknowns. Those are bound to happen as Biome grows into a bundler, a type-checker, a test runner and whatnot.

This comment was marked as off-topic.

Sign in to view

jamiebuilds · 2024-02-08T07:46:47Z

jamiebuilds
Feb 8, 2024

Currently Biome can be installed as a binary and executed before you run npm install

This is an enormous benefit for CI usage and it would be a shame to give that up for plugins

At the same time, asking users to install things with Cargo is a pretty high bar.

So I would consider making Biome capable of installing its own dependencies. Not trying to be its own fully fledged package manager, it could build on top of the npm registry and package.json

7 replies

DaniGuardiola Feb 8, 2024
Collaborator

+1, love the idea of bundling linter deps in the repo too, it simplifies things and works well for Yarn v2+.

fbartho Feb 18, 2024

Expecting biome to be installed on the system before the user's normal package manager executes is not a convenience (IMHO) for many CI approaches, you end up having to add another CI step, and an out-of-band versioning management issue.

If this project is meant to format JS/TypeScript + NodeJS projects, it should be fair to install it as a node_module

morgante Feb 18, 2024

Not requiring node_modules does not imply that Biome (and plugins) can't also be installed via your normal package manager. This is already true today: Biome is available as a standalone executable, but also installable from npm. I don't think anything with plugins would change that.

fbartho Feb 18, 2024

I guess I was just alarmed by a mix of details in other threads: embedding Deno, having biome install its own plugins, soon you have to manage a lock file for biome, and security disclosures need to go into GitHub etc.

arendjr Feb 18, 2024
Maintainer Author

Hehe, no worries. I expect Biome plugins would simply be checked into your repository, similar to how Yarn plugins also work. That means you wouldn’t need a separate install step for them and lockfiles are also not necessary.

It’s also similar in approach to if you use the ESLint rulesdir plugin where you simply specify a directory to load custom rules from.

jamiebuilds · 2024-02-08T07:51:31Z

jamiebuilds
Feb 8, 2024

If Biome does go down the route of JS plugins, asking the plugin authors to compile their code ahead of time provides an opportunity for optimization.

This would allow Biome to avoid requiring any kind of additional runtime, meaning no node/deno/bun APIs, besides the APIs you provide it.

During this phase, you could compile to byte code or executables with Hermes or QuickJS. You can ensure all the code is bundled, optimized, and minified. So by the time you want to load the plugin you have very low initialization costs.

10 replies

Conaclos Feb 9, 2024
Maintainer

There are many options to asses: Libjs, njs, Boa, llrt, workerd, ...

arendjr Feb 13, 2024
Maintainer Author

Boa seems an interesting candidate indeed. It's written in Rust and its implementation is split into a few crates. Maybe we could cherry-pick a few of the crates to build something tightly integrated with the rest of Biome. Worthy of some more investigation!

ematipico Feb 13, 2024
Maintainer

FWIW, I don't think I'd be comfortable integrating Hermes into Biome

Actually, what we should consider is static Hermes, which has a different purpose. It's not oriented to react native

arendjr Feb 13, 2024
Maintainer Author

@ematipico As cool as I think Static Hermes is (very), I don't think it's suitable for this use case. It's basically a full compiler for translating JS (and to some extent: TS) into C and then to native binaries. Doing that on-demand would give terrible latency, since you'd always have a full compile step before being able to run anything. That's all fine for mobile apps, where you can use regular Hermes during development and then compile binaries for final distribution, but I don't think it makes for a decent plugin system. There's a couple of problems here that I see:

Bundling Hermes and LLVM with Biome would probably make our binary even bigger than bundling V8.
The latency for compiling on-demand would be manifold of what any JIT compiler can offer.
Checking in binaries isn't really a workable solution, since they're platform-specific and would give issues for teams that use multiple operating systems (including teams on Mac who use Linux runners in CI).
Developing plugins would become very tedious if you have to compile them for every dev-test cycle. You'd be combining the downsides of static languages (slow compilation) with the downsides of dynamic languages (no type safety). Like mobile apps, that wouldn't be an issue if we had regular Hermes as an option as well, but then we need to implement two runtimes...

arendjr Feb 13, 2024
Maintainer Author

The Boa devs seems quite receptive to the idea of using their engine in Biome btw: boa-dev/boa#3673

nstepien · 2024-02-16T00:13:19Z

nstepien
Feb 16, 2024
Sponsor

And the main reason it is so much faster, is precisely because it is not written in JavaScript. So if we allow plugins written in JavaScript, we need to be careful to avoid letting their performance drag down Biome as a whole.

I think the main reason Biome is so fast is that it's multi-threaded, uses a background process, and reuses the same parse data for linting/formatting/...
Of course if it was written in JS it'd still be much slower, but the overall architecture has a bigger impact IMO, and that architecture would amortize the cost of JS plugins to some extent: the same JS plugin could work on multiple files in parallel thanks to the architecture.

Biome could be slower than JS tools if it was poorly designed, but it's not!

1 reply

jamiebuilds Feb 28, 2024

I would not underestimate the importance of manual memory management and Rust's performance. For example, Biome's parsing mechanisms (see biome_rowan/etc) do not currently exist in any JavaScript parser and I believe if someone tried to build one it would be unreasonably slow for tooling that rapidly edits these trees.

I would also not underestimate the weight of serialization between Rust and JavaScript or the limitations you'd need to impose that would make JavaScript plugins notably less powerful than Rust ones. I don't think you could reasonably implement "real-time" typing features with invalid programs and such, you'd need to wait for a valid AST all the time because the cost of the CST would be far too high.

nstepien · 2024-02-16T00:34:40Z

nstepien
Feb 16, 2024
Sponsor

Something I've not seen mentioned is that plugins could run in wintercg-compatible workers.

Biome doesn't need to support plugins importing npm dependencies, plugin authors can bundle them instead.
On the other hand, native modules like fs cannot be bundled, so it'd depend on the worker environment. Not all npm dependencies will work in those environments. Tradeoffs!

3 replies

Conaclos Feb 16, 2024
Maintainer

Requiring bundled code could also improve performance for loading and executing code.

fbartho Feb 22, 2024

Why do you say “Biome doesn’t need to support plugins importing npm dependencies”?

I disagree — declaring that all plugins can’t have dependencies severely limits the problems that can be solved, or the scale of the plugins ahead of time.

That might be an appropriate choice, but I don’t think that’s an “obvious” constraint that can “just” be stated as fact.

At the minimum plugins will need an API surface area from biome, and people like to use their own tools and helpers, those might be external dependencies.

nstepien Feb 22, 2024
Sponsor

Why do you say “Biome doesn’t need to support plugins importing npm dependencies”?

NPM deps can be bundled, so biome doesn't need to support resolving NPM/node_modules imports.
I'm not saying plugins shouldn't use NPM packages.

acalvino4 · 2024-04-26T14:57:09Z

acalvino4
Apr 26, 2024

Just read that Cloudflare is using Grit as well: https://blog.cloudflare.com/lessons-from-building-an-automated-sdk-pipeline
Always good to have validation (and investment) from the big guys.

0 replies

rotu · 2024-05-08T21:21:02Z

rotu
May 8, 2024

Say I want to write a rule that forbids calling process.exit(). I know GritQL can easily check that the syntax process.exit() does not literally occur in my code.

But especially in a language as dynamic as JavaScript, there many ways of laundering/obfuscating this code:

e.g. ```js
import('data:text/javascript,process.exit()')


e.g. ```js
const x = 'pro'+'cess';
const y = 'exit';
globalThis[x][y]();

Obviously you can't perfectly tell what a piece of code is going to do by static analysis, but I'd want the ability to make both positive and negative behavioral rules like "this code MUST call process.exit" or "this code MUST NEVER call process.exit".

8 replies

morgante May 8, 2024

Ex. here's an example of how easy it is to subvert any of the built-in rules.

These limitations have nothing to do with GritQL. Biome, as it is today, does not contain type-level information.

rotu May 9, 2024

GritQL is a query engine on top of the graph. It can query whatever information is available in the graph. You are holding it to expectations that no other linter currently follows.

If you would like to add more semantic information to be available for querying, by all means add it.

I'm probably misunderstanding the tool here. Looking at the documentation for GritQL made me think it is a pattern matching and query tool for only the syntax tree, and that it does not have the ability to represent richer semantic details like "what line of code defines the symbol?".

There is clearly a difference between the "biome graph" and the AST which I missed and I think you were trying to convey above. I think it would help my intuition a ton to see an example of GritQL on a structure which is not merely AST such as the overlaid dataflow you mention:

We have actually had other graphs overlaid on top (ex. dataflow graph) and would like to expose any graph's Biome already has for fast/easy querying. There is nothing intrinsically saying GritQL can only query ASTs.

As for your statement on my semi-obfuscated code:

In general, I really think you should consider what you're trying to do with your tools. If you repo contains code like this then that is the problem

I agree that my example is unreasonable code. There will be unreasonable code and I'm trying to wrap my head around how that works with query-based rules. Your example is a good one, and rightly demonstrates that the current status quo is nothing to write home about, so perhaps I'm thinking too far outside the box!

morgante May 9, 2024

Looking at the documentation for GritQL made me think it is a pattern matching and query tool for only the syntax tree, and that it does not have the ability to represent richer semantic details like "what line of code defines the symbol?".

That is just semantic information, which could theoretically always be overlaid on top of an AST. Ultimately the AST is what any tool starts with which you can enrich with additional symbolic information.

For example, here's how you can match on the type of a symbol in GritQL if a type tree is available:

`$symbol` where {
   $x = type($symbol),
   $x <: string // match any case where the type of $symbol is resolved as a string
}

Anyways, I think most of your objections are simply with what the state of the art is in most linting tools (including Biome), not GritQL. If Biome ever does add more semantic information, that will be queryable via functions.

There will be unreasonable code and I'm trying to wrap my head around how that works with query-based rules.

I don't know why you are picking on query-based rules. The hard/important part is constructing the semantic tree—once it's available, built-in rules and plugins can both use it.

Anyways I encourage you to consider the purpose of linters is generally to catch common mistakes from developers not to enforce 100% correctness. If you are looking for that, I'm afraid you will be disappointed by ~every tool there is.

rotu May 9, 2024

I’m sold on query-based rules. It makes sense, once you point out how functions allow for reacting to conditions that are not local, syntactic properties.

And linters don’t prove that things are rigorously, structurally true of code. Because when an engine gets that persnickety, we call it a type system instead :-).

morgante May 9, 2024

Because when an engine gets that persnickety, we call it a type system instead :-).

Indeed. Hopefully Biome will eventually integrate one, but it's quite a ways off.

alexgorbatchev · 2024-05-23T02:41:28Z

alexgorbatchev
May 23, 2024

I'd like to chime in here. I work in an enterprise environment, our workspace has about 20 eslint rules specific to the repo itself. They primarily target developer experience and very often deal with the file system and file paths. I believe this kind of rules to be a necessity when your project has 100+ contributors. We are unable to move to Biome because the loss of these rules would be detrimental to the developer experience.

I think the plugin system has to be flexible enough to support basic language features such as code sharing, functions, conditional statement, etc and deal with external file system at the very least.

I imagine this situation is a common place in large scale projects.

3 replies

We have a Remix project which we agreed that, every route that has a loader MUST also export an error boundary it can handle its own error without bubbling to the root error boundary.

To enforce that easily, I've created a custom ESlint rule that looks like this 👇🏽

// @ts-check
/** @type {import('eslint').Rule.RuleModule} */
export default {
  meta: {
    type: 'problem',
    fixable: 'code',
    docs: {
      description: `Loader + ErrorBoundary`,
    },
    messages: {
      unexpectedProcessEnv: `When exporting a loader, the route must also provide an ErrorBoundary.`,
    },
  },
  create(context) {
    const sourceCode = context.sourceCode;

    let fileHasLoader = false;
    let importGenericErrorBoundaryWasAdded = false;
    let exportErrorBoundaryWasAdded = false;
    let fileHasErrorBoundary = false;
    /**
     * Every page route that renders something must have a default export.
     * If it only has the loader, it means it's an API route so we don't need
     * to have an ErrorBoundary
     */
    let hasDefaultExport = false;

    return {
      ExportNamedDeclaration(node) {
        if (node.declaration) {
          // Handle FunctionDeclaration (including async functions) and VariableDeclaration
          if (
            node.declaration.type === 'FunctionDeclaration' ||
            node.declaration.type === 'VariableDeclaration'
          ) {
            // For FunctionDeclaration, the function name is directly available
            if ('id' in node.declaration && node.declaration.id.name) {
              if (node.declaration.id.name === 'ErrorBoundary') {
                fileHasErrorBoundary = true;
              } else if (node.declaration.id.name === 'loader') {
                fileHasLoader = true;
              }
            }

            // For VariableDeclaration, iterate through declarations to find the names
            if ('declarations' in node.declaration) {
              for (const declaration of node.declaration.declarations) {
                if (declaration.id && 'name' in declaration.id) {
                  if (declaration.id.name === 'ErrorBoundary') {
                    fileHasErrorBoundary = true;
                  } else if (declaration.id.name === 'loader') {
                    fileHasLoader = true;
                  }
                }
              }
            }
          }
        }

        // Handling other named exports (e.g., export { loader };)
        if (node.specifiers) {
          for (const specifier of node.specifiers) {
            if (specifier.exported && specifier.exported.name) {
              if (specifier.exported.name === 'ErrorBoundary') {
                fileHasErrorBoundary = true;
              } else if (specifier.exported.name === 'loader') {
                fileHasLoader = true;
              }
            }
          }
        }
      },
      ExportDefaultDeclaration() {
        hasDefaultExport = true;
      },
      'Program:exit'(node) {
        if (fileHasLoader && hasDefaultExport && !fileHasErrorBoundary) {
          context.report({
            node: sourceCode.getScope(node).block,
            message: 'When exporting a loader, an `ErrorBoundary` must also be exported.',
            fix(fixer) {
              const { text: fileText, ast } = context.sourceCode;

              let fixers = [];

              if (
                !importGenericErrorBoundaryWasAdded &&
                !fileText.includes("from '$ui/GenericErrorBoundary'")
              ) {
                fixers.push(
                  fixer.insertTextBefore(
                    ast.body[0],
                    "import { GenericErrorBoundary } from '$ui/GenericErrorBoundary'\n",
                  ),
                );
                importGenericErrorBoundaryWasAdded = true;
              }

              if (
                !exportErrorBoundaryWasAdded &&
                !fileText.includes('export const ErrorBoundary')
              ) {
                fixers.push(
                  fixer.insertTextAfter(
                    ast.body[ast.body.length - 1],
                    '\n\nexport const ErrorBoundary = GenericErrorBoundary',
                  ),
                );

                exportErrorBoundaryWasAdded = true;
              }

              return fixers;
            },
          });
        }
      },
    };
  },
};

And it works pretty well as team agreements.

RFC: Biome Plugins Proposal #1762

arendjr Feb 6, 2024 Maintainer

Introduction

GritQL plugins

Syntax

Alternatives Considered

GritQL in Biome

JS/TS plugins

Linter API

Transformation API

Grit Patterns

Choice of Engine

V8 (Deno)

V8 (Node)

SpiderMonkey

JavaScriptCore (Bun)

QuickJS

Custom engine

Ecosystem Support

TypeScript support

An alternative for typescript-eslint

Wrapping Up

Replies: 12 comments · 64 replies

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as disruptive content.

This comment was marked as off-topic.

This comment was marked as off-topic.

arendjr Feb 8, 2024 Maintainer Author

arendjr Feb 18, 2024 Maintainer Author

arendjr Feb 8, 2024 Maintainer Author

DaniGuardiola Feb 8, 2024 Collaborator

ematipico Feb 9, 2024 Maintainer

This comment was marked as off-topic.

DaniGuardiola Feb 8, 2024 Collaborator

arendjr Feb 18, 2024 Maintainer Author

Conaclos Feb 9, 2024 Maintainer

arendjr Feb 13, 2024 Maintainer Author

ematipico Feb 13, 2024 Maintainer

arendjr Feb 13, 2024 Maintainer Author

arendjr Feb 13, 2024 Maintainer Author

nstepien Feb 16, 2024 Sponsor

nstepien Feb 16, 2024 Sponsor

Conaclos Feb 16, 2024 Maintainer

nstepien Feb 22, 2024 Sponsor

arendjr
Feb 6, 2024
Maintainer

An alternative for `typescript-eslint`

Replies: 12 comments 64 replies

arendjr Feb 8, 2024
Maintainer Author

arendjr Feb 18, 2024
Maintainer Author

arendjr Feb 8, 2024
Maintainer Author

DaniGuardiola Feb 8, 2024
Collaborator

ematipico Feb 9, 2024
Maintainer

DaniGuardiola Feb 8, 2024
Collaborator

arendjr Feb 18, 2024
Maintainer Author

Conaclos Feb 9, 2024
Maintainer

arendjr Feb 13, 2024
Maintainer Author

ematipico Feb 13, 2024
Maintainer

arendjr Feb 13, 2024
Maintainer Author

arendjr Feb 13, 2024
Maintainer Author

nstepien
Feb 16, 2024
Sponsor

nstepien
Feb 16, 2024
Sponsor

Conaclos Feb 16, 2024
Maintainer

nstepien Feb 22, 2024
Sponsor