Replies: 12 comments 64 replies
-
I just wanted to chime in that I'm fully in support of this and Grit is excited to share technology with Biome. 🚀 We're planning our first open source release in the next month. I'm particularly interested in the ideas around interop between JS plugins and GritQL queries. It's along the lines of what i have already thought about for a Grit SDK and should theoretically be able to combine the expressiveness of JS with the performance of Rust (accessed through declarative queries). |
Beta Was this translation helpful? Give feedback.
-
This an exciting idea! GritQL is currently backed by TreeSitter, which gives a lot of flexibility in supporting many languages, and modifying the grammars to suit our particular needs. I'd love to exchange notes on GritQL internals, and Biomes CST to further explore this idea. |
Beta Was this translation helpful? Give feedback.
-
Why make the formatter have plugins at all? Most prettier plugins are for new grammars (which is notably not enabled by GritQL), and I’m not sure it would be in the interest of Biome’s architecture to add new grammars at this level. There’s an entire pipeline to consider that sits higher than the formatter The only other prettier plugins of any note are for things Biome should probably just support out of the box (jsdoc, tailwind) Adding plugins to the formatter seems to accept pretty massive cost for little to no benefit |
Beta Was this translation helpful? Give feedback.
-
A note on typescript eslint and the possibility of building a type checker in Biome. The approach described in the Biome docs to eventually add a “stricter subset” of TS is almost certainly impossible to achieve without some sort of major change to how TS itself is being built Put simply, it’s too much work, and people’s tolerance for differences to the official type-checker are nonexistent That doesn’t mean that Biome couldn’t do things in the space. Quite a few TS-ESLint rules would be enabled by a robust control flow graph and data flow analysis. If Biome did go down the path of doing type analysis. It would be far more successful to actually be less strict. Effectively giving up in some key places, such as The other alternative for that is to directly interface with the TypeScript language server. If Biome is to never build a replacement type checker, then users will always have to be running TypeScript alongside Biome anyways, might as well query from it. This has obvious perf implications, so you’d want to avoid it in certain critical paths, but if it’s just for linting it seems acceptable with thoughtful scheduling (tbh you might want to consider cleanly dividing linting that requires typings and linting that does not) |
Beta Was this translation helpful? Give feedback.
-
I think this is far too optimistic of what GritQL would actually be useful for. There are different levels of analysis required by different kinds of lint rules. Some operate directly on the structure of the code. These have largely been replaced by formatters. The majority of lint rules out there require a lot more analysis than that. They need to look at scope information, or the control flow, or even type information. Even further, there are a lot of lint rules that need to pull in libraries for parsing or validating or any number of other things GritQL seems great for codemods. But it doesn’t seem useful for lint rules outside of the most straightforward kinds of lint rules, which also happen to be the easiest to just put in Biome (if they aren’t there already, as is the case with the example given in this RFC) There are alternatives to GritQL that can query higher levels of analysis such as CodeQL, but that may also go too far in the other direction and be too confusing to be useful |
Beta Was this translation helpful? Give feedback.
-
Currently Biome can be installed as a binary and executed before you run This is an enormous benefit for CI usage and it would be a shame to give that up for plugins At the same time, asking users to install things with Cargo is a pretty high bar. So I would consider making Biome capable of installing its own dependencies. Not trying to be its own fully fledged package manager, it could build on top of the npm registry and package.json |
Beta Was this translation helpful? Give feedback.
-
If Biome does go down the route of JS plugins, asking the plugin authors to compile their code ahead of time provides an opportunity for optimization. This would allow Biome to avoid requiring any kind of additional runtime, meaning no node/deno/bun APIs, besides the APIs you provide it. During this phase, you could compile to byte code or executables with Hermes or QuickJS. You can ensure all the code is bundled, optimized, and minified. So by the time you want to load the plugin you have very low initialization costs. |
Beta Was this translation helpful? Give feedback.
-
I think the main reason Biome is so fast is that it's multi-threaded, uses a background process, and reuses the same parse data for linting/formatting/... Biome could be slower than JS tools if it was poorly designed, but it's not! |
Beta Was this translation helpful? Give feedback.
-
Something I've not seen mentioned is that plugins could run in wintercg-compatible workers. Biome doesn't need to support plugins importing npm dependencies, plugin authors can bundle them instead. |
Beta Was this translation helpful? Give feedback.
-
Just read that Cloudflare is using Grit as well: https://blog.cloudflare.com/lessons-from-building-an-automated-sdk-pipeline |
Beta Was this translation helpful? Give feedback.
-
Say I want to write a rule that forbids calling But especially in a language as dynamic as JavaScript, there many ways of laundering/obfuscating this code: e.g. ```js
Obviously you can't perfectly tell what a piece of code is going to do by static analysis, but I'd want the ability to make both positive and negative behavioral rules like "this code MUST call |
Beta Was this translation helpful? Give feedback.
-
I'd like to chime in here. I work in an enterprise environment, our workspace has about 20 eslint rules specific to the repo itself. They primarily target developer experience and very often deal with the file system and file paths. I believe this kind of rules to be a necessity when your project has 100+ contributors. We are unable to move to Biome because the loss of these rules would be detrimental to the developer experience. I think the plugin system has to be flexible enough to support basic language features such as code sharing, functions, conditional statement, etc and deal with external file system at the very least. I imagine this situation is a common place in large scale projects. |
Beta Was this translation helpful? Give feedback.
-
Introduction
A few weeks ago, I wrote an RFC to collect ideas and preferences for Biome's upcoming plugin system. First of all, I would like to thank everyone who responded! A lot of great suggestions were received, both in the RFC itself and also in follow-up discussions we had on our Discord.
In this RFC I would like to propose a more concrete direction. It will become quite the read, so feel free to jump to sections that peak your interest. I expect this to be the last RFC for a while, so while you can leave suggestions and critiques here of course, if you want to follow our progress, I suggest hopping onto the
#dev-plugins
channel on our Discord. Things may evolve considerably from what I'm writing down here.Before diving into the deep, let me give you a quick summary of what to expect:
With that out of the way, let's dig in.
GritQL plugins
I would like to start by discussing GritQL. In the first RFC several use cases were discussed: formatting, linting, transformation, a query engine, as well as codemods. But if you look carefully, they are all different masks that hide the same face:
At this point it should be clear the keywords here are querying and transformation. Those are GritQL's bread and butter.
Syntax
A large part of the appeal for GritQL is in its syntax. At first glance, Grit queries look like the code you're trying to match. Consider the following query:
`console.log("Hello world!");`
Can you guess what it matches? The nice thing is that is doesn't just match
console.log("Hello world!")
, it also matchesconsole.log('Hello world!')
(notice the single quotes) andconsole . log ( "Hello world!" )
. This is because it does structural matching based on the syntax tree of a program.Transformations are strikingly easy too. Imagine you want to transform
foo.bar && foo.bar()
intofoo.bar?.()
, and you want it to work regardless of the identifier path of the function being called. Here is the GritQL to do it:There's a whole lot more it can do, but suffice to say that I was convinced. If you want to read up some more about GritQL, I recommend having a look at the tutorial: https://docs.grit.io/tutorials/gritql
Alternatives Considered
Of course GritQL isn't the only language designed for this purpose. There are a lot of query languages out there, but fewer that also support transformation. I'll limit this section to alternatives that support both.
Comby is an alternative that doesn't rely on tree parsing. This can make it a bit faster than other approaches, but the downside is that it has less syntactic awareness of parsed programs. And given that Biome already parses code into a tree, it doesn't look like a great match for us.
Semgrep and
ast-grep
both use pattern matching that looks and acts a lot like what GritQL does, although both seem to have a preference for wrapping their patterns in YAML files. We could of course try to take the pattern syntax only, but then we quickly find it to be limited in functionality compared to GritQL's syntax. For instance, where GritQL has direct syntax support for things such as conditions and nested patterns,ast-grep
has to fall back to complex-looking YAML structures for more advanced patterns: https://ast-grep.github.io/guide/rule-config.html#rule-fileFinally, it should be mentioned that Morgante, the founder of Grit.io, quickly joined the conversations and showed willingness to cooperate with us. Having their support certainly helped us to confirm our choice.
GritQL in Biome
Looking back at the use cases that we want to support, I believe it is a no-brainer to support GritQL in the transformation and codemod use cases. We haven't fully fleshed out yet how we would integrate the query engine use case into Biome, but one possibility that I find appealing is to add a sidebar to our VS Code extension, and let users submit GritQL queries from there. It would give our users a syntax-aware Search/Replace feature right in their editor.
For formatting, things are more subtle. GritQL hasn't been designed with support for transforming trivia in mind. However, if we create a GritQL implementation on top of Biome's CST, which is very much trivia-aware, we might be able to use it here too. Fingers crossed.
That leaves the linter. Let's imagine how we could use GritQL to implement Biome's
noImplicitBoolean
rule, which disallows JSX attributes without value, such as in<input disabled />
. Here's how we could write this rule with GritQL:You can see the fixer for this rule in action for yourself: https://app.grit.io/studio?key=kPdo1E85sb-gxiTFLA9Ag
The
diagnostic()
function isn't defined by GritQL, but the language allows for custom functions. We could use it to report the given message when linting, and to apply the replacement in thefixer
when the user asks for it.noImplicitBoolean
isn't a complex rule to begin with, but it's hard to beat the expressiveness of the above snippet. As an added bonus, Grit already supports Markdown pattern files which would allow us to conveniently store a rule like the above together with the Markdown documentation for it.JS/TS plugins
Given the overwhelming preference of our users to write plugins in JavaScript or TypeScript, I think it only makes sense to try to support them in Biome. It's a natural fit for web developers, but the main concern that needs addressing is performance. One of Biome's main selling points is that it is so much faster than the established alternatives. And the main reason it is so much faster, is precisely because it is not written in JavaScript. So if we allow plugins written in JavaScript, we need to be careful to avoid letting their performance drag down Biome as a whole. To achieve this, I would suggest the following strategy:
Linter API
As I like to say, "A code example speaks a thousand words". Let's look at what the above
noImplicitBoolean
rule could look like as a pure JS plugin:The principle here is straightforward: The plugin registers a callback which is invoked for AST nodes of a given kind. On those nodes, it can apply some analysis, and report diagnostics when it finds issues.
At this point I would propose two intentional limitations for the AST nodes passed to callbacks:
If either of those limitations turns out to cause too much friction, we can always introduce options that would allow rules to opt-in to additional serialization.
But the most interesting aspect of the snippet above is the fixer that is used. The fixer callback uses a
transform()
function that I'll explain in the next section.Transformation API
Remember that GritQL snippet from above, where we transformed function calls preceeded by a test for the callee with a use of optional chaining?
Here is what it looks like if we used the proposed JS API:
We've seen the traversal API above with the linter API. And we've seen a
transform()
call as well, since the linter's fixer also used it. It might be time to explain how it works...transform()
takes an AST node (the first argument) and transforms it into a snippet (the second argument). Often the node to be transformed matches the one being visited, but it could be another, such asexpr.left
in the example above. The snippet is expressed as an array where each element is either a node, or a string. Snippets can be conveniently created with theinto
tagged template helper.The reason for using snippets instead of just flattening everything to a string is simple: We can't reliably turn it into a string, because we've lost information when we took Biome's internal CST nodes and exposed only part of their information as AST nodes to the script. So instead, we pass references to the AST nodes (along with replacement strings) back to Rust, where the complete replacement is constructed for us.
Grit Patterns
Of course, if JavaScript plugins are not allowed to perform AST traversal themselves, it makes matching certain types of traversal and analysis annoyingly complex. But what if instead of using multiple listeners to detect nested patterns, a Grit query could be used to match instead?
For instance, consider this transformation for flattening nested
if-else
statements:It allows for far richer queries than plain node traversal can, and also has performance benefits since all the traversal and query matching happens in native code inside Biome. But meanwhile developers writing their own rules or transformations still have the full flexibility of JavaScript available in processing the matches.
Of course the above syntax is merely a teaser, but I think if we go in this direction, we can truly get the best of both worlds.
Choice of Engine
One big question that will inevitably need answering if we want to support JavaScript plugins is, which engine will we use to implement it? I haven't made a decision here, so I'll just list the options as I see them:
V8 (Deno)
V8 seems the go to engine people like to embed. Node and Deno are both built on it, and of course the world's most-used browser uses it. And thanks to Deno, it's even relatively easy to embed in a Rust program, because they have crates that we can use for exactly that purpose. It even has a permissions model built-in so we can offer a certain amount of sandboxing around plugins.
Deno also has built-in TypeScript support, but that support is implemented in the CLI; not the lower-level Rust crates. To some extend that's a blessing for us, since Deno uses SWC for transpiling TypeScript, while we would probably like to use our own infrastructure for that. Deno already provides us with a module loader interface where we could implement this.
To be clear, I think embedding Deno is probably our most-promising road to offering JavaScript plugin support. But I'll list some alternatives for the sake of completeness.
V8 (Node)
Of course we could also V8 through Node.js, and I even found a useful presentation about it: https://speakerdeck.com/jlkiri/node-dot-js-in-rust-how-to-do-it-and-what-to-expect-from-it
That said, with the exception of NPM compatibility, which may be easier to achieve, it seems like going with Node.js mostly has downsides. Integration into our Rust codebase would be more difficult, there's no sandboxing, and we'd inherit CommonJS vs. ESM incompatibilities.
SpiderMonkey
SpiderMonkey is Firefox's engine, and it also has Rust bindings we can use. It's quite a bit more low-level since we'd be dealing with the engine directly, but it appears SpiderMonkey is also a relatively popular engine for embedding, such as with CouchDB and MongoDB.
Performance-wise I wouldn't expect major changes from V8.
JavaScriptCore (Bun)
JavaScriptCore made some waves recently for becoming the engine of choice for Bun, the high-performance alternative to Node and Deno. It seems JSC's main performance advantage is in startup times, which would be a nice benefit for loading plugins too, of course.
Unfortunately, I haven't really found an official embedding API for Bun, and we'd either have to interface to Bun through C APIs, or use JSC's C++ APIs directly.
QuickJS
Another low-level engine, and because it has no JIT it will be significantly slower than the prior options. That said, it will
help to keep Biome's binary from growing as much compared to the others and it has excellent startup performance.
Custom engine
As QuickJS shows, writing a JS engine from scratch isn't entirely infeasible, especially without a JIT. Biome already has a parser, and a custom engine could start execution without needing transpilation or an extra parsing step. Bindings to our CST model could also be very lean. So while a custom engine could never beat any of the JIT engines in raw performance, it could have the lowest overhead of any of the options.
As fun as it to theorize about this however, I wouldn't suggest going down this road unless someone volunteers with way too much time on their hands :)
Ecosystem Support
Another big question that comes up when offering JavaScript plugins is whether they can load third-party modules or not. Offering compatibility with either the NPM or Deno ecosystems would open up a lot of possibilities for further integrations. It would also be a lot of work. Realistically, we would need to either embed Deno, Node, or Bun if we want to have a shot at NPM compatibility. Deno has an advantage here, in that tapping into the Deno ecosystem will already bring a lot of benefits while being a lot easier to support than full NPM compatibility. Either way, I think it's best to tamper user expectations here, and not make any promises on this front for now.
One final remark I'd like to make here: Even though NPM may seem ubiquitous, not every web project uses it. Biome certainly doesn't use it. While I can see the obvious value in allowing plugins to load NPM modules, I would be hesitant towards requiring the use of NPM for installing Biome plugins.
TypeScript support
So far I've talked only about JavaScript specifically, but adding TypeScript support shouldn't be too hard for our plugin system.
When running a TypeScript plugin, Biome can simply strip the type annotations and hand over the resulting JavaScript to the engine that we settle on. One caveat is that no type-checking will be done when running a TypeScript plugin. Which leads us to...
An alternative for
typescript-eslint
I will give a little bit of special attention to what might be the most popular ESLint plugin:
typescript-eslint
. Some users have made it very clear that supporting everythingtypescript-eslint
can do is a dealbreaker for them. This included a suggestion that our plugins would need to support full NPM compatibility so that we could use thetypescript
package that way. There's a few things to address here:Neither of the above approaches has much overlap with the plugins discussed in this RFC however, so if you're interested in this domain, I suggest hopping onto our Discord and checking out the
#dev-type-inference
channel instead.Wrapping Up
This RFC proposed two different paths towards supporting plugins in Biome. I intend to begin by exploring how we can implement GritQL plugins first, since I expect this to be less effort, and GritQL plugins should already be able to handle a large amount of use cases for Biome. JS/TS plugins will potentially be even more powerful, but implementing support for them will likely also be more effort. And with the inclusion of Grit-matching in the API for our JS/TS plugins, it makes sense to complete GritQL support first.
Exploring the integration of a JS engine into Biome could be done in parallel, so if someone is already interested in starting this work, I would happily support them. I may be able to offer mentoring opportunities here as well. If this speaks to you, please reach out on the
#dev-plugins
channel on our Discord server.That should hopefully wrap things up for now. There's a lot of work to be done, so from here there'll be more coding and less RFCs... If you do have questions, suggestions, or concerns, feel free to leave them here. If you would like to follow our progress, I would again recommend our
#dev-plugins
channel :)Thanks for reading!
Beta Was this translation helpful? Give feedback.
All reactions