Skip to content

TypeScript compiler in Rust #5432

@kitsonk

Description

@kitsonk

This is long, but please read before you comment. If you are interested in following this, use the subscribe button to the right instead of just adding a comment.

Since the Deno v1 announcement there has been a lot of community interest in the TypeScript compiler in Rust that was mentioned. This issue is intended to provide information on that. First a few points:

  • I am a long term collaborator in Deno. I am not a committer. This means these are entirely my own opinions, but ones I have formed with discussion on a wide range of people and my personal experiences.
  • There is a highly likely hood that this issue will get noisy. Please think first before adding to that noise. A lot of people are eager to contribute to Deno and TypeScript. If you haven't been contributing to Deno and related projects or the TypeScript compiler itself for a while, it would be best to be an observer here and see if you can help out. We will all make better progress if we continue in a semi-organised fashion.
  • Deno ❤️ TypeScript. TypeScript is here to stay. A lot of hard work, sweat and tears have been invested by lots of people to get TypeScript as a language a reality, and that has always been built on a compiler that is written in TypeScript. My opinion, which I know is not alone, is that TypeScript would not be as great as it is if it hadn't been written in TypeScript. That being said, the language TypeScript is built on, JavaScript is a slippery beast when it comes to performance. One could argue one of the things that makes it hard to easily optimise is the lack of strong types. That irony has a direct impact on the ability for tsc to maximise performance.
  • Sometimes we are imprecise in our language. In the v1 announcement, the intent was born out of the challenges of making TypeScript a first class language in Deno, and want to get to a better situation where the gap in performance between starting up JavaScript in Deno and starting up TypeScript in Deno narrows. We have one opinion on approach, which we will express here, but we are open to take the path the delivers the best outcomes.

The Problem

We need a bit of context of what problem we are really trying to solve here. Type checking, especially with a structural type system built on top of a weakly typed underlying language, is expensive. As TypeScript as a language has grown, and the type checking become more and more advanced and safe, means that it is increasingly "expensive" to do this type of checking. Deno treats TypeScript as a first class language. At the moment Deno also always runs TypeScript through the TypeScript compiler and treats any type errors as it would treat a JavaScript runtime error.

This is "expensive". If you look at Deno benchmarks you will see that there is currently a cost of about 10x just spinning up the compiler to get some TypeScript transpiled into JavaScript.

Benchmarks_-_Deno

We have tried a few things to improve that, but there is still a big cost, and we want to try to narrow that gap significantly. For other things in Deno we have had really good experience in moving to Rust things we used to do in JavaScript. There are several reasons for the performance improvement, but the most compelling is that it is just plain easier to get performant code in Rust. If you resort to non-obvious structures in JavaScript, you can often get lots of performance from v8, but it is really really really hard work. Our text encoder is an example. We started with a spec complaint implementation, following the structures laid out in the IDL. It was abysmal performance. We then implemented a fairly "magical" non-obvious implementation in JavaScript/TypeScript which dramatically increased performance, but that wasn't even enough, we got eve more performance moving that to Rust, with even the overhead of "copying" that data in and out of the JS sandbox is faster than what we could get out of heavily optimised JavaScript. Could we have gotten more out of our JS implementation... maybe... but it just gets too hard.

How does it work today?

I realise a lot of folks might not understand well where we are at today, so some background context is helpful. It sort of works like this...

How the CLI is put together

  • v8 runs JavaScript in a sandbox, called an isolate.
  • rusty_v8 is a Rust crate that provides bindings to v8 from Rust, to allow Deno to interface with v8 (which is built in C++).
  • Deno core is a Rust crate that provides a basic communication path between a v8 isolate and Rust. Sending a message, including data, between an isolate and Rust is what we call an "op", and so Deno internal isolate code "ops" to Rust and Rust replies back.
  • deno_typescript is a Rust crate that allows us to transpile TypeScript ES Modules into a single file JavaScript System bundle which can be run in an isolate. This is effectively bootstrap code which allows us to author TypeScript for internal code. It also allows us to create snapshots, which is a v8 feature which allows a running JavaScript sandbox's state to be serialised and dumped to a file.
  • During the build process we create two snapshots: the runtime/cli snapshot, which provides all the basic runtime environment where Deno CLI code runs; and the compiler snapshot. The compiler snapshot is effectively a web worker which contains typescript.js (tsc if you will) and our infrastructure code to manage it.
  • In the compiler bundle that the snapshot is built on, we instantiate a program with the TypeScript compiler, which with our infrastructure generates TypeScript source files for each of the lib files that we use in our runtime environment. We also currently use that as an "old program" we feed the compiler, but recent conversations have made me realise that this likely doesn't do anything, that speed we are seeing is that having the source files in memory for the lib files is what is efficient as that being part of the snapshot, as they are reused on subsequent compliations.

"Running" TypeScript code

  • When you try to use some TypeScript (most commonly via deno run), the Deno CLI will take the module passed and see if it is in the Deno cache, or if it needs to be fetched. If it is in the cache and it is a local module, it checks if the source has been modified or the --reload flag is true, and if it is remote, just if the --reload flag is true. If the transpiled version of the modules is in the cache and "valid" Deno will just load it (see below).
  • If there is a TypeScript file that needs to be compiled/transpiled, Deno will spin up the compiler isolate from the snapshot and then will send it a message to compile the module.
  • Inside the infrastructure code for the compiler in Deno, we currently use the ts.preProcessFile() plus other logic to determine the dependencies of the root file. As those dependencies are identified, the compiler "ops" to Rust to resolve and fetch those modules.
  • There is logic in Rust to deal with identification of "type substitutions" for TypeScript files. For example if you are loading a JavaScript file, but you want to use type definitions for the compiler to type check against instead of the JavaScript file, Deno supports using things like X-TypeScript-Types header to identify those files. The fetching logic will resolve all of that before returning the type definition to the compiler instead of the JavaScript file.
  • When the sources are resolved, they are returned back to the compiler and the compiler infrastructure caches them in memory in order to be ready for them to be requested by the TypeScript compiler.
  • Deno then creates a program with the TypeScript compiler. This will then trigger the TypeScript compiler to request source files, which we start feeding via the provided Deno CompilerHost. We substitute .d.ts files for JavaScript files based where appropriate.
  • We check the pre-emit diagnostics and if there are any (that we don't ignore), we serialise them and return them to Rust and stop the compilation.
  • Deno then does an emit on the program, which will cause the TypeScript compiler to start to "write" out files. The Deno CompilerHost will take these writes and "op" back into Rust to have them added to the cache.
  • Finally we finish the compilation and return to Rust, which usually spins down the compiler.

Loading JavaScript into the isolate

  • When a file is in the cache, where it is either direct JavaScript, or JavaScript transpiled from a TypeScript source, Deno loads it into the runtime isolate (or a worker isolate) as an ES Module. Everything from "userland" is considered an ES module.
  • Loading the module triggers v8's dependency analysis, which in turn will start requesting other modules to be loaded into the isolate. Those are then fetched from the cache. If the root module was TypeScript, most of the time the compilation will already be done and the emitted file is in the cache ready to go. In the case of not statically identifiable dynamic imports or if the root module was just JavaScript, hitting a non-transpiled module is possible, and the compiler would be loaded and the compilation would happen like above.
  • Once v8 has all the modules, it instantiates them and code starts running. 🎉

The Approach

Ok, so what do we do about it. The discussion I have had on the subject with various people have always been about an evolutionary approach. Also this approach is what I personally feel is right for Deno. Solving everyone's problems is tough, but if solving Deno's problems helps out everyone else 🥳.

Here are the major items that I think we need to tackle, and likely in the order presented here:

  • Move the dependency graph analysis into Rust. @bartlomieju has this in progress in refactor: rewrite TS dependency analysis in Rust #5029. It also needs to be integrated into the compiler which Bartek and I planned on working on together. This means we would be able to fully fetch all the sources that the compiler would need before we spin up the compiler. While ts.preProcessFile() has been super useful, there are a few bugs with it that have been long outstanding, and with our added logic of supporting type substitutions it is just better/easier/more testable to have all of that in Rust. It will speed up things little bit from not having to "op" back and forth sending individual files.
  • Provide a "no check" capability for Deno, where TypeScript is simply stripped of its types (and transformed for things like experimental decorators) and then loaded into Deno. This is discussed in Slow TypeScript compile time #4323 but likely needs its own issue to deliver the feature (which I will take care of). This is covered in Not Type Checking TypeScript #5436. Some of the conversations I have had with people is that making this the default might actually make more sense. Something we should discuss. Not type checking TypeScript would certainly speed things up, and especially when you are developing, and saving type checking for those special occasions, like a pre-commit hook.
  • Get a really good performance analysis of the TypeScript compiler under Deno. We have the v8 inspector working under Deno now, we should be able to get flame charts for compilations, as well as there is a lot of instrumentation infrastructure in the TypeScript compiler that we haven't looked at enabling or using in Deno. Some of it is quite Node.js opinionated, so contributing back enhancements that make it a bit more isomorphic or enable it a bit more I am sure would benefit everyone, as well as making it easy to observe the performance of the compiler. Admittedly we don't know specifically where we are eating up time.
  • Look at lexing/parsing of TypeScript in Rust and sending a serialised AST to the compiler isolate. This is a seriously complex issue. swc is our preferred choice for Rust lexing/parsing of JavaScript and TypeScript. It is the engine behind deno fmt (along with the excellent dprint built on top of it, which is how we became aware of swc in the first place) and deno doc. SWC, like most ECMAScript parsers, produces a estree like AST structure. The TypeScript AST is significantly different than that... it is far more aligned to a Roslyn AST. It would be naïve of me (or anyone) to think that it would be trivial to transform swc's AST to a TypeScript one, but it maybe worth the effort, or at least exploring a simple transform to see if we can generate an AST (TS SourceFile) that TypeScript can consume. If this did work, we could use the TypeScript compiler to just do type checking. (It was pointed out that a reference implementation of estree like AST to TS AST exists in typescript-eslint)
  • Investigate further improvements in type checking performance in TypeScript that could be gained by optimising for v8. This would benefit everyone, not just Deno. It would be hard work, but I believe the TypeScript core team are amenable to contributions along those lines. Once we have a baseline of performance, there maybe a lot we can do there.
  • Try to move type checking to Rust. Personally, I am not convinced on this, but it is a logical conclusion of "moving to Rust". I see lots of downsides to this, and very few upsides, unless we find that 99% of the time spent in TypeScript was type checking, even then it is naïve to think that Rust in and of itself would be the solution to that problem. There is a lot of blood sweat and tears put into the type checking in TypeScript from a lot of people that understand that a lot better than any of us ever will. What I do understand about it is that what the type checking is doing is a lot different than saying the "hot paths" that tend to work fast in Rust, like parsing and lexing large text files. So even if it were ported over and kept in sync with the TypeScript based type checker, it may not perform that much faster. My opinion is there is a lot of other things to do before we even consider this, if ever.

Metadata

Metadata

Assignees

Labels

featnew feature (which has been agreed to/accepted)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions