Public API for compilers #1739

kitsonk · 2019-02-11T03:13:14Z

For tracking purposes, please don't work on this without discussing with Ry or myself.

Having a public API that is similar to how we perform TypeScript compilation is a good idea. It would allow JS->JS transpilation (e.g. those who need Babel custom plugins or Flow) or other languages (e.g. CoffeeScript).

Related to #1738 and some other work in rationalising the compiler APIs internally, but we should be able to support loading a compiler in a web worker and instructing the privileged side what resources should be sent to that runtime compiler.

daniele-orlando · 2019-02-19T16:17:29Z

Dart and Elm are other candidate languages.

islishude · 2019-02-28T12:49:40Z

I think Dartlang is better than TypeScript.

kitsonk · 2019-03-01T02:19:31Z

@islishude then you are probably looking at the wrong project.

afinch7 · 2019-03-01T22:25:46Z

What you really want is a extensible module loading system(not resolution as already discussed this is prone to problems). I have quite a few ideas on this one, and even some existing code. I've been working on similar idea for while now, and I think the only good way to design this is with a two stage loading process:

First "load"(what this means is really dependent on the media type) modules to js or ts(possibly with embedded wasm) source, since these are the only formats that the typescript compiler understands(other than JSON).
Compile using something similar to the existing ts compiler.

The first step would be the only part that is user extensible.

kitsonk · 2019-03-02T01:43:13Z

The public compiler API would need to:

Allow a user to register a compiler. That registration would include extensions and media types that it should compile. For security and consistency reasons, TypeScript media types and extensions should be disallowed, but JSON and JavaScript could potentially be registered. This would be a special type of web worker. I am not certain if this should be a new op or if, we add non-standard customisations to the options of new Worker(url, options) to provide the appropriate information.
When the media type is encountered, Rust would post a message on the worker providing the module specifier and refer.
The userland compiler would be able to fetch resources using the fetch_module_meta_data op. We would enforce the read file/network security on this (unlike the built in compiler).
The userland compiler would post back via the web work API to Rust the compiled module code, source map and any diagnostics.
The userland compiler would need to be able to be unregistered. (I guess we would just use Worker.terminate())

afinch7 · 2019-03-03T19:01:50Z

You specified separating compilers by media type, but I don't think that extension(media type) is really a reliable way to do this(how would you handle no extension). I guess you could decide to on a list of supported media types, but I feel that would be very limiting. You could load modules by manifest and have the manifest tell the loader system what sort of loader to use, but that already sounds way to much like package.json if you ask me. That might be fine if some thought was put into it, but I don't think thats really what the deno community wants.

In general the more complicated your expectations get the more difficult this will be to implement, and the more bugs we will encounter in the process.
It might be much more simple to decide that user compilers should be trusted code, and expect them to handle the retrieval of resources required to complete their tasks thus the nomenclature loader would be more fitting.
The expectations for said loaders could be as simple as:

Loaders will be given a module's fully qualified url(this could be generated using native url parsing I.E. new URL(specifier, referrer ? referrer : defaultUrl)).
Loaders will also be given a modules referrer information: origin url, source code, source map, media type, source loader(what loader was used to provide this resource?), etc.
Loaders would be expected to take the information from 1 and 2, and accurately as possible return a ts or js module that represents that modules url(or error out if not possible).
Loaders should be given the ability to error out on a request for any reason(and should be encouraged to error out as soon as possible).
Loader priority should be determined by their order in the list of configured loaders, and each loader must be given an attempt and error out before trying the next one.
Loaders should be designed to be platform agnostic, so they can be integrated in tooling like a typescript language services plugin. This would most likely be achieved by having the implementation passing the loaders platform specific implementations of a shared resource accessor api.

The new dynamic import could be used to load these "loaders" as modules with a defined structure.
I tried to describe this as best as possible, but I figured I might be able to better represent my ideas in typescript interfaces and I also included a simple example implementation: https://gist.github.com/afinch7/4356a4377ec20dc336456d4639777578.

This would absolve the implementation of a lot of potentially complicated responsibilities, and even allow loaders to make security decisions about the content they are attempting to load(like a browser would if you tried to make a cross origin request).

A simple approach like this could support just about any use case, and could be universal enough to enable parity between the deno compiler and a typescript language services plugin(Seamless accurate integration in the editor is rare for systems like this). It also wouldn't be limited to javascript transpilers or json processing you could very easily compile just about anything into a javascript or typescript module. Flatbuffers definitions could be compiled to a typescript module at runtime, or c++, c, or even rust source code could be compiled to wasm and embedded into a js/ts module at runtime. You could have a nearly completely language agnostic platform, and give the developer the information to actually use it effectively.

afinch7 · 2019-03-03T20:14:10Z

This might require a preflight check of sorts to check for redirects #1742 or give deno control of fetching the module "entry point".

kitsonk · 2019-03-05T01:33:38Z

You specified separating compilers by media type, but I don't think that extension(media type) is really a reliable way to do this

This is exactly how we do it today. media type != extension.

(how would you handle no extension)

Without a media type, that is insecure. We wouldn't want to allow a file like that to be processed... Files without extensions require a media type.

It might be much more simple to decide that user compilers should be trusted code, and expect them to handle the retrieval of resources required to complete their tasks thus the nomenclature loader would be more fitting.

We don't trust our own compiler. It does not do module resolution. That is up to privileged/Rust, as is appropriate.

There is nothing preventing someone from implementing a loader and eval'ing code today with the right Deno permissions. A public API for a userland compiler needs to follow the pattern of the built in compiler.

afinch7 · 2019-03-05T17:42:21Z

Those are valid concerns. You want something that doesn't require any user setup or config, and my approach would require user configuration to work thus it falls outside of the deno philosophy.

It will always be distributed as a single executable - and that executable will be sufficient software to run any deno program. Given a URL to a deno program, you should be able to execute it with nothing more than the 50 megabyte deno executable.

I think that pretty much settles what direction deno should go, but I still have my concerns with the idea of a untrusted compiler. I think my main concern here is how can you in any way trust the code a compiler emits if you don't have full trust in the compiler as a end user.

In general I think we are both on completely different pages right now, so I want to do what is needed to put us all on the same page with this one.

rdeforest · 2019-04-21T18:35:39Z

[snip]

I still have my concerns with the idea of a untrusted compiler. I think my main concern here is how can you in any way trust the code a compiler emits if you don't have full trust in the compiler as a end user.

[snip]

Just a lurker here, but I think what @kitsonk means by trusted/untrusted isn't what you think. You're right that one can't delegate code generation without the risk that the output will do something unwanted. Ken Thompson addressed this famously in his "Reflections on Trusting Trust" presentation in 1984.

The objective of treating the compiler as untrusted is to limit the damage it can do. Isolating the compiler prevents it from (for example) invoking /bin/bash in hopes of exploiting that unrelated program. The isolation is a reduction in attack surface as part of a defense-in-depth strategy.

The reason it's acceptable to risk nefarious or broken compiler output is because there is no architectural way to avoid the risk. The risk has to be addressed at a different layer, such as via module signing, code review, webs of trust, insurance mechanisms, etc.

I hope my comment is helpful.

oldrich-s · 2019-07-23T05:38:36Z

I propose to use service worker api to provide the compiler API:

#2676 (comment)

kitsonk · 2019-07-23T06:42:54Z

Service Workers aren't really suitable for a public compiler API. Service Workers are a specific class of Web Workers anyways. The existing compiler is implemented as a web worker, and a specific class of web worker would also be suitable IMO for the public compiler API, which is laid out above.

brandonkal · 2020-01-15T08:15:17Z

TypeScript media types and extensions should be disallowed

Strongly disagree with this. Put it behind a permission flag, but we should have the possibility to use the existing babel ecosystem or other tools to preprocess TypeScript files.

rsp · 2020-01-15T09:26:52Z

It could be useful to be able to use a custom compiler for TypeScript as well, like ttypescript or reflec-ts. I would generally avoid it for performance reasons but it might be useful for some experiments, unless it is a problem with bootstrapping, i.e. the main entry file that defines a custom TS compiler itself being compiled by a built-in TS compiler chicken and egg problem.
Some things could be done with custom transformers if supported by #2089/#2927/#3442.

JimLynchCodes · 2020-02-10T01:31:48Z

I don't know if I'm in the right place, but I would like to write ClojureScript and directly or indirectly run it through deno! 🙃 🙌

kitsonk · 2020-02-10T02:27:35Z

If there is a JavaScript based Clojure compiler, then that would likely be possible to accomplish with this feature.

Soremwar · 2020-04-15T19:40:47Z

@kitsonk I assume this isn't a goal for 1.0 is it?

lucacasonato · 2020-04-15T19:43:33Z

No as manifested by its future milestone.

kitsonk · 2020-12-04T02:49:08Z

@GeoffreyBooth thanks for the input! This is a really slow burn issue for Deno, but it would be great to align. I think a "resolve" and "load" hooks should be great. @piscisaureus did some thinking and POCing around this semi-recently.

Deno currently doesn't need a way to determine format, effectively a single media type resolves to a specific format, or expected that whatever transforms it can handle what-ever variants of that media type. I will take a look at the PR thread!

ghost · 2020-12-04T08:22:34Z

@RDambrosio016 Trust me, I fully understand (I once looked at their 8kb+ block).
Even trying to parse plain JS requires a whole lot of code.
TS is still a relatively dynamic language, so it's not easy to type check.
I was using TSC as an example of a poorly performing "compiler" written in JS.

The key point is I believe that Deno should try to support more than just JS compilers, or at least make it easier to use something that isn't JS.

There are also plenty of other compilers that emit JS that I believe are not written in JS, for example, dart2js and the ClosureScript compilers. (I may be wrong)

shadowtime2000 · 2021-01-03T22:04:51Z

I guess this is kind of like require hooks in NodeJS. I think another usage would be for compiled to JS functions template engines, so you don't have to read the file and compile, you can just import it.

auvipy · 2021-09-11T11:00:08Z

how much away are we from this to happen?

kitsonk · 2021-09-11T22:42:01Z

Quite a lot.

mimbrown · 2022-03-04T18:06:49Z

@kitsonk it looks like the module resolution API for your deno_graph module is pretty much there, no? I've used the module a little bit and it seems to work quite smoothly. Can that be reworked back into deno?

kitsonk · 2022-03-05T08:44:06Z

@mimbrown it is already part of Deno, as it is what is used to do module resolution, but directly as a Rust crate.

We are unlikely to expose it as an internal API, because it is available as a JavaScript/WASM API.

mimbrown · 2022-03-05T17:45:23Z

Yes I am aware, sorry my comment was not at all clear. What I meant was, the createGraph function exposed by the deno_graph module has a set of options that allow for user-defined module resolution and loading. You're giving users hooks to override the default behavior. I'll copy the options interface here:

interface CreateGraphOptions {
  /**
   * A callback that is called with the URL string of the resource to be loaded
   * and a flag indicating if the module was required dynamically. The callback
   * should resolve with a `LoadResponse` or `undefined` if the module is not
   * found. If there are other errors encountered, a rejected promise should be
   * returned.
   *
   * @param specifier The URL string of the resource to be loaded and resolved
   * @param isDynamic A flag that indicates if the module was being loaded
   *   dynamically
   */
  load?(
    specifier: string,
    isDynamic: boolean,
  ): Promise<LoadResponse | undefined>;
  /** The type of graph to build. `"all"` includes all dependencies of the
   * roots. `"typesOnly"` skips any code only dependencies that do not impact
   * the types of the graph, and `"codeOnly"` only includes dependencies that
   * are runnable code. */
  kind?: "all" | "typesOnly" | "codeOnly";
  /** When identifying a `@jsxImportSource` pragma, what module name will be
   * appended to the import source. This defaults to `jsx-runtime`. */
  jsxImportSourceModule?: string;
  /** An optional callback that will be called with a URL string of the resource
   * to provide additional meta data about the resource to enrich the module
   * graph. */
  cacheInfo?(specifier: string): CacheInfo;
  /** An optional callback that allows the default resolution logic of the
   * module graph to be "overridden". This is intended to allow items like an
   * import map to be used with the module graph. The callback takes the string
   * of the module specifier from the referrer and the string URL of the
   * referrer. The callback then returns a fully qualified resolved URL string
   * specifier or an object which contains the URL string and the module kind.
   * If just the string is returned, the module kind is inferred to be ESM. */
  resolve?(specifier: string, referrer: string): string | ResolveResult;
  /** An optional callback that can allow custom logic of how type dependencies
   * of a module to be provided. This will be called if a module is being added
   * to the graph that is is non-typed source code (e.g. JavaScript/JSX) and
   * allow resolution of a type only dependency for the module (e.g. `@types`
   * or a `.d.ts` file). */
  resolveTypes?(specifier: string): TypesDependency | undefined;
  /** An optional callback that returns `true` if the sub-resource integrity of
   * the provided specifier and content is valid, otherwise `false`. This allows
   * for items like lock files to be applied to the module graph. */
  check?(specifier: string, content: Uint8Array): boolean;
  /** An optional callback that returns the sub-resource integrity checksum for
   * a given set of content. */
  getChecksum?(content: Uint8Array): string;
  /** An optional string to be used when generating an error when the integrity
   * check of the module graph fails. */
  lockFilename?: string;
  /** An optional record of "injected" dependencies to the module graph. This
   * allows adding things like TypeScript's `"types"` values into the graph. */
  imports?: Record<string, string[]>;
}

The load, resolve, and resolveTypes hooks look like they were at least somewhat inspired by @GeoffreyBooth's comment above. You can tell me if I'm wrong. Anyway, this API seems like it would work almost as-is for a custom Deno loader. This is what I'm envisioning:

/** @file myLoader.ts */

export function load(specifier: string, isDynamic: boolean): Promise<LoadResponse | undefined> {
  // custom logic.
}

export function resolve(specifier: string, referrer: string): string | ResolveResult {
  // custom logic.
}

Used as:

deno run --loader=./myLoader.ts ./myScript.ts

Deno would then look at the loader file and override the default behavior depending on the functions that are exposed. I think it's already been noted that Workers aren't actually a great solution to this, even though they seem like they would be at first glance. This seems to be the simplest solution I can see, and it's already working fine for the deno_graph module.

masx200 · 2022-07-17T01:05:54Z

https://nodejs.org/dist/latest-v18.x/docs/api/esm.html#loaders

https://rollupjs.org/guide/en/#plugins-overview

Leo-Mu · 2022-07-28T06:39:00Z

I think Rescript will be the next big thing. Its syntax is very similar to rust, and many rust developers like it, and it is powered by Facebook and moves forward with the react ecosystem.

ejsmith · 2022-08-04T20:37:35Z

@ry @kitsonk has there been any movement on this? If I wanted to create a language that compiles to JS and is then run like any other file in Deno, is there any hook I can currently use to do the transpile action on the fly how you do with TypeScript files?

kitsonk · 2022-08-04T21:31:10Z

This continue to not be a priority. Even if there was a community contributed solution, it may not get accepted. The issue is still open, because it is a potentially desirable feature, but it is not "just around the corner".

The main reason is that we have really hardened the integration of TypeScript into Deno and in a lot of ways we have federated how it is handled. We don't have a straight forward pipeline anymore of a simple "here is a TypeScript file, give me JavaScript". There are very good reasons for this, mostly to do with performance. In a lot of cases, the TypeScript compiler isn't even spun up and we don't use the emit from it any more, meaning its main purpose is to type check code.

We also now only emit code when it is needed and emitted/transformed code is no longer cached. So back when we had a much clearer way of registering an extension and media type to transpiler, it was feasible, now it would require implementing user APIs to make it work. There are now multiple paths that code can take through things and multiple points where this would need to be plugged in. deno_graph enforces all the content typing and dependency analysis, eszip is used to determine what goes into a deno compile bundle, we still have deno bundle which uses the swc bundler and there is deno_emit which handles the actual emit, and we still have an internal cache in CLI that we manage. It is fairly complex process to get a file off a web server or read it locally and get it to the point where it is JavaScript and ready to go into v8 to be executed.

We removed Deno.emit() and moved its functionality to a Wasm library under deno_emit. It still needs some work, but that feels like more likely the way forward for these type of things, exposing the "internals" of Deno to users via user loadable modules.

In theory, it could be done now, in userland, without exposing anything. People can transpile their code to JavaScript and load runtime code either via data URLs or object URLs and then dynamically import() them. Making that whole process easier might be a great community idea and might help make the case that there is broad community interest in using Deno for this type of things.

ejsmith · 2022-08-05T14:57:32Z

@kitsonk thank you for the very thorough response. :-) I'm looking to do something similar to EJS templates being transformed into JS and run. I'd like to avoid having a compilation step. I will take a look into your idea of using dynamic imports.

masx200 · 2022-08-12T01:06:05Z

https://www.npmjs.com/package/ttypescript

https://github.com/Hookyns/tst-reflect

https://www.npmjs.com/package/typescript-is

matthewp · 2022-10-23T14:36:54Z

@kitsonk Understand your position here, just wanted to explain the difficulty of pulling this off in userland. There's really two ways you can do it, and both have bad tradeoffs that you don't really want. It's because a custom file type, like .foo might be depended on by a .js/.ts file too.

That means it's not enough to just transpile the files that Deno doesn't understand. You have to transpile all of them in order to rewrite JS imports. The two ways I have seen this done is:

Transpile the entire codebase ahead-of-time. This can be slow in large codebases. You might as well just run a bundler, as it's the same thing.
Rewrite all files and bypass the module loader entirely. This is what Vite does. It allows you to load things lazily as you are creating your own module loader. But you have to rewrite all imports into non-imports.

(2) is problematic because it doesn't and can't match ESM semantics entirely. And since you only want to do this in dev mode you are risking dev/prod differences.

tldr; you can't write a module loader in userland without bypassing Deno's module loader and that's not something anyone really wants to do.

mimbrown · 2022-10-25T08:33:33Z

@matthewp Just throwing another option in here, if you're wanting a way to transpile any kind of file to JS on the fly when it gets imported. This is easy on a server where the server code can do the transformation before serving the file. To achieve the same behavior in local imports, we can use an import map to map our local code folder to a server running on localhost, something like:

{
  "imports": {
    "./src/": "http://localhost:8000/"
  }
}

Now, we can have a server sitting between the import whatever from "./some-custom-file.custom-ext" and the yet-to-be-transpiled code on our local machines. I did a POC for myself using the svelte compiler. I don't know the performance implications of using a localhost server, and the caching isn't optimal (which is why I filed #15509), but it works.

reggi · 2022-10-25T16:17:46Z

@mimbrown yeah I had the same idea, also outlined this here https://dev.to/reggi/proposal-the-as-ts-language-server-52in

reggi · 2022-10-26T05:24:20Z

Today Vercel / Next.js dropped turbopack a rust-based transpiler / bundler designed to be a successor to webpack. Curious if the deno team has any interest in using this natively in deno in the future given it is built in rust.

symful · 2023-01-30T13:16:48Z

@mimbrown Something like this? (Please check the examples, it doesn't have proper readme :P)

ry mentioned this issue Feb 19, 2019

[Discussion] Other lang support (Dart) #1808

Closed

ry added this to the v0.4 milestone Feb 19, 2019

afinch7 mentioned this issue Mar 21, 2019

Typescript to webassembly compilation #1980

Closed

afinch7 mentioned this issue Mar 31, 2019

Web workers #1993

Merged

kitsonk mentioned this issue Apr 1, 2019

Support CoffeeScript on the new deno JavaScript engine jashkenas/coffeescript#5150

Open

bartlomieju mentioned this issue Apr 29, 2019

Add support for loading tsconfig.json #2089

Merged

3 tasks

kitsonk mentioned this issue Sep 12, 2019

[Feature Request] Typescript compile API #2927

Closed

kitsonk mentioned this issue Oct 8, 2019

feat: JSX Support #3038

Merged

kitsonk mentioned this issue Feb 9, 2020

Make ClojureScript A First Class Language? #3941

Closed

bartlomieju modified the milestones: v1.0, future Feb 24, 2020

Soremwar mentioned this issue May 12, 2020

[Feature/Discussion] Babel for Deno babel/babel#11543

Closed

kitsonk mentioned this issue Jan 7, 2021

Typescript Custom Transformers Support #3354

Closed

ejsmith mentioned this issue Feb 28, 2022

Support deno run --no-cache flag or { cache: false } import option to prevent caching of imports #13754

Closed

bartlomieju mentioned this issue Mar 30, 2022

I have a module, now what? denoland/deno_graph#145

Open

hayd mentioned this issue May 30, 2022

Enhancement: Flow support #500

Closed

kitsonk mentioned this issue Jul 16, 2022

Suggestion: custom loader option #14842

Closed

GeoffreyBooth mentioned this issue Jul 19, 2022

Proposal: Rename “loaders” nodejs/loaders#95

Closed

jsejcksn mentioned this issue Aug 15, 2022

Provide deno check as runtime API #15478

Open

ssttevee mentioned this issue Nov 5, 2022

Exposing internal APIs in runtime. #16545

Closed

bartlomieju removed this from the future milestone Feb 4, 2023

smcenlly mentioned this issue Mar 15, 2023

[Feature]: Support for Deno / Fresh wallabyjs/console-ninja#113

Open

lucacasonato mentioned this issue Sep 18, 2023

How to import other extension file with Deno. #20469

Closed

Public API for compilers #1739

Public API for compilers #1739

Comments

kitsonk commented Feb 11, 2019

daniele-orlando commented Feb 19, 2019

islishude commented Feb 28, 2019

kitsonk commented Mar 1, 2019

afinch7 commented Mar 1, 2019 • edited Loading

kitsonk commented Mar 2, 2019

afinch7 commented Mar 3, 2019 • edited Loading

afinch7 commented Mar 3, 2019 • edited Loading

kitsonk commented Mar 5, 2019

afinch7 commented Mar 5, 2019 • edited Loading

rdeforest commented Apr 21, 2019 • edited Loading

oldrich-s commented Jul 23, 2019

kitsonk commented Jul 23, 2019

brandonkal commented Jan 15, 2020

rsp commented Jan 15, 2020

JimLynchCodes commented Feb 10, 2020

kitsonk commented Feb 10, 2020

Soremwar commented Apr 15, 2020

lucacasonato commented Apr 15, 2020

kitsonk commented Dec 4, 2020

ghost commented Dec 4, 2020

shadowtime2000 commented Jan 3, 2021

auvipy commented Sep 11, 2021

kitsonk commented Sep 11, 2021

mimbrown commented Mar 4, 2022

kitsonk commented Mar 5, 2022

mimbrown commented Mar 5, 2022

masx200 commented Jul 17, 2022

Leo-Mu commented Jul 28, 2022

ejsmith commented Aug 4, 2022

kitsonk commented Aug 4, 2022

ejsmith commented Aug 5, 2022

masx200 commented Aug 12, 2022 • edited Loading

matthewp commented Oct 23, 2022

mimbrown commented Oct 25, 2022

reggi commented Oct 25, 2022

reggi commented Oct 26, 2022 • edited Loading

symful commented Jan 30, 2023 • edited Loading

afinch7 commented Mar 1, 2019 •

edited

Loading

afinch7 commented Mar 3, 2019 •

edited

Loading

afinch7 commented Mar 3, 2019 •

edited

Loading

afinch7 commented Mar 5, 2019 •

edited

Loading

rdeforest commented Apr 21, 2019 •

edited

Loading

masx200 commented Aug 12, 2022 •

edited

Loading

reggi commented Oct 26, 2022 •

edited

Loading

symful commented Jan 30, 2023 •

edited

Loading