Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dynamic export name resolving to Synthetic Modules #52333

Open
Danielku15 opened this issue Apr 2, 2024 · 5 comments
Open

Add dynamic export name resolving to Synthetic Modules #52333

Danielku15 opened this issue Apr 2, 2024 · 5 comments
Labels
feature request Issues that request new features to be added to Node.js.

Comments

@Danielku15
Copy link

What is the problem this feature will solve?

I want to dynamically resolve and create values for the individual imports (named and default) resolved by a module. The SyntheticModule (and related spec) expects that all named exports are registered in advance and then resolved from a fixed map.

There is no possibility to dynamically react on the imports of a module and return values.

When using CommonJS and vm.runInNewContext we can supply a new Proxy with a get accessor to inject a fully dynamic module. For ESM code we would need to parse first the code to find out the named imports which might occur to generate a SyntheticModule.

This makes it impossible to port code like this to ESM:
https://github.com/microsoft/vscode-extension-test-runner/blob/4554fc9a5b5633a944eb8bbc21ab5e9cc4b1018d/src/extract/evaluate.ts#L30-L125

What is the feature you are proposing to solve the problem?

A node specific vm.DynamicSyntheticModule could be added which allows registering callbacks to generate the default or named exports on-the-fly.

It should be a setting whether the returned value should be cached or node should call the callback every time a value is resolved.

class Test { }

const module = new vm.DynamicSyntheticModule(name => {
    if (name === null) { // import * as m from 'module'; new m.MyClass()
        return {
            MyClass: Test
        }
    } else if(name === undefined) { // import MyClass from 'module'; new MyClass();
        return Test;
    } else if(name == 'MyClass') { // import { MyClass } from 'module'; new MyClass();
        return Test;
    } else {
        throw new vm.UnknownExportError(name); // some known error to trigger the SyntaxError message, null and undefined are valid values
    }
);

This is kind-of similar to what module.link is offering. There we can create modules dynamically based on an identifier.

What alternatives have you considered?

I'm not aware of any proper alternative. Currently I need to stick to CJS for my usecases. But as CJS does not support top level awaits, and the ecosystem transitions away from CJS, incompatibilities are growing.

As I'm evaluating single files and replace all modules, I could try parsing the AST in advance and collect the imported names per module. But this is very costly.

@Danielku15 Danielku15 added the feature request Issues that request new features to be added to Node.js. label Apr 2, 2024
@Jamesernator
Copy link

The SyntheticModule (and related spec)

Yes this is how the specification works for JS. Node doesn't have the ability to change this at their whim, in fact there even was a proposal for what you're suggesting but it ultimately wound up going nowhere.

As I'm evaluating single files and replace all modules, I could try parsing the AST in advance and collect the imported names per module.

Node already internally does this for ESM importing CJS. While the internal module isn't exposed, you can install the library from npm anyway.

@Danielku15
Copy link
Author

Yes this is how the specification works for JS. Node doesn't have the ability to change this at their whim, in fact there even was a proposal for what you're suggesting but it ultimately wound up going nowhere.

Interesting link, unfortunate for me that my need never made it through. I've read about the specs of the vs.SyntheticModule and was trying to point out the difference in the features the current module system offers and what I'd need. That's where my idea originates: I saw how simple the SyntheticModule implementation is with its map, and thought that it should be fairly simple to resolve exports on-the-fly (with optional caching?) in a new module implementation (beside the official standard ones).

The complexity to make Node understand a new own Module class seems fairly high. I peeked a bit into the code and the coupling to the two build-in variants is quite high too.

Node already internally does this for ESM importing CJS. While the internal module isn't exposed, you can install the library from npm anyway.

Sounds a bit like the reverse case of mine:

  • Node translates CJS to ESM when imported
  • I'm (currently) transpiling TypeScript or ESM-JS to CJS which has less features (e.g. top level awaits).
  • And even if I parse the ESM myself, collect exports and then evaluate, Note will need to parse the ESM code again when passing it to vm.SourceTextModule

Background info: I'm working on a VS Code extension for Mocha and will need to evaluate all the test files to collect the unit tests contained in the file. I have an AST based extraction but this has its own limitations again.

@Jamesernator
Copy link

I'm still confused by what the goal here is, if you want to run actual ESM you might as well just use vm.SourceTextModule. If you need to know what the exports are prior to evaluation, .namespace already shows that immediately after linking:

const mod = new vm.SourceTextModule(`
    export const foo = "foo";
    export const bar = "bar";
`);

await mod.link(yourLinker);
// prints: ["foo", "bar"]
console.log(Object.keys(mod.namespace));

And even if I parse the ESM myself, collect exports and then evaluate, Note will need to parse the ESM code again when passing it to vm.SourceTextModule

If the goal here is to replace just some modules, then you can just use sourceTextModule.createCachedData() and re-use that cache data to reconstruct all of the graph except whatever few modules need replacement.

If CJS is what is actually used as output for the tests (by way of transpiling via TS), then I'd just suggest using require as that is what will actually be run in Node anyway.

@Danielku15
Copy link
Author

Danielku15 commented Apr 4, 2024

The goal is to dynamically create exports based on what another module imports.

I imagine this:

function evaluate(code) {
  const mod = new SourceTextModule(code, {context: contextifiedObj});
  mod.link(aync (specifier, ref) => {
    return new vm.DynamicModule(async importSpecifier => {
      if (importSpecifier === 'a') { return 42; }
      if (importSpecifier === undefined) { return { a: 13, b: 'bye'}; }
      return 'hello';
    });
  });
  mod.evaluate();
}
evaluate(`
import {a, b, c} from 'x';
import { d } from 'x';
import * as e from 'y';

console.log(a, b, c, d, e);
// 42, 'hello', 'hello', 'hello', { a: 13, b: 'bye' }
`)

When evaluating I start off a string where I cannot know what it will import. I want to generate each export dynamically based on the import specifier.

In my actual usecase every dependency beside a few functions are Proxy obkects which again provides values dynamically.

In the CJS world I just define that the require function returns Proxy objects and all module ls become fully runtime dynamic.

Goal is to evaluate a given piece of code standalone in a sandbox with all dependencies stubbed.

The problem: I don't know in advance what exports are imported without parsing the code myself and collect them. Hence I want to react in a callback on the import specifiers and create the values dynamically.

@Jamesernator
Copy link

Yes you can't do this without the rejected dynamic modules proposal or just parsing.

Looking at your use case more carefully, it seems like in the upper majority of cases you could get away by literally just finding all valid identifiers† in the file and just generating mocks for those.

i.e. Just do this:

const IDENTIFIER_NAME = /^(?:[$_\p{ID_Start}])(?:[$\u200C\u200D\p{ID_Continue}])*$/u;

const sourceText = `
    import { a, b } from "bar";
    import { c } from "baz";
    import * as bizz from "buzz";
`;

const identifiers = new Set();
// Yes this captures things like keywords and other identifiers, but it really doesn't matter as those imports just won't be used by the module
for (const [identifier] of sourceText.matchAll(IDENTIFIER_NAME)) {
    identifiers.add(identifier);
}

const syntheticModule = new vm.SyntheticModule([...identifiers], () => {
    for (const identifier of identifiers) {
        syntheticModule.setExport(identifier, mockThing);
    }
});

† Technically imports/exports could be strings as well e.g. import { "foo-bar" as baz } from "mod", so if you wanted to be really defensive you'd also find all string literals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Issues that request new features to be added to Node.js.
Projects
Status: Pending Triage
Development

No branches or pull requests

2 participants