-
Notifications
You must be signed in to change notification settings - Fork 29.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vm: add dynamic import callback and streamline initialize import.meta callback to match it #19717
Conversation
1dc6a0b
to
7f3823f
Compare
doc/api/vm.md
Outdated
@@ -437,6 +437,9 @@ changes: | |||
`cachedData` property of the returned `vm.Script` instance. | |||
The `cachedDataProduced` value will be set to either `true` or `false` | |||
depending on whether code cache data is produced successfully. | |||
* `linker` {function} See [`module.link()`][] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: {function}
-> {Function}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sidenote: maybe we can add a lint rule for this 😓
doc/api/vm.md
Outdated
@@ -891,6 +894,7 @@ associating it with the `sandbox` object is what this document refers to as | |||
[`script.runInContext()`]: #vm_script_runincontext_contextifiedsandbox_options | |||
[`script.runInThisContext()`]: #vm_script_runinthiscontext_options | |||
[`url.origin`]: https://nodejs.org/api/url.html#url_url_origin | |||
[`module.link()`]: #vm_module_link_linker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of ABC-order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nothing else here follows ABC order so i have no idea where to put it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, it seems so(
Feel free to ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I am wrong, there is an ASCII order:
`
+ Uppercase letter`
+ lowercase letter- Uppercase letter
- lowercase letter
So this goes after [`eval()`]:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
impressive detective work, thank you very much
doc/api/vm.md
Outdated
@@ -437,6 +437,9 @@ changes: | |||
`cachedData` property of the returned `vm.Script` instance. | |||
The `cachedDataProduced` value will be set to either `true` or `false` | |||
depending on whether code cache data is produced successfully. | |||
* `linker` {function} See [`module.link()`][] | |||
* `specifier` {string} | |||
* `scriptOrModule` {Script | Module} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both Script
and Module
need to be added in customTypesMap
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am not mistaken, in this way, in the end of the object literal, after the url
module types:
- 'URLSearchParams': 'url.html#url_class_urlsearchparams'
+ 'URLSearchParams': 'url.html#url_class_urlsearchparams',
+
+ 'Module': 'vm.html#vm_class_vm_module',
+ 'Script': 'vm.html#vm_class_vm_script'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be great to see this in. Just a few questions and comments on naming would be nice to clarify.
lib/internal/errors.js
Outdated
@@ -763,6 +763,8 @@ E('ERR_HTTP_INVALID_HEADER_VALUE', | |||
E('ERR_HTTP_INVALID_STATUS_CODE', 'Invalid status code: %s', RangeError); | |||
E('ERR_HTTP_TRAILER_INVALID', | |||
'Trailers are invalid with this transfer encoding', Error); | |||
E('ERR_IMPORT_MODULE_DYNAMICALLY_CALLBACK_MISSING', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps just ERR_DYNAMIC_IMPORT_CALLBACK_MISSING
?
lib/internal/modules/cjs/loader.js
Outdated
@@ -646,7 +646,11 @@ Module.prototype._compile = function(content, filename) { | |||
var compiledWrapper = vm.runInThisContext(wrapper, { | |||
filename: filename, | |||
lineOffset: 0, | |||
displayErrors: true | |||
displayErrors: true, | |||
importModuleDynamically: experimentalModules ? async (specifier) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, perhaps just dynamicImport
or resolveDynamicImport
?
lib/repl.js
Outdated
@@ -249,7 +250,15 @@ function REPLServer(prompt, | |||
} | |||
script = vm.createScript(code, { | |||
filename: file, | |||
displayErrors: true | |||
displayErrors: true, | |||
async linker(specifier, scriptOrModule) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will only be called for top-level dynamic imports right? If so, won't scriptOrModule
always be the top-level script
above?
It looks like this is exactly the same sort of signature as importModuleDynamically
in the vm.runInThisContext
case? If so, why not give it the same name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linkers are always inherited if the child import doesn't have one. as for the signature, importModuleDynamically
above was the old name, i forgot to change it, and so it appears i'm missing a test :p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linkers are always inherited if the child import doesn't have one
So if I return a custom VM module instance, that does not have its own dynamic import hook, then this linker will be applied? But it will still only apply to dynamic import?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I return a custom VM module instance, that does not have its own dynamic import hook, then this linker will be applied?
yep
But it will still only apply to dynamic import?
vm.Module uses the linker for static and dynamic import requests, the only difference is when the function is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to say that this same function is run for resolving the static import dependencies of dynamic import modules? But then it should take a Module object as the return value not a Namespace surely?
Also this code should be altered to return module records that aren't yet executed, as it will break circular references running an explicit loader.import
here.
Please tell me I'm misinterpreting something here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vm.Module uses the linker for static and dynamic import requests, the only difference is when the function is called.
Do you mean to say that this same function is run for resolving the static import dependencies of dynamic import modules? But then it should take a Module object as the return value not a Namespace surely?
As far as I can tell neither of the above statements apply to this vm.createScript
linker
API. Please let me know if I am wrong, but it sees that the linker
is acting only as a dynamic import hook, returning the executed namespace object.
As such, perhaps just call it resolveDynamicImport
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i can call it resolveDynamicImport
but i would like it to accept objects and vm.Module instances as return values. as to circulars, i'm not sure what you mean. we already handle dynamic import with loader.import(), this doesn't really change that
86f4ee9
to
cddc07e
Compare
lib/vm.js
Outdated
const linker = options.linker; | ||
const { callLinkerForNamespace } = require('internal/vm/module'); | ||
options.importModuleDynamically = async (specifier) => { | ||
return callLinkerForNamespace(linker, specifier, this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When linker
is undefined, it seems like dynamic import is still throwing here.
It could be worthwhile still defaulting to the default Node loader here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the only way to get this far is if options.linker
is a function, otherwise yes import() rejects with an error about a missing linker. i think defaulting to the process loader would be confusing/unexpected behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for clarifying, of course you've defined the error for this case as well - looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good except changing to a multimap
src/env.h
Outdated
}; | ||
|
||
std::unordered_map<int, loader::ModuleWrap*> id_to_module_wrap_map; | ||
std::unordered_map< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't this have collisions? We should probably ensure that even if there is a collision we can have the duplicates properly stored. Perhaps use a multimap
here?
src/env.h
Outdated
|
||
std::unordered_map<int, loader::ModuleWrap*> id_to_module_wrap_map; | ||
std::unordered_map< | ||
v8::Global<v8::Module>, loader::ModuleWrap*, ModuleGlobalHash> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, multimap
to handle collisions.
src/node_contextify.h
Outdated
@@ -9,6 +9,8 @@ | |||
namespace node { | |||
namespace contextify { | |||
|
|||
enum SourceType { kScript = 6633, kModule }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to stay out of the way of other users of v8's API, fwiw it's "node" on a phone
67be43b
to
992bbf3
Compare
f68dbc0
to
8761724
Compare
54adddc
to
10ae336
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see things progressing here.
.loaderPromise; | ||
return loader.import(specifier, url); | ||
}, | ||
}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice not to have to set these up for every module if we can just have a single general function that handles these as we do currently.... is it not possible to maintain the previous fallback mechanism while distinguishing contexts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fallback method leaves too much at risk imo. if a module wrap is nullptr for some reason I would rather a dynamic import request failed than went to the context loader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively since this object is the same for all ES modules, can we define this as a constant and pass the same object each time?
That does require receiving a second argument in importModuleDynamically to get the referrer though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that won't really help since ModuleWrap picks these off anyway?
What I'm getting at is that any part of the module pipeline where we can reduce the amount of code running, we should, in the name of supporting fast loading of 1000s of modules. While we're setting this up now, it's worth thinking about, as refactoring of these types of interactions gets trickier as the code gets more complex.
It would just be nice to have a more streamlined approach than defining closures per-module, at least so far as there isn't any technical reason we can't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when I have time I'll set up some benchmarks for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance work will definitely be a great thing to start working on! But note that once you set up these closures, likely no one will ever change this to something simpler. The best time to write simple code is when you first write it, and conceptual simplicity itself is an important goal - performance aside too.
I'm not blocking anything on this, just commenting here.
doc/api/vm.md
Outdated
@@ -437,6 +437,9 @@ changes: | |||
`cachedData` property of the returned `vm.Script` instance. | |||
The `cachedDataProduced` value will be set to either `true` or `false` | |||
depending on whether code cache data is produced successfully. | |||
* `resolveDynamicImport` {Function} See [`module.link()`][] | |||
* `specifier` {string} | |||
* `scriptOrModule` {Script | Module} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the second argument here is necessary for these use cases if it will always be the top-level script? (I know we're going in circles around this somewhat, but helps to clarify!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it's quite likely most people won't assign a handler to each individual script or module instance so they will depend on it being inherited. I can leave a note explaining this behaviour if you think that would help people.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then perhaps just document it as parentScript
as this format of the function is always a script? Alternatively might it make sense to pass this as a this
binding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parentScriptOrModule works I guess, I'll change it in a bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
Tip: for easier reviews: https://github.com/nodejs/node/pull/19717/files?w=1 |
Thanks a lot for working on this! Some high level feedback:
|
@jkrems if i make it per-context we come back to the issue with interacting with node's builtin loader, i don't want to have any indirect method of handing off a dynamic import request to node's loader because of how this presents an escape from the vm in an unexpected way. i'm also not sure what |
@devsnek This looks to mostly affect and be tested with
So blocking or redirecting imports seems to not really add anything - other than a confusing and/or polluted module map in that context (does anything track that the same resolved URL gets loaded only once?). The interesting bit would be both handling and ensuring a consistent module map for other contexts/realms. To be honest: I'm having some problems following what exactly this PR is aiming for which makes it hard to review. Maybe adding some high level notes on the overall idea might be helpful. :) |
|
@jkrems that seems oddly specific, seemlingly akin to vm.Script having |
const referrer = scriptOrModule.url || | ||
getURLFromFilePath(filename === 'repl' ? | ||
path.join(process.cwd(), filename) : filename).href; | ||
return loader.import(specifier, referrer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come this can work without any isNamespace
checks then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't work, it's wrong and needs a test :D
Yeah, so thinking about this a little further, I would feel much more comfortable not treating the Supporting a namespace return value seems necessary if we are to be able to delegate the dynamic import to other loaders regardless, so while I know the x.mjs new vm.Script({
resolveDynamicImport (specifier, referrer) {
// eg reference own contextual import
return import(specifier);
}
}); |
That would prevent cycles. |
@bmeck you can't have cycles between vm.Script and vm.Module as vm.Script can't be imported itself. |
I'm happy to accept namespaces, we just have to wait a bit more |
@guybedford my concern was around it matching constraints and capabilities of |
@bmeck the only advantage is a separation of concerns, and simpler mental model for script dynamic import linking in not needing to understand the details of I can also get behind renaming back to It's only worth waiting if we think it's a nicer API to wait for. Must admit I'm somewhat on the fence, but in such a case perhaps momentum of (A) should win. |
we're almost there 😄 #20016 |
6eb4a61
to
44919ba
Compare
@guybedford this accepts module namespaces now 🎉 |
doc/api/errors.md
Outdated
@@ -1023,6 +1023,12 @@ made to mark a stream and dependent of itself. | |||
`http2.connect()` was passed a URL that uses any protocol other than `http:` or | |||
`https:`. | |||
|
|||
<a id="ERR_LINKER_CALLBACK_MISSING"></a> | |||
### ERR_LINKER_CALLBACK_MISSING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ERR_DYNAMIC_IMPORT_LINKER_MISSING
or something similar.
doc/api/vm.md
Outdated
@@ -437,6 +437,9 @@ changes: | |||
`cachedData` property of the returned `vm.Script` instance. | |||
The `cachedDataProduced` value will be set to either `true` or `false` | |||
depending on whether code cache data is produced successfully. | |||
* `resolveDynamicImport` {Function} See [`module.link()`][] | |||
* `specifier` {string} | |||
* `scriptOrModule` {vm.Script | vm.Module} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove space around |
.
self note: repl needs a test, not quite sure how to yet though |
44919ba
to
c804a2e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work getting the v8 integration necessary here, I thought that would have taken months!
I've just done an initial review for now, will provide further feedback soon.
const { | ||
initializeImportMetaMap, | ||
importModuleDynamicallyMap | ||
} = require('internal/process/esm_loader'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an issue with circular references or something? Surely it's already in the graph at this point that it can be hoisted?
} = require('internal/process/esm_loader'); | ||
initializeImportMetaMap.set(module, (meta) => { meta.url = url; }); | ||
importModuleDynamicallyMap.set(module, async (specifier) => { | ||
const loader = await require('internal/process/esm_loader') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should share the same binding of esm_loader as above.
doc/api/vm.md
Outdated
@@ -437,6 +437,9 @@ changes: | |||
`cachedData` property of the returned `vm.Script` instance. | |||
The `cachedDataProduced` value will be set to either `true` or `false` | |||
depending on whether code cache data is produced successfully. | |||
* `resolveDynamicImport` {Function} See [`module.link()`][] | |||
* `specifier` {string} | |||
* `scriptOrModule` {vm.Script|vm.Module} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably note the return value expected here - vm.Module | ModuleNamespace
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of interest, what is the default context for a vm.Module
instance? I'm just wondering if a module were to be a thenable to a vm.Module
instance if that would enable the sandbox code to break out into the non-sandbox context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default context is the context it is created in, i'm not sure how one escapes via thenables though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither am I, but it's an interesting thought experiment.
So the context is set on vm.Module
construction, not on instantiate or link etc? That seems to be what avoids a possible escalation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default context is the context it is created in
Minor correction: it’s always the top context.
So the context is set on
vm.Module
construction, not on instantiate or link etc? That seems to be what avoids a possible escalation.
That, and that there seems to be no reason to allow mixing of contexts in a module graph anyway.
return getWrapFromModule(m, scriptOrModule, linker); | ||
} | ||
|
||
async function callLinkerForNamespace(linker, specifier, scriptOrModule) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if returning a vm.Module instance that has not been linked (and that has dependencies)? Can we have a test for this error case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is already tested by testScript
702cc14
to
db39f07
Compare
I’d still like to take a more in-depth look at this. I would appreciate it if everyone could hold off on landing this for now. |
if (specifier === 'vm://Y') { | ||
assert.strictEqual(scriptOrModule, moduleX); | ||
return new vm.Module('export default 5', { url: 'vm://Y' }); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@devsnek neither of the above sources have static imports, so I don't think this covers the unlinked test case I described?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will also inherit the resolveDynamicImport function for that which makes me think we need to change the name again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely resolveDynamicImport
should never be inherited from Script to Module, only the linker should be inherited from Module to Module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the hang-up we are having is differentiating scripts and modules too much and differentiating dynamic linking and static linking too much. if we just have "linker" and document that it handles all the things I think it should be fine
6064fd1
to
3c19204
Compare
3c19204
to
eee5c1f
Compare
i'm going to try and reapproach this in a few days, i'll close this for now |
Fixes #19570
Fixes #19363
Commits:
src: flatten ContextifyScript
src, vm: implement scoped import.meta and dynamic import
repl: enable dynamic import
This might be a little rough around the edges particularly with the c++ changes, i wasn't quite sure about the difference between the two map types except for that one had the methods i wanted 😓
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes