Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support dynamic requires #2310

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

support dynamic requires #2310

wants to merge 13 commits into from

Conversation

boneskull
Copy link
Collaborator

@boneskull boneskull commented Jun 7, 2024

Description

  • This PR adds support for dynamic requires via loadFromMap() and link() (in import-lite.js and link.js, respectively). importLocation()'s signature has also been modified for support.

    To use this feature in either function, the following must be true:

    1. The moduleTransforms option (in the appropriate options parameter) must not be present. These are asynchronous module transforms, which cannot be used by dynamic require.
    2. The ReadPowers param must be a proper ReadPowers object (not just a ReadFn) and must contain both the new readSync, new isAbsolute, and filePathFromURL functions.
      anything, and we must delegate loading to the sync exit module handler.
    3. The PackagePolicy of the CompartmentDescriptor must have a dynamic: true flag (see below)

    If all of the above are true, then a compartment will be allowed to dynamically require something. If that thing still cannot be found in the compartment map, the sync "fallback" exit hook (see next item) will be executed.

  • The new importHookNow property can be provided via options, which is a synchronous exit module import hook.

  • ReadPowers.readSync() is necessary to read files synchronously which is necessary to load them synchronously.

  • ReadPowers.isAbsolute() is necessary to determine if the module specifier of a dynamic require is absolute. If it is, it could be just about anything, and must be loaded via the user-provided importNowHook (the sync exit module import hook).

    Note: It's possible to do more work here to check if the module specifier belongs to any known compartment to avoid using the fallback--this would mean storing absolute paths somewhere when traversing node modules, then, at import time, determinng if the module specifier refers to a child path of some compartment's absolute path--but IMO this is a good first pass.

  • As an alternative to moduleTransforms, synchronous module transforms may be provided via the new syncModuleTransforms object. In a non-dynamic-require use-case, if present, syncModuleTransforms are combined with the moduleTransforms option; all sync module transforms are module transforms, but not all module transforms are sync module transforms.

  • All builtin parsers are now synchronous. User-defined parsers can be async, but this will disable dynamic require support.

  • @endo/evasive-transform now exports evadeCensorSync() in addition to evadeCensor(). This is possible because I've swapped the async-only source-map with source-map-js, which is a fork of the former before it went async-only. source-map-js claims comparable performance.

  • PackagePolicy now allows a dynamic flag. Piggybacking on options was considered, but the content of options is intended to be unknown to Endo. Thus, we needed a new property.

Security Considerations

Dynamically requiring an exit module (e.g., a Node.js builtin) requires a user-defined hook, which has the same security considerations as a user-defined exit module hook.

Swapping out a dependency (source-map-js for source-map) incurs risk.

Scaling Considerations

n/a

Documentation Considerations

Should be announced as a user-facing feature

Testing Considerations

I've added some fixtures and tested around the conditionals I've added, but am open to any suggestions for additional coverage.

Compatibility Considerations

This increases ecosystem compatibility considerably; use of dynamic require in the ecosystem is not rare.

For example, most packages which ship a native module will be using dynamic require, because the filepath of the build artifact is dependent upon the platform and architecture.

Upgrade Considerations

Dynamic imports cannot be used without providing a readSync() and isAbsolute() in the readPowers parameter. Both readSync() and isAbsolute() are now generated by makeReadPowersSloppy(), which means that any consumer using this method who previously expected a dynamic require to fail will now receive a different error (presumably due to the missing policy item).

To avoid this, I could create a separate function to provide a ReadPowers including readSync() and isAbsolute(), instead of changing makeReadPowersSloppy; please advise.

Otherwise, everything else should be backwards-compatible, as long as source-map-js does as it says on the tin.

Users of @endo/evasive-transform may note that native modules are neither downloaded/compiled (due to the switch from source-map to source-map-js).

@boneskull boneskull self-assigned this Jun 18, 2024
@boneskull boneskull marked this pull request as ready for review June 18, 2024 20:57
@boneskull
Copy link
Collaborator Author

boneskull commented Jun 18, 2024

If anyone can point me in the direction of why the tests are failing, that'd be helpful; otherwise I'll just plug at it. Plugged.

compartmentDescriptor.modules = moduleDescriptors;

let { policy } = compartmentDescriptor;
policy = policy || Object.create(null);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for typescript

Comment on lines +483 to +495
if ('packages' in policy && typeof policy.packages === 'object') {
for (const [pkgName, policyItem] of entries(policy.packages)) {
if (
!(pkgName in compartmentDescriptor.modules) &&
pkgName in compartmentDescriptor.scopes &&
policyItem
) {
compartmentDescriptor.modules[pkgName] =
compartmentDescriptor.scopes[pkgName];
}
}
}
Copy link
Collaborator Author

@boneskull boneskull Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure if this is correct, but it creates links between compartment descriptors based only on policy--where Endo would not have detected them otherwise.

Checking if dynamic: true is in policy at this point causes many tests to fail, because this ends up being a code path which many tests take. Why? Because the behavior is opt-out.

With regards to that, in addition to the policy, perhaps we should add a dynamic option to ImportLocationOptions or whathaveyou, so you can explicitly opt-in to using importNowHook. Right now, it's an opt-out based on the lack of a moduleTransforms (async module transforms) option and the shape of readPowers. It's not until later that we take the policy into account. Another way to put it: we're creating an importNowHook because a) we have the tools to do so, and b) something might dynamically require something else.

If we did that, I'd be able to delete some LoC, and we'd reduce the risk of introducing backwards-incompatible changes. Thoughts?

Comment on lines +507 to +568
// Collate candidate locations for the moduleSpecifier,
// to support Node.js conventions and similar.
const candidates = [moduleSpecifier];
for (const candidateSuffix of searchSuffixes) {
candidates.push(`${moduleSpecifier}${candidateSuffix}`);
}

for (const candidateSpecifier of candidates) {
const candidateModuleDescriptor = moduleDescriptors[candidateSpecifier];
if (candidateModuleDescriptor !== undefined) {
const { compartment: candidateCompartmentName = packageLocation } =
candidateModuleDescriptor;
const candidateCompartment = compartments[candidateCompartmentName];
if (candidateCompartment === undefined) {
throw Error(
`compartment missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`,
);
}
// modify compartmentMap to include this redirect
const candidateCompartmentDescriptor =
compartmentDescriptors[candidateCompartmentName];
if (candidateCompartmentDescriptor === undefined) {
throw Error(
`compartmentDescriptor missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`,
);
}
candidateCompartmentDescriptor.modules[moduleSpecifier] =
candidateModuleDescriptor;
// return a redirect
/** @type {RedirectStaticModuleInterface} */
const record = {
specifier: candidateSpecifier,
compartment: candidateCompartment,
};
return record;
}

// Using a specifier as a location.
// This is not always valid.
// But, for Node.js, when the specifier is relative and not a directory
// name, they are usable as URL's.
const moduleLocation = resolveLocation(
candidateSpecifier,
packageLocation,
);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copypasta

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot of duplication. I think this is worth trying to bounce on a trampoline.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. This is a good idea, but is also unfortunate because it will take me longer to land this.

Comment on lines +554 to +581
let moduleBytes;
try {
moduleBytes = readSync(moduleLocation);
} catch (err) {
if (err && err.code === 'ENOENT') {
// might be an exit module. use the fallback `exitModuleImportNowHook` to import it
// eslint-disable-next-line no-continue
continue;
}
throw err;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we know something is an exit module? maybe we should check the compartment descriptor, and only use the fallback if the thing doesn't exist there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not arrived at a hard decision for this question. Falling through the bottom of the import hook should be enough, but might not be. An option we did not have when I first wrote this: we can identify “host modules” by the presence of a prefix: like node: or endo:. These would be guaranteed to escape the Node.js style mappings.

In the interest of keeping coupling to a specific specifier resolution strategy low, I am leaning heavily toward “pass through if nothing in the compartment map matches.”

Comment on lines +565 to +655
if (moduleBytes !== undefined) {
/** @type {string | undefined} */
let sourceMap;
// eslint-disable-next-line no-await-in-loop
const envelope = parse(
moduleBytes,
candidateSpecifier,
moduleLocation,
packageLocation,
{
compartmentDescriptor,
readPowers,
sourceMapHook:
sourceMapHook &&
(nextSourceMapObject => {
sourceMap = JSON.stringify(nextSourceMapObject);
}),
},
);
const {
parser,
bytes: transformedBytes,
record: concreteRecord,
} = envelope;

// Facilitate a redirect if the returned record has a different
// module specifier than the requested one.
if (candidateSpecifier !== moduleSpecifier) {
moduleDescriptors[moduleSpecifier] = {
module: candidateSpecifier,
compartment: packageLocation,
};
}
/** @type {StaticModuleType} */
const record = {
record: concreteRecord,
specifier: candidateSpecifier,
importMeta: { url: moduleLocation },
};

let sha512;
if (computeSha512 !== undefined) {
sha512 = computeSha512(transformedBytes);

if (sourceMapHook !== undefined && sourceMap !== undefined) {
sourceMapHook(sourceMap, {
compartment: packageLocation,
module: candidateSpecifier,
location: moduleLocation,
sha512,
});
}
}

const packageRelativeLocation = moduleLocation.slice(
packageLocation.length,
);
packageSources[candidateSpecifier] = {
location: packageRelativeLocation,
sourceLocation: moduleLocation,
parser,
bytes: transformedBytes,
record: concreteRecord,
sha512,
};
for (const importSpecifier of getImportsFromRecord(record)) {
strictlyRequiredForCompartment(packageLocation).add(
resolve(importSpecifier, moduleSpecifier),
);
}

return record;
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copypasta

* @param {SyncModuleTransforms} moduleTransforms
* @returns {ParseFn}
*/
export const mapParsersSync = (
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably just change mapParsers to handle both SyncModuleTransforms and ModuleTransforms...?

Comment on lines 671 to 669
importNowHook = () => {
throw new Error(
`Dynamic require not allowed in compartment ${q(compartmentDescriptor.name)}`,
);
};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what else to do here.

Looking for the dynamic flag too early causes other tests to break, and would potentially need dynamic: true in the policy for attenuators (see here)

Comment on lines +707 to +742
/**
* @typedef FsPromisesApi
* @property {(filepath: string) => Promise<string>} realpath
* @property {WriteFn} writeFile
* @property {ReadFn} readFile
*/

/**
* @typedef FsAPI
* @property {FsPromisesApi} promises
* @property {ReadSyncFn} readFileSync
*/

/**
* @typedef UrlAPI
* @property {(location: string | URL) => string} fileURLToPath
* @property {(path: string) => URL} pathToFileURL
*/

/**
* @typedef CryptoAPI
* @property {typeof import('crypto').createHash} createHash
*/
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not strictly necessary, but I found it helpful. YMMV

Copy link
Member

@kriskowal kriskowal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preliminary feedback.

Comment on lines 42 to +45
const freeze = Object.freeze;

const entries = Object.entries;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const freeze = Object.freeze;
const entries = Object.entries;
const { entries, freeze } = Object;

Comment on lines +478 to +480
let { policy } = compartmentDescriptor;
policy = policy || Object.create(null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does TypeScript not like this pattern:?

Suggested change
let { policy } = compartmentDescriptor;
policy = policy || Object.create(null);
const { policy = create(null) } = compartmentDescriptor;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thought if we’re coming off of JSON and using the in operator, it might be better to ensure that the policy has a null proto, something like:

const policy = { __proto__: null, ...(compartment.descriptor.policy || {}) };

Or:

const policy = assign(create(null), compartment.descriptor.policy || {});

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer your first question: no; it is inferred to be {} instead of CompartmentDescriptor.

I'll take a few putts at it, but if I bogey I'm just going to leave it as-is.

// associates modules with compartment descriptors based on policy
// which wouldn't otherwise be there
if ('packages' in policy && typeof policy.packages === 'object') {
for (const [pkgName, policyItem] of entries(policy.packages)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: I’ve generally elsewhere avoided abbreviating package to pkg. policyItem should probably be something like compartmentPolicy or packakgePolicy to increase the specificity of “item” to the scope it covers.

try {
moduleBytes = readSync(moduleLocation);
} catch (err) {
if (err && err.code === 'ENOENT') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is coupled too closely to Node.js. The coupling should only exist in the Node.js powers. We have an analogous maybeRead that returns undefined if the file is not found. We will need a maybeReadSync power.

Comment on lines +507 to +568
// Collate candidate locations for the moduleSpecifier,
// to support Node.js conventions and similar.
const candidates = [moduleSpecifier];
for (const candidateSuffix of searchSuffixes) {
candidates.push(`${moduleSpecifier}${candidateSuffix}`);
}

for (const candidateSpecifier of candidates) {
const candidateModuleDescriptor = moduleDescriptors[candidateSpecifier];
if (candidateModuleDescriptor !== undefined) {
const { compartment: candidateCompartmentName = packageLocation } =
candidateModuleDescriptor;
const candidateCompartment = compartments[candidateCompartmentName];
if (candidateCompartment === undefined) {
throw Error(
`compartment missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`,
);
}
// modify compartmentMap to include this redirect
const candidateCompartmentDescriptor =
compartmentDescriptors[candidateCompartmentName];
if (candidateCompartmentDescriptor === undefined) {
throw Error(
`compartmentDescriptor missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`,
);
}
candidateCompartmentDescriptor.modules[moduleSpecifier] =
candidateModuleDescriptor;
// return a redirect
/** @type {RedirectStaticModuleInterface} */
const record = {
specifier: candidateSpecifier,
compartment: candidateCompartment,
};
return record;
}

// Using a specifier as a location.
// This is not always valid.
// But, for Node.js, when the specifier is relative and not a directory
// name, they are usable as URL's.
const moduleLocation = resolveLocation(
candidateSpecifier,
packageLocation,
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot of duplication. I think this is worth trying to bounce on a trampoline.

@@ -55,6 +87,44 @@ export const loadFromMap = async (readPowers, compartmentMap, options = {}) => {
assign(create(null), languageForExtensionOption),
);

/**
* Object containing options and read powers which fulfills all requirements
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Object containing options and read powers which fulfills all requirements
* Object containing options and read powers that fulfills all requirements

@@ -55,6 +87,44 @@ export const loadFromMap = async (readPowers, compartmentMap, options = {}) => {
assign(create(null), languageForExtensionOption),
);

/**
* Object containing options and read powers which fulfills all requirements
* for creation of a {@link ImportNowHookMaker}, thus enabling dynamic import
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* for creation of a {@link ImportNowHookMaker}, thus enabling dynamic import
* for creation of a {@link ImportNowHookMaker}, thus enabling dynamic import.


/**
* Object containing options and read powers which is incompatible with
* creation of a {@link ImportNowHookMaker}, thus disabling dynamic import
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* creation of a {@link ImportNowHookMaker}, thus disabling dynamic import
* creation of an {@link ImportNowHookMaker}, thus disabling dynamic import.

@@ -4,7 +4,7 @@
* @module
*/

import { SourceMapConsumer } from 'source-map';
import { SourceMapConsumer } from 'source-map-js';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d be happy to review this change to evasive-transform separately to make this one smaller and land it faster.

Comment on lines +167 to +212
if (behavior.type === 'SYNC') {
const { importNowHook: exitModuleImportNowHook, syncModuleTransforms } =
behavior.options;
makeImportNowHook = makeImportNowHookMaker(
/** @type {SyncReadPowers} */ (readPowers),
entryCompartmentName,
{
compartmentDescriptors: compartmentMap.compartments,
searchSuffixes,
exitModuleImportNowHook,
},
);
({ compartment, pendingJobsPromise } = link(compartmentMap, {
makeImportHook,
makeImportNowHook,
parserForLanguage,
languageForExtension,
globals,
transforms,
syncModuleTransforms,
__shimTransforms__,
Compartment,
}));
} else {
// sync module transforms are allowed, because they are "compatible"
// with async module transforms (not vice-versa)
const moduleTransforms = /** @type {ModuleTransforms} */ ({
...behavior.options.syncModuleTransforms,
...behavior.options.moduleTransforms,
});
({ compartment, pendingJobsPromise } = link(compartmentMap, {
makeImportHook,
parserForLanguage,
languageForExtension,
globals,
transforms,
moduleTransforms,
__shimTransforms__,
Compartment,
}));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to think more about what it would take to make this work with a single call to link, deferring the problem of differentiating these cases to the link implementation. That might be by passing a sync: true option into link.

Maybe this would give us more options: SES currently accepts both importHook and importNowHook but only consults one or the other. We could change Compartment#import to consult importNowHook before importHook, proceeding to the later only if the former returns undefined. Maybe that would let us get more bang from importNowHook. Or perhaps we just need that trampoline.

@boneskull
Copy link
Collaborator Author

Going to extract the changes to @endo/evasive-transform into a separate PR.

@boneskull
Copy link
Collaborator Author

Ref: #2332

cc @kriskowal

@boneskull
Copy link
Collaborator Author

Once #2332 is merged, I can rebase this onto master, which should eliminate the commit from this PR's history. So let's sit on it until then.

This change:

1. Creates a minimal interface for the `fs`, `url`, and `crypto` objects as passed into `makeReadPowers()`. This makes it easier to duck-type the objects.
2. Fixes the invalid type of `MaybeReadPowers`; properties (defined thru `@property`) are ignored in a `@typedef` of that `@typedef` does not extend `object`/`Object`.
3. Added necessary type assertion in `powers.js`
4. Adds return type to `makeReadPowersSloppy()`

# Conflicts:
#	packages/compartment-mapper/src/types.js
This change replaces [source-map](https://npm.im/source-map) with [source-map-js](https://npm.im/source-map-js), which is a fork of the former.  Crucially, `source-map-js` is a synchronous, pure-JS implementation.

A consequence of this is that `makeLocationUnmapper` is now synchronous.  It is internal, however.

`evadeCensor` now just wraps `evadeCensorSync` in a `Promise`.  Tests have been changed to use `evadeCensorSync` directly.

In the `makeLocationUnmapper` implementation, an assertion for the truthiness of `ast.loc` has been moved _before_ instantiation of `SourceMapConsumer`, where it maybe should have been in the first place.
Is this a breaking change?  I don't know.

This doesn't mean that parsers _cannot_ be async--just rather that the ones we have are not.
This adds a new prop to `ReadPowers`: `isAbsolute()`. This function is used to load such a module specifier as an exit module in `importNowHook`, since it could be pointing to anything.
…quire support

- improve error messages
- better tests to support the `node-gyp-build` use-case
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants