Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use explicit key for tracking dependencies in bundler #835

Closed
wants to merge 1 commit into from

Conversation

motiz88
Copy link
Contributor

@motiz88 motiz88 commented Jun 17, 2022

Summary:
Changes collectDependencies so it is responsible for providing a key with each extracted dependency. This key should be stable across builds and when dependencies are reordered, and must be locally unique within a given module.

NOTE: This is a similar concept to the key prop in React. It's also used for a similar purpose - traverseDependencies is not unlike a tree diffing algorithm.

Previously, the name property ( = the first argument to require etc) implicitly served as this key in both collectDependencies and DeltaBundler, but since the addition of require.context dependency descriptors in #821, this is no longer strictly correct. (This diff therefore unblocks implementing require.context in DeltaBundler.)

Instead of teaching DeltaBundler to internally re-derive suitable keys from the dependency descriptors - essentially duplicating the logic in collectDependencies - the approach taken here is to require collectDependencies to return an explicit key as part of each descriptor.

NOTE: Keys should be considered completely opaque. As an implementation detail (that may change without notice), we hash the keys to obfuscate them and deter downstream code from depending on the information in them.

Note that it's safe for multiple descriptors to resolve to a single module (as of D37194640 (fc29a11)), so from DeltaBundler's perspective it's now perfectly valid for collectDependencies to not collapse dependencies at all (e.g. even generate one descriptor per require call site) as long as it provides a unique key with each one.

WARNING: This diff exposes a preexisting bug affecting optional dependencies (introduced in #511) - see FIXME comment in graphOperations.js for the details. This will require more followup in a separate diff.

Differential Revision: D37205825

Summary:
Changes `collectDependencies` so it is responsible for providing a key with each extracted dependency. This key should be *stable* across builds and when dependencies are reordered, and *must* be locally unique within a given module.

NOTE: This is a similar concept to the [`key`](https://reactjs.org/docs/lists-and-keys.html#keys) prop in React. It's also used for a similar purpose - `traverseDependencies` is not unlike a tree diffing algorithm.

Previously, the `name` property ( = the first argument to `require` etc) *implicitly* served as this key in both `collectDependencies` and DeltaBundler, but since the addition of `require.context` dependency descriptors in facebook#821, this is no longer strictly correct. (This diff therefore unblocks implementing `require.context` in DeltaBundler.)

Instead of teaching DeltaBundler to internally re-derive suitable keys from the dependency descriptors - essentially duplicating the logic in `collectDependencies` - the approach taken here is to require `collectDependencies` to return an *explicit* key as part of each descriptor.

NOTE: Keys should be considered completely opaque. As an implementation detail (that may change without notice), we Base64-encode the keys to obfuscate them and deter downstream code from depending on the information in them. (We do this on the assumption that Base64 encoding performs better than hashing.)

Note that it's safe for multiple descriptors to resolve to a single module (as of D37194640 (facebook@fc29a11)), so from DeltaBundler's perspective it's now perfectly valid for `collectDependencies` to not collapse dependencies at all (e.g. even generate one descriptor per `require` call site) as long as it provides a unique key with each one.

WARNING: This diff exposes a **preexisting** bug affecting optional dependencies (introduced in facebook#511) - see FIXME comment in `graphOperations.js` for the details. This will require more followup in a separate diff.

Differential Revision: D37205825

fbshipit-source-id: 3559ee6a2f3c30017812ff009f0fe4c442e30029
@facebook-github-bot facebook-github-bot added CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported labels Jun 17, 2022
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D37205825

@motiz88
Copy link
Contributor Author

motiz88 commented Jun 17, 2022

cc @EvanBacon - see if you can work on top of this branch for your next set of require.context changes. I'm unlikely to land it before I'm back from PTO in early July though. The good news: I don't expect to make further changes to this area of DeltaBundler, except maybe to fix the aforementioned bug around optional dependencies.

EvanBacon added a commit to EvanBacon/metro that referenced this pull request Jun 19, 2022
@EvanBacon EvanBacon mentioned this pull request Jun 22, 2022
1 task
facebook-github-bot pushed a commit that referenced this pull request Aug 17, 2022
Summary:
Continued work for adding `require.context` to Metro, which started in #821. The overall design is discussed in microsoft/rnx-kit#1515. The user-facing API is closely modelled on [Webpack's `require.context`](https://webpack.js.org/guides/dependency-management/#requirecontext).

**This feature is experimental and unsupported.** We will document and announce this for general consumption once it's stable.

## How it works

When we encounter a dependency that has `contextParams` (see #821), we will now generate a corresponding *virtual module* in the dependency graph, with dependencies on every file in the file map that matches the context params.

```
┌─────────┐  require.context('./ctx', ...)   ┌−−−−−−−−−−−−−−−−−−┐     ┌──────────┐
│ /bundle │ ───────────────────────────────▶ ╎ <virtual module> ╎ ──▶ │ /ctx/bar │
└─────────┘                                  └−−−−−−−−−−−−−−−−−−┘     └──────────┘
                                               │
                                               │
                                               ▼
                                             ┌──────────────────┐
                                             │     /ctx/foo     │
                                             └──────────────────┘
```

Crucially, we keep this context module up-to-date as file change events come in, so that HMR / Fast Refresh continues to work reliably and transparently. This accounts for the bulk of the complexity of the implementation and tests.

## Main pieces of the implementation

* [collectDependencies] Done in #821: Extract `require.context` calls as dependencies with added metadata. (Behind a flag)
* [metro-file-map] HasteFS: API for querying the file map for the set of files that match a particular resolved context.
* [DeltaBundler] DeltaCalculator: Logic to mark a previously-generated context module as dirty when the set of files it matches has changed.
* [DeltaBundler] graphOperations: Logic to "resolve" context params to a virtual module stored in the graph object, and forward the resolved params to the transformer to generate the module's body and dependencies.
* [DeltaBundler] graphOperations: API for querying the graph for the set of currently active contexts that match a particular file. (Used by DeltaCalculator)
* [metro] transformHelpers, contextModuleTemplates: Logic to generate the contents of a virtual context module by querying the file map. Includes various templates to implement the different `require.context` modes.
* [metro-runtime] `require()`: A stub for `require.context` that helps us throw a meaningful error if the feature is not enabled.

Tests:
* `require-context-test`: a new integration test suite that builds and executes various bundles that use `require.context`. In particular this covers the subset of the `require.context` runtime API that we support.
* `DeltaCalculator-context-test`: a new test suite covering the changes to DeltaCalculator that are specific to `require.context` support.
* `traverseDependencies-test`: expanded and refactored from its previous state. Covers the changes to graphOperations.

## Future work

At the moment, every time we invalidate a context module, we scan the entire file map to find the up-to-date matches and then regenerate the module from scratch (including passing it through the transformer).

Two open areas of investigation are:
1. Optimising the *initial* scan over the file map - e.g. representing it as a path tree to drastically limit the number of files we need to match against.
2. Optimising the incremental case - e.g. directly using the file addition/deletion events we receive from the file map to update the generated module in-place.

At least (2) is essential before we treat this feature as stable.

There's also room to generalise the "virtual modules" concept/infrastructure here to support other use cases. For now everything is very much coupled to the `require.context` use case.

EvanBacon's original PR summary follows.

----

<details>

- Continued work for adding `require.context` to Metro #821
- This PR adds the ability to match files given a "require context". This is similar to the existing method for matching against a regex, but this enables users to search upwards in a directory, search shallow, and match a regex filter.
- Adds ability to pass a `Buffer` to `transformFile` as a sort of virtual file mechanism. This accounts for most the changes in `packages/metro/src/DeltaBundler/Transformer.js`, `packages/metro/src/DeltaBundler/Worker.flow.js`, `packages/metro/src/DeltaBundler/WorkerFarm.js`.
- Since we collapse `require.context` to `require` I've also added a convenience function in dev mode which warns users if they attempt to access `require.context` without enabling the feature flag.
- Made `DeltaCalculator` aware of files being added.
- `graphOperations` has two notable changes:
  1. When resolving dependencies with context, we attach a query to the absolute path (which is used for indexing), this query has a hex hash of the context -- used hex instead of b64 for filename safety.
  2. We pass the require context to `processModule` and inside we transform the dependency different based on the existence of a context module.
- When collecting the delta in `_getChangedDependencies` we now iterate over added and deleted files to see if they match any context modules. This is only enabled when the feature flag is on.
- In `packages/metro/src/lib/contextModule.js` we handle a number of common tasks related to context handling.
- In `packages/metro/src/lib/contextModuleTemplates.js` we generate the virtual modules based on the context. There are a number of different modules that can be created based on the `ContextMode`. I attempted to keep the functionality here similar to Webpack so users could interop bundlers. The most notable missing feature is `context.resolve` which returns the string path to a module, this doesn't appear to be supported in Metro. This mechansim is used for ensuring a module must be explicitly loaded by the user. Instead I wrapped the require values in getters to achieve the same effect.
- We implement the `lazy` mode as well but this requires the user to have async bundles already setup, otherwise the module will throw a runtime error.

### Notice

I've withheld certain optimizations in order to keep this PR simple but still functional. We will want to follow up with a delta file check to require context so we aren't iterating over every file on every update. This feature can be seen in my [test branch](main...EvanBacon:@evanbacon/context/graph-resolver).

In my [test branch](https://github.com/facebook/metro/compare/main...EvanBacon:%40evanbacon/context/graph-resolver?expand=1), I built the feature on top of #835 so I know it works, should only require minor changes to graphOperations to get them in sync.

Pull Request resolved: #822

Test Plan:
I've (motiz88) reviewed and expanded the tests from the original PR, as well as fixed several bugs and removed some accidental complexity that got introduced.

EvanBacon's original PR test plan follows.

 ---

- [ ] Unit tests -- pending motiz88 approving the implementation.

### Local testing

`metro.config.js`
```js
const { getDefaultConfig } = require("expo/metro-config");

const config = getDefaultConfig(__dirname);
const path = require("path");

config.watchFolders = [
  path.resolve(__dirname),
  path.resolve(__dirname, "../path/to/metro"),
];

config.transformer = {
  // `require.context` support
  unstable_allowRequireContext: true,
};

module.exports = config;
```

index.js
```tsx
console.log(require.context("./").keys());
```

Start a dev server, when changes to the file system occur, they should be reflected as automatic updates to the context. This does lead to the issue of having multiple contexts triggering multiple sequential reloads, I don't think this is a blocking issue though.

Also tested with async loading enabled, and the context:  `require.context("./", undefined, undefined, 'lazy')`.

### Behavior

Notable features brought over to ensure this `require.context` functions as close to the original implementation as possible:

- `require.context` does not respect platform extensions, it will return every file matched inside of its context.
- `require.context` can match itself.
- A custom 'empty' module will be generated when no files are matched, this improves the user experience.
- All methods in `require.context` are named to improve the debugging.
- We always match against a `./` prefix. This prefix is also returned in the keys.
- Modules must be loaded by invoking the context as a function, e.g. `context('./module.js')`

</details>

Reviewed By: robhogan

Differential Revision: D38575227

Pulled By: motiz88

fbshipit-source-id: 503f69622f5bebce5d2db03e8f4c4705de169383
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants