Add support for packageLock option #93

isaacs · 2020-05-28T07:56:20Z

Based on #92, land that first

This change updates a few things, resulting in a much more resilient and reasonable approach to finding the dependencies of link targets, which is relevant for workspaces and for interpreting pnpm trees, and reduces the "load my whole dev folder" antipattern. Prior to this change, in loadActual, if a link target was found in a node_modules folder, then the parent of that node_modules folder would be loaded, along with its children. While this is seems like a reasonable way to handle pnpm trees, it has the following problems: 1. If you are developing a bunch of projects right in a `node_modules` folder, then it would sometimes result in loading ALL of your JS projects in the actualTree! This is rarely what you want, and had some surprising behavior when I encountered it myself. (I develop quite a lot of projects in a node_modules folder for convenience.) 2. If you have a situation where a package is _not_ in a node_modules folder, but it does have some missing dependencies which are in fact resolved by deps already in the tree, it would show up as missing. With this change, the loadActual process will walk up the parent directories of any top nodes, searching for any missing edges. If a folder is found at `${dir}/node_modules/${name}`, then a "dummy" node is created for the parent, such that the parent/fsParent resolution algorithm will find a hit and resolve to the named dep. It will not load any _other_ dependencies in the dummy node, however, and will not walk up higher than the nearest common ancestor of the project root and the link target. Furthermore, this uncovered a bug where setting the parent of a node to a new parent with the same `root` was causing it to unlist itself and all children from the root.meta and root.inventory, but only _relist_ itself, and not its children. This bug was previously not hit in test because the node demonstrating it was within a node_modules folder, and so was fully loaded along with all of its siblings. Lastly, because dummy nodes are never "extraneous" (they're just there as placeholders, they're not "real" anyway), had to update a few shrinkwrap files in fixtures to not get weird results. An upcoming refactor of the Shrinkwrap class will make this incorrect metadata less of a hazard moving forward.

Since the Shrinkwrap class was developed prior to figuring out exactly how it would be used in the various Arborist methods, some things ended up growing in a somewhat inelegant direction. - Nodes were fetching their resolved and integrity values in multiple places. - The Shrinkwrap internal data tried to keep up both the new and old data up to date as the tree changed, but this was not always possible. - Dependencies of link targets were not properly reflected in the legacy shrinkwrap metadata. - The Node class and buildIdealTree methods both had to dive deep into the internals of the YarnLock class, which should be an implementation detail of the Shrinkwrap class. Now, the Shrinkwrap object keeps a reference to the root tree node found at its own path. Rather than try to keep the legacy metadata in sync at all times, we _only_ build up the legacy shrinkwrap data when calling `this.commit()`, by walking the `node.children` and `node.target.children` maps appropriately. Also, `this.yarnLock` is updated on commit(), and a `checkYarnLock()` method is added to return the spec that should be fetched, and update the provided options object with resolved and integrity metadata. All of the logic for setting node metadata from the Shrinkwrap is now done in `Shrinkwrap.add(node)`, making it much simpler and harder to get wrong. This means that: - `yarn.lock` files are respected more fully (including their resolved and integrity expectations) - We don't have cases where both `extraneous` and `dev`/`optional` are set in the lockfile (which is always an error -- extraneous nodes cannot be either dev or optional!) - Link target dependencies are accurately reflected in the legacy shrinkwrap data, rather than creating invalid `../` entries. Fix: #82 Fix: #84

Do not load the actual tree multiple times if loadActual() is called multiple times on the same Arborist object.

Added while debugging why making buildIdealTree respect the existing actualTree was not working as expected.

Fix #79 If there is no package-lock.json or npm-shrinkwrap.json file present, then base the idealTree on the actualTree if available. This minimizes surprising meta-dep updates (and in some cases, extremely slow install times!) when not using a package-lock.json file. A caveat here is that it _may_ result in a lockfile being saved for the first time with `resolved: null` for some deps, if they were present in the `node_modules` folder to start with.

Compare the mtime of the hidden lockfile at node_modules/.package-lock.json against all package folders in node_modules, and ensure that all packages in node_modules exist in the lockfile data. This of course still can be thwarted by editing a single file within a package folder, but not adding or deleting any files or folders, or modifying files in nested subfolders in a package in any way whatsoever. However, it prevents the overwhelming majority case where this would bite us, which is someone installing something with yarn or previous versions of npm. In those cases, we ignore the hidden lockfile, meaning that any loadActual call will read the files on disk as normal. Fix #91

This was causing a weird as hell race condition due to the prior commit that checks if the hidden lockfile is up to date. If the test fixture was created with a >10ms gap between the hidden lockfile being created and the package contents being created, then the hidden lockfile wouldn't be respected, and it'd have to load the package.json from disk. However, since the package.json was missing, this meant that the semver node always had a node.package of {}. That timing issue would only hit if the test fixture generation took more than 10ms, AND the delay landed in just the right spot in the test. Reproduced by adding a ton of loging, and then running the test in a tight loop along with 6 other tests going at the same time, and even then, it only usually triggered after 5-10 minutes. It wasn't a problem before, because a hidden lockfile contains everything necessary to load the tree. So, when there was no safety check to ensure that the hidden lock is newer than the packages it describes, the lack of a package.json file was never a problem.

Re #46

isaacs added 8 commits May 27, 2020 14:28

make Arborist.loadActual() parallel-safe

f282506

Do not load the actual tree multiple times if loadActual() is called multiple times on the same Arborist object.

more extensive tests for deduper

2b08c0f

Added while debugging why making buildIdealTree respect the existing actualTree was not working as expected.

Add support for packageLock option

f9070fa

Re #46

isaacs mentioned this pull request May 28, 2020

[BUG] Configs needing to be added/respected #46

Closed

6 tasks

ruyadorno approved these changes May 29, 2020

View reviewed changes

ruyadorno closed this in 9077c4b May 29, 2020

wraithgar deleted the isaacs/package-lock-config branch April 22, 2021 17:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for packageLock option #93

Add support for packageLock option #93

isaacs commented May 28, 2020

Add support for packageLock option #93

Add support for packageLock option #93

Conversation

isaacs commented May 28, 2020