Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements packageIterator #205

Open
wants to merge 3 commits into
base: master
from

Conversation

@arcanis
Copy link
Contributor

arcanis commented Jan 3, 2020

Offers a fix for the problem described in #199 (comment)

This PR offers a way to alter the default package path generation. The default implementation still uses path.join('/.../node_modules', packageName) (which cannot work for PnP when using the link protocol for the reasons described here), but the PnP resolver (automatically configured thanks to #174) will directly return the path to the package regardless of the dependency type, shortcutting the default logic.

One big advantage of this diff is that it's semver-minor, which avoids having to upgrade all third-party consumers (which I really don't have the bandwidth to do 馃槄).

};

var defaultPackageIterator = function defaultPackageIterator(x, start, getPackageCandidates, opts) {
return getPackageCandidates();

This comment has been minimized.

Copy link
@nicolo-ribaudo

nicolo-ribaudo Jan 5, 2020

Contributor

Nit: this is really confusing. I saw a function with the same name defined right before this function, which takes 3 arguments. It took me a while to realize that this isn't calling that function, but the one passed as a parameter.

Maybe the one on line 39 could be renamed to getPackageNodeModulesCandidates?

This comment has been minimized.

Copy link
@ljharb

ljharb Jan 5, 2020

Member
Suggested change
return getPackageCandidates();
return defaultGetPackageCandidates();
};

var defaultPackageIterator = function defaultPackageIterator(x, start, getPackageCandidates, opts) {
return getPackageCandidates();

This comment has been minimized.

Copy link
@ljharb

ljharb Jan 5, 2020

Member
Suggested change
return getPackageCandidates();
return defaultGetPackageCandidates();
return dirs;
};

var defaultPackageIterator = function defaultPackageIterator(x, start, getPackageCandidates, opts) {

This comment has been minimized.

Copy link
@ljharb

ljharb Jan 5, 2020

Member
Suggested change
var defaultPackageIterator = function defaultPackageIterator(x, start, getPackageCandidates, opts) {
var defaultPackageIterator = function defaultPackageIterator(x, start, defaultGetPackageCandidates, opts) {
lib/async.js Outdated Show resolved Hide resolved
lib/sync.js Outdated Show resolved Hide resolved
lib/sync.js Outdated Show resolved Hide resolved
lib/sync.js Show resolved Hide resolved
readme.markdown Show resolved Hide resolved
@nicolo-ribaudo

This comment has been minimized.

Copy link
Contributor

nicolo-ribaudo commented Jan 5, 2020

@arcanis Is packageIterator meant to return the possible package folders or the possible resolved path?

i.e. for resolve("foo/bar") should it return "/.../node_modules/foo/bar" or "/.../node_modules/foo"? I am asking because this PR does the first thing, while https://github.com/yarnpkg/berry/blob/3e324d5383de1f4143c0afc4829c33bd1b8e5f50/packages/plugin-compat/extra/resolve/normalize-options.js#L22 does the second one.

nicolo-ribaudo added a commit to nicolo-ribaudo/yarn-berry that referenced this pull request Jan 5, 2020
lib/async.js Outdated Show resolved Hide resolved
lib/async.js Outdated Show resolved Hide resolved
lib/async.js Outdated Show resolved Hide resolved
lib/sync.js Outdated Show resolved Hide resolved
@arcanis

This comment has been minimized.

Copy link
Contributor Author

arcanis commented Jan 5, 2020

Some context on my last few commits:

  • About isDirectory, I found out it got added in #190 for performance reasons. Since it doesn't have semantic meaning, checking the existence of the parent directory is good enough to preserve the original intent, so that's what I did (this avoids having to make packageIterator sync or async depending on the caller).

  • Regarding @nicolo-ribaudo's question I initially meant to return the package folder only (which isn't the case atm). However, after a closer inspection of the current architecture I feel like this would require to do extra work in the code itself (to split the subpath) for little reason (since it's already done later in the resolution). So I think keeping the current behavior is preferable (ie @scope/foo/bar will produce /n_m/@scope/foo/bar, not /n_m/@scope/foo).

  • The isCore change simply removes an unreachable call in the sync codepath (it could only be reached if the request was both an absolute path and a core module, which isn't possible) and moves the isCore check in the async codepath to be done before trying the n_m resolution, similar to what happens in the sync codepath. Due to the way the check was done it doesn't change anything, except that packageIterator won't be called if the package is a core module.

var file = path.join(dir, x);
loadAsFile(file, opts.package, onfile);
}
isDirectory(path.dirname(dir), function (err, status) {

This comment has been minimized.

Copy link
@ljharb

ljharb Jan 6, 2020

Member

why the extra path.dirname call? shouldn't the dirs list already have that done?

This comment has been minimized.

Copy link
@arcanis

arcanis Jan 6, 2020

Author Contributor

dirs may point to files, for example if you do resolve something like foo/package.json then dirs will be ['/n_m/foo/package.json'] (that makes dir a misnomer, do you want me to rename it into entry or similar?).

This comment has been minimized.

Copy link
@ljharb

ljharb Jan 6, 2020

Member

i mean, yes, but in that case foo should be the entry that's a directory and package.json should be the entry that's not.

This comment has been minimized.

Copy link
@arcanis

arcanis Jan 6, 2020

Author Contributor

That's why I call dirname: since dir is /n_m/foo/package.json, the isDir check has to be done against its parent. Even if it was /n_m/foo (which would happen if the request was just foo), the right behavior would be to check against /n_m (since foo is allowed to be a file per the Node resolution rules), so the dirname would still stand.

@ljharb ljharb force-pushed the arcanis:mael/package-iterator branch from 847d93a to 4d138d7 Jan 6, 2020
@ljharb

This comment has been minimized.

Copy link
Member

ljharb commented Jan 6, 2020

This is a lot of change in one PR; would you mind opening up a separate PR with all the bugfixes and refactors, and associated tests, so we can get this semver-minor PR as small as possible?

@arcanis arcanis mentioned this pull request Jan 6, 2020
@ljharb ljharb force-pushed the arcanis:mael/package-iterator branch from 4116357 to af016eb Jan 7, 2020
@ljharb ljharb force-pushed the arcanis:mael/package-iterator branch 2 times, most recently from 99aa42e to c8a6050 Jan 7, 2020
@@ -265,19 +274,18 @@ module.exports = function resolve(x, options, callback) {
if (dirs.length === 0) return cb(null, undefined);
var dir = dirs[0];

isDirectory(dir, isdir);
isDirectory(path.dirname(dir), isdir);

This comment has been minimized.

Copy link
@ljharb

ljharb Jan 7, 2020

Member

when is the dirname not going to be a directory? the intention here, i believe, is to see if dir points to a file or not. It seems like hoisting the path.join prevents that from being determined.

This comment has been minimized.

Copy link
@arcanis

arcanis Jan 7, 2020

Author Contributor

The dirname isn't a directory when it doesn't exist. From what I gather from #190 and #116, this code was only meant to speed up the resolution by checking whether the n_m folder existed before doing the file stat calls.

Checking whether the parent of the subpath (ie /n_m/foo for foo/hello) exists has the same behaviours: if foo/hello can be resolved, then /n_m/foo will necessarily be a directory.

This comment has been minimized.

Copy link
@arcanis

arcanis Jan 7, 2020

Author Contributor

In other words:

request partial resolution isDirectory (before) isDirectory (after)
foo /n_m/foo /n_m /n_m
foo/bar /n_m/foo/bar /n_m /n_m/foo
@foo/bar /n_m/@foo/bar /n_m /n_m/@foo

The effect will be the same before and after: the two stat calls that the code would have made before #190 (taking the first line as an example, one to check whether /n_m/foo.js exists and another to check whether /n_m/foo/index.js exists) won't be necessary because resolve is able to find out that the directory supposed to contain them doesn't even exist.

This comment has been minimized.

Copy link
@ljharb

ljharb Jan 7, 2020

Member

I see. Do you think that this perf improvement could also be split out into a separate PR (or even just a separate commit prior to your addition)?

This comment has been minimized.

Copy link
@arcanis

arcanis Jan 7, 2020

Author Contributor

In this PR it's not a perf improvement, it's necessary to make my PR work with the changes brought by #190 (because with my PR we don't have access to n_m paths anymore in loadNodeModules - instead we only have /n_m/<subpath>). I guess I could revert it and re-apply it, but I'm not sure it would be much clearer 馃

This comment has been minimized.

Copy link
@ljharb

ljharb Jan 7, 2020

Member

I'm mainly concerned about an unintentional breakage, and my hope would be to release this particular change as a patch prior to releasing this PR as a minor.

This comment has been minimized.

Copy link
@arcanis

arcanis Jan 7, 2020

Author Contributor

I think #190 has only been released in the 2.0.0-next.0 branch, so for the purpose of the patch release the path.dirname thing can be removed entirely. Do you want me to open a separate PR against the 1.x branch?

This comment has been minimized.

Copy link
@ljharb

ljharb Jan 7, 2020

Member

#192 was the PR into the 1.x branch.

@@ -38,6 +38,14 @@ var maybeUnwrapSymlink = function maybeUnwrapSymlink(x, opts) {
return x;
};

var getPackageCandidates = function getPackageCandidates(x, start, opts) {
var dirs = nodeModulesPaths(start, opts, x);

This comment has been minimized.

Copy link
@nicolo-ribaudo

nicolo-ribaudo Jan 13, 2020

Contributor

Since this function is the same in sync.js, maybe it could be extracted to a separate file?

This comment has been minimized.

Copy link
@ljharb

ljharb Jan 13, 2020

Member

I thought about that, but then it's a separate file that has to be supported for the rest of v1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can鈥檛 perform that action at this time.