RFC: file: specifier changes

RFC: file: specifier changes

Conversation

iarna commented Feb 28, 2017

Easy reading link for the new specification. This also introduces written specifications for npm features, something we intend to extend out to all of npm's activity.

# The Problem We're Solving

File type dependency specifiers are confusing. How they interact across npm's many varied commands (npm install --save, npm update, npm outdated, npm shrinkwrap) has never been defined in one place. Most of the behaviors "happened by default". That is, the minimum implementation was done to make them install ok and the rest was left as a side effect.

Currently npm update and npm outdated simply don't do anything for local dependencies. If you've updated the source you have to manually reinstall to get a copy of it. How npm shrinkwrap saves local dependencies to npm-shrinkwrap.json and how it resolves them has varied over time as well.

## Root Causes

When local dependencies were added, we didn't have any process around defining behavior or ensuring that all use cases were specified.

## Proposal

In addition to actually specifying behavior (often simply writing down what
things currently do), we propose in important breaking change:

• file:-type specifiers that refer to directories will be soft deprecated, their behavior being identical to the new link: specifier and their existence becoming a footnote in the documentation.
• A new specifer type, link:, will be introduced for linking local directories. For the duration of npm@5, file:specifiers that refer to directories will be treated identically tolink: specifiers.
• link: specifiers will result in a symlink or junction made from the specifier path into your node_modules. On Windows try a junction and if that fails, try a symlink. If both fail, the error from the junction should be used.

This RFC essentially brings linklocal's bevior into core.

## RISKS

• Diagnostic information changes need to be handle delicately in order to not increase user confusion.
• Some users may be unhappy with the changes to file: semantics.

timoxley commented Mar 1, 2017

 Nice! Why new link: vs just using file:?

iarna commented Mar 1, 2017

 The reason for introducing the new specifier are: Have a specifier that immediately tells someone reading a package.json what it will do. Not have fights over what a valid file: URL looks like. Originally there was a plan to actually hard deprecate the file: specifier and remove them, but we decided that was pointlessly disruptive. Instead file: will be an alias for link:.

timoxley commented Mar 1, 2017

 @iarna Sounds reasonable 👍 Have you considered the impact of having different module resolution paths when using symlinks? I currently only use symlinks in development, while in production I let the current file: behaviour do its thing. Without the symlinks npm is able to move many shared packages to the top level, decreasing overall install time. This could have significant impact on deploy times if local dependencies share a lot of binary dependencies.

iarna commented Mar 1, 2017

 @timoxley It's a fair question. As it's specced right now it would do that. Maybe, if the linked module is inside the top level install root we should flatten its deps down, instead of keeping them inside the linked module. The module loader would still be able to pick them up and it'd not increase disk usage.

timoxley commented Mar 1, 2017

 if the linked module is inside the top level install root we should flatten its deps down @iarna that sounds good. Being aware of where symlinks resolve to perhaps could be considered a good general improvement to the dedupe/flattening algorithm.

 * Attempting to install a specifer that has a windows drive letter will produce an error on non-Windows systems. * A valid file: specifier points is:

thefourtheye Mar 1, 2017

Did you mean "A valid file: specifier also is" like in the next item? I am finding it difficult to understand this line.

legodude17 Mar 1, 2017

I would say it should be "Also, a valid file: specifier is:"

 Historically, these ambiguous specifiers were also allowed in the package.json. Starting in npm@5 using an ambiguous specifier in your shrinkwrap will be depricated and will warn. In npm@6 it will be an

mantoni Mar 1, 2017

Typo: depricated -> deprecated

mantoni commented Mar 1, 2017

 This is great! I would like to know if you considered supporting the current npm ln use-case? I'm thinking of link:module-name to resolve to the "global" link which in turn links to the actual module directory. Contrary to the relative link paths, this wouldn't enforce a specific directory structure. What do you think?

### legodude17 left a comment

This seems like it would really help a lot of people. 🎆

 * Attempting to install a specifer that has a windows drive letter will produce an error on non-Windows systems. * A valid file: specifier points is:

legodude17 Mar 1, 2017

I would say it should be "Also, a valid file: specifier is:"

 * A valid file: specifier points is: * a valid package file. That is, a .tar, .tar.gz or .tgz containing /package.json. * OR, a directory that contains a package.json

legodude17 Mar 1, 2017

Should that 'OR' be removed or should all of these start with 'OR'? Consistency.

iarna Mar 2, 2017

There are only two of them and the OR separates the two. So I'm not sure what you mean by "all of these".

 * a valid package file. That is, a .tar, .tar.gz or .tgz containing /package.json. * OR, a directory that contains a package.json * And a valid file: specifier also is:

legodude17 Mar 1, 2017

Shouldn't this not be a bullet? Or just remove it.

iarna Mar 2, 2017

It should. It's part of the top level list.

 The preinstall for file-type specifiers MUST be run AFTER the finalize phase as the symlink may be a relative path reaching outside the current project root and a symlink that resolves in .staging won't resolve in the package's final resting place.

legodude17 Mar 1, 2017

Would the preinstall for a package depended on via link: be run from it's original folder, or somewhere else?

iarna Mar 2, 2017

Basically this is me trying to not screw people who are using pwd (which would be relative to where your package root's node_modules).

Folks who are using cwd would always get the same answer (the destination of the link).

  example-package@1.0.0 /path/to/example-package +-- a -> Copied from: link:../a 

legodude17 Mar 1, 2017

Again, is this repetition intentional?

  example-package@1.0.0 /path/to/example-package +-- a -> link:../a 

legodude17 Mar 1, 2017

Why are there two parts with the same thing? Is that intentional?

iarna commented Mar 2, 2017

 So after some discussion in air between @zkat and I, we're of a consensus that we should go ahead with just file: and not add a new specifier. I'll update the RFC as appropriate.

  example-package@1.0.0 /path/to/example-package +-- a -> file:../a

legodude17 Mar 2, 2017

Still, is this repetition intentional?

zkat Mar 2, 2017

what repetition? directories and package names can be different: a -> file:../b is perfectly valid

iarna Mar 2, 2017

One of these is with unicode enabled, one without. I wanted to specify both.

legodude17 Mar 2, 2017

Oh. Sorry. I didn't notice that.

  Package Current Wanted Latest Location a MISSING LOCAL LOCAL example-package

legodude17 Mar 2, 2017

I would suggest that it should also log where it is looking for the package in.

iarna Mar 2, 2017

As zkat says below, changing outdated to be more useful is really a separate RFC.

legodude17 Mar 2, 2017

Ok, cool. Sounds like a really good thing.

zkat left a comment

 #### File type specifers pointing at directories File-type specifiers that point at directories will necessarily not do anything for fetch and extract phases.

zkat Mar 2, 2017

How do you feel about pacote doing this automagically on pacote.extract?

iarna Mar 2, 2017

Hrm. I don't think pacote has quite enough information currently to resolve this sort of thing. The specifier and the target destination aren't enough. You also need to know the location of the module that required it, because the specifier is relative to THAT.

zkat Mar 2, 2017

Can/should we add this to realize-package-specifier? We use where there already.

iarna Mar 2, 2017

Maybe, I mean, we might actually already be resolving come to think. It just requires some end-to-end care.

 file:///foo/bar reference the same package. * … or a relative path (eg ../path/to/thing, path\to\subdir). Leading slashes on a file specifier will be removed, that is 'file://../foo/bar references the same package as same as file:../foo/bar. The latter is

zkat Mar 2, 2017

Can you add a quick clarification that file://foo/bar is considered relative, too? At least that's what I assume from reading this. It's a weird, sandwichy corner case, but it's probably worth actually specifying:

• file:///foo -> Absolute /foo
• file://foo -> Relative, same as ./foo
• file:/foo -> Absolute /foo

iarna Mar 2, 2017

Number of leading slashes changes nothing:

file:///foo -> /foo
file://foo -> /foo
file:/foo -> /foo
With no leading slashes, relative paths are evident:

file:foo -> ./foo
file:../ -> ../foo

I'm proposing that if you have .. as your first path element then we ignore any leading slashes:

file:/../foo -> ../foo
file://../foo -> ../foo
file:///../foo -> ../foo

I think this would eliminate whole classes of likely errors, particularly around users who are trying to make these things work like URLs. Also /../ is not a construct that would ever normally exist. It's a nonsensical noop in and of itself.

 * Attempting to install a specifer that has a windows drive letter will produce an error on non-Windows systems. * A valid file: specifier points is: * a valid package file. That is, a .tar, .tar.gz or .tgz containing

zkat Mar 2, 2017

We should make sure the error message from this is VERY CLEAR that we determine tarballness based on file extension. I know we talked about this a lot, but I still want to avoid linuxy/unixy folks pulling their hair out if they, say, try to install a tarball they downloaded from a webservice that does not specify a filename and they end up with an unsuffixed filename that is still technically a tarball.

Or, for example, if someone tries to npm install ~/.npm/.pacote/content/deadbeef, which do not have .tgz suffixes.

 #### File type specifiers pointing at tarballs File-type specifiers pointing at a .tgz or .tar.gz or .tar file will

zkat Mar 2, 2017

missing backtick () here after .tar.gz

 dependencies of the linked package will be hoisted to the top level as usual. If the module is outside the package root then dependencies will be installed inside the linked module's node_modules folder.

zkat Mar 2, 2017

Is this going to be done as a separate npm run within that project folder (possibly a subprocess?) or will we build a single ideal tree with some special semantics, and do it in a single npm run? This seems trickier to me than what this single sentence describes. 🤔

iarna Mar 2, 2017

Nah, single thing, it doesn't even require much in the way of special semantics. It's basically the same logic as how we define "global" mode currently.

  Package Current Wanted Latest Location a MISSING LOCAL LOCAL example-package

zkat Mar 2, 2017

oh god Location is such a horrible, confusing name when we factor this in. I really bloody wish we had a different name for that column.

zkat Mar 2, 2017

Additionally, I wish we had somewhere to put more information about the package getting installed, such as from or something, because the package name by itself doesn't give us much. But that's something for a separate RFC.

iarna Mar 2, 2017

Yup, I'd love to change that too

legodude17 Mar 2, 2017

Wait, I am confused. So package is the name, and Location is the path on disk? Or is it the path to it through the dependency chain? Or is it the package that depended on this?

zkat Mar 2, 2017

@legodude17 npm help outdated will answer your questions here.

legodude17 Mar 2, 2017

@zkat thanks for the tip. You are right that Location is a very confusing name for that. Maybe Parent?

zkat Mar 2, 2017

@legodude17 we'll talk about that once there's an RFC for it. Talking about it in here is bound to get lost in oblivion (and is offtopic for this PR)

legodude17 Mar 2, 2017

Sounds like a good idea. Do you have a estimated time for this?

iarna Mar 4, 2017

Nope, not part of the current schedule. I have on my todo-list writing up an up-to-date "state of NPM" blog post to bring everyone up-to-date on our goals and plans and expectations.

changed the title RFC: file: specifier changes & link: specifier RFC: file: specifier changes Mar 4, 2017

mlucool commented Mar 8, 2017

 Using symlinks with node, requires a flag set for it to work correctly: nodejs/node#8749 (review). Is the intent to change how node handles symlinks also? One nice feature about how file works today is for peer dependencies. Due to the copy instead of a symlink, peers with global singletons work as expected. When this moves to symlinks only, the resolution will make certain cases impossible to resolve. Example: Module A is a react component and has react as a peerDependency and devDependency While working on A, I want to test this locally in Module B , so I use file:/path/to/a. Today it will do the right thing and share the global version of react as it is copied. The dependency resolution won't include a second react. If instead we did a symlink, this would not work because React is installed in in A's node_modules and would not be the same instance as in as B's node_modules. Let me know if that example needs more clarification or if I have misunderstood RFC.

zkat commented Mar 8, 2017

 @mlucool my understanding of this case is that it would continue to work so long as Module A is anywhere inside the directory structure for B when you link it. That is: module-b \ | - module-a | \ | node_modules \ node_modules \ react  module-a will have its react hoisted to the top. It's worth being more explicit about how we handle peerDeps in this situation. I hopefully didn't misunderstand the RFC in this case. @iarna it might be worth specifying the logic for peerDependencies and devDependencies for these linked submodules in the RFC, cause I don't think I'm that clear on it rn.

mlucool commented Mar 8, 2017

 @zkat In my case A is outside of B always. We do not have a monorepo. Still this seems to imply as long as it is note tar it'll use symlinks. My current hack (which works quite well) is to create a symlink like util that does: Uninstall A Install via file:/path/to/a Start a file watcher for all files in A's package.json:files Start a file watcher for A's package.json to start from 1 again Now for any change I make it appears nearly instantly. Clearly this is a hack, but it maintains that I can separate components across repos and not worry about dependency issues.

zkat commented Mar 8, 2017

 @mlucool if you're not going to use a monorepo structure, you could literally just run npm pack and then npm install file://path/to/module-a/module-a-1.2.3.tgz, which will have the same behavior as npm i file://path/to/module-a currently does. We're not changing local tarballs. It's a single extra step, but also something the CLI was already doing (it had to, in order to install the package)

zkat commented Mar 8, 2017

 and yes, we will always create symlinks for directory specs. The difference is that we hoist deps normally iff the target package is in a subdir of the current package.

mlucool commented Mar 8, 2017

 @zkat I could do that to keep my hack working. I am still a little unsure about how this hoisting will play out because you can imagine something like: examples \ example1 \ package.json components \ component1 \ package.json  If I did a file: in example1, this would not work as far as I can tell now (assuming this is still a React example). I believe this paradigm is relatively common also. Any comments on the node flag required for this to work? P.S. I am a BIG fan of better support for symlinks, per my commit to node.

zkat commented Mar 9, 2017

 @mlucool It's worth noting that it's tremendously unlikely for us at npm to try to change module resolution, including symlink stuff, ourselves. afaik, there's no intent for us to actually go and push for a change on that end in order to land this. More likely, I think, is discussing the possibility of taking that flag into account when deciding how to build the hoisted tree -- some users will want that symlink preservation, some won't. I'll defer to @iarna about whether she thinks this is something worth exploring, with a note that any current hacks around directory dependencies can continue to exist with that npm pack thing I mentioned above. @iarna does it make sense to you to read process.env.NODE_PRESERVE_SYMLINKS from the env and hoisting across symlink boundaries if we find that flag?

mlucool commented Mar 9, 2017

 IMO if something is outside of a module you should never touch its contents.

iarna commented Mar 10, 2017

 @mlucool As @zkat says, changing the default Node.js module loader behavior isn't on the table, which is why 'preserve-symlinks' landed as an option. Thank you for bringing this up! What you described isn't a use-case we had discussed previously. I think having it change install behavior based on NODE_PRESERVE_SYMLINKS is actually mandatory. npm as part of installation has to model how Node.js would load the package to correctly determine where to put its dependencies. This is a detail we've thus far not had to deal with because npm install has treated symlinks as opaque things it can't know details of. So yeah, if you have NODE_PRESERVE_SYMLINKS then npm will install transitive dependencies of linked dependencies in your local project, not in the symlinked module. That is, it will work the same way it works when the symlinked module is in a subdirectory of your project.

octogonz commented Oct 28, 2017

 Hi @iarna, We're having an issue with the "file://" specifier not being handled correctly by NPM 5. We have an automated tool that generates a package.json with references like this: "dependencies": { "yargs": "~4.6.0", "z-schema": "~3.18.3", "project1": "file:./projects/project1.tgz", "project2": "file:./projects/project2.tgz", The "project1.tgz" archive contains a package.json file like this: { "name": "project1", "version": "0.0.0", "private": true, When NPM 4 installs it, the node_modules/project1/package.json looks like this, as expected:  "version": "0.0.0" However, NPM 5 for some reason writes it like this:  "version": "file:projects/project1.tgz" Was this an intentional design change? Is it an NPM 5 bug? It's causing trouble because read-package-tree chokes on this JSON value. We get an error like this: ERROR: Failed to parse package.json for project1: Invalid version: "file:projects/project1.tgz" 

octogonz commented Oct 30, 2017

 I'm pretty sure this is an NPM bug. I've opened #19006

adrian-gierakowski commented Mar 6, 2018

 @iarna what happened to this proposal? has it been rejected? btw. looks like something similar has been implemented in yarn some time ago

zkat commented Mar 6, 2018

 @adrian-gierakowski it was merged and implemented as part of npm@5.0.0

