Skip to content
This repository has been archived by the owner on Aug 11, 2022. It is now read-only.

RFC: file: specifier changes #15900

Closed
wants to merge 210 commits into from
Closed

RFC: file: specifier changes #15900

wants to merge 210 commits into from

Conversation

iarna
Copy link
Contributor

@iarna iarna commented Feb 28, 2017

Easy reading link for the new specification. This also introduces written specifications for npm features, something we intend to extend out to all of npm's activity.

The Problem We're Solving

File type dependency specifiers are confusing. How they interact across npm's many varied commands (npm install --save, npm update, npm outdated, npm shrinkwrap) has never been defined in one place. Most of the behaviors "happened by default". That is, the minimum implementation was done to make them install ok and the rest was left as a side effect.

Currently npm update and npm outdated simply don't do anything for local dependencies. If you've updated the source you have to manually reinstall to get a copy of it. How npm shrinkwrap saves local dependencies to npm-shrinkwrap.json and how it resolves them has varied over time as well.

Root Causes

When local dependencies were added, we didn't have any process around defining behavior or ensuring that all use cases were specified.

Proposal

In addition to actually specifying behavior (often simply writing down what
things currently do), we propose in important breaking change:

  • file:-type specifiers that refer to directories will be soft deprecated, their behavior being identical to the new link: specifier and their existence becoming a footnote in the documentation.
  • A new specifer type, link:, will be introduced for linking local directories. For the duration of npm@5, file:specifiers that refer to directories will be treated identically tolink:` specifiers.
  • link: specifiers will result in a symlink or junction made from the specifier path into your node_modules. On Windows try a junction and if that fails, try a symlink. If both fail, the error from the junction should be used.

This RFC essentially brings linklocal's bevior into core.

RISKS

  • Diagnostic information changes need to be handle delicately in order to not increase user confusion.
  • Some users may be unhappy with the changes to file: semantics.

@timoxley
Copy link
Contributor

timoxley commented Mar 1, 2017

Nice! Why new link: vs just using file:?

@iarna
Copy link
Contributor Author

iarna commented Mar 1, 2017

The reason for introducing the new specifier are:

  1. Have a specifier that immediately tells someone reading a package.json what it will do.
  2. Not have fights over what a valid file: URL looks like.

Originally there was a plan to actually hard deprecate the file: specifier and remove them, but we decided that was pointlessly disruptive. Instead file: will be an alias for link:.

@timoxley
Copy link
Contributor

timoxley commented Mar 1, 2017

@iarna Sounds reasonable 👍

Have you considered the impact of having different module resolution paths when using symlinks?
I currently only use symlinks in development, while in production I let the current file: behaviour do its thing. Without the symlinks npm is able to move many shared packages to the top level, decreasing overall install time. This could have significant impact on deploy times if local dependencies share a lot of binary dependencies.

@iarna
Copy link
Contributor Author

iarna commented Mar 1, 2017

@timoxley It's a fair question. As it's specced right now it would do that. Maybe, if the linked module is inside the top level install root we should flatten its deps down, instead of keeping them inside the linked module. The module loader would still be able to pick them up and it'd not increase disk usage.

@timoxley
Copy link
Contributor

timoxley commented Mar 1, 2017

if the linked module is inside the top level install root we should flatten its deps down

@iarna that sounds good. Being aware of where symlinks resolve to perhaps could be considered a good general improvement to the dedupe/flattening algorithm.

* Attempting to install a specifer that has a windows drive letter will
produce an error on non-Windows systems.

* A valid `file:` specifier points is:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean "A valid file: specifier also is" like in the next item? I am finding it difficult to understand this line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say it should be "Also, a valid file: specifier is:"


Historically, these ambiguous specifiers were also allowed in the
`package.json`. Starting in `npm@5` using an ambiguous specifier in your
shrinkwrap will be depricated and will warn. In `npm@6` it will be an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: depricated -> deprecated

@mantoni
Copy link
Contributor

mantoni commented Mar 1, 2017

This is great! I would like to know if you considered supporting the current npm ln use-case? I'm thinking of link:module-name to resolve to the "global" link which in turn links to the actual module directory. Contrary to the relative link paths, this wouldn't enforce a specific directory structure. What do you think?

Copy link
Contributor

@legodude17 legodude17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it would really help a lot of people. 🎆

* Attempting to install a specifer that has a windows drive letter will
produce an error on non-Windows systems.

* A valid `file:` specifier points is:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say it should be "Also, a valid file: specifier is:"

* A valid `file:` specifier points is:
* a valid package file. That is, a `.tar`, `.tar.gz` or `.tgz` containing
`<dir>/package.json`.
* OR, a directory that contains a `package.json`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that 'OR' be removed or should all of these start with 'OR'? Consistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only two of them and the OR separates the two. So I'm not sure what you mean by "all of these".

* a valid package file. That is, a `.tar`, `.tar.gz` or `.tgz` containing
`<dir>/package.json`.
* OR, a directory that contains a `package.json`
* And a valid `file:` specifier also is:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this not be a bullet? Or just remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should. It's part of the top level list.

The `preinstall` for file-type specifiers MUST be run AFTER the
`finalize` phase as the symlink may be a relative path reaching outside the
current project root and a symlink that resolves in `.staging` won't resolve
in the package's final resting place.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the preinstall for a package depended on via link: be run from it's original folder, or somewhere else?

Copy link
Contributor Author

@iarna iarna Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically this is me trying to not screw people who are using pwd (which would be relative to where your package root's node_modules).

Folks who are using cwd would always get the same answer (the destination of the link).

```
example-package@1.0.0 /path/to/example-package
+-- a -> Copied from: link:../a
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, is this repetition intentional?

```
example-package@1.0.0 /path/to/example-package
+-- a -> link:../a
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are there two parts with the same thing? Is that intentional?

@iarna
Copy link
Contributor Author

iarna commented Mar 2, 2017

So after some discussion in air between @zkat and I, we're of a consensus that we should go ahead with just file: and not add a new specifier. I'll update the RFC as appropriate.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 86.013% when pulling 43db052 on link-specifier into a08189f on latest.


```
example-package@1.0.0 /path/to/example-package
+-- a -> file:../a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, is this repetition intentional?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what repetition? directories and package names can be different: a -> file:../b is perfectly valid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of these is with unicode enabled, one without. I wanted to specify both.

Copy link
Contributor

@legodude17 legodude17 Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. Sorry. I didn't notice that.


```
Package Current Wanted Latest Location
a MISSING LOCAL LOCAL example-package
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest that it should also log where it is looking for the package in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As zkat says below, changing outdated to be more useful is really a separate RFC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, cool. Sounds like a really good thing.

Copy link
Contributor

@zkat zkat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty excited about this. Thanks for writing the spec up <3

#### File type specifers pointing at directories

File-type specifiers that point at directories will necessarily not do
anything for `fetch` and `extract` phases.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about pacote doing this automagically on pacote.extract?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm. I don't think pacote has quite enough information currently to resolve this sort of thing. The specifier and the target destination aren't enough. You also need to know the location of the module that required it, because the specifier is relative to THAT.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can/should we add this to realize-package-specifier? We use where there already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, I mean, we might actually already be resolving come to think. It just requires some end-to-end care.

`file:///foo/bar` reference the same package.
* … or a relative path (eg `../path/to/thing`, `path\to\subdir`). Leading
slashes on a file specifier will be removed, that is 'file://../foo/bar`
references the same package as same as `file:../foo/bar`. The latter is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a quick clarification that file://foo/bar is considered relative, too? At least that's what I assume from reading this. It's a weird, sandwichy corner case, but it's probably worth actually specifying:

  • file:///foo -> Absolute /foo
  • file://foo -> Relative, same as ./foo
  • file:/foo -> Absolute /foo

Copy link
Contributor Author

@iarna iarna Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Number of leading slashes changes nothing:

file:///foo -> /foo
file://foo -> /foo
file:/foo -> /foo
With no leading slashes, relative paths are evident:

file:foo -> ./foo
file:../ -> ../foo

I'm proposing that if you have .. as your first path element then we ignore any leading slashes:

file:/../foo -> ../foo
file://../foo -> ../foo
file:///../foo -> ../foo

I think this would eliminate whole classes of likely errors, particularly around users who are trying to make these things work like URLs. Also /../ is not a construct that would ever normally exist. It's a nonsensical noop in and of itself.

* Attempting to install a specifer that has a windows drive letter will
produce an error on non-Windows systems.
* A valid `file:` specifier points is:
* a valid package file. That is, a `.tar`, `.tar.gz` or `.tgz` containing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make sure the error message from this is VERY CLEAR that we determine tarballness based on file extension. I know we talked about this a lot, but I still want to avoid linuxy/unixy folks pulling their hair out if they, say, try to install a tarball they downloaded from a webservice that does not specify a filename and they end up with an unsuffixed filename that is still technically a tarball.

Or, for example, if someone tries to npm install ~/.npm/.pacote/content/deadbeef, which do not have .tgz suffixes.


#### File type specifiers pointing at tarballs

File-type specifiers pointing at a `.tgz` or `.tar.gz or `.tar` file will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing backtick (`) here after .tar.gz

dependencies of the linked package will be hoisted to the top level as usual.

If the module is outside the package root then dependencies will be installed inside
the linked module's `node_modules` folder.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to be done as a separate npm run within that project folder (possibly a subprocess?) or will we build a single ideal tree with some special semantics, and do it in a single npm run? This seems trickier to me than what this single sentence describes. 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, single thing, it doesn't even require much in the way of special semantics. It's basically the same logic as how we define "global" mode currently.


```
Package Current Wanted Latest Location
a MISSING LOCAL LOCAL example-package
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh god Location is such a horrible, confusing name when we factor this in. I really bloody wish we had a different name for that column.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, I wish we had somewhere to put more information about the package getting installed, such as from or something, because the package name by itself doesn't give us much. But that's something for a separate RFC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I'd love to change that too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I am confused. So package is the name, and Location is the path on disk? Or is it the path to it through the dependency chain? Or is it the package that depended on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@legodude17 npm help outdated will answer your questions here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zkat thanks for the tip. You are right that Location is a very confusing name for that. Maybe Parent?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@legodude17 we'll talk about that once there's an RFC for it. Talking about it in here is bound to get lost in oblivion (and is offtopic for this PR)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good idea. Do you have a estimated time for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, not part of the current schedule. I have on my todo-list writing up an up-to-date "state of NPM" blog post to bring everyone up-to-date on our goals and plans and expectations.

@iarna iarna changed the title RFC: file: specifier changes & link: specifier RFC: file: specifier changes Mar 4, 2017
@mlucool
Copy link

mlucool commented Mar 8, 2017

Using symlinks with node, requires a flag set for it to work correctly: nodejs/node#8749 (review). Is the intent to change how node handles symlinks also?

One nice feature about how file works today is for peer dependencies. Due to the copy instead of a symlink, peers with global singletons work as expected. When this moves to symlinks only, the resolution will make certain cases impossible to resolve. Example:

  • Module A is a react component and has react as a peerDependency and devDependency
  • While working on A, I want to test this locally in Module B , so I use file:/path/to/a.
    • Today it will do the right thing and share the global version of react as it is copied. The dependency resolution won't include a second react.
    • If instead we did a symlink, this would not work because React is installed in in A's node_modules and would not be the same instance as in as B's node_modules.

Let me know if that example needs more clarification or if I have misunderstood RFC.

@zkat
Copy link
Contributor

zkat commented Mar 8, 2017

@mlucool my understanding of this case is that it would continue to work so long as Module A is anywhere inside the directory structure for B when you link it.

That is:

module-b
\
 | - module-a
 |   \
 |     node_modules
 \
  node_modules
  \
    react

module-a will have its react hoisted to the top. It's worth being more explicit about how we handle peerDeps in this situation. I hopefully didn't misunderstand the RFC in this case.

@iarna it might be worth specifying the logic for peerDependencies and devDependencies for these linked submodules in the RFC, cause I don't think I'm that clear on it rn.

@mlucool
Copy link

mlucool commented Mar 8, 2017

@zkat In my case A is outside of B always. We do not have a monorepo. Still this seems to imply as long as it is note tar it'll use symlinks.

My current hack (which works quite well) is to create a symlink like util that does:

  1. Uninstall A
  2. Install via file:/path/to/a
  3. Start a file watcher for all files in A's package.json:files
  4. Start a file watcher for A's package.json to start from 1 again

Now for any change I make it appears nearly instantly. Clearly this is a hack, but it maintains that I can separate components across repos and not worry about dependency issues.

@zkat
Copy link
Contributor

zkat commented Mar 8, 2017

@mlucool if you're not going to use a monorepo structure, you could literally just run npm pack and then npm install file://path/to/module-a/module-a-1.2.3.tgz, which will have the same behavior as npm i file://path/to/module-a currently does. We're not changing local tarballs. It's a single extra step, but also something the CLI was already doing (it had to, in order to install the package)

@zkat
Copy link
Contributor

zkat commented Mar 8, 2017

and yes, we will always create symlinks for directory specs. The difference is that we hoist deps normally iff the target package is in a subdir of the current package.

@mlucool
Copy link

mlucool commented Mar 8, 2017

@zkat I could do that to keep my hack working. I am still a little unsure about how this hoisting will play out because you can imagine something like:

examples
\
  example1
    \ 
     package.json
components
\
  component1
   \ 
    package.json

If I did a file: in example1, this would not work as far as I can tell now (assuming this is still a React example). I believe this paradigm is relatively common also.

Any comments on the node flag required for this to work?

P.S. I am a BIG fan of better support for symlinks, per my commit to node.

@zkat
Copy link
Contributor

zkat commented Mar 9, 2017

@mlucool It's worth noting that it's tremendously unlikely for us at npm to try to change module resolution, including symlink stuff, ourselves. afaik, there's no intent for us to actually go and push for a change on that end in order to land this.

More likely, I think, is discussing the possibility of taking that flag into account when deciding how to build the hoisted tree -- some users will want that symlink preservation, some won't. I'll defer to @iarna about whether she thinks this is something worth exploring, with a note that any current hacks around directory dependencies can continue to exist with that npm pack thing I mentioned above.

@iarna does it make sense to you to read process.env.NODE_PRESERVE_SYMLINKS from the env and hoisting across symlink boundaries if we find that flag?

@mlucool
Copy link

mlucool commented Mar 9, 2017

IMO if something is outside of a module you should never touch its contents.

@iarna
Copy link
Contributor Author

iarna commented Mar 10, 2017

@mlucool As @zkat says, changing the default Node.js module loader behavior isn't on the table, which is why 'preserve-symlinks' landed as an option.

Thank you for bringing this up! What you described isn't a use-case we had discussed previously.

I think having it change install behavior based on NODE_PRESERVE_SYMLINKS is actually mandatory. npm as part of installation has to model how Node.js would load the package to correctly determine where to put its dependencies. This is a detail we've thus far not had to deal with because npm install has treated symlinks as opaque things it can't know details of.

So yeah, if you have NODE_PRESERVE_SYMLINKS then npm will install transitive dependencies of linked dependencies in your local project, not in the symlinked module. That is, it will work the same way it works when the symlinked module is in a subdirectory of your project.

@octogonz
Copy link

Hi @iarna,

We're having an issue with the "file://" specifier not being handled correctly by NPM 5. We have an automated tool that generates a package.json with references like this:

"dependencies": {
    "yargs": "~4.6.0",
    "z-schema": "~3.18.3",
    "project1": "file:./projects/project1.tgz",
    "project2": "file:./projects/project2.tgz",

The "project1.tgz" archive contains a package.json file like this:

{
  "name": "project1",
  "version": "0.0.0",
  "private": true,

When NPM 4 installs it, the node_modules/project1/package.json looks like this, as expected:

  "version": "0.0.0"

However, NPM 5 for some reason writes it like this:

  "version": "file:projects/project1.tgz"

Was this an intentional design change? Is it an NPM 5 bug?

It's causing trouble because read-package-tree chokes on this JSON value. We get an error like this:

ERROR: Failed to parse package.json for project1: Invalid version: "file:projects/project1.tgz"

@octogonz
Copy link

I'm pretty sure this is an NPM bug. I've opened #19006

dhei added a commit to microsoft/appcenter-sdk-react-native that referenced this pull request Nov 30, 2017
…m install` on npm5

problem: npm4 install local dependencies by copying over the local package but npm5 install local dependencies as symlink. This cause issues for our TestApp that install local appcenter* packages because "Header Search Paths" can't find react native modules.

solution: `npm pack appcenter && npm install appcenter.tgz` will package up and then install the tgz files, which avoid the symlinks. (see npm/npm#15900 (comment))
lkraav added a commit to lkraav/pure that referenced this pull request Dec 15, 2017
* we want to have assets available via simple `npm install`
* npm 5 changed `file:` semantics into `link:`, breaks local installs
* https://www.npmjs.com/package/install-local is cumbersome to use

[1]: npm/npm#15900
lkraav added a commit to lkraav/pure that referenced this pull request Dec 16, 2017
* we want to have assets available via simple `npm install`
* npm 5 changed `file:` semantics into `link:`, breaks local installs
* https://www.npmjs.com/package/install-local is cumbersome to use

[1]: npm/npm#15900
@iarna iarna deleted the link-specifier branch February 1, 2018 01:32
evocateur added a commit to lerna/lerna that referenced this pull request Feb 13, 2018
This helps avoid publishing broken packages when using npm5's new relative link specifiers

npm/npm#15900
@adrian-gierakowski
Copy link

@iarna what happened to this proposal? has it been rejected?

btw. looks like something similar has been implemented in yarn some time ago

@zkat
Copy link
Contributor

zkat commented Mar 6, 2018

@adrian-gierakowski it was merged and implemented as part of npm@5.0.0

evocateur added a commit to lerna/lerna that referenced this pull request Mar 8, 2018
@imyangyong imyangyong mentioned this pull request Jun 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet