New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yarn Plug'n'Play: Getting rid of node_modules #101

Open
wants to merge 4 commits into
base: master
from

Conversation

Projects
None yet
@arcanis
Member

arcanis commented Sep 13, 2018

Note: This PR's discussions should be focused on the high-level design. [A separate PR] (yarnpkg/yarn#6382) has been opened on the code repository to discuss the implementation details.

A pdf has been generated and made available here for easier reading. Note that this PR remains the reference location for the most up-to-date information regarding this proposal.

Hi folks!

We propose in this RFC a new alternative and entirely optional way to resolve dependencies installed on the disk, in order to solve issues caused by the incomplete knowledge Node has regarding the dependency tree. We also detail the actual implementation we went with, describing the rational behind the design choice we made.

I'll keep it short here since there's much to discuss in the document itself, but here are some highlights:

  • Installs ran using Plug'n'Play are up to 70% faster than regular ones (sample app)
  • Starting from this PR, Yarn will now be on the path to make yarn install a no-op on CI
  • Yarn will now be able to tell you precisely when you forgot to list packages in your dependencies
  • Your applications will boot faster through a hybrid approach of static resolutions

This is but a high-level description of some of the benefits unlocked by Plug'n'Play, I encourage you to give a look at the document for more information about the specific design choices - and in case anything is missing, please ask and I'll do my best to explain them more in depth!

I should mention that we've been using in production inside Facebook for about two weeks now, and didn't get issues since then. Now that it passed the trial by fire we felt confident enough that this solution was the right solution, and share it openly so that we can all iterate on it.

Working on this project has been super exciting for me, and I can't wait to see the new possibilities that it will unlock! Especially from a tooling perspective, the benefits of having a unified indirection allowing package managers to dictate the way the dependency are loaded unlocks new incredible patterns and make it easier and safer for tools to integrate with it.

Paging some community members that have been made aware of the project during its development and helped us in various ways, either through actual contributions (kudos to @imsnif for implementing yarn unplug!) or by their feedback:

@orta

This comment has been minimized.

Show comment
Hide comment
@orta

orta Sep 13, 2018

I read the paper. I love the concept, moving the resolver into an exportable the pnp.js is a particularly smart idea.

orta commented Sep 13, 2018

I read the paper. I love the concept, moving the resolver into an exportable the pnp.js is a particularly smart idea.

@davidnagli

This comment has been minimized.

Show comment
Hide comment
@davidnagli

davidnagli Sep 13, 2018

Really love this idea! Would this have to be implemented on a package manager level, bundler, or both?

davidnagli commented Sep 13, 2018

Really love this idea! Would this have to be implemented on a package manager level, bundler, or both?

@arcanis

This comment has been minimized.

Show comment
Hide comment
@arcanis

arcanis Sep 13, 2018

Member

@davidnagli Mostly package managers. Bundlers could however be able to consume the .pnp.js file to do special processing and provide a better integration with the dependency tree (such as workspaces).

Member

arcanis commented Sep 13, 2018

@davidnagli Mostly package managers. Bundlers could however be able to consume the .pnp.js file to do special processing and provide a better integration with the dependency tree (such as workspaces).

@azz

This comment has been minimized.

Show comment
Hide comment
@azz

azz Sep 13, 2018

⚰️node_modules

This is fantastic!

A few random thoughts:

  • This will probably break a lot of tools and libraries that rely node_modules existing, like read-pkg-up (7m weekly downloads). A migration strategy would need to be documented.

  • How will bin commands be handled? Currently they're in node_modules/.bin which is added to the $PATH. Will that still exist or be moved to the cache? (I guess PATH=$PATH:$(yarn bin) could help in that case.)

  • Has this approach been tested with other tools that modify node resolution, like esm?

    # will this work?
    node -r esm -r .pnp.js index.js 
  • In order to solve this, Plug'n'Play details a special case when a package makes a require call to a package it doesn't own but that the top-level has listed as one of its dependencies.

    This would have to include peerDependencies, right?

  • I might be missing something, but why have relative paths from the workspace root to the yarn cache? Could the path to the yarn cache be added to the environment and then the paths look more like $YARN_CACHE/lodash/4.0.0/flatMap.js?

azz commented Sep 13, 2018

⚰️node_modules

This is fantastic!

A few random thoughts:

  • This will probably break a lot of tools and libraries that rely node_modules existing, like read-pkg-up (7m weekly downloads). A migration strategy would need to be documented.

  • How will bin commands be handled? Currently they're in node_modules/.bin which is added to the $PATH. Will that still exist or be moved to the cache? (I guess PATH=$PATH:$(yarn bin) could help in that case.)

  • Has this approach been tested with other tools that modify node resolution, like esm?

    # will this work?
    node -r esm -r .pnp.js index.js 
  • In order to solve this, Plug'n'Play details a special case when a package makes a require call to a package it doesn't own but that the top-level has listed as one of its dependencies.

    This would have to include peerDependencies, right?

  • I might be missing something, but why have relative paths from the workspace root to the yarn cache? Could the path to the yarn cache be added to the environment and then the paths look more like $YARN_CACHE/lodash/4.0.0/flatMap.js?

@zkochan

This comment has been minimized.

Show comment
Hide comment
@zkochan

zkochan Sep 13, 2018

For reference, I always wanted to symlink packages directly from the store in pnpm (related issue: pnpm/pnpm#1001).

We also did a partial implementation (using the --independent-leaves flag). However, too many packages in the ecosystem currently rely on their real location, so we decided to make the change later. But it would be a lot faster than the current algos used by npm/yarn/pnpm.

Lots of packages already don't work with pnpm because of its strict node_modules. This is one of the reasons pnpm isn't adopted as much as Yarn or npm. I see in the RFC that it is planned to make this hooked resolution algorithm "strict" as well. I think it would be good for pnpm as we don't have the power to make the ecosystem fix itself. Even fixes that we contribute are sometimes not merged and published for years.

So this is a very brave design. I wonder how it will be welcomed when all the issues will arise. I wonder whether the design will be adjusted or you will insist to keep the strict resolution (this scenario would be best for pnpm)

zkochan commented Sep 13, 2018

For reference, I always wanted to symlink packages directly from the store in pnpm (related issue: pnpm/pnpm#1001).

We also did a partial implementation (using the --independent-leaves flag). However, too many packages in the ecosystem currently rely on their real location, so we decided to make the change later. But it would be a lot faster than the current algos used by npm/yarn/pnpm.

Lots of packages already don't work with pnpm because of its strict node_modules. This is one of the reasons pnpm isn't adopted as much as Yarn or npm. I see in the RFC that it is planned to make this hooked resolution algorithm "strict" as well. I think it would be good for pnpm as we don't have the power to make the ecosystem fix itself. Even fixes that we contribute are sometimes not merged and published for years.

So this is a very brave design. I wonder how it will be welcomed when all the issues will arise. I wonder whether the design will be adjusted or you will insist to keep the strict resolution (this scenario would be best for pnpm)

@mischnic

This comment has been minimized.

Show comment
Hide comment
@mischnic

mischnic Sep 13, 2018

Strange that npm has just released a similar concept (as package manager called crux): https://blog.npmjs.org/post/178027064160/next-generation-package-management

mischnic commented Sep 13, 2018

Strange that npm has just released a similar concept (as package manager called crux): https://blog.npmjs.org/post/178027064160/next-generation-package-management

@schmod

This comment has been minimized.

Show comment
Hide comment
@schmod

schmod Sep 13, 2018

My $0.02 is that any changes to replace the require.resolve algorithm should take place in the Node.JS core itself, and be guided by Node's community process instead of the (single) commercial entities behind Yarn and NPM.

An awful lot of existing code and infrastructure depends on the current behavior of require.resolve, and will break if those semantics are changed. The changes being discussed have the potential to fork or break the Node/NPM ecosystem, and should not be taken lightly.

I don't want to have to worry about whether my dependencies are compatible with node_modules, pnp, or crux.

If a future version of node allows package managers to supply an alternative implementation of require.resolve, that's fine, but there should explicit guidelines about how that should work, and how developers should write package-manager-agnostic code. This would be a semver-MAJOR change to Node.

schmod commented Sep 13, 2018

My $0.02 is that any changes to replace the require.resolve algorithm should take place in the Node.JS core itself, and be guided by Node's community process instead of the (single) commercial entities behind Yarn and NPM.

An awful lot of existing code and infrastructure depends on the current behavior of require.resolve, and will break if those semantics are changed. The changes being discussed have the potential to fork or break the Node/NPM ecosystem, and should not be taken lightly.

I don't want to have to worry about whether my dependencies are compatible with node_modules, pnp, or crux.

If a future version of node allows package managers to supply an alternative implementation of require.resolve, that's fine, but there should explicit guidelines about how that should work, and how developers should write package-manager-agnostic code. This would be a semver-MAJOR change to Node.

@Droogans

This comment has been minimized.

Show comment
Hide comment
@Droogans

Droogans Sep 13, 2018

How is this going to change yarn checksum behaviors, if at all? I noticed that it wasn't even mentioned.

Droogans commented Sep 13, 2018

How is this going to change yarn checksum behaviors, if at all? I noticed that it wasn't even mentioned.

@arcanis

This comment has been minimized.

Show comment
Hide comment
@arcanis

arcanis Sep 13, 2018

Member

@schmod Note that the goal here isn't to sidestep the Node processes. Rather, we want to show a practical implementation that we know is solid enough for a first iteration, and use it to ignite discussions.

If you want to compare it to something else, see it as what Boost sometimes is compared to the standard C++ library: it's only after boost::thread and boost::filesystem were proved to work that the WG21 felt confident enough to use them as base for the standard library, knowing that the design was of good quality not only in theory, but also in practice.

Member

arcanis commented Sep 13, 2018

@schmod Note that the goal here isn't to sidestep the Node processes. Rather, we want to show a practical implementation that we know is solid enough for a first iteration, and use it to ignite discussions.

If you want to compare it to something else, see it as what Boost sometimes is compared to the standard C++ library: it's only after boost::thread and boost::filesystem were proved to work that the WG21 felt confident enough to use them as base for the standard library, knowing that the design was of good quality not only in theory, but also in practice.

@donaldpipowitch

This comment has been minimized.

Show comment
Hide comment
@donaldpipowitch

donaldpipowitch Sep 13, 2018

I'm always interested in changes like this one. 👏

I guess this would break TypeScripts @types resolving logic which expects these modules inside node_modules. What do you think about this change @DanielRosenwasser? Would you support that in TypeScript?

donaldpipowitch commented Sep 13, 2018

I'm always interested in changes like this one. 👏

I guess this would break TypeScripts @types resolving logic which expects these modules inside node_modules. What do you think about this change @DanielRosenwasser? Would you support that in TypeScript?

@boblauer

This comment has been minimized.

Show comment
Hide comment
@boblauer

boblauer Sep 13, 2018

An important note from the crux readme:

You can still install things in your node_modules folder and those versions will be used in preference to the cached version. This opens a path to live-editing of dependencies (sometimes a necessary debugging technique) ...

Will this be possible with pnp as well? I poke around in node_modules very often to debug and better understand the libraries I'm using.

boblauer commented Sep 13, 2018

An important note from the crux readme:

You can still install things in your node_modules folder and those versions will be used in preference to the cached version. This opens a path to live-editing of dependencies (sometimes a necessary debugging technique) ...

Will this be possible with pnp as well? I poke around in node_modules very often to debug and better understand the libraries I'm using.

Show outdated Hide outdated accepted/0000-plug-an-play.md
Show outdated Hide outdated accepted/0000-plug-an-play.md
Show outdated Hide outdated accepted/0000-plug-an-play.md
@arcanis

This comment has been minimized.

Show comment
Hide comment
@arcanis

arcanis Sep 13, 2018

Member

Will this be possible with pnp as well? I poke around in node_modules very often to debug and better understand the libraries I'm using.

yarn unplug <pkg-name> will put a copy of the specified package into .pnp/unplugged. You can then inspect/edit this package as you see fit, and once you're done you just have to yarn unplug --clear-all to go back to normal 😃

Member

arcanis commented Sep 13, 2018

Will this be possible with pnp as well? I poke around in node_modules very often to debug and better understand the libraries I'm using.

yarn unplug <pkg-name> will put a copy of the specified package into .pnp/unplugged. You can then inspect/edit this package as you see fit, and once you're done you just have to yarn unplug --clear-all to go back to normal 😃

@staabm staabm referenced this pull request Sep 13, 2018

Open

Composer v2: Pool/Solver/Repo Tasks #7630

3 of 4 tasks complete
@deepsweet

This comment has been minimized.

Show comment
Hide comment
@deepsweet

deepsweet Sep 13, 2018

The current implementation overrides Module._load, but Node 10 recently released a new API that we plan to use to register into the resolver.

Could you please provide a link to that new API?

deepsweet commented Sep 13, 2018

The current implementation overrides Module._load, but Node 10 recently released a new API that we plan to use to register into the resolver.

Could you please provide a link to that new API?

@arcanis

This comment has been minimized.

Show comment
Hide comment
@arcanis

arcanis Sep 13, 2018

Member

Cf loader hooks. This API only applies on imports coming from ESM modules, unfortunately.

Member

arcanis commented Sep 13, 2018

Cf loader hooks. This API only applies on imports coming from ESM modules, unfortunately.

@davidk01

This comment has been minimized.

Show comment
Hide comment
@davidk01

davidk01 Sep 13, 2018

How does this work with modules that wrap native code? So is the idea that no more native modules will be allowed? The example given is node-sass but there are plenty of other modules that require post install logic.

davidk01 commented Sep 13, 2018

How does this work with modules that wrap native code? So is the idea that no more native modules will be allowed? The example given is node-sass but there are plenty of other modules that require post install logic.

@Aghassi

This comment has been minimized.

Show comment
Hide comment
@Aghassi

Aghassi Sep 13, 2018

Noticed the bit about postinstall going away. Agree the premise for security. I'm curious how that would impact tools like https://github.com/typicode/husky that help developer workflow, but rely on a post install task to setup. Is there a story around how these types of tools would be able to exist in this new world?

Aghassi commented Sep 13, 2018

Noticed the bit about postinstall going away. Agree the premise for security. I'm curious how that would impact tools like https://github.com/typicode/husky that help developer workflow, but rely on a post install task to setup. Is there a story around how these types of tools would be able to exist in this new world?

@transitive-bullshit

This comment has been minimized.

Show comment
Hide comment
@transitive-bullshit

transitive-bullshit Sep 13, 2018

Really interesting proposal -- Glad to see that the yarn team is thinking radically towards the future!

One issue with relying on unplugged is that my 99% of use case for yarn link is to link to a version of a dependency that I'm developing locally in parallel, and maintaining that symlink is important so I can edit the "real" checkout of a dependency instead of a temporary throwaway version in the .pnp/unplugged folder.

Aside from this issue, I'd love to hear the yarn team's thoughts on @azz's questions. I think @schmod's excellent point is answered pretty well by @arcanis's analogy to boost, but it would be great to understand what the Node and NPM folks think early on in this process.

@zkat I'd really love to hear your thoughts on how this proposal differs from and is analogous to some of the design decisions npm has been making with crux.

Thanks && I love the Node community because of awesome developments like this!

transitive-bullshit commented Sep 13, 2018

Really interesting proposal -- Glad to see that the yarn team is thinking radically towards the future!

One issue with relying on unplugged is that my 99% of use case for yarn link is to link to a version of a dependency that I'm developing locally in parallel, and maintaining that symlink is important so I can edit the "real" checkout of a dependency instead of a temporary throwaway version in the .pnp/unplugged folder.

Aside from this issue, I'd love to hear the yarn team's thoughts on @azz's questions. I think @schmod's excellent point is answered pretty well by @arcanis's analogy to boost, but it would be great to understand what the Node and NPM folks think early on in this process.

@zkat I'd really love to hear your thoughts on how this proposal differs from and is analogous to some of the design decisions npm has been making with crux.

Thanks && I love the Node community because of awesome developments like this!

@rivertam

This comment has been minimized.

Show comment
Hide comment
@rivertam

rivertam Sep 13, 2018

Also on the post-install but less agreeable.

While native modules have their usefulness, WebAssembly is becoming a more and more serious candidate for a portable bytecode as the months pass.

This impacts a whole class of packages which genuinely rely on actually using some other language. For example, I've built a binding for a camera that interacts with linux4video2. While the camera has official bindings for both Node and C++, it doesn't have bindings for wasm. For this project, we genuinely need the low level capabilities, speed, and parallelism that C++ provides, but we can't use wasm because none of the APIs are available for wasm.

The solutions proposed in this proposal don't impact this in any way, as far as I can tell, and we'll be able to use pnp across all our projects. However, I'm just a little nervous reading that sentence.

rivertam commented Sep 13, 2018

Also on the post-install but less agreeable.

While native modules have their usefulness, WebAssembly is becoming a more and more serious candidate for a portable bytecode as the months pass.

This impacts a whole class of packages which genuinely rely on actually using some other language. For example, I've built a binding for a camera that interacts with linux4video2. While the camera has official bindings for both Node and C++, it doesn't have bindings for wasm. For this project, we genuinely need the low level capabilities, speed, and parallelism that C++ provides, but we can't use wasm because none of the APIs are available for wasm.

The solutions proposed in this proposal don't impact this in any way, as far as I can tell, and we'll be able to use pnp across all our projects. However, I'm just a little nervous reading that sentence.

@saschagrunert

This comment has been minimized.

Show comment
Hide comment
@saschagrunert

saschagrunert Sep 13, 2018

Very smart and clean concept about the next generation dependency handling with yarn. Great! 👍

saschagrunert commented Sep 13, 2018

Very smart and clean concept about the next generation dependency handling with yarn. Great! 👍

@arcanis

This comment has been minimized.

Show comment
Hide comment
@arcanis

arcanis Sep 13, 2018

Member

The solutions proposed in this proposal don't impact this in any way, as far as I can tell, and we'll be able to use pnp across all our projects. However, I'm just a little nervous reading that sentence.

@rivertam Just to be clear: postinstall scripts will be supported, there is no plan to deprecate them in this RFC. I guess I should have been clearer 😃

One issue with relying on unplugged is that my 99% of use case for yarn link is to link to a version of a dependency that I'm developing locally in parallel, and maintaining that symlink is important so I can edit the "real" checkout of a dependency instead of a temporary throwaway version in the .pnp/unplugged folder.

@transitive-bullshit I have some research to do on this side - I believe yarn link can be made compatible with PnP without problem - we just have to do the same thing than unplug except that we would generate a symlink instead of a real folder. The main problem will be to reconcile both dependency tree, but I believe it can be done without too much issue 🙂

This will probably break a lot of tools and libraries that rely node_modules existing, like read-pkg-up (7m weekly downloads). A migration strategy would need to be documented.

@azz Yep, totally. For most packages, using require.resolve is enough, but some other might require a bit more tooling. From my experiments, it wasn't super common, and could be fixed without too much efforts. Since projects like Webpack and Babel happen to work just fine, I'm confident we should be mostly fine 👍

How will bin commands be handled? Currently they're in node_modules/.bin which is added to the $PATH. Will that still exist or be moved to the cache? (I guess PATH=$PATH:$(yarn bin) could help in that case.)

@azz The binaries are "hidden in the cache", but you can access them through yarn run <bin-name> just like before, which does the heavy lifting of figuring out which tools are provided by your dependencies.

This would have to include peerDependencies, right?

@azz Not entirely sure what you mean, but this "fallback to the top-level" doesn't consider whether things are a dependency or not. If a package makes a require call to something it doesn't own (either through dependencies or peer dependencies, which are resolved at install-time).

I might be missing something, but why have relative paths from the workspace root to the yarn cache?

@azz They were initially absolute, but I made them relative to the location of the .pnp.js file (which is at the root of the project). It could be relative to an environment variable, even if I'm not sure it's something we really want to encourage. Opinions welcome!

Member

arcanis commented Sep 13, 2018

The solutions proposed in this proposal don't impact this in any way, as far as I can tell, and we'll be able to use pnp across all our projects. However, I'm just a little nervous reading that sentence.

@rivertam Just to be clear: postinstall scripts will be supported, there is no plan to deprecate them in this RFC. I guess I should have been clearer 😃

One issue with relying on unplugged is that my 99% of use case for yarn link is to link to a version of a dependency that I'm developing locally in parallel, and maintaining that symlink is important so I can edit the "real" checkout of a dependency instead of a temporary throwaway version in the .pnp/unplugged folder.

@transitive-bullshit I have some research to do on this side - I believe yarn link can be made compatible with PnP without problem - we just have to do the same thing than unplug except that we would generate a symlink instead of a real folder. The main problem will be to reconcile both dependency tree, but I believe it can be done without too much issue 🙂

This will probably break a lot of tools and libraries that rely node_modules existing, like read-pkg-up (7m weekly downloads). A migration strategy would need to be documented.

@azz Yep, totally. For most packages, using require.resolve is enough, but some other might require a bit more tooling. From my experiments, it wasn't super common, and could be fixed without too much efforts. Since projects like Webpack and Babel happen to work just fine, I'm confident we should be mostly fine 👍

How will bin commands be handled? Currently they're in node_modules/.bin which is added to the $PATH. Will that still exist or be moved to the cache? (I guess PATH=$PATH:$(yarn bin) could help in that case.)

@azz The binaries are "hidden in the cache", but you can access them through yarn run <bin-name> just like before, which does the heavy lifting of figuring out which tools are provided by your dependencies.

This would have to include peerDependencies, right?

@azz Not entirely sure what you mean, but this "fallback to the top-level" doesn't consider whether things are a dependency or not. If a package makes a require call to something it doesn't own (either through dependencies or peer dependencies, which are resolved at install-time).

I might be missing something, but why have relative paths from the workspace root to the yarn cache?

@azz They were initially absolute, but I made them relative to the location of the .pnp.js file (which is at the root of the project). It could be relative to an environment variable, even if I'm not sure it's something we really want to encourage. Opinions welcome!

@lubien lubien referenced this pull request Sep 13, 2018

Closed

Edição 262 - 20/09/2018 #228

@petetnt

This comment has been minimized.

Show comment
Hide comment
@petetnt

petetnt Sep 14, 2018

re: modules that would break under this plan. The similar npm POC tink includes a todo that states:

add fallback where "incompatible" packages get dumped into node_modules (and tagged a such in package-map.json)

Would that be an option for yarn pnp? Or a mechanism to mark certain dependencies incompatible?

petetnt commented Sep 14, 2018

re: modules that would break under this plan. The similar npm POC tink includes a todo that states:

add fallback where "incompatible" packages get dumped into node_modules (and tagged a such in package-map.json)

Would that be an option for yarn pnp? Or a mechanism to mark certain dependencies incompatible?

@daKmoR

This comment has been minimized.

Show comment
Hide comment
@daKmoR

daKmoR Sep 14, 2018

There is a somewhat similar idea of using a map to provide package names to the browser.
https://github.com/domenic/package-name-maps

As far as I can see each map stories completely different information.
So I could see the creation of such a "package-name-map" as post-install step using pnp as a source of "truth".

So as the main goal of such a package-name-maps is a flat installation for the browser, I'm wondering it should "work" combined with "yarn flat" or if it should have its own way creating the flat map (via pnp api?).

I'm just curious :)

daKmoR commented Sep 14, 2018

There is a somewhat similar idea of using a map to provide package names to the browser.
https://github.com/domenic/package-name-maps

As far as I can see each map stories completely different information.
So I could see the creation of such a "package-name-map" as post-install step using pnp as a source of "truth".

So as the main goal of such a package-name-maps is a flat installation for the browser, I'm wondering it should "work" combined with "yarn flat" or if it should have its own way creating the flat map (via pnp api?).

I'm just curious :)

@arcanis

This comment has been minimized.

Show comment
Hide comment
@arcanis

arcanis Sep 14, 2018

Member

Would that be an option for yarn pnp? Or a mechanism to mark certain dependencies incompatible?

We already have the concept of "unplugged modules" (primarily meant for debug purposes, and triggered manually through yarn unplug). We'll extend this so that Yarn will automatically unplug packages with post-install scripts, as they are the most common issue people will have.

Packages with incompatibilities related to crossing package boundaries won't be subject to this automatic unplug, because it wouldn't help them: even if we were to unplug them, they would still be alone in their node_modules, preventing them from interacting with anything else.

And to be extremely clear: manual unplug (through yarn unplug) is only meant to be used for debugging purposes. I don't recommend any package from using it as a formal installation step (part of the reason why it cannot be configured through the package.json at the moment). Package authors should instead find ways to work with the package constraints they are given (which are the same one as currently, except that we now enforce them), otherwise they'll just make the developer experience of their tools worse for their users.

So as the main goal of such a package-name-maps is a flat installation for the browser, I'm wondering it should "work" combined with "yarn flat" or if it should have its own way creating the flat map (via pnp api?).

That sounds like a prime example for a first experimentation! I've been made aware of the proposal during the PnP development, and while I still think the single-file approach is the one that'll work best for the Node ecosystem, generating package-name-maps files from the PnP API sounds super valuable. If anyone wants to give it a try, send me a mail with your results, however positive or negative!

Member

arcanis commented Sep 14, 2018

Would that be an option for yarn pnp? Or a mechanism to mark certain dependencies incompatible?

We already have the concept of "unplugged modules" (primarily meant for debug purposes, and triggered manually through yarn unplug). We'll extend this so that Yarn will automatically unplug packages with post-install scripts, as they are the most common issue people will have.

Packages with incompatibilities related to crossing package boundaries won't be subject to this automatic unplug, because it wouldn't help them: even if we were to unplug them, they would still be alone in their node_modules, preventing them from interacting with anything else.

And to be extremely clear: manual unplug (through yarn unplug) is only meant to be used for debugging purposes. I don't recommend any package from using it as a formal installation step (part of the reason why it cannot be configured through the package.json at the moment). Package authors should instead find ways to work with the package constraints they are given (which are the same one as currently, except that we now enforce them), otherwise they'll just make the developer experience of their tools worse for their users.

So as the main goal of such a package-name-maps is a flat installation for the browser, I'm wondering it should "work" combined with "yarn flat" or if it should have its own way creating the flat map (via pnp api?).

That sounds like a prime example for a first experimentation! I've been made aware of the proposal during the PnP development, and while I still think the single-file approach is the one that'll work best for the Node ecosystem, generating package-name-maps files from the PnP API sounds super valuable. If anyone wants to give it a try, send me a mail with your results, however positive or negative!

@nerumo

This comment has been minimized.

Show comment
Hide comment
@nerumo

nerumo Sep 14, 2018

how do docker image builds make use of this? the cache can't be injected during docker build time since volume mounting isn't available on docker build time (yet).

nerumo commented Sep 14, 2018

how do docker image builds make use of this? the cache can't be injected during docker build time since volume mounting isn't available on docker build time (yet).

@tracker1

This comment has been minimized.

Show comment
Hide comment
@tracker1

tracker1 Sep 14, 2018

@zkochan IIRC symbolic links in windows requires elevated privileges.

tracker1 commented Sep 14, 2018

@zkochan IIRC symbolic links in windows requires elevated privileges.

@zkochan

This comment has been minimized.

Show comment
Hide comment
@zkochan

zkochan Sep 14, 2018

@tracker1 that is why pnpm uses junctions on Windows.
If you have any questions I am ready to answer in our gitter chat.
You can find some info about how we solved issues for Window on our FAQ page

In a nutshell, the solution that pnpm uses for creating the non-flat node_modules works on all operating systems

zkochan commented Sep 14, 2018

@tracker1 that is why pnpm uses junctions on Windows.
If you have any questions I am ready to answer in our gitter chat.
You can find some info about how we solved issues for Window on our FAQ page

In a nutshell, the solution that pnpm uses for creating the non-flat node_modules works on all operating systems

@geoffdavis92

This comment has been minimized.

Show comment
Hide comment
@geoffdavis92

geoffdavis92 Sep 14, 2018

This proposal sounds great; increased performance of installation and storage is always welcome in my book.

I'm curious how this would affect the ability to inspect individual modules' contents/source in the even that I run into an issue and want to check the code:

  1. Would there be additional yarn commands created that could "explore" the package's dependencies?
  2. Could the .pnp.js file contain an API that could act as a sort of repl to view package contents, perhaps opening them with a arg-passed terminal text editor or with cat or bat?

geoffdavis92 commented Sep 14, 2018

This proposal sounds great; increased performance of installation and storage is always welcome in my book.

I'm curious how this would affect the ability to inspect individual modules' contents/source in the even that I run into an issue and want to check the code:

  1. Would there be additional yarn commands created that could "explore" the package's dependencies?
  2. Could the .pnp.js file contain an API that could act as a sort of repl to view package contents, perhaps opening them with a arg-passed terminal text editor or with cat or bat?
@millette

This comment has been minimized.

Show comment
Hide comment
@millette

millette Sep 14, 2018

FYI, a couple of deno discussions related to packaging:

millette commented Sep 14, 2018

FYI, a couple of deno discussions related to packaging:

@arcanis

This comment has been minimized.

Show comment
Hide comment
@arcanis

arcanis Sep 14, 2018

Member

I'm curious how this would affect the ability to inspect individual modules' contents/source in the even that I run into an issue and want to check the code:

yarn unplug will copy a package from the cache and put it into a directory, suitable for quick exploration and patching.

Could the .pnp.js file contain an API that could act as a sort of repl to view package contents, perhaps opening them with a arg-passed terminal text editor or with cat or bat?

That sounds out of scope for this proposal 🙂

Member

arcanis commented Sep 14, 2018

I'm curious how this would affect the ability to inspect individual modules' contents/source in the even that I run into an issue and want to check the code:

yarn unplug will copy a package from the cache and put it into a directory, suitable for quick exploration and patching.

Could the .pnp.js file contain an API that could act as a sort of repl to view package contents, perhaps opening them with a arg-passed terminal text editor or with cat or bat?

That sounds out of scope for this proposal 🙂

@geoffdavis92

This comment has been minimized.

Show comment
Hide comment
@geoffdavis92

geoffdavis92 Sep 14, 2018

@arcanis awesome, thanks for the reply 👍

geoffdavis92 commented Sep 14, 2018

@arcanis awesome, thanks for the reply 👍

@i0natan

This comment has been minimized.

Show comment
Hide comment
@i0natan

i0natan Sep 16, 2018

May I ask, probably due to my own ignorance, what's wrong with node_modules or any other way of storing all dependencies within the project?

i0natan commented Sep 16, 2018

May I ask, probably due to my own ignorance, what's wrong with node_modules or any other way of storing all dependencies within the project?

@transitive-bullshit

This comment has been minimized.

Show comment
Hide comment
@transitive-bullshit

transitive-bullshit Sep 16, 2018

@i0natan read the pdf linked in the original post 😃specifically the motivation section.

transitive-bullshit commented Sep 16, 2018

@i0natan read the pdf linked in the original post 😃specifically the motivation section.

@sorrycc sorrycc referenced this pull request Sep 17, 2018

Open

早报 @ 2018.9.17 #79

@DanielRuf DanielRuf referenced this pull request Sep 17, 2018

Open

PnP (Yarn) support #70

@KenanSulayman

This comment has been minimized.

Show comment
Hide comment
@KenanSulayman

KenanSulayman Sep 17, 2018

This proposal is an applaudable effort, although I would appreciate a more generic standards process. I can already see npm Inc. trying to rebuild this feature just a lot less competent, directly competing with this and then forcing their implementation onto every single developer who didn't adopt yarn, yet.

That said, I have tried the pnp implementation provided by this PR and unfortunately have found it not to work with any of our codebases; specifically when using Node “binary”-scripts (in our case webpack-dev-server is used).

It seems the third-party binary being called from a script specified in package.json fails to reference dependencies of the project:

...
...
ERROR in multi /Users/xxx/Library/Caches/Yarn/v3/npm-webpack-dev-server-3.1.4-9a08d13c4addd1e3b6d8ace116e86715094ad5b4/node_modules/webpack-dev-server/client?http://0.0.0.0:8085 webpack/hot/dev-server ./src/index.tsx
Module not found: Error: Can't resolve 'cache-loader' in '/Users/xxx/some/directory'
 @ multi /Users/xxx/Library/Caches/Yarn/v3/npm-webpack-dev-server-3.1.4-9a08d13c4addd1e3b6d8ace116e86715094ad5b4/node_modules/webpack-dev-server/client?http://0.0.0.0:8085 webpack/hot/dev-server ./src/index.tsx
...
...

The cache-loader dependency is referenced from the webpack configuration file of the local project and loaded by the webpack-dev-server third-party "binary"-script, which then fails to locate the cache-loader dependency.

How did you solve this for the build-processes at Facebook, or if you didn't have this problem to begin with, what is the suggested approach to dealing with third-party tooling in scripts defined in package.json?

KenanSulayman commented Sep 17, 2018

This proposal is an applaudable effort, although I would appreciate a more generic standards process. I can already see npm Inc. trying to rebuild this feature just a lot less competent, directly competing with this and then forcing their implementation onto every single developer who didn't adopt yarn, yet.

That said, I have tried the pnp implementation provided by this PR and unfortunately have found it not to work with any of our codebases; specifically when using Node “binary”-scripts (in our case webpack-dev-server is used).

It seems the third-party binary being called from a script specified in package.json fails to reference dependencies of the project:

...
...
ERROR in multi /Users/xxx/Library/Caches/Yarn/v3/npm-webpack-dev-server-3.1.4-9a08d13c4addd1e3b6d8ace116e86715094ad5b4/node_modules/webpack-dev-server/client?http://0.0.0.0:8085 webpack/hot/dev-server ./src/index.tsx
Module not found: Error: Can't resolve 'cache-loader' in '/Users/xxx/some/directory'
 @ multi /Users/xxx/Library/Caches/Yarn/v3/npm-webpack-dev-server-3.1.4-9a08d13c4addd1e3b6d8ace116e86715094ad5b4/node_modules/webpack-dev-server/client?http://0.0.0.0:8085 webpack/hot/dev-server ./src/index.tsx
...
...

The cache-loader dependency is referenced from the webpack configuration file of the local project and loaded by the webpack-dev-server third-party "binary"-script, which then fails to locate the cache-loader dependency.

How did you solve this for the build-processes at Facebook, or if you didn't have this problem to begin with, what is the suggested approach to dealing with third-party tooling in scripts defined in package.json?

@arcanis

This comment has been minimized.

Show comment
Hide comment
@arcanis

arcanis Sep 17, 2018

Member

@KenanSulayman webpack-dev-server should work - cf the following PR that adds it to the sample app. Are you sure the Plug'n'Play loaders are correctly configured? You also need to configure the plugin for the resolveLoader key, since Webpack ignores the resolve field when loading its own plugins.

What is the suggested approach to dealing with third-party tooling in scripts defined in package.json?

When they use their own require.resolve implementation (like Webpack, which uses enhanced-resolve), you might need to add a plugin. For all other scripts, running them through yarn run and/or yarn node should be enough.

Member

arcanis commented Sep 17, 2018

@KenanSulayman webpack-dev-server should work - cf the following PR that adds it to the sample app. Are you sure the Plug'n'Play loaders are correctly configured? You also need to configure the plugin for the resolveLoader key, since Webpack ignores the resolve field when loading its own plugins.

What is the suggested approach to dealing with third-party tooling in scripts defined in package.json?

When they use their own require.resolve implementation (like Webpack, which uses enhanced-resolve), you might need to add a plugin. For all other scripts, running them through yarn run and/or yarn node should be enough.

@CrabDude

This comment has been minimized.

Show comment
Hide comment
@CrabDude

CrabDude Sep 18, 2018

Contributor

FWIW, the post-install cache conflict raised in the pdf would be ameliorated by including the NODE_MODULE_VERSION in a cache entry's hash if

  1. A post-install script ran
  2. The resulting physical (filesystem) entries differ. (i.e., file content/existence)

This could be repeated for known meta-values of significance (e.g., OS version for fsevents). In the future, packages could even pre-declare meta-values to be considered for cache heuristics.

EDIT: This would be consistent with the accepted caching RFC Idempotent Install still to be implemented.

Contributor

CrabDude commented Sep 18, 2018

FWIW, the post-install cache conflict raised in the pdf would be ameliorated by including the NODE_MODULE_VERSION in a cache entry's hash if

  1. A post-install script ran
  2. The resulting physical (filesystem) entries differ. (i.e., file content/existence)

This could be repeated for known meta-values of significance (e.g., OS version for fsevents). In the future, packages could even pre-declare meta-values to be considered for cache heuristics.

EDIT: This would be consistent with the accepted caching RFC Idempotent Install still to be implemented.

@DanielRosenwasser

This comment has been minimized.

Show comment
Hide comment
@DanielRosenwasser

DanielRosenwasser Sep 18, 2018

Hey @arcanis, thanks for putting this together. I'd like to try to discuss the impact this might have on the TypeScript community. While I'll be traveling for the next few weeks, maybe we can schedule a call or something in the near future.

From my personal view, the high-level goals sound like a great idea. Apart from the use-cases given here, when it comes to powering tools like TypeScript (including editor scenarios for both TypeScript and JavaScript in tools like VS and VS Code), resolution tends to be a pretty costly process that can cause delays. Minimizing that sounds great!

However, there are a few issues I'd like to raise.

Alternative Resolution

You may already know, but TypeScript effectively overlays a "mirrored" resolution process to find files it's interested in. Specifically:

  • When resolving files without file extensions, in addition to .js files, TypeScript first searches for .ts, .tsx, and .d.ts files.
  • When resolving foo from node_modules, in addition to */node_modules/foo, TypeScript resolves from */node_modules/@types/foo
  • When resolving from package.json, in addition to resolving from the main field, TypeScript resolves from the types field (as well as the typings field)
    • In TypeScript 3.1, we will have something for version selection called typesVersions.

So even if TypeScript had some sort of resolution support to plug into Yarn PnP, PnP itself is oblivious to what TypeScript is actually trying to search for.

Arbitrary Code Execution

I think @liftM covered some of this already on #101 (comment), but I think another broadly-applicable motivating scenario would be helpful here.

Let's say we were able to resolve this issue, and that language services that power editors were able to require or start up a server from the .pnp.js file. Ideally, this file just runs, users are happy, and they go about their lives.

Now imagine someone places a .pnp.js file at the root of a repository, clones it, and opens an editor there. That editor now either has the choice of executing arbitrary code or turning itself into a security notification carnival. The former is obviously not ideal, and the latter leads to a very undesirable user experience. Users either hit accept anyway, or the editor experience shuts down entirely.

Bifurcation

Much as these ideas are great, npm's simultaneous effort on tink (formerly crux) means that there are potentially two approaches for us to support which isn't ideal.

Ideas & Mitigations

I think that @sokra had some good insights over Twitter.

  • A static file is definitely easier to analyze and doesn't introduce arbitrary code execution problems.
  • The complete mapping of files in packages makes it fully possible for any tool to overlay its own resolution process which is crucial to TypeScript.
  • Having a tool that can actually be imported or spawned as a server/daemon to understand this new feature is absolutely a great idea to ensure tools like Flow can still work without a full reimplementation.
  • Collaborating with npm and leveraging .package-map.json format could produce some of the aforementioned wins while avoiding bifurcation.

DanielRosenwasser commented Sep 18, 2018

Hey @arcanis, thanks for putting this together. I'd like to try to discuss the impact this might have on the TypeScript community. While I'll be traveling for the next few weeks, maybe we can schedule a call or something in the near future.

From my personal view, the high-level goals sound like a great idea. Apart from the use-cases given here, when it comes to powering tools like TypeScript (including editor scenarios for both TypeScript and JavaScript in tools like VS and VS Code), resolution tends to be a pretty costly process that can cause delays. Minimizing that sounds great!

However, there are a few issues I'd like to raise.

Alternative Resolution

You may already know, but TypeScript effectively overlays a "mirrored" resolution process to find files it's interested in. Specifically:

  • When resolving files without file extensions, in addition to .js files, TypeScript first searches for .ts, .tsx, and .d.ts files.
  • When resolving foo from node_modules, in addition to */node_modules/foo, TypeScript resolves from */node_modules/@types/foo
  • When resolving from package.json, in addition to resolving from the main field, TypeScript resolves from the types field (as well as the typings field)
    • In TypeScript 3.1, we will have something for version selection called typesVersions.

So even if TypeScript had some sort of resolution support to plug into Yarn PnP, PnP itself is oblivious to what TypeScript is actually trying to search for.

Arbitrary Code Execution

I think @liftM covered some of this already on #101 (comment), but I think another broadly-applicable motivating scenario would be helpful here.

Let's say we were able to resolve this issue, and that language services that power editors were able to require or start up a server from the .pnp.js file. Ideally, this file just runs, users are happy, and they go about their lives.

Now imagine someone places a .pnp.js file at the root of a repository, clones it, and opens an editor there. That editor now either has the choice of executing arbitrary code or turning itself into a security notification carnival. The former is obviously not ideal, and the latter leads to a very undesirable user experience. Users either hit accept anyway, or the editor experience shuts down entirely.

Bifurcation

Much as these ideas are great, npm's simultaneous effort on tink (formerly crux) means that there are potentially two approaches for us to support which isn't ideal.

Ideas & Mitigations

I think that @sokra had some good insights over Twitter.

  • A static file is definitely easier to analyze and doesn't introduce arbitrary code execution problems.
  • The complete mapping of files in packages makes it fully possible for any tool to overlay its own resolution process which is crucial to TypeScript.
  • Having a tool that can actually be imported or spawned as a server/daemon to understand this new feature is absolutely a great idea to ensure tools like Flow can still work without a full reimplementation.
  • Collaborating with npm and leveraging .package-map.json format could produce some of the aforementioned wins while avoiding bifurcation.
@arcanis

This comment has been minimized.

Show comment
Hide comment
@arcanis

arcanis Sep 18, 2018

Member

So even if TypeScript had some sort of resolution support to plug into Yarn PnP, PnP itself is oblivious to what TypeScript is actually trying to search for.

The Plug'n'Play API is split in two parts: resolveToUnqualified, and resolveUnqualified. The first one is the static resolution that converts lodash/foo into /path/to/cache/lodash-1.2.3/foo. The second one is the one that converts /path/to/cache/lodash-1.2.3/foo into /path/to/cache/lodash-1.2.3/foo/index.js.

So in your case, your resolver would just have to use resolveToUnqualified in order to get the basic path, which you would then continue resolving as you see fit. You can see an example on the webpack resolver, which uses resolveToUnqualified then defers to the rest of the enhanced-resolve plugins to compute the rest of the resolution.

Now imagine someone places a .pnp.js file at the root of a repository, clones it, and opens an editor there.

It's a great point, I didn't consider it before. It's worth noting that this problem already occurs now, though: while not native to the editor, various extensions already transparently execute Javascript files obtained from freshly cloned projects - a classic example being the eslint plugin for vscode.

Member

arcanis commented Sep 18, 2018

So even if TypeScript had some sort of resolution support to plug into Yarn PnP, PnP itself is oblivious to what TypeScript is actually trying to search for.

The Plug'n'Play API is split in two parts: resolveToUnqualified, and resolveUnqualified. The first one is the static resolution that converts lodash/foo into /path/to/cache/lodash-1.2.3/foo. The second one is the one that converts /path/to/cache/lodash-1.2.3/foo into /path/to/cache/lodash-1.2.3/foo/index.js.

So in your case, your resolver would just have to use resolveToUnqualified in order to get the basic path, which you would then continue resolving as you see fit. You can see an example on the webpack resolver, which uses resolveToUnqualified then defers to the rest of the enhanced-resolve plugins to compute the rest of the resolution.

Now imagine someone places a .pnp.js file at the root of a repository, clones it, and opens an editor there.

It's a great point, I didn't consider it before. It's worth noting that this problem already occurs now, though: while not native to the editor, various extensions already transparently execute Javascript files obtained from freshly cloned projects - a classic example being the eslint plugin for vscode.

@DanielRuf DanielRuf referenced this pull request Sep 25, 2018

Open

Yarn PnP support (also for the templates) #11515

3 of 3 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment