Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Singleton Packages RFC. #23

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

usergenic
Copy link

Suggested feature for support of a singleton property in package.json to guard against installation of multiple versions/instances of a package the package tree.

Anticipates, but does not define course for resolving version conflicts among singleton packages. (That would be a separate RFC.)

@arcanis
Copy link

arcanis commented Dec 12, 2018

This is something that we (Yarn) are interested in supporting as well. My comments:

  • One question I have is regarding the expected efficiency. Finding the right set of packages that satisfy multiple constraints is I think NP-complete, so I'd prefer any spec to mention that the package managers are allowed to ask for an explicit version lock even if a theoretical solution was possible. I know that you've said the resolution conflict was to be defined in a different RFC, but this particular point affects the semantic of the field, so I think it should be answered here.

  • I'm a bit concerned that this change will prompt everyone to add the singleton field to all their packages because "hey the package manager will optimize it better this way!", which will cause a lot of conflicts issues and potential pain that will last forever. It should be made crystal clear that this feature should only be used if there is no way for the package to work properly without it and that using it for optimization is absolutely the wrong thing to do.

  • While it doesn't affect npm directly since they don't implement this feature yet, this also has ramifications regarding the Yarn workspaces. Is a singleton unique in a single workspace, or across the whole worktree? I need to think a bit more before having a recommendation on this topic, but I'd appreciate if you could think about it as well.

@mantoni
Copy link

mantoni commented Dec 12, 2018

There has been a related discussion some years ago.

@dbatiste
Copy link

@arcanis : your last point reminded me of this discussion on Scoped Custom Element Registries, which has some support. If such scoped element registry becomes possible, the scenario of a component that defines a custom registry scope and a dependency on a component that, which today, must only be "defined" once, then an approach that makes these components unique to the whole dependency tree could be problematic.

@bmeck
Copy link

bmeck commented Dec 12, 2018

Another discussion of this topic happened as well. It seems there are multiple of these.

@arcanis
Copy link

arcanis commented Dec 12, 2018

I remember - the suggestion to use peer dependencies to define singletons looked interesting (but unfortunately relies on the ecosystem doing the right thing, which is a assumption that's been proven wrong multiple times). I wonder if the preferPeer you mentioned would help.

@usergenic
Copy link
Author

usergenic commented Dec 13, 2018

Thanks for feedback and questions, @arcanis.

One question I have is regarding the expected efficiency. Finding the right set of packages that satisfy multiple constraints is I think NP-complete, so I'd prefer any spec to mention that the package managers are allowed to ask for an explicit version lock even if a theoretical solution was possible. I know that you've said the resolution conflict was to be defined in a different RFC, but this particular point affects the semantic of the field, so I think it should be answered here.

I am assuming that package managers already produce an effective union of semver constraints such that transitive dependencies on a@^1.2.0 and a@=1.3.0 would resolve to a@1.3.0 even if a@1.4.0 were available? Is this not correct? I wasn't operating under the assumption that singletons would add a deeper behavior to these calculations.

I'm working on separate RFC to encourage npm to essentially implement a fully compatible resolutions property to Yarn's including its' Selective Version Resolutions, since this provides the necessary top-level conflict resolution specification.

I'm a bit concerned that this change will prompt everyone to add the singleton field to all their packages because "hey the package manager will optimize it better this way!", which will cause a lot of conflicts issues and potential pain that will last forever. It should be made crystal clear that this feature should only be used if there is no way for the package to work properly without it and that using it for optimization is absolutely the wrong thing to do.

I agree. Optimized package semver union resolutioning, if not a thing, should be a thing that is separate from this thing.

While it doesn't affect npm directly since they don't implement this feature yet, this also has ramifications regarding the Yarn workspaces. Is a singleton unique in a single workspace, or across the whole worktree? I need to think a bit more before having a recommendation on this topic, but I'd appreciate if you could think about it as well.

I am going to have to get more familiar with Yarn workspaces. I'm quite familiar with Lerna but also haven't used it in conjunction with workspaces. Will explore.

@steveklabnik
Copy link

Just to chip in briefly here: Rust and Cargo have this distinction; https://doc.rust-lang.org/stable/cargo/reference/build-scripts.html#the-links-manifest-key

It works well, but there's one big issue: this means that major version bumps of these kinds of libraries split the ecosystem. When those are big, foundational libraries, it can lead to a lot of pain. In Rust, this has mostly manifested as the libc package. This is transitively used by the vast majority of the ecosystem, and can only be included once. This means that the upgrade from 0.1 to 0.2 was a really big pain. You could only use packages that depended on identical versions to each other.

The JavaScript ecosystem may not have situations like this, but I figured it was worth mentioning. Happy to answer any specific questions about it. Good luck!

@ljharb
Copy link
Contributor

ljharb commented Dec 13, 2018

The JS ecosystem definitely already does - React, eslint, babel, etc - and it's generally handled by every participant in that "singleton" ecosystem adding a peer dep on it. (not that it's the best solution)

@daKmoR
Copy link

daKmoR commented Jan 2, 2019

It's a new year 🎉
So anything we can do to help this along?

@zkat zkat mentioned this pull request Jan 10, 2019
@daKmoR
Copy link

daKmoR commented Jan 21, 2019

a friendly reminder if there is anything we can do to help?

@zkat
Copy link
Contributor

zkat commented Jan 24, 2019

Hey y'all! Thanks a lot for submitting this!

Overall, we think this is a good idea. This feature does require us to land #27's implementation first, imo. Our plan right now is to land that implementation, see how it feels, and then see about this RFC and implementing it.

Historically, npm has been a package manager that worked pretty hard to guarantee that you wouldn't run into dependency hell[1], and a feature like this one is a major step back into that hole (after we stopped installing peerDependencies by itself, which sort-of-kind-of pushed out the issue). If we're gonna introduce dependency hell into npm, I really want to make sure we have a good way to resolve conflicts.

I appreciate the discussion that's been going on in this thread, and I hope it keeps going while we take the necessary steps to accept it. Thanks everyone for adding the context, and thanks again to @usergenic for taking the time to write and submit it. Hang in there, we'll get this in, and I think this looks a lot like what we'll end up accepting in the end, if not exactly like it!

[1]: In npm parlance, dependency hell is when you have two different packages that require the same package at incompatible versions, a problem that's universal to package managers and that npm resolves through dependency nesting.

@darcyclarke darcyclarke moved this from In Design to To do in Community & Open Source Nov 6, 2019
@darcyclarke darcyclarke added Backlog a "backlogged" item that will be tracked in a Project Board Release 7.x semver:major backwards-incompatible breaking changes and removed Agenda will be discussed at the Open RFC call semver:minor new backwards-compatible feature labels Nov 15, 2019
@darcyclarke darcyclarke removed this from To do in Community & Open Source Nov 15, 2019
@dzearing
Copy link

@darcyclarke @zkat What's the latest on this particular one?

This concept looks like a viable replacement to peer dependencies, which we're struggling to rationalize in various places across Microsoft and the oss ecosystem.

There are some basic things we see:

  • Today, dependencies are duplicated to resolve exclusive semvers. This is the default policy.
  • Producers need a way to declare themselves as "singletons", breaking from default policy. They know which policy is right for their own code. Packages that include React context or theming information are good examples of things that would declare themselves to only have 1 instance. (e.g. package.json could have "policy": "singleton")
  • Consumers need a way to override package policies. Sometimes producers don't care about dupes but consumers do. Things like d3 which can be chunky are good candidates to dedupe even if the default policy allows duplication.

The current answer for reducing duplication is to use peerDependencies. This only allows middle layer libraries a way to opt out of duplicating. But it isn't enforced well, and the strict mode work to error on unmet peers is making it even more difficult to understand who is at fault, when to use correctly, and how to reconcile.

This RFC would help us eliminate peer dependencies as a concept. React could say: "I expect to be a singleton." NPM installs could error when React gets duplicated, and could provide guidance on the best course of action. Apps would have a way to override that when they disagree with default policies. Things would make sense.

Very few people really understand peers but I think most would understand singleton declarations. I would love to see peers deprecated in favor of this.

@dzearing
Copy link

dzearing commented Jul 12, 2022

I would offer a suggestion to improve the feature; in addition to a singleton flag on producer package.json definitions, I would also allow consumers to override that default. There are edge cases where it's ok to go against the producer's recommendation but they need to be intentional. It might look something like this:

dependencyPolicies: [

  // allow duplication of react
  { name: ["react", "react-dom"], policy: "default" },
  
  // prevent duplication of mui, wildcard support
  { name: "@mui/*", policy: "singleton" } 
]

React might be a case where it declares itself a singleton. It's the most used peer dependency I know. However some cases could allow multiple react instances to live in the dependency graph. E.g. test scenarios.

D3 is bulky and might want to be a singleton by default. However, some pages might not care about bulkiness or are not using the full library, so the duplication might be acceptable.

When violations occur, one viable suggestion might be to override the policy through this mechanism. (E.g., if react 16 and 17 get installed with it marked singleton, the tool could suggest a policy override like shown above.)

@arcanis
Copy link

arcanis commented Jul 12, 2022

D3 is bulky and might want to be a singleton by default. However, some pages might not care about bulkiness or are not using the full library, so the duplication might be acceptable.

Fwiw, Yarn supports this use case; package authors can list a dependency as both a regular dependency AND a peer dependency. When that happens, Yarn prefers the peer dependency if fulfilled, but silently fallbacks to the regular dependency if not. So in your case, it'd look like:

{
  "dependencies": {
    "d3": "^x.y.z"
  },
  "peerDependencies": {
    "d3": "*"
  }
}

We have thought about Singleton dependencies, but so far decided it wouldn't be a good strategy for us. Such packages would affect every single node from the dependency tree, which I'm worried would lead to irreconcilable version conflicts - unlikely in regular projects, but much more likely in monorepos. If someone really wants a singleton (but, really, peerDependencies are enough if used well), then overrides via the resolutions field should provide a decent opt-in.

@dzearing
Copy link

dzearing commented Jul 12, 2022

Hey @arcanis thanks for the insight and the link, I really appreciate your thoughts here!

You described singletons and peers well. I would add a few comments.

I would consider peers a reused workaround to an early design limitation. They made sense in npm 2.x where they provided a workaround for installing dependencies in a predictable path location relative to another package. They were designed for plugin scenarios (e.g. grunt plugins.)

Later when the resolution and installation logic was changed to flatten the node_modules structure more, they were repurposed into what they are today: a non-obvious way for consumers to (optionally) opt out of the final resolution, but still provide semver requirements which generate warnings (or errors with the right flag set) when those expectations are unmet... all in hopes that the app can decide on the final single version of a dependency. And, it's a naively hopeful idea - there is no guarantee that all consumers in the final app's dependency graph will use them correctly. All it takes is for 1 thing to list "react": "^0.15" as a dependency for a dupe to silently be introduced. Consider that npm install --save react will do the wrong thing for libraries, by default.

I think we should also consider that peers are consumer-only metadata, yet producers are the ones who own their code as it changes over time, and whether it's safe for duplication or not. Consider the sideEffects flag that is used for webpack tree-shaking. The owners of the code know whether or not the code has side effects. Imagine if sideEffects were defined in the consumer package.json. Weird, right? The consumer isn't going to follow whether or not duplication is safe in their downstream dependencies. The package owner knows if they export react context or theme data. They should be allowed to define the default policy. Peers push that responsibility downstream, which doesn't cover producer expectations and is prone to failure by default.

But, there are indeed scenarios where consumers want singletons for things that technically would be safe to duplicate, and primarily this is about controlling bundle size. This is the only reason why the argument for peers in their current form makes sense to me. We want 1 copy of react because we don't want to bloat the bundle size, and because we're likely to hit issues with the setup. However what doesn't make sense is to call this concept "peer dependency". This made sense when we were talking about folder structure peers in npm 2.x. The term "peer dependency" hasn't really explained expectations since (e.g. still don't know how to fix peers 😆) We should use vocabulary that explains the purpose and concept and have tools that do the right thing by default (e.g. yarn add some-singleton installs a compliant version that doesn't violate policy.)

This is where I believe we have an opportunity to do better by providing a mechanism with explicit vocabulary that formally solves the need to deduplicate in a clear way that can be defined at the producer level, overridden by the consumer, and enforced by the package manager.

We are considering trialing this idea within our dependency graphs across some of our large repos. Basically we'd add policy data to our packages, write a separate prototype tool which traverses the install graph post-install, reads policy metadata, and fails on violations. This would help us learn about edge cases and where things don't work. We have multiple yarn monorepos with 2000+ source packages to try this in (mixing both react and react-native projects, which adds to the complexity.)

If it works well, then my hope would be that we could move this concept into the package managers to enforce. It would be much more efficient since the graph and definitions are already parsed.

I would definitely welcome your thoughts and ideas. Thanks again for the response and background.

@ljharb
Copy link
Contributor

ljharb commented Jul 12, 2022

Peers make sense for any ecosystem with a core - eslint, babel, webpack, typescript, react, etc. It declares compatibility, and does already require something to be a singleton - at least within the highest subgraph that declares it.

@arcanis
Copy link

arcanis commented Jul 12, 2022

All it takes is for 1 thing to list "react": "^0.15" as a dependency for a dupe to silently be introduced. Consider that npm install --save react will do the wrong thing for libraries, by default.

Yes, I agree. However, my take is different: I think we should do better to educate people and to have sane default, not abandon peer dependencies. It's unfortunately more difficult for npm to do that (because of the semantic changes they made around peer dependencies in npm 8), but that's the approach we want to take with Yarn.

For instance, I think it would be acceptable to have a preferPeer field (similar to preferDev) that would instruct yarn add (or npm add) to add the package as a peer dependency by default, unless the user explicitly states in the command line they'd prefer to add it as a regular dependency. It would extend the existing system rather than attempt to replace it. Yes, it may not solve every single problem, but it costs very little, so imo it's worth a shot before thinking about taking more extreme measures.

yet producers are the ones who own their code as it changes over time, and whether it's safe for duplication or not

Yes ... in most cases. But if there's one thing I'm sure, there are many use cases, and some of them rely on keeping the dependency tree subgraphs estranged 🙂

To give you an example, I myself have a third-party tool, which you install as a dependency in your project, and which lists next and react in its own dependencies field (using pinned versions). When you run the tool, a Next.js server is spawned, acting as a light GUI for a separate daemon. In this context, it'd be quite bad if my GUI react was forced to share its version with the main app! They would have absolutely no connection whatsoever.

@ljharb
Copy link
Contributor

ljharb commented Jul 12, 2022

preferPeer would be just as problematic as preferDev is - there does not exist a package that is always a peer, a dev, or a runtime dep. Everything has use cases as every category.

@arcanis
Copy link

arcanis commented Jul 12, 2022

Right, hence why those settings are suggestions, but not requirements. Those who won't use those packages the ninety-nine-percentil-way exist (as I said in my post I'm even one of them!), but are more likely than not to be advanced enough to know that the dependency must then be moved in dependencies, not peerDependencies (either manually, or by adding the appropriate CLI options when running yarn add).

@dzearing
Copy link

Responding to both above:

Peers make sense for any ecosystem with a core - eslint, babel, webpack, typescript, react, etc. It declares compatibility, and does already require something to be a singleton - at least within the highest subgraph that declares it.

I think singleton policies make sense. But the word "peer" doesn't indicate that's the intent to an average developer - to avoid introducing duplication. The vocab is part of the confusion (the lack of enforcement and configurability is the other part.)

To give you an example, I myself have a third-party tool, which you install as a dependency in your project, and which lists next and react in its own dependencies field (using pinned versions). When you run the tool, a Next.js server is spawned, acting as a light GUI for a separate daemon. In this context, it'd be quite bad if my GUI react was forced to share its version with the main app! They would have absolutely no connection whatsoever.

Good food for thought and I see your point. Let's say react declared itself as a singleton.

  • App depends on react ^18
  • YourTool depends on react ^17

Developer installs YourTool within App as a dependency. Package manager complains about singleton violation, explaining the options to the developer:

  1. Allow for the duplication of react (add policy override to package.json) (yes, this one!)
  2. Downgrade react to 17 (nope)
  3. Update YourTool to a version which includes React 18 in its semver requirement. (nope)

This feels better than blindly taking on a duplicate when typical intent is to deduplicate. In your case, it's fine to override the default because your scenarios are sandboxed; your app does not host YourTool within the same react-managed dom node on a single page.

This experience would be so much more clear to explain to users. They would get a resolvable problem when they create dupes for things not intended to live multiple times in the same install. Sandboxed cases are the edge case where overriding policy makes sense. If 2 separate parts of a page both render separately with React, it's possible, but not desirable, to duplicate react. So to me it seems that producers should pick a best-case-scenario policy as the default. (Which usually means duplication is fine.)

I can think of edge cases. What about when your App has a mix of middle layer libraries which should dedupe react, along with YourTool which are ok to duplicate? This might need to be something that policy overrides consider. E.g. exemptions may need to be specific to a particular relationship. Even with the edge cases though, it still seems more easier to grok by using clear language to define expectations. Flags like preferPeer or preferDev - i understand the intent, but the language I think could create more confusion.

Ultimately we see this question asked over and over: should I list this dependency as a "peerDependency" or "dependency"?

The answer is always vague. The real question is: is it ok to duplicate this dependency? Different scenarios require different answers.

I stumbled upon this issue in material-table as an example:

mbrn/material-table#472

Reflecting on this... why was mui listed as a dependency in the first place? Why did they change it to a peer dep, and did they change it for the right reasons? Why didn't they change other things in their list to peers which might be dangerous to duplicate? (like @emotion/styled which might introduce class name collisions?)

This happens quite often across the ecosystem. We hear it internally at Microsoft with our own component libraries and utility packages. We want developers to do the right thing by default, without ambiguity, and with rails to guide them when they need to deviate from the common path.

@darcyclarke darcyclarke added the Agenda will be discussed at the Open RFC call label Jul 13, 2022
@dzearing
Copy link

dzearing commented Jul 15, 2022

I've summarized my thoughts on how to address peer dependencies here: https://hackmd.io/@dzearing/BJifNnpsq

The important parts are at the bottom but the TLDR:

  • app hosts can define their own policy - they are the source of truth
  • packages which expect to be singletons can define a default policy which can be overridden explicitly (to address @arcanis very valid scenario but keep end users informed)
  • the client enforces rules (this could be iterated as a separate library outside the client until we like it)
  • the client gives a friendly way to resolve the issues it finds (error messages which lead users to resolution)

This can all live side by side with peers so there isn't a conflict but if accepted and integrated, I think it covers a superset of scenarios and has some nice graceful fail behaviors that end users would appreciate, and at that point peers could be deprecated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backlog a "backlogged" item that will be tracked in a Project Board Enhancement new feature or improvement Needs Discussion is pending a discussion semver:major backwards-incompatible breaking changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet