Should I trust npm? #12045
Comments
Putting this here for completeness: http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm |
I'd like to add to that list:
The nature of small modules requires us to spread our web of trust quite wide. When we install packages, we implicitly trust that package's maintainers, and all of that package's dependencies maintainers', and so on, from then into the future. The ability for unpublished packages to be reclaimed by anyone is a critical security issue, because the web of trust can expand to "the world" on the whim of a maintainer. And then someone whom I never wished to trust suddenly appears inside my dependency tree. Granted, I trusted the maintainer who unpublished, but I never wanted to trust "everyone;" and why should that happen as a result of his decision to unpublish? I am very pleased that in the recent blog post, npm pledges to address this issue:
Here is my response to that:
Please, dispel the fear I have of running |
First, see my update to #12017. I hope the additional detail there is useful. As I said on #12017, in the part before you quoted:
by which I mean it was rapidly turning into a dogpile. I stand by that assessment.
So, that said:
That may be. Tuesday was a very trying day. However, there are multiple community interests to be balanced. #12017 was a community attempt to discuss and define policy. I believe that I have a responsibility, as team lead, to make clear to the community the extent to which those attempts will be successful. The npm CLI is a community project. The npm registry, and npm's policies, are not. I didn't, and don't, feel that allowing a conversation about policy that is outside the scope of the community's ability to effect changes to those policies was in the best interests of either npm or the community, especially in the heat of the moment.
That's a reasonable take, but to me it feels incomplete. A package repository where anybody can publish, without gatekeepers approving submissions, is a commons, and like all commons brings together people with differing, and sometimes conflicting, needs. You're correct that by publishing a package to the registry, a developer is committing to sharing that package to the world (and, because of how npm's replication works, is doing so in a way that any truly private information included as part of that publish can't be erased). However, there are good reasons, some of which I've already outlined elsewhere, why you as a developer may want to, or may be compelled to, take a package down. I would suggest to you that any trust placed in the npm registry instead of your fellow developers is misplaced. A package distribution system like most OS package distribution systems, where each package is carefully vetted and approved before being made available to users, is designed for that kind of trust. However, in a system where you are obtaining packages directly as they were provided by other developers, your trust is entirely in them. There are cases where npm will intervene (as we did with As such, the duty of care for using that repository falls on you as users of the registry rather than us as providers of it. There's simply no other way to deal with a registry offering hundreds of thousands of packages with thousands of new versions of packages published each day. Specifically with respect to package availability, this is why companies frequently set up their own caching proxies or mirrors for npm – so that issues with the primary registry don't prevent them from getting things done.
This has always been true, both on an operational level, and at a policy level. Our registry team has done a phenomenal job at improving the robustness and reliability of npm to the point that most users never even think about the registry anymore, but it's fundamentally a distributed system. From the policy side, there are always going to be reasons why the people responsible for providing the resources to keep the registry system going are going to have to make judgment calls to preserve access to that registry for everyone.
It's reasonable to be concerned that it's so easy to remove packages. I think my position is probably the most extreme in favor of preserving that privilege for individual developers (I really do believe that part of the freedom of using a commons is to have the ability to withdraw from it), but I do think it's reasonable to put in place systems to mitigate the damage that abrupt package disappearances can cause. If you [read Isaac's postmortem](http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm, in the section entitled), you'll see that he agrees. Changes are coming there.
I fought the un-unpublishing of
I'm here.
At the package level, yes. At the individual package version level, no, mostly due to some quirks we've only recently become aware of. That said, that's a request for the registry team, and not information that I personally can get for you immediately. I believe it to be comparatively rare. @azer's mass unpublish has to be one of the largest single takedowns we've seen, if not the very largest.
A list of reasons include:
#1 and #3 are typically believed to be time-sensitive by the unpublishers, due to the fact that secrets are involved. This is a moot point, because the act of publishing public packages puts them into the registry's replication stream, and many, if not most, replicas don't do anything with unpublish events coming down the follower stream. That said, it can be very stressful to have a package name for an unreleased tool or project publicly visible, and making the affected users go through npm support (if, e.g. their login credentials aren't working, or they're trying to unpublish the packages using the wrong account) has historically been very stressful for them.
Me too. I'll leave this issue open for a little while, but because this is a discussion topic and not a bug, I'll close it within a week or so if you don't first. Trying to keep this issue tracker usable is another one of my enormous problems. |
@othiym23, just read your post. Came in right after mine! I realize much of it is off-topic here. I have contacted npm support with my issue. I suppose the only part relevant here would be my suggestion for |
@othiym23 Without going into the rest of your reply, point 3 is absolutely not a valid reason for unpublishing. Once sensitive information has been published, even for a split second, it has to be assumed to be circulated widely - as there is no way to measure the real impact. Resetting credentials and such isn't optional anymore at that point, and so unpublishing wouldn't help there. |
Secrets include more than just credentials. If somebody accidentally includes a database dump containing personally identifying information such as tax identifiers or home addresses, that's not so simple to fix as a credential roll, and can lead to threats at least as significant as a host compromise. We agree that that information has been let loose and can't be completely brought back in, but that doesn't mean that there's no point in any mitigation at all. Also, please reread this portion of my reply:
This is about why users unpublish packages, not which of those reasons are valid. |
Fair enough. I actually thought you were referring to credentials alone, but in this context it makes more sense. |
@othiym23 Thank you so much for your quick, and excellent, reply! I'm glad my concerns can be discussed here. Thanks again for your time and consideration. I really appreciate it. |
:) I agree with the point that unpublish can't go away. Your mention of a commons makes me rethink my expectations and honestly the whole premise of this thread. I'm going to close this and dwell on my misconceptions. @othiym23, thanks for being here ;) |
@othiym23, where should we be having the discussion on the npm policies then, can you provide a link? I still stand by my original opinion that the cli unpublished should only hide a package, and allow existing versions to still be downloaded when directly referenced. Alternatively, an option/convention that allowed the cli to automatically reference git repositories, instead of the npm repository, would certainly fall within the realm of the cli project. I'm thinking you just have a system wide switch that makes the cli use git references, where provided, but defaults to using npm. Extra points for an option to automatically fork dependencies into your own repo and merge when updates are applied. All of this, of course, would be unnecessary with an unpublish policy like nuget I'd also argue unpublish shouldn't be there just for "critical bugs". If those occur, republish the old version with a newer version number, and possibly allow the owner to set a warning message against the unpublished version that displays when it is restored. Of course, hide that version in all other scenarios, but don't break those who are already using it. |
@dolkensp Keep in mind that it is very difficult and iffy to implement semantic versioning for Git references, and the mutability of Git tags doesn't help with that. |
I agree with @dolkensp. Un-publishing a package should hide it. Shrinkwrapping is completely pointless if you allow things to be deleted from the registry. If I shrinkwrap something that means it's already downloaded in my node_modules directory. If I wipe out my node_modules directory and run npm install I expect my modules to be downloaded exactly as described in my npm-shrinkwrap.json file. If that isn't the case I must check in all of my node_modules to source control thus making shrinkwrapping totally worthless. Also, if I wanted to go the route of not checking my node_modules into source control and then realize that a package was deleted but it's still in my node_modules directory, then I'm just going to republish it. Unpublish didn't solve anything. It just created extra work for me. Checking into source control also isn't ideal. My node_modules easily get over 200MB with not many dependencies IMO. You either need to get rid of the ability to delete packages or get rid of the "npm shrinkwrap" command since it's broken and dangerous. |
Here's the thing though, in this "commons", the right to unpublish is pre-exisitng, That's been given, I'm not sure you can take it away now. I'd love to know where to continue that conversation. I do think the shrinkwrap functionality should come with a warning about the possibility of this scenario, I think that's only fair to our future devs. |
nevermind it's there: https://docs.npmjs.com/cli/shrinkwrap#caveats RTFM I'm saying to myself. |
https://github.com/npm/policies @dolkensp, @caseyhoward See also this tweet from @ashleygwilliams:
|
hey everyone! thanks for the mention, @othiym23. i'm in the process of working on a document that will explain the direction we plan on going with |
@othiym23 Thanks - I see they don't have anything open currently. Will wait and see what the blog post has to say. |
What if someone created a service where people could depend on their npm packages being there. It would basically be a proxy for the npm registry. You would add it as a registry and it would behave just like npm. The only difference is that it would cache any version of any package that someone has installed. The npm unpublish command would unpublish from npm but keep it in the cache. Are there any legal reasons someone couldn't do this? What if everyone started using the new service because they needed to be able to actually depend on their dependencies? Wouldn't unpublishing be somewhat pointless? |
@caseyhoward what you describe already somewhat exists from my understanding, there's nothing stopping you from creating your own npm registry mirror with these features, and then pointing your npm client at it. There are actually a lot of packages out there that provide mirror features for npm that are quite good. Legally, I don't think there's any issues. I don't think you could charge for this service, but I'm no lawyer, so take that with a grain of salt. My dream is to have it be decentralized and based on a block chain and/or bittorrent... but I'm not sure we are there yet. I've heard of similar request for github itself over the years and haven't seen a legitimate contender actually rise. |
Such a mirror would be prone to copyright / trademark / etc. complaints as well. However, when marketed correctly, it's very unlikely that charging for it would be a problem from a legal point of view. But yes, consult a lawyer and so on and so on. EDIT: This is a good read regarding use of blockchains. |
@joepie91 thanks for that link! |
@jamesjnadeau I know there's nothing stopping me from creating my own mirror, that's what gave me the idea. It's just that it requires a bit extra of work. That's why I mentioned creating a publicly available mirror, so everyone doesn't have to set up their own. I'm also not suggesting making any money off of it. @joepie91 I know there are still copyright issues. Those are special cases. I just don't want people to take down their packages breaking all my stuff by simply using "npm unpublish". It should be a more manual process to get something removed. |
Your cool-headedness in the face of such a rare and catastrophic event has earned both my trust and my admiration. Thank you. |
I sent this as a DM on Twitter to @ashleygwilliams before she shut down her account for a break, but I think it's a viable solution to unpublished packages.
|
Given for how long Lodash deprecation notices have been around, even in big projects like Grunt, I'd say that would need to be measured in 'years'... |
@joshmanders I'm not sure this is the place for continuing that discussion. You should start a new issue on https://github.com/npm/policies/issues |
One can build a wrapper on top of IPFS to gain the benefit of permanent addressing. |
@beenotung IPFS does not persist data unless clients explicitly seed it, so that is not in any way guaranteed to work, nor am I convinced that it's worth it considering the limited value and added complexity (both from a technical and political perspective - people are going to have Opinions about their workstation seeding packages to others). |
@joepie91 no matter what's the default setting, it won't get worse to allow people to contribute to the network (as a seeder, at least those he is using) (if we can reduce the fiction to do so) |
@beenotung It depends. There's a well-known phenomenon where in a situation of shared responsibility, each individual will feel less responsible (or not responsible at all) for the outcome. Adding a decentralized layer can actually make availability worse, by making the (previously) centralized party feel less responsible for ensuring that availability. This has consequences in particular for less popular data, which can vanish entirely in that scenario. It's certainly not as simple as "it can't hurt to add it", and that's not even taking into account the extra complexity it introduces, also from an architectural point of view; decentralized systems are notoriously difficult to upgrade, and this means that adding an IPFS layer would essentially 'freeze' the architecture of the registry in many aspects. |
@joepie91, for the incentive part, will it helps if it's like bitcoin or filecoin? Otherwise you reminds me sometimes getting dead seed in BitTorrent. What if we add some rule like, if a user downloads X packages, he must be the seed of Y packages ? For the complexity part, maybe that should be an alternative of npm instead of upgrade. |
I feel it may work because the dependency of go IPFS is managened by IPFS itself. |
@beenotung To my knowledge, nobody has reliably solved the proof-of-storage problem yet (Filecoin included). Until that happens - and there's good reason to believe that this can't ever be done perfectly - such incentivization schemes are not viable. This also applies to any other kind of accounting scheme; decentralized networks do not have a central point of trust, therefore it's impossible to reliably verify that the reported numbers are correct. A peer could easily lie about its seeding. |
@joepie91 thanks for pointing out clearly. If you still want to continue on this direction: But the availability problem still may appear if very few node are storing it and they go offline at the same time :/ |
What if we have a gossip protocol to make sure rare resource have higher priority to be stored then popular one, it may helps ? e.g. when a new comer want to download hot package A, he must also download rare pakages B,C,D. If we later on found that node is online but do not keep B,C,D and did not handover gracefully, that node may receive penalty ? … |
In #12017
@othiym23, you said the following, and I'm deeply troubled by it.
I've waited 24 hours now since I saw this, but the ban has yet to be lifted. This statement and action is eye opening for me and I hope for the rest of the community.
I have valid concerns, and opinions that I would like to share, and I’d like to think I have something to add to the conversation that has not been said.
I can empathize with your concerns and desire for a quiet evening. I hope you had one. That's your choice to make, and I hope you have evenings such as you desire for the rest of your days. That’s your right and choice.
However, your choice to ignore, redirect, or turn off notifications you receive should not dictate how and when I can contribute to the conversation. I can understand stopping a conversation for other reasons, but because you personally don't want to hear it at that moment in time is not a valid reason I think this community should accept.
Your action of taking away my ability to add to that conversation hurts me as a contributing member of the community, and the community as a whole. ( you can say I’m speaking here, but you are missing my point )
Your statement and action, from my perspective, come from a selfish reason, and not with the communities best interest in mind.
When you don't allow community members to contribute, you break the trust that is established between you as the open source administrator, and us as a community who wishes to use your software and give back in any way we can.
Trust is a fickle thing though.
Trust is built upon commitments and following them through with actions. When you publish a package to npm, you are making a commitment that this package is available for the world to use.
You've taken an action to share your code with the world, and I should trust and expect that package to be available in npm, and npm will make it available to me in the future if I need it. Perhaps my expectation is wrong, but I share it with many others.
If you, as a package maintainer, take that package down, I lose trust in you, but also in npm as a place I can depend upon to get me the software I need. That’s what a package manager is for, right?
A couple years ago, when shrinkwrap first came out, and was preached as a way to avoid the need to check your npm dependencies into version control, the community was told that's the 'preferred method'.(https://www.npmjs.org/doc/misc/npm-faq.html#should-i-check-my-node_modules-folder-into-git, the docs have been updated, since, but a trip to the wayback machine will show you what I'm talking about.) We trusted you, as the project’s creators, to tell us the best way to use it.
Now, over the past 2 years, we've had several incidents that make me call into question the above commitments and trust that has been established. I can no longer trust that npm will have the package I need in the future. I now need to take actions to deal with that reality.
Without this trust and expectation of what the future holds, the use cases for npm as a package manager are extremely limited. This hurts the software, and more importantly the business that npm uses to sustain itself from my understanding.
What this long winded explanation is getting at, is that there has been a series of decisions, commitments, and actions that this project’s maintainers have taken that have eroded the trust of it's users.
I can't trust that a package will always be available.
I can't trust npm will keep a published package around.
I can’t trust they will respect my actions of unpublishing something from npm.
I can’t trust that project maintainers will at least listen to my concerns.
I can’t trust…..
I imagine the number of people taking a look at how much they trust, need, and depend on npm right now is huge. I’m actively taking steps to mitigate how much I depend on this project to be available, and at what point in my development process I make use of it. I’ve talked to others doing the same.
I’m taking actions that demonstrate my loss of trust with this project. In doing so, I can see multiple ways in which the npm organization is much less involved with the work I produce. This series of thoughts doesn't make me want to open up my wallet for you anytime soon. Quite the opposite.
Is this what you want your community members to be thinking and doing right now?
Here's what I originally wanted to ask in #12017 :
Is there any data available that we can cite to see how many packages have been unpublished. A list of reasons that most users unpublish a package would be extremely helpful for this conversation. I'd be interested to hear who wants to keep unpublish around other than the project maintainers? I haven't heard that opinion voiced, I'd love to hear more about that side.
The text was updated successfully, but these errors were encountered: