Investigate alternatives to NPM #291

Closed
prettydiff opened this Issue Mar 24, 2016 · 10 comments

Projects

None yet

2 participants

@prettydiff
Owner

Goal

The intention is to find a distribution solution so that Pretty Diff can leave NPM.

Primary Problem

These two quotes from NPM co-founder Laurie Voss (CTO) are the straw that broken the camel's back:

Un-un-publishing is an unprecedented action that we're taking given the severity and widespread nature of breakage, and isn't done lightly,

This action puts the wider interests of the community of NPM users at odds with the wishes of one author; we picked the needs of the many.

Conflict to Open Source

I fundamentally disagree. The needs of a package owner completely and absolutely outweigh everybody else. This is a fundamental issue of open source software. The commercial resolution to this condition exists in various forms of warranties, contracts, and documented obligations. Where such documents are neither expressed nor implied the open source software exists for use entirely at risk to the consuming dependent. This the reality of open source software and is stated as such in the most popular open source software licenses.

NPM forms their position because they are a revenue generating business who sell infrastructure. The most important quality to their business is availability (like any infrastructure model), which, in this case, puts their business into conflict with basic principles of open source software. This is an exceedingly rare edge case even if absolutely fundamental. When availability is in question the value of their service is called into question and their brand is severely harmed.

The reason why this conflict likely exists (speculation) is because NPM traffic growth is largely driven by a dependency model. Use of dependencies requires a choice between freely available work (and assuming all risks of that work and its availability) versus doing that work yourself locally.

Personal Thoughts and Convictions of Liberty

The quoted statement from NPM is the opposite of liberty. I hold the concepts of liberty and leadership above all other sociopolitical establishments and is the primary reason I continue to serve in the US Army after 19 years.

If the subject at hand were of gay rights or racial distinctions, instead of software, people would not be so generally silent on the subject. There would be loud angry boycotts and protests. While the community of NPM users and leaders may distinguish these subjects differently liberty does not. The logical conclusions then are that:

  • The community of NPM is not aware of liberty as a concept (ignorant)
  • The community of NPM is aware of liberty but finds it antagonistic to other competing values (authoritative)

Regardless of whether the thoughts and actions opposing liberty are willful, as an adherent to the practice of supporting and upholding liberty as a paramount virtue I must remove myself from that community and no longer contribute works toward it. I believe a willful opposition to liberty on principle alone is perhaps the greatest of evils and all that it takes for evil to thrive is for good persons to do nothing.

For the definition of liberty I subscribe to the preeminent treatise On Liberty by John Stuart Mill. I have included a few select quotes from the second chapter, the first of which is the thesis of the essay.

If all mankind minus one, were of one opinion, and only one person were of the contrary opinion, mankind would be no more justified in silencing that one person, than he, if he had the power, would be justified in silencing mankind.


Let us suppose, therefore, that the government is entirely at one with the people, and never thinks of exerting any power of coercion unless in agreement with what it conceives to be their voice. But I deny the right of the people to exercise such coercion, either by themselves or by their government. The power itself is illegitimate. The best government has no more title to it than the worst. It is as noxious, or more noxious, when exerted in accordance with public opinion, than when in opposition to it.


Strange it is, that men should admit the validity of the arguments for free discussion, but object to their being "pushed to an extreme;" not seeing that unless the reasons are good for an extreme case, they are not good for any case. Strange that they should imagine that they are not assuming infallibility, when they acknowledge that there should be free discussion on all subjects which can possibly be doubtful, but think that some particular principle or doctrine should be forbidden to be questioned because it is so certain, that is, because they are certain that it is certain. To call any proposition certain, while there is any one who would deny its certainty if permitted, but who is not permitted, is to assume that we ourselves, and those who agree with us, are the judges of certainty, and judges without hearing the other side.

Security Implications

Package Reassignment

NPM can arbitrarily reassign a package name without owner consent or awareness. I understand the occasional need for this and that such power is used exceedingly rarely. NPM can violate a package maintainers' intentions to unpublish by arbitrarily publishing the same code under the same name for continuity. When these two concerns coincide namespace assignment and management of all associated code are fundamentally lost to the package maintainers. This presents immediate risks to product management (for distribution) and thus secondary consequences of associated branding and product health decisions are lost to maintainers. The third level consequences apply to disruptions of contractual obligations that arise from loss of control of maintained packages.

Weak Software and a Community of Dependency Hell

The problem that precipitated the NPM action and quotes above is due to a catastrophic break from the removal a single tiny artifact. This could be characterized as the removal of a grain a straw that instantly destroyed all the buildings in a grand city upon its plucking. The problem there is not the result of an NPM directed action or the malicious convictions of a single evil agent. The problem is a community entirely reliant upon a system of dependencies, without regard for integrity or health, with NPM directly serving as the primary enabler.

This community problem exists because weak software developers value convenience above other more necessary considerations. Audits are not performed (not regularly, not ever) over included dependencies and the dependencies of the primary dependencies are certainly not thought of. These dependencies are frequently included without constraint or safety of any kind. Emerging software developers who are new to software development take this weakness as standard practice and so the weakness is the new baseline. The consequence is that discussions and thoughts of architecture are avoided out-right thereby avoiding ownership of responsibility, so therefore it becomes the user's fault when software breaks due to the dependency madness.

This problem is observed in numerous posted issues to the Atom-Beautify project. Atom-Beautify is a large project containing many dependencies. Even after the project itself is well-written and does everything it can to reinforce concepts of health and integrity it still suffers from dependency hell due to the madness that comes in from certain grandchild dependencies. When a micro nth level dependency fails to come across the line there is only one solution: We tell the user to kindly reinstall. Clearly, this is a user error (due to no fault of the user) that we are powerless to fix.

Secondary Reasons

Additional and unrelated frustrations that break Pretty Diff:

Tertiary Reason

It is hard to qualify NPM download numbers. This is absolutely not related to any project health or security concern, so it is absolutely minor. NPM download counts appear to be total downloads. This includes counts for specifically requested packages, packages downloaded as any dependency type, packages further down the dependency chain, and various versions of the same package on the same dependent and/or dependency tree. This generates some level of confusion in that a package could get 30 (hypothetical) download counts when a single unrelated package is downloaded.

The challenges:

  • There is no way to determine external downloads apart from NPM internal (NPM dependency model) downloads
  • There is no way to determine unique (manual) requests apart from total downloads, which speaks to desirability versus unintended inclusion
  • There is no way to determine download traffic source

See also: #250

It would be nice to have more qualified numbers. In the reality of open source the numbers matter very little. In the rest of the world it is a validation. The rest of the world includes:

  • potential investment
  • family (from whom I rob time to write open source)
  • software developers who don't contribute to open source
@prettydiff
Owner

Examining http://jspm.io

@prettydiff
Owner

In order to prevent potential malicious entities from taking the NPM names reserved by user austincheney I will not unpublish from NPM. Instead Pretty Diff will upgrade to a new major version number when publication at an alternate venue is announced. Pretty Diff will continue to retain version 1.x.x at NPM, but it will no longer be maintained. This plan also minimizes any disruptions associated with this move.

@prettydiff
Owner

Custom (self-hosted) package management

Another solution I have been thinking about is to simple write a new node application that does the job of NPM, but without a registry. You don't need a centrally managed registry, because the web has already solved this problem and applied the solution with great success in the past.

URI to the rescue

URI is always unique, is universal, and can be applied with varying degrees of specificity. Its universal nature solves the NPM namespace collision problem while simultaneously ignoring the friction of public/private or commercial/free.

Listing of basic requirements for package management

  • A data file that defines the package. Something like NPM's package.json.
  • A means to tarball a directory cross-OS.
  • A location where versions are saved
  • A default version, latest, that symlinks to the latest version tarball
  • A notification service (to beacon out when a new package version is available)

Example of an applied solution

The Pretty Diff project has its own domain at http://prettydiff.com/ so I will use this as a potential demonstration. Let's presume there is a http://prettydiff.com/package which contains an index file that points to the default version via symlink so you when go to that address a tarball is returned. Let's also presume there is a http://prettydiff.com/package/versions directory, so for example http://prettydiff.com/packages/versions/1.16.37

When applied in this way the version format is completely custom so long as there is an address that points to tarball of matching version, such as http://prettydiff.com/packages/versions/1.16.37--alpha3. If you want the package to be private then move it to a web-server behind a firewall. Even better would be to move it to an internally available address/domain scheme. Since this idea is reliant upon URI the privacy of a given package is vested in the ability to resolve an address.

No service

The great thing about this is that its an application and not a service. The only required service is a HTTP server (a simple one comes prepackaged with Node). The application must run in response to a command that creates a new package tarball, puts that tarball in the proper location (a version directory on the web server), updates the default package symlink, and optionally beacons that a new version is published.

Beacons

This, like everything else in this idea, is completely open to custom definition. It would just have to be defined in the package's data file so that the format and point of access are known to the distant end where the package is used.

A client application

Unfortunately a client application would also be required to, at least, unpack the tarball. Some optional (ideal) qualities for the client-side app:

  • Install the package to a location defined in the command line PATH variable so that the package can be executed as a global command
  • Check the version against the last accessed beacon so that update notifications can be optionally presented

Potential problems

  • discoverability - It could be hard to find relevant or superior packages without a centralized registry.
@prettydiff prettydiff added Underway and removed Not started labels Mar 28, 2016
@prettydiff
Owner

Writing alternate solution for NPM. The application is called biddle.

@prettydiff
Owner

Pretty Diff is now published with biddle. The publication point is: http://prettydiff.com/downloads/prettydiff/

Automated distribution and installation is now as easy as:

biddle install http://prettydiff.com/downloads/prettydiff/prettydiff_latest.zip
@ackalker
ackalker commented Nov 21, 2016 edited

Sad to see npm/npm#10686 mentioned (and still not fixed). I actually bisected the thing down to a single commit (a botched refactoring/variable renaming attempt). To my knowledge, no-one ever looked at that bad commit as a source of the problem. I think my comment simply got drowned out in the incredible noise from people pasting entire npm logs while trying to prove that they had "the same problem"...

@prettydiff
Owner
prettydiff commented Nov 21, 2016 edited

The one that really REALLY did it for me was #5082.

The problem for me isn't that there are defects in software or even the severity of the defects. All software has defects of which some are hard to solve even by the people best positioned to provide the solutions. My problem is that the defects are so incredibly hard to solve that they remain unsolved for so long. Normally in open source software defects will remain open for a long time due to the limited time of volunteers. NPM has a paid staff, hundreds of contributors, thousands of participants, and millions of users. Limited man power isn't an argument for NPM when problems are so dangerously critical.

This is a problem for me, because it is symptomatic of bigger problems. Why would such foundational problems be so hard to solve for or so challenging to detect? Is it because of how the software is organized, managed, or executed? This line of thinking led me to believe the primary problem of NPM is its over-reliance on modularity.

When I was looking into issue 5082. I had trouble following the code. The application was divided into many different modules written by NPM staff. The NPM code style is really sloppy which opens the code to some ambiguity and the tar component was binary heavy with great use of typed arrays. Each of these qualities compounded the problem and likely prevented entry from many people willing to solve this problem. Had the various components been centralized into a single application the problem would have been easier to detect. Even though the code would still be sloppy the problem would be confined to a single known location, so at the very least you would know where to start and could test your way through until test criteria is firmly defined.

Although I identified a technical hurdle limiting understandability of the code it wasn't until I heard the responses to the left-pad problem that I realized the problem isn't technical at all. It is purely a management problem of the code. This is a design problem more so than an execution problem. This is the thinking that led me to design an alternative. I knew I could do this not because of confidence in my programming skills, but because they identified a unique problem to solve that they are less willing or capable of solving themselves.

@prettydiff prettydiff added Pending Release and removed Underway labels Dec 2, 2016
@prettydiff prettydiff added this to the 2.1.15 milestone Dec 2, 2016
@prettydiff
Owner

Use via biddle is now available in production. The release of branch 2.1.15 with issue #381 will mark a resolution to this issue. Once a convenient name and catalogue system are worked (key value pair) OS and language agnostic self-hosted/managed packages via CDN will be the future.

@prettydiff prettydiff referenced this issue Dec 2, 2016
Merged

2.1.15 #395

@prettydiff prettydiff closed this in #395 Dec 4, 2016
@prettydiff
Owner

Publishing updates with biddle broken.

@prettydiff prettydiff reopened this Dec 4, 2016
@prettydiff prettydiff modified the milestone: 2.1.16, 2.1.15 Dec 4, 2016
@prettydiff prettydiff added Underway and removed Underway labels Dec 4, 2016
@prettydiff
Owner

Closing and working biddle defect from #396 and https://github.com/prettydiff/biddle/issues

@prettydiff prettydiff closed this Dec 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment