Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Provide standards around integrity between source code and published package #77

Open
shaunwarman opened this Issue Dec 10, 2018 · 85 comments

Comments

Projects
None yet
@shaunwarman
Copy link

shaunwarman commented Dec 10, 2018

Right now, it seems that most maintainers may publish their packages from their local environment. There should be a way to verify what is published against the public source code or specific git sha to maintain transparency of what is being published. Not only will this mitigate out of sync issues or accidents, but will provide greater confidence that additions aren't added as they are published (potentially malicious).

Not sure if this is the best place for this, but after reading through other issues and recent resources I thought I better put this down somewhere. And it brings up the discussion of maintainers permissions to not only package registry, but SCM as well.

@ljharb

This comment has been minimized.

Copy link
Contributor

ljharb commented Dec 10, 2018

npm doesn’t require a repo, and any package with a prepublish build process (read: every Babel user) will correctly have the published package not matching any sha in their git repo.

I’m not sure how we’d be able to do any sort of verification in a consistent and automated way.

@mcollina

This comment has been minimized.

Copy link
Member

mcollina commented Dec 11, 2018

The only way to guarantee this is to compute an hash of all the content of a module (in prepublish) and cryptographically sign it. However, there is no easy way to know:

a. what are the keys that are allowed to sign a given package
b. associate those keys with npm profiles

People have been asking for a similar feature to npm for the last 2-3 years.

@shaunwarman

This comment has been minimized.

Copy link
Author

shaunwarman commented Jan 8, 2019

Thanks @ljharb @mcollina and sorry for the late reply.

This article by @skonves gave me a nice reminder of this topic. And it seems like there are tools in his tbv and another in npm-verified that attempt to do similar.

There would need to be a blessed, yet optional process to add some sort of "trusted": true || false or "verified": true || false metadata that npm or other package managers would need to set. Optional because of the common cases of transpiled code (e.g. babel, etc) and because this is an expensive task with trade offs.

@sompylasar

This comment has been minimized.

Copy link

sompylasar commented Jan 9, 2019

My 1 cent out of 2 (don't have time for more): npm doesn't require a repo, but a package verification service on top of npm might require whatever is needed, and package authors that want to conform and get the "verified" badge will try and make sure their package is good. For example there are similar efforts to standardize open source repositories: https://github.com/todogroup/repolinter — there could be tools that help package authors to manage packages efficiently (I personally miss the kind of tool that would automatically set up CI and releases for me reproducibly and reliably).

@ljharb

This comment has been minimized.

Copy link
Contributor

ljharb commented Jan 9, 2019

Indeed, that’s something npm can solve - but i don’t think it’s something node, and thus we, can.

@niftylettuce

This comment has been minimized.

Copy link

niftylettuce commented Feb 13, 2019

Chiming in here, we really need something like this... koa-router was just transferred to a relatively unknown user on GitHub and the package name was apparently sold

ZijianHe/koa-router@bd780c9#diff-04c6e90faac2675aa89e2176d2eec7d8R3

Screenshot (in case commit is force deleted):

screen shot 2019-02-13 at 2 38 39 pm

I've version locked koa-router, which is downloaded 135K+ times per week, and subscribed on https://libraries.io/npm/koa-router to get a notification when a new version is published of this package.

HN: https://news.ycombinator.com/item?id=19156707

@ljharb

This comment has been minimized.

Copy link
Contributor

ljharb commented Feb 13, 2019

@niftylettuce i'd suggest reporting that to npm; i doubt their TOS permits the sale of a package name.

@niftylettuce

This comment has been minimized.

Copy link

niftylettuce commented Feb 13, 2019

I did report that to NPM, and here was there response...

screen shot 2019-02-13 at 2 47 19 pm

I also reached out to the author of the package koa-router and received a very negative response which I don't wish to share publicly out of respect since I see no malicious version of the package published yet.

@justinmchase

This comment has been minimized.

Copy link

justinmchase commented Feb 13, 2019

@niftylettuce You did the right thing here even if nothing negative comes of it. Its a red flag for sure and others deserve to be alerted by this information. Good job on being alert.

@niftylettuce

This comment has been minimized.

Copy link

niftylettuce commented Feb 13, 2019

To anyone reading this, I'm building a tool to automate this nonsense, at least until Node/NPM do something about it. Email me at niftylettuce@gmail.com if you want to get notified once it's up. I'll notify everyone that posted in this thread and/or left reactions as well. It will be free and open-source.

@Enrico204

This comment has been minimized.

Copy link

Enrico204 commented Feb 13, 2019

Maybe it's time to think to an NPM alternative.

@iarna

This comment has been minimized.

Copy link
Member

iarna commented Feb 14, 2019

What's more, even if npm validated against the git repo at publish time, the user can just force push after publication. Checking this kind of thing does nothing.

If you want to validate at install time, well, I hope you enjoy your multi-hour install times. =p (Seriously: In modern npm or yarn, install time per package is under 10ms—adding a git clone to that mix would massively increase overall run time.)

This kind of action gets you no security whatsoever. It solves no actual problem. It validates nothing.

@freewil

This comment has been minimized.

Copy link

freewil commented Feb 14, 2019

@iarna what do you think about a new command for npm, I've been thinking about a npm diff that you would be able to use while upgrading a package to see actual changes. Haven't thought about it super deep, but npm update or npm install mypkg@latest followed by a npm diff or similar.

This would effectively allow you to review the code of new/updated packages, similar to how adding node_modules to version control would allow for deeper code reviews.

@6a68

This comment has been minimized.

Copy link

6a68 commented Feb 14, 2019

@niftylettuce Hey, if you're going to work on a solution, it would be way better to start a repo on github, so people can contribute ideas/feedback via issues

@iarna

This comment has been minimized.

Copy link
Member

iarna commented Feb 14, 2019

@freewil That seems much more useful, but I am concerned how you scale that out to the thousands of deps in a typical modern deployment. 'cause yeah, the one thing you asked for may have a reasonable diff, but what about the dozen transitive deps that also updated? Still, I think this would be an excellent place to begin experimenting with, to see how it feels (--dry-run --json can get you what an action would have done, in machine readable format).

@niftylettuce

This comment has been minimized.

Copy link

niftylettuce commented Feb 14, 2019

@6a68 yes it will be on GitHub, I will post the link here once I have a proof of concept up

@niftylettuce

This comment has been minimized.

Copy link

niftylettuce commented Feb 14, 2019

To anyone reading this - please do not harass, email, or contact the original maintainer of the package mentioned in the above discussion. It was never my intention for anyone to harass them. I simply wanted to raise awareness about this issue with NPM and the potential of this becoming a security issue in general. This is not the only package like this.

@niftylettuce

This comment has been minimized.

Copy link

niftylettuce commented Feb 14, 2019

Another update on the koa-router issue for anyone subscribed to this thread. The new maintainer @ZijianHe has provided us with an update ZijianHe/koa-router#494 (comment). Hopefully this eases concern and it looks like we have a new contributor in open source land.

For the CLI npm diff command, would it just accept two different versions of a package? npm diff <package> <version-a> <version-b> and diff compare tarball?

basti1302 added a commit to instana/nodejs-sensor that referenced this issue Feb 14, 2019

deps(test): pin koa-router for now
See
nodejs/package-maintenance#77 (comment)

Might be that nothing malicious is done with the package after a
transfer but let's rather be safe than sorry.
@Enrico204

This comment has been minimized.

Copy link

Enrico204 commented Feb 14, 2019

Maybe the best "way" to handle the change of a maintainer is that the new maintainer should open its own repository on NPM. They should not allow a "takeover" procedure: instead, the installation may fails with a message like "hi, this library is not maintained anymore by olddev, there is a new version in newdev". At least the developer knows that there is something going on..

@mcollina

This comment has been minimized.

Copy link
Member

mcollina commented Feb 14, 2019

I think the current way npm does this is by forcing a bump in a major release. That's enough to protect users form a malicious "takeover" in the case of a non-responsive maintainer.

@panva

This comment has been minimized.

Copy link

panva commented Feb 14, 2019

I think the current way npm does this is by forcing a bump in a major release. That's enough to protect users form a malicious "takeover" in the case of a non-responsive maintainer.

Is that so? Any way to confirm?

@mcollina

This comment has been minimized.

Copy link
Member

mcollina commented Feb 14, 2019

I've done this several time, and that's how it works. I looked in https://www.npmjs.com/policies/disputes but there is no mention about that.

@ljharb

This comment has been minimized.

Copy link
Contributor

ljharb commented Feb 14, 2019

They certainly don’t; and can’t, because a legitimate new maintainer should be able to backport fixes as needed anyways.

Any owner can always and forever publish to any previously unused version number, and that’s how it must stay.

@sompylasar

This comment has been minimized.

Copy link

sompylasar commented Feb 14, 2019

@iarna

What's more, even if npm validated against the git repo at publish time, the user can just force push after publication. Checking this kind of thing does nothing.

I assume you assume the evil user force pushes to remove malicious code?

If such force push leads to file changes, git commit ids will change. Npm would record the commit id it verified against, and a standalone checksum of published files. If there's no such commit in the repo, it's a red flag. If running the same checksum at the same commit id files results in a different checksum, it's a red flag.

I believe this verification has to be a background task in npm; if one of the red flags has triggered, this package downloads are paused to prevent further spread until they are resolved. There has to be cache invalidation mechanism that would notify downstream caches to not use the flagged package.

I'm in no way a security expert though, just brainstorming. What do the npm security people that npm aquihired say?

@sompylasar

This comment has been minimized.

Copy link

sompylasar commented Feb 14, 2019

The way I experimented in npm-verified POC is I needed a standardized package build+publish process (I used npm prepare for that). The packages that conform to the process can be marked as verified if the checks pass. The packages that do not — never. Like in Chrome, the sites that do not use HTTPS are marked as "Not Secure" by default, and even more, colored in red if they collect some user input via a form (for npm, this can be translated to using some OS APIs like process, network, and filesystem).

@freewil

This comment has been minimized.

Copy link

freewil commented Feb 14, 2019

@niftylettuce

For the CLI npm diff command, would it just accept two different versions of a package? npm diff <package> <version-a> <version-b> and diff compare tarball?

Yeah, I think that would be one version/use-case of the command. The way I'm currently thinking, it'd be nice to also have a no argument version similar to git diff where the current unstaged/uncommitted changes are shown, but that may be more complex than necessary for an initial version as it might require git/version control integration. I'd be hesitant in entangling npm with version control-specific code, but I think there is already a precedent with some commands - one that I'm aware of is npm version, which creates git tags for you.

@freewil

This comment has been minimized.

Copy link

freewil commented Feb 14, 2019

@freewil

but that may be more complex than necessary for an initial version as it might require git/version control integration.

One alternative to prevent the need for tracking changes/state would be just to add a --diff flag to npm install that would output a diff. That should be much easier since npm install already mutates package.json (which means it needs to/could be aware of the version change while running) and prevents needing to track changes/state across two commands, as I originally proposed (npm install followed by npm diff).

@sompylasar

This comment has been minimized.

Copy link

sompylasar commented Feb 15, 2019

Inbefore, I understand that npm prohibits publishing the same version twice, if that's what you were referring as a new version @ljharb

@dominykas

This comment has been minimized.

Copy link
Contributor

dominykas commented Feb 15, 2019

The lock file does not have to live in a commit, nor does it have to live inside the tar. It is trivial to automate publishing it as part of GH release artifacts. This means that you have it, in case you need it, but you don't force it onto your contributors or users (and yes, I have little love for lockfiles myself, but we do keep them around for when we need to patch production, in a similar way as I described here, ie not affecting npm installs, unless explicitly needed).

Is that something a lot of maintainers will do? Probably not. Is that something the folks in this fine maintenance WG can help with? Definitely. Are reproducible builds something that businesses would be interested in? I think so, if that can be provided without overheads. So I think it's in scope and moves towards goals of this WG?

I think there's a lot of ways to solve the original problem stated in the topic of this thread. I'm very happy to see userland attempts. Maybe we need a break here, as part of this WG, and let these attempts move forward, maybe invent some competing ideas, and bring them back to build a standard and a recommendation.

It is very clear that there is interest in this, and there is value in this in at least some contexts, so I'm honestly surprised at the opposition. What are the downsides of establishing such a standard?

@mcollina

This comment has been minimized.

Copy link
Member

mcollina commented Feb 15, 2019

Myself and a lot of other maintainers with a huge numbers of packages think lockfiles are a bad idea for modules. I hope this clarifies our position: sindresorhus/ama#479 (comment).

@mcollina

This comment has been minimized.

Copy link
Member

mcollina commented Feb 15, 2019

lockfiles are a very good idea for applications, which is the high majority of users. However, they are not really the target of this initiative.

@dominykas

This comment has been minimized.

Copy link
Contributor

dominykas commented Feb 15, 2019

@mcollina this is not about lockfiles. I don't think they make sense as part of the package or part of the repo. They do make sense as a snapshot in time to achieve reproducible builds.

Once again - they should not be published on git or npm - only kept around as a convenient metadata format.

@ljharb

This comment has been minimized.

Copy link
Contributor

ljharb commented Feb 15, 2019

@dominykas package-lock is already not published on npm; it'd be fine to keep a lockfile around if it didn't constrain npm install, but the only way to do that would be to rename it and commit it to git.

That might be reasonable, however - a sort of package-release.json that you'd update in-place in git, and would effectively be npm install --package-lock-only --package-lock && mv package-{lock,release}.json right before publishing a release.

@dominykas

This comment has been minimized.

Copy link
Contributor

dominykas commented Feb 15, 2019

@ljharb that is also an option, but it depends on the toolchain. I've literally spent this week setting up semantic-release, because we have locked masters and therefore can't have version commits. This also means that we need to push release meta data outside of git repo. We also have a pattern where we push shrinkwraps into dangling tags (i.e. a shrinkwrap-X.Y.Z for every vX.Y.Z). Basically anything but master (or npm, for various reasons) :)

I wish npm had a way to store this, and some other, release meta data, but not necessarily distribute it with the package.

eysi09 added a commit to garden-io/garden that referenced this issue Feb 19, 2019

chore(node_modules): lock koa-router version
The ownership of the koa-router package was transferred to an "unknown"
user.

See more on Github: nodejs/package-maintenance#77,
Hacker News: https://news.ycombinator.com/item?id=19156707 and the npm
blog: https://blog.npmjs.org/post/182828408610/the-security-risks-of-changing-package-owners
@styfle

This comment has been minimized.

Copy link

styfle commented Feb 19, 2019

A possible solution to this problem (integrity between source code and published package) is to provide a "builder" service that take the source code, builds it, then publishes the output.

ZEIT Now is probably 80% of the way there because builds are immutable, every build has a unique URL, the source code is captured during the build process, and you don't need a repo to publish. But if you do have a repo, you can connect it to the build system so that pushing to master (or any branch) automatically publishes.

There's a couple changes that would need to be implemented in npm:

  1. The first missing piece is some way to link the published npm package back to the builder source so that npm (and users) can verify integrity (probably add a field in the generated package.json).

  2. The second missing piece is some sort of allow-list that npm uses to trust certain builders (if anyone could host their own builder service, they could circumvent this integrity).

  3. The third piece is an optional flag that users can use to opt-in to this type of integrity check. Something like npm install --builder-integrity that will stop with an error if any package is missing this new builder field in the package.json or the builder field is not in the allow-list.

@Bene-Graham

This comment has been minimized.

Copy link

Bene-Graham commented Feb 19, 2019

A possible solution to this problem (integrity between source code and published package) is to provide a "builder" service that take the source code, builds it, then publishes the output.

ZEIT Now is probably 80% of the way there because builds are immutable, every build has a unique URL, the source code is captured during the build process, and you don't need a repo to publish. But if you do have a repo, you can connect it to the build system so that pushing to master (or any branch) automatically publishes.

There's a couple changes that would need to be implemented in npm:

1. The first missing piece is some way to link the [published npm package](https://react-tsx.now.sh) back to the [builder source](https://react-tsx.now.sh/_src) so that npm (and users) can verify integrity (probably add a field in the generated `package.json`).

2. The second missing piece is some sort of _allow-list_ that npm uses to trust certain builders (if anyone could host their own builder service, they could circumvent this integrity).

3. The third piece is an optional flag that users can use to opt-in to this type of integrity check. Something like `npm install --builder-integrity` that will stop with an error if any package is missing this new builder field in the `package.json` or the builder field is not in the _allow-list_.

I was initially thinking the same thing.

But what stops someone from having a build task that inject malicious code into the "built package"?


Wouldn't a per module sandbox / permission system to protect against certain actions (Network, Disk Access, etc) help?

@freewil

This comment has been minimized.

Copy link

freewil commented Feb 20, 2019

@styfle

A possible solution to this problem (integrity between source code and published package) is to provide a "builder" service that take the source code, builds it, then publishes the output.

I was also thinking this. Basically the idea is to have a trusted third-party do the packaging from source.

Docker Hub might be a good place to look for inspiration. With Docker, it can be an even bigger issue, as docker images are built into binary files and pushed up to a public repository. Although, Dockerfiles/images are generally much more simple.

https://blog.docker.com/2013/11/introducing-trusted-builds/

@sompylasar

This comment has been minimized.

Copy link

sompylasar commented Feb 20, 2019

But what stops someone from having a build task that inject malicious code into the "built package"?

There shouldn't be any user-provided build task code while using one of the trusted builders, only a build pipeline configuration, for example, like https://github.com/pikapkg/pack does.

Wouldn't a per module sandbox / permission system to protect against certain actions (Network, Disk Access, etc) help?

Yes, that, too, may increase confidence, and complexity. That is needed, too, but is orthogonal to this particular discussion about package build-from-source integrity.

@ljharb

This comment has been minimized.

Copy link
Contributor

ljharb commented Feb 20, 2019

The build process itself needs to be able to contain user-written transforms, if it's babel, or a bundler, to support common and valid use cases.

eysi09 added a commit to garden-io/garden that referenced this issue Feb 20, 2019

chore(node_modules): lock koa-router version
The ownership of the koa-router package was transferred to an "unknown"
user.

See more on Github: nodejs/package-maintenance#77,
Hacker News: https://news.ycombinator.com/item?id=19156707 and the npm
blog: https://blog.npmjs.org/post/182828408610/the-security-risks-of-changing-package-owners
@sompylasar

This comment has been minimized.

Copy link

sompylasar commented Feb 20, 2019

The build process itself needs to be able to contain user-written transforms, if it's babel, or a bundler, to support common and valid use cases.

It doesn't if all the common and valid use cases like babel and bundler are standardized into trusted builders. I guess there should be either no way, or a hard and impractical way, to tamper with the executable content (e.g. JavaScript code) of a trusted build by providing a malicious build step.

But again, I'm not a security expert. I pinged the Node.js SecurityWG to take a look at this discussion as I think it is rather opitionated, not engineered as a document with clear sets of requirements, tradeoffs and solutions.

@travi

This comment has been minimized.

Copy link

travi commented Feb 20, 2019

It doesn't if all the common and valid use cases like babel and bundler are standardized into trusted builders

i think you missed a very important detail in the point you quoted:

contain user-written transforms

part of the valid use of babel is to apply custom transformations beyond those provided by babel itself or even official plugins. these are often included as devDependencies. locking down approved builders that didnt allow the inclusion of those custom transformations would eliminate a significant portion of the value of tools like this, but allowing them does allow for changes that would be difficult to ensure are not malicious.

@sompylasar

This comment has been minimized.

Copy link

sompylasar commented Feb 20, 2019

part of the valid use of babel is to apply custom transformations beyond those provided by babel itself or even official plugins. these are often included as devDependencies. locking down approved builders that didnt allow the inclusion of those custom transformations would eliminate a significant portion of the value of tools like this, but allowing them does allow for changes that would be difficult to ensure are not malicious.

I'd like to see some data about how popular these are, and whether they can be standardized. They can be published as a trusted package that itself does not use them, thus building a chain of trust. I hear you, but there's always a tradeoff between security and convenience.

@ljharb

This comment has been minimized.

Copy link
Contributor

ljharb commented Feb 20, 2019

@sompylasar packages are all already implicitly trusted; the attack vector i thought you're trying to address here is when a trusted package is hijacked, for which the same risks exist for any package, including ones in the build system.

@sompylasar

This comment has been minimized.

Copy link

sompylasar commented Feb 20, 2019

@sompylasar packages are all already implicitly trusted; the attack vector i thought you're trying to address here is when a trusted package is hijacked, for which the same risks exist for any package, including ones in the build system.

Yeah I guess you're right. Well that's what I mentioned above. We all would benefit from a document (like an RFC) with the list of potential attack vectors, and how they are addressed or can be.

@skonves

This comment has been minimized.

Copy link

skonves commented Feb 20, 2019

I have actually already put together a draft of a graph of the NPM ecosystem for just that purpose. Each edge and vertex definitely has its own unique set of challenges. For the past few evenings, I have been working on a writeup, but I wouldn't mind opening it up for community feedback/contribution.

Here is the draft of the graph I have so far:
ecosystem draft

I have a draft on Medium right now, but I will move that to a more public place in a heartbeat if it means more eyes on it. Thoughts?

FWIW, my work with TBV/verifynpm and @sompylasar's similar work with npmverified seems reasonably confined to the "publishes to" edge near the top; however, I strongly agree with Ivan that understanding the full picture helps inform any proposed solution (particularly what is in and out of scope).

@JamesMGreene

This comment has been minimized.

Copy link
Contributor

JamesMGreene commented Feb 21, 2019

Wouldn't a per module sandbox / permission system to protect against certain actions (Network, Disk Access, etc) help?

I saw an interesting post about this idea from the NPM perspective not too long ago: https://hackernoon.com/npm-package-permissions-an-idea-441a02902d9b

CC @davidgilbertson

@azu azu referenced this issue Feb 23, 2019

Open

Package manager #2

@tlhunter

This comment has been minimized.

Copy link

tlhunter commented Feb 26, 2019

FWIW, we built Package Diff to gain visibility into npm package updates by directly examining the package instead of looking at the potentially-misleading source code repository:
https://diff.intrinsic.com

@sompylasar

This comment has been minimized.

Copy link

sompylasar commented Feb 26, 2019

FWIW, I built package diff for the purpose of comparing next prepared version with the latest published one for the review and version bump purposes https://github.com/sompylasar/zmey-gorynych/blob/9029972/bin/cli.js#L512

@justinmchase

This comment has been minimized.

Copy link

justinmchase commented Feb 27, 2019

@tlhunter very nice.

I had one thought, one way to get around it. What if you were using node-pre-gyp which fetches pre-compiled binaries rather than builds them from the source in the package and someone malicious had uploaded a pre-compiled binary from something other than the source that is published, how could we verify that they match?

@tlhunter

This comment has been minimized.

Copy link

tlhunter commented Feb 27, 2019

@tlhunter very nice.

I had one thought, one way to get around it. What if you were using node-pre-gyp which fetches pre-compiled binaries rather than builds them from the source in the package and someone malicious had uploaded a pre-compiled binary from something other than the source that is published, how could we verify that they match?

For the system to be perfect we'd need to throw away the concept of npm install scripts. One could perform an npm install of every package ever made, on a set schedule, and constantly diff the artifacts left behind, but even that wouldn't be perfect. Like, the fetching of pre-compiled binaries might include information about the host machine. The event-stream incident, for example, would only inject malicious code if certain criteria were met.

I'd like to see some sort of package manager service which does the following:

  1. Distributes both a tarball AND is the git repo
    1. The user wouldn't be able to generate tarballs
    2. The server would guarantee correlation between git tags and tarballs
  2. Disable all installation scripts
@justinmchase

This comment has been minimized.

Copy link

justinmchase commented Feb 28, 2019

Yeah, I agree that sounds like the only way. I could imagine npm flagging releases that were published that way and you could then opt to only accept verified dependencies if you wanted.

It would be tricky for pre-compiled binary modules as they could have a large build matrix for platforms and versions but it would be enormously helpful for confidence and security.

@shaunwarman

This comment has been minimized.

Copy link
Author

shaunwarman commented Mar 8, 2019

Interesting and related proposal in the golang ecosystem:
https://go.googlesource.com/proposal/+/master/design/25530-notary.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.