Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Security and centralization issues with NPM packages #905

Closed
braydonf opened this issue Nov 6, 2019 · 11 comments · Fixed by #910
Closed

Security and centralization issues with NPM packages #905

braydonf opened this issue Nov 6, 2019 · 11 comments · Fixed by #910
Milestone

Comments

@braydonf
Copy link
Contributor

braydonf commented Nov 6, 2019

When installing via git and verifying signatures, the integrity of the dependent packages is verified using a sha512 hash via the npm package-lock.json file. Thus if the installation of nodejs with npm is verified you can be certain that the packages are correct.

However there are a few issues:

  • There is a single point of failure in regards to installing and publishing those packages.
  • When upgrading packages that have a locked version there isn't signature verification for the release. Thus when bcoin is installed via npm or used as a dependency in a package the signature is not verified as it would when installed via git.
@braydonf
Copy link
Contributor Author

braydonf commented Nov 6, 2019

I have setup a few different explorations as to remove the dependency upon npm packages.

Git + Yarn Workspaces

This uses a feature of yarn package manager called workspaces. It solves both single point of failure and centralization of dependencies as well as signature verification (as it is not necessary). It manages several packages within one git repository, thus it's not necessary to reach out to any centralized service to install or publish a dependent package; as git itself is decentralized. The exploration has included all dependencies managed in a git repository.

https://github.com/braydonf/bcoin/tree/pkg-workspaces

Git + NPM w/ Semver

This uses a feature of npm to install dependencies via git while also resolving semantic versioning of releases. It somewhat solves the single point of failure and centralization of dependencies, however it doesn't solve the issue with verifying signatures of releases. All dependencies are currently specified via a Github remote, while it should technically be able to point to any git repository.

https://github.com/braydonf/bcoin/tree/pkg-git

Note: There is however an issue that git uses a sha1 hash for the integrity. Though git uses a hardened variation of sha1, it would be preferable to use a stronger hash algorithm. That is true in any use of git. There is progress to move towards a stronger hash algorithm in git, however it's unclear when it will land.

@tynes
Copy link
Member

tynes commented Nov 6, 2019

Great work putting together these options. I see benefits of both options but ultimately do not support a monorepo for multiple reasons, so I support git + NPM with semver.

The packages were split out to support development of hsd. It is not immediately obvious to me that hsd would be able to depend on the packages based in a different workspace after looking at their docs. I foresee the hsd community growing larger than the bcoin community in the not so distant future, and to split the dependencies would be discounting the potential contributions by people in the hsd community. For example, bcoin and hsd get to benefit from these changes, all made by an hsd contributor: https://github.com/bcoin-org/binet/pulls

It would require having to install a yarn when developing and the team has a history of being lean when it comes to depending on external things. It is a fairly large change and may impact jj contributing in the future. jj is actively developing bcrypto, so we will have to backport all of his changes.

Will ci run tests for all packages, or just bcoin?

We have a tool specifically made for doing signed release called bpkg and I have explored its usage in #836. I think that we should be doing releases using that for when people want to run bcoin in production. We could make the release signing key very obvious and sign the bundled package that bpkg --release creates. Cloning the repository would be better for development, and downloading the signed release would be better for production. Creating the release could be done in a minimal container.

All open pull requests would have to be reopened in the main bcoin repository. Any completely external projects will no longer get patches, it will become much more difficult to share code with other projects, including my personal projects. I will end up continuing to use the bcoin-org repos. The number of issues from other packages may become distracting. I do not want to spend a lot of time porting bugfixes between repos to make sure that hsd gets the fixes as well.

Could you link to the docs where it shows how signature verification works with yarn? After searching their docs, I could not find any description of how it works.

Some background on the sha1 used by git
https://github.com/git/git/blob/master/Documentation/technical/hash-function-transition.txt

Over time some flaws in SHA-1 have been discovered by security
researchers. On 23 February 2017 the SHAttered attack
(https://shattered.io) demonstrated a practical SHA-1 hash collision.

Git v2.13.0 and later subsequently moved to a hardened SHA-1
implementation by default, which isn't vulnerable to the SHAttered
attack.

Thus Git has in effect already migrated to a new hash that isn't SHA-1
and doesn't share its vulnerabilities, its new hash function just
happens to produce exactly the same output for all known inputs,
except two PDFs published by the SHAttered researchers, and the new
implementation (written by those researchers) claims to detect future
cryptanalytic collision attacks.

I agree that it would be ideal to switch to another hashing algorithm as soon as possible. It still seems doubtful that there will be a collision internal to git, I would really hope that they would move quickly if it was to become a large threat.

While cloning the repo using git, authentication is also achieved using ssh trust on first use (~/.ssh/known_hosts) or https certificate authorities.

I was unable to find a github SSHFP DNS record to check against the github.com git server's ssh public key. We could note that key in a public location, and advise people that are extra cautious to make sure the fingerprint in their known_hosts matches. Since ssh will not work when that fingerprint changes, it really only needs to be checked on first use.

Even though it would be using Github, we wouldn't be locked into using Github forever because, like you mentioned, any git remote should be able to work.

@braydonf
Copy link
Contributor Author

braydonf commented Nov 6, 2019

Another exploration:

Git Submodules (without npm or yarn)

This uses a feature of git called submodules to install dependencies via git. There is no need to run npm install or yarn install, however you need to run npm rebuild or node-gyp rebuild to build C/C++ code. It somewhat solves the single point of failure and centralization of dependencies, as the git remotes could be any git repository. However unlike the workspaces direction, the dependencies are not copied and much be fetched, which could be unavailable. If submodules where also to use submodules for dependencies, there may be issues with de-duplication of packages as semantic versioning isn't resolved. This also resolves the issues with signatures of releases as it is a feature of git, there may need to be some scripts to assist in that for verifying submodules.

https://github.com/braydonf/bcoin/tree/pkg-submodules

Use this to clone the branch:

git clone --recursive --branch pkg-submodules https://github.com/braydonf/bcoin.git

And then build the C/C++ code:

npm rebuild

Which is actually running node-gyp, and could be instead:

node-gyp rebuild

@braydonf
Copy link
Contributor Author

braydonf commented Nov 7, 2019

Another exploration:

Git Package Manager (gpm)

This uses git for package management. It solves all the issues. Signatures are verified and packages are decentralized. The dependencies can be configured with with multiple sources of mirrors. The remotes can be local files, .onion services, and standard git urls. Semantic versioning is used to resolve dependencies to avoid any duplication of dependencies.

https://github.com/braydonf/gpm
https://github.com/braydonf/bcoin/tree/pkg-gpm

git clone --branch pkg-gpm https://github.com/braydonf/bcoin.git
cd bcoin
gpm install

@braydonf
Copy link
Contributor Author

braydonf commented Nov 8, 2019

Another exploration:

Git Committed Modules

This commits all node_modules into the bcoin git repository. It does not depend on npm with the exception of it being necessary to run npm rebuild. Packages would not need to be published to a registry, and also solves the single point of failure in terms of the availability of packages. However, it does not work well with dependencies that are common to other dependencies. As a result, there would be de-duplication to handle. It also does not address the issue of signature verification when updating a module, as similar to the existing use of a lockfile.

https://github.com/braydonf/bcoin/tree/pkg-nodemodules

git clone --branch pkg-nodemodules https://github.com/braydonf/bcoin.git
cd bcoin
npm rebuild

@braydonf
Copy link
Contributor Author

braydonf commented Nov 14, 2019

@tynes

Re: Use in applications that depend on bcoin and libraries.

With yarn workspaces there would be multiple packages published, presumably to the npm repository, as is current. Other applications would continue to depend on those packages, and lock the versions using yarn.json, package-lock.json, or vendor them.

Re: It would require having to install yarn.

I think this comes with benefits. Currently the way to upgrade npm is to run npm i -g npm, which does not do signature checks. For Debian based systems yarn is upgraded and installed via apt with signature verification. Other distro have yarn available through the official package manager. See https://yarnpkg.com/en/docs/install

Re: Continous integration testing

Continous integration would run for all of the packages within the repository in the case of yarn workspaces. Currently many dependencies are not run with CI. There would also likely be some benefit in reduced overhead in running CI with Jenkins for multiple environments and configurations. There will also be opportunity for integration testing between the packages.

Re: bpkg

I think this tool is mostly for bundling releases with similar functionality to browserify. Something that is used as a last step, however would not be used to make a release for every dependency, or the method of which packages specify dependencies and install them.

Re: yarn signature verification of packages.

While yarn introduced a lockfile before npm added it, signature verification of dependencies is not a supported feature of yarn.

@braydonf
Copy link
Contributor Author

braydonf commented Nov 15, 2019

Another exploration (combination):

Git Semver + Committed Modules (works with npm and yarn)

This uses a feature of npm and yarn to install dependencies via git while also resolving semantic versioning of releases. It also solves all the issues. There isn't a single point of failure of dependencies for publishing or installing. Dependencies are guaranteed to be available at any point in previous version history. The modules are committed, and a signature is added for all dependencies, to be verified.

https://github.com/braydonf/bcoin/tree/pkg-gitmod

git clone --branch pkg-gitmod https://github.com/braydonf/bcoin.git
cd bcoin
npm rebuild

Note: Both npm rebuild and yarn rebuild can be used to build. Additionally npm and yarn can be used in development for updating dependencies, running tests, rebuilding and etc.

@pinheadmz pinheadmz added the feature Adding a new feature label Nov 18, 2019
@pinheadmz
Copy link
Member

pinheadmz commented Nov 19, 2019

As much as I love the GPM idea, and the tradition of bcoin-org writing our own tools from scratch AND the fact that it solves all our problems and adds features the world needs, I'm worried that it will be problematic for new users. I'm worried that "installing something just to install something" could potentially add more issues than it is worth, especially since Github package registry is a thing, and that's going to be confusing.

The next coolest option out of these proposals I think is Git submodules. It uses an application users already have installed, it keeps all the dependencies separate so they can still be accessed by Handshake, and removes npm (as a package repo) from the equation. I'm a little confused about how the versions in package.json relate to the installation after submodules are employed (would those just be removed from package.json?). And for a user just looking at the repo, you can see which commits a dep is from, but not it's version (e.g. bcrypto @ e454320), but I don't know if that's an issue at all, might just be slightly less human-friendly. I also really like that since each dep is installed from git, it has the complete git tree in each folder, making forks, PRs, and commit checkouts really easy for each dep (something an npm install doesn't have).

The committed module options (Git + Yarn Workspaces) I feel like are not good options simply because Handshake shares dependencies with bcoin, and bcrypto is especially sensitive, I feel like this would make the bcoin commit log really messy, and we'd be dealing with constant updates to bcrypto, possibly incompatibilities between Handshake and bcoin, etc...

Finally, Git + NPM w/ Semver was problematic for me due to the use of SSH. I got a dozen Enter passphrase for key ... messages, and nothing installed. It looked frozen and nothing was happening in the directory. I'm sure this could be fixed or I'm just doing something wrong, but again it's going to add to the pain of installing bcoin.

I'm going to attempt to summarize everything in a chart, let me know what I get wrong (@braydonf especially where npm checks pakage-lock sha512 hash instead of just the git signed commit)

proposal monorepo integrity decentralized required issues
npm (current) no sha512
package-lock.json
no npm permission to publish
Git + Yarn Workspaces yes sha1
git signed commit
possible
with git mirrors
git, yarn
Git + NPM w/ Semver no sha1
git signed commit
possible
with git mirrors
git, npm local ssh keys?
Git Submodules no sha1
git signed commit
possible
with git mirrors, but maybe not for individual deps?
git, npm 2-step install
(npm rebuild)
Git Package Manager no sha512
custom
yes
specify multiple remotes
git, GPM install-to-install,
possible confusion with git package registry
Git Committed Modules no
deps are vendored inside bcoin, but released elsewhere
sha1
git signed commit
possible
with git mirrors
git, npm de-duplication of sub-dependencies
Git Semver + Committed Modules no
deps are vendored inside bcoin, but released elsewhere
sha1
git signed commit
possible
with git mirrors
git, npm

@pinheadmz
Copy link
Member

One more thought about GPM: could it be integrated directly into bcoin? I know this sounds ironic because I'm sort of not into mono-repos, but if we could maintain this ease of installation:

git clone bcoin
cd bcoin
npm run gpm-install

I think that would help. (Or some other 3rd command there where the user wouldn't have to even know GPM is doing all the work).

@pinheadmz
Copy link
Member

I think it's important to know how any of these proposals will affect Handshake, and specifically if @chjj has any input -- especially if we are going to consider removing bcoin from npm, so that it can only be installed by a more decentralized, secure process.

@braydonf braydonf removed the feature Adding a new feature label Nov 19, 2019
@braydonf
Copy link
Contributor Author

braydonf commented Nov 20, 2019

Re: Installing yarn or gpm as necessity to install bcoin.

Committing the node_modules into git removes that necessity as npm, yarn or gpm could be used, as it would only be necessary to run npm rebuild, yarn rebuild or gpm rebuild.

This is also where packaging to be distributed via operating system package managers will be useful for releases of bcoin that bundle all dependencies. It will fit into existing package manager workflows.

Re: Git submodules.

Having the complete git tree available for each dependency is useful. There is however a de-duplication issue with that approach, to which there are not tools (npm and yarn) to automatically de-duplicate those based on semantic versioning (a.k.a semver).

Re: Issues with ssh and git when installing with npm.

I think this is also another case where having the node_modules committed helps, as instead of just a hash of the dependency for integrity, there is a mirror of the dependency available. The package.json would only be used during development in upgrading packages. Also https git urls do not user authenticate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants