Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] tarball data currupted when installing from 'chunked encoding' server. #3884

Closed
1 task done
everett1992 opened this issue Oct 12, 2021 · 5 comments
Closed
1 task done
Labels
Bug thing that needs fixing Priority 1 high priority issue Release 7.x work is associated with a specific npm 7 release

Comments

@everett1992
Copy link

everett1992 commented Oct 12, 2021

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

While using npm v7/8 to install packages from our private npm registry we'll regularly see tar extract and integrity errors

npm WARN tar TAR_ENTRY_INVALID checksum failure
npm WARN tar zlib: incorrect data check
npm WARN tarball tarball data for lodash@http://localhost:4000/lodash/-/lodash-4.17.19.tgz (sha512-JNvd8XER9GQX0v2qJgsaN/mzFCNA5BRe/j8JN9d+tWyGLSodKQHKFicdwNYzWwI3wjRnaKPsGj1XkBjx/F96DQ==) seems to be corrupted. Trying again.
npm ERR! code Z_DATA_ERROR
npm ERR! errno -3
npm ERR! zlib: incorrect data check

I can reproduce these errors using a chunked-encoding proxy in front of registry.npmjs.org. I can resolve the issue by changing make-fetch-happen's integrity check from a MinipassPipeline to a basic stream. I don't know if that really fixes the issue or just changes the timing enough to hide it.

everett1992/make-fetch-happen@4c26ad6#diff-e11a8e3d6413e0d326dca64c22417f2670168171c83d2db6da41fb7eae868aa0

Expected Behavior

npm should be able to install with any valid http server.

Steps To Reproduce

  1. In this environment...
    I've reproduced this on Ubuntu with node 16, npm v7 and npm v8. I can't reproduce with v6 but one user has reported the issue with npm v6.

  2. With this config...

proxy.js

# my example proxy, or any other http registry that uses chunked-encoding and small chunks.
registry=http://localhost:4000/
node proxy.js
  1. Run '...'
$ mkdir example
$ cd example
$ # create package.json so node_modules will be installed here
$ echo {} > package.json
$ # remove cache. cannot reproduce with cache disabled or a cache hit.
$ rm ~/.npm/_cacache node_modules -rf
$ install any package
$ npm install react
  1. See error...
npm WARN tarball tarball data for react@http://localhost:4000/react/-/react-17.0.2.tgz (sha512-gnhPt75i/dq/z3/6q/0asP78D0u592D5L1pd7M8P+dck6Fu/jJeL6iVVK23fptSUZj8Vjf++7wXA8UNclGQcbA==) seems to be corrupted. Trying again.
npm ERR! code Z_DATA_ERROR
npm ERR! errno -3
npm ERR! zlib: incorrect data check

Environment

  • OS: Ubuntu
  • Node: 16.11.0, 16.8.0
  • npm: v8.0.0, v7.21.0
@everett1992 everett1992 added Bug thing that needs fixing Needs Triage needs review for next steps Release 7.x work is associated with a specific npm 7 release labels Oct 12, 2021
@everett1992
Copy link
Author

everett1992 commented Oct 12, 2021

I think that my proxy implementation is correct because files downloaded with curl have the correct checksum and extract cleanly.

$ npm info react dist.shasum
d0b5cc516d29eb3eee383f75b62864cfb6800037
$ curl -sS http://localhost:4000/react/-/react-17.0.2.tgz | sha1sum
d0b5cc516d29eb3eee383f75b62864cfb6800037  -

Our actual private registry is written in java, so I don't think it's a bug in common.

@everett1992
Copy link
Author

I've traced thru pacote, minipass-fetch, and make-fetch-happen and the stream's integrity is correct thru here

https://github.com/everett1992/make-fetch-happen/blob/chunked-encoding/lib/remote.js#L53

But the stream here is wrong

https://github.com/everett1992/make-fetch-happen/blob/chunked-encoding/lib/cache/entry.js#L288

@lukekarrys lukekarrys added Priority 1 high priority issue and removed Needs Triage needs review for next steps labels Oct 21, 2021
everett1992 pushed a commit to everett1992/make-fetch-happen that referenced this issue Nov 2, 2021
I can reliably cause tar integrity errors while downloading packages
from npm thru an http proxy that uses chunked encoding. The error seems
to be in make-fetch-happen, minipass-fetch or one of the minipass
libraries. I don't understand the root cause but I'm unable to reproduce
the issue after applying this patch.

My guess is that something in the stack expects res.body to be a
Minipass instance but not MinipassPipeline, or creating the pipeline
starts body flowing before all listeners are connected, missing some
chunks of data.

fixes npm/cli#3884

To reproduce

1. Start the [chunked-encoding
   proxy](https://github.com/everett1992/make-fetch-happen/blob/chunked-encoding/proxy.js)
2. In a new terminal create a npm project `mkdir chunked-encoding-error
   && cd chunked-encoding-error && echo '{}' > package.json`
3. configure npm to use the registry `npm config --location project set registry
   http://localhost:4000
4. remove node_modules, and npm cache
5. install any package from the registry (this should fail).
   - I recommend a package without dependencies, like lodash or typescript, debugging is easier.
   - The error isn't 100% consistent, you may want to run this step in
     a loop
     ```
     while rm ~/.npm/_cacache node_modules package-lock.json -rf && npm install --no-save lodash; do; done
     ```

[chunked-encoding proxy]: https://github.com/everett1992/make-fetch-happen/blob/chunked-encoding/proxy.js
@booniepepper
Copy link

For some context, Caleb (and I) are build engineers at Amazon. This is blocking us from moving our company onto NPM 7 because of the way we handle internal repository mirrors.

everett1992 pushed a commit to everett1992/make-fetch-happen-tar-extract-error that referenced this issue Nov 30, 2021
Related to npm/cli#3884, npm/make-fetch-happen#63

npm v7+ throws integrity and zlib corrupted errors when installing from
a registry that uses chunked encoding.

This package reproduces that error using make-fetch-happen and tar. The
error only occurs when fetch is passed integrity and the server uses
chunked encoding. Only the `extract with integirty` test fails.
everett1992 pushed a commit to everett1992/minipass that referenced this issue Dec 1, 2021
everett1992 pushed a commit to everett1992/minipass that referenced this issue Dec 1, 2021
everett1992 pushed a commit to everett1992/minipass that referenced this issue Dec 1, 2021
4c5a106 handled a convoluted case where there is a chunk in the buffer
AND  we're in a flowing state during a write call which caused out of
order writes.

The fix was to flush the buffer before emitting the new chunk, but it
didn't account for destinations pausing the stream after flushing part
of the buffer. This caused issues in npm/pacote/npm-registry-fetch.

That specific issue is demonstrated in everett1992/make-fetch-happen-tar-extract-error
and occurs when make-fetch-happen res.body is piped to a tar.extract
stream.

Fixes isaacs#27 npm/cli#3884 make-fetch-happen#63
isaacs pushed a commit to isaacs/minipass that referenced this issue Dec 6, 2021
4c5a106 handled a convoluted case where there is a chunk in the buffer
AND  we're in a flowing state during a write call which caused out of
order writes.

The fix was to flush the buffer before emitting the new chunk, but it
didn't account for destinations pausing the stream after flushing part
of the buffer. This caused issues in npm/pacote/npm-registry-fetch.

That specific issue is demonstrated in everett1992/make-fetch-happen-tar-extract-error
and occurs when make-fetch-happen res.body is piped to a tar.extract
stream.

Fixes #27 npm/cli#3884 make-fetch-happen#63

PR-URL: #28
Credit: @everett1992
Close: #28
Reviewed-by: @isaacs
@nlf nlf mentioned this issue Dec 9, 2021
@wraithgar
Copy link
Member

npm is now on the version of minipass that had this fixed. isaacs/minipass#28.

Closing as fixed, please reopen if this is still an issue.

@Jiasm
Copy link

Jiasm commented Mar 21, 2023

For some context, Caleb (and I) are build engineers at Amazon. This is blocking us from moving our company onto NPM 7 because of the way we handle internal repository mirrors.

Our private Registry service had same problem after the migration of express.js -> koa.js, and finally concluded that koa will override the Content-Type of the return value of the stream type as Content-Type: application/octet-stream, but charset is not set. It is suspected that this is why npm v7 cannot read the tarball normally. When we manually override Content-Type to application/octet- stream; charset=utf-8 works fine.

Reference: https://github.com/koajs/koa/blob/65f9c939e173ff2fab82ee9229a2213690959426/docs/api/response.md#buffer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug thing that needs fixing Priority 1 high priority issue Release 7.x work is associated with a specific npm 7 release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants