Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug?]: checksum is different from windows and unix for local packages #6105

Open
1 task done
Diggsey opened this issue Jan 25, 2024 · 20 comments
Open
1 task done

[Bug?]: checksum is different from windows and unix for local packages #6105

Diggsey opened this issue Jan 25, 2024 · 20 comments
Labels
bug Something isn't working

Comments

@Diggsey
Copy link

Diggsey commented Jan 25, 2024

Self-service

  • I'd be willing to implement a fix

Describe the bug

Yarn computes a different checksum for local packages (ie. packages installed via a relative path) on windows vs linux, causing yarn install to fail in CI.

To reproduce

  • Run yarn add ../relativePath followed by yarn install on windows.
  • Commit changes and attempt to run yarn install in CI on linux.

Run yarn install
➤ YN0000: · Yarn 4.0.2
➤ YN0000: ┌ Resolution step
Resolution step
➤ YN0000: └ Completed in 0s 566ms
➤ YN0000: ┌ Post-resolution validation
Post-resolution validation
➤ YN0028: -  resolution: "platformed-browser-api@file:../browser-api#../browser-api::hash=516e63&locator=platformed-frontend%40workspace%3A."
➤ YN0028: -  checksum: f4cc44353a17885d87d70800a9a43b4d88bc68aa81c1ae9177775ef4fe5f47ecc2a469e21f3615d3bd08f7b83c3e9e7b7980ecdcc8e76ff069f1c5bf7487deab
➤ YN0028: +  resolution: "platformed-browser-api@file:../browser-api#../browser-api::hash=cab6bb&locator=platformed-frontend%40workspace%3A."
➤ YN0028: The lockfile would have been modified by this install, which is explicitly forbidden.
➤ YN0000: └ Completed
➤ YN0000: · Failed with errors in 0s 695ms
Error: Process completed with exit code 1.

Environment

System:
    OS: Windows 10 10.0.19045
    CPU: (20) x64 12th Gen Intel(R) Core(TM) i9-12900H
  Binaries:
    Node: 18.12.1 - ~\AppData\Local\Temp\xfs-2d93d2cb\node.CMD
    Yarn: 4.0.2 - ~\AppData\Local\Temp\xfs-2d93d2cb\yarn.CMD
    npm: 9.2.0 - C:\Program Files\nodejs\npm.CMD

Additional context

Related issues:
#5136
#2774

@Diggsey Diggsey added the bug Something isn't working label Jan 25, 2024
@arcanis
Copy link
Member

arcanis commented Jan 25, 2024

Please attach the generated files on Windows and Linux

@arcanis arcanis added the waiting for feedback Will autoclose in a while unless more data are provided label Jan 29, 2024
@brenthompson
Copy link

brenthompson commented Jan 30, 2024

I'm not the originator, but I am encountering the same issue when installing on Mac vs Windows. Here's some info I hope helps you resolve it:

I have a local package, added as a file: entry. The .zip file in the cache has a different checksum on the 2 OS's, from the looks of it because of the embedded CR/LF chars on Windows, and directory permissions.

Mac:

project.js # zipinfo /Users/me/.yarn/berry/cache/mySDK-file-9da22a02b2-10c0.zip
Archive:  /Users/me/.yarn/berry/cache/mySDK-file-9da22a02b2-10c0.zip
Zip file size: 330384 bytes, number of entries: 4
drwxr-xr-x  6.3 unx        0 b- stor 84-Jun-22 21:50 node_modules/
drwxr-xr-x  6.3 unx        0 b- stor 84-Jun-22 21:50 node_modules/mySDK/
-rw-r--r--  6.3 unx   329745 b- stor 84-Jun-22 21:50 node_modules/mySDK/myProject.js (a webpack bundle)
-rw-r--r--  6.3 unx       95 b- stor 84-Jun-22 21:50 node_modules/mySDK/package.json
4 files, 329840 bytes uncompressed, 329840 bytes compressed:  0.0%

Windows: (Git Bash)

project.js # zipinfo /c/Users/me/AppData/Local/Yarn/Berry/cache/mySDK-file-a940f94840-10c0.zip
Archive:  /c/Users/me/AppData/Local/Yarn/Berry/cache/mySDK-file-a940f94840-10c0.zip
Zip file size: 330383 bytes, number of entries: 4
drwxr-xr-x  6.3 unx        0 b- stor 84-Jun-22 21:50 node_modules/
drw-r--r--  6.3 unx        0 b- stor 84-Jun-22 21:50 node_modules/mySDK/
-rw-r--r--  6.3 unx   329745 b- stor 84-Jun-22 21:50 node_modules/mySDK/myProject.js
-rw-r--r--  6.3 unx       94 b- stor 84-Jun-22 21:50 node_modules/mySDK/package.json
4 files, 329839 bytes uncompressed, 329839 bytes compressed:  0.0%

Differences:

  • node_modules/mySDK directories have different file permissions (755 vs 644)
  • the size of the package.json differs due to CR/LF in the Windows version:
project.js (mac) # cat -e node_modules/mySDK/package.json
{$
    "name": "mySDK",$
    "main": "./mySDK.js",$
    "version": "1.0.0"$
}$

project.js (windows) # cat -e node_modules/mySDK/package.json
{^M$
    "name": "mySDK",^M$
    "main": "./mySDK.js",^M$
    "version": "1.0.0"^M$
}^M$ 

Seems it could be resolved by stripping out control characters when zipping?

@arcanis
Copy link
Member

arcanis commented Jan 30, 2024

Did you per chance configure git on Windows to automatically convert line returns?

@brenthompson
Copy link

Did you per chance configure git on Windows to automatically convert line returns?

Not that I'm aware of, I have Git Bash installed on many Windows machines, all of them using default installation settings. I can check.

@brenthompson
Copy link

brenthompson commented Jan 30, 2024

It was set on Windows, and changing it made a difference, but I still get the checksum mismatch.

Mac:

project.js # git config --get core.autocrlf 
project.js # // nothing

project.js # git ls-files packages/mySDK/vendor/* --eol
i/crlf  w/crlf  attr/                 	packages/mySDK/vendor/mySDKproject.js
i/lf    w/lf    attr/                 	packages/mySDK/vendor/package.json

Windows:

project.js # git config --get core.autocrlf
true     // must be a git bash default

project.js # git ls-files packages/mySDK/vendor/* --eol
i/crlf  w/crlf  attr/                   packages/mySDK/vendor/mySDKproject.js
i/lf    w/crlf  attr/                   packages/mySDK/vendor/package.json     <--- w/ value is different

project.js #

I added a .gitattributes file:

project.js # cat .gitattributes
packages/mySDK/vendor/package.json eol=lf

Then ran

git rm --cached -r .
git reset --hard

per https://www.aleksandrhovhannisyan.com/blog/crlf-vs-lf-normalizing-line-endings-in-git/

Windows again:

project.js # git ls-files packages/mySDK/vendor/* --eol
i/crlf  w/crlf  attr/                   packages/mySDK/vendor/mySDKproject.js
i/lf    w/lf    attr/text eol=lf        packages/mySDK/vendor/package.json    <---- w/ value now matches

project.js # cat -e node_modules/mySDK/package.json
{$
    "name": "mySDK",$
    "main": "./mySDKproject.js",$
    "version": "1.0.0"$
}$

I committed the .gitattibutes file, and ran yarn --check-cache on both Mac and Windows. yarn.lock on Mac didn't change, the one on Windows did, i.e. the checksums are still different.

Here are verbose zipinfo outputs for comparison. Again the differences are the directory permissions and number of bytes in the package.json file.
zipinfo_windows.txt
zipinfo_mac.txt

@Diggsey
Copy link
Author

Diggsey commented Jan 31, 2024

Did you per chance configure git on Windows to automatically convert line returns?

I do not have this option set and have the issue.

I believe the permissions are the cause.

@TomppaPackage
Copy link

We're having this problem also. It's causing quite a headache when Windows Devs are generating different lockfiles to those on UNIX.

@akwodkiewicz
Copy link
Contributor

Folks, isn't it about checksums of packages being calculated on the compressed versions of packages vs the "raw" packages from npm? See my issue from some time ago where I learned about this: #5957

Try setting compression level to 0 in the project -- maybe the differences are due to how the compression algorithm works on various OS?

@brenthompson
Copy link

brenthompson commented Feb 21, 2024

Folks, isn't it about checksums of packages being calculated on the compressed versions of packages vs the "raw" packages from npm? See my issue from some time ago where I learned about this: #5957

Try setting compression level to 0 in the project -- maybe the differences are due to how the compression algorithm works on various OS?

Thanks for the suggestion, I hadn't seen that bug. But 1) I'm not using a package from npm - it's a simple file: entry consisting of a webpack bundle + package.json, 2) my issue didn't occur after upgrading from yarn 3 to 4, we've been on v4 all along, 3) our compressionLevel was already set to 0 everywhere

And P.S. I agree this is highly annoying, seems to be pretty widespread, and so I'm baffled as to why the maintainers are ignoring it.

@brenthompson
Copy link

Please attach the generated files on Windows and Linux

@arcanis kindly remove the 'waiting for feedback' tag, data has been provided

@yarnbot
Copy link
Collaborator

yarnbot commented Mar 22, 2024

Hi! 👋

It seems like this issue as been marked as probably resolved, or missing important information blocking its progression. As a result, it'll be closed in a few days unless a maintainer explicitly vouches for it.

@yarnbot yarnbot added the stale Issues that didn't get attention label Mar 22, 2024
@Diggsey
Copy link
Author

Diggsey commented Mar 22, 2024

Bad @yarnbot

@ezweave
Copy link

ezweave commented Mar 22, 2024

Yeah, this is definitely not resolved. We had to move one of our repos to npm because of it.

@ClementValot
Copy link

How do we get a "maintainer to explicitly vouch for it"?
Do we have to wave arms in the comments until it draws attention? :')

@yarnbot yarnbot removed the stale Issues that didn't get attention label Mar 22, 2024
@yarnbot
Copy link
Collaborator

yarnbot commented Apr 21, 2024

Hi! 👋

It seems like this issue as been marked as probably resolved, or missing important information blocking its progression. As a result, it'll be closed in a few days unless a maintainer explicitly vouches for it.

@yarnbot yarnbot added the stale Issues that didn't get attention label Apr 21, 2024
@Diggsey
Copy link
Author

Diggsey commented Apr 21, 2024

Very bad @yarnbot

@yarnbot yarnbot removed the stale Issues that didn't get attention label Apr 21, 2024
@ClementValot
Copy link

Sadly this is breaking for any team that uses both Windows and Unix/MacOS, the only workaround I've found is having checkSumBehavior: ignore in yarnrc and that's too big a trade-off in security :(

@arcanis Maybe we can have a bit of reassurance that it's in someone's scope? The waiting for feedback tag is still on even though that's been addressed

@arcanis
Copy link
Member

arcanis commented May 6, 2024

Sorry, this thread fell of the radar. Yarn will pack file: and git: packages, and their content needs to be the same for the checksum to pass. If the content isn't the same, then we don't know for sure whether it's inconsequential or a problem that puts your application in jeopardy.

Unfortunately, the way the packages are built may depend from your systems, and that makes this process flaky. We're always looking for ways to improve that, but it's unclear right now what the solution should be.

For example, in the case of the OP the problem was about CRLF strings. Should Yarn normalize them during packing? Should it do that on all files? Probably that should exclude binary files? What if a project expects a CRLF for X or Y reason? If we can't do it safely, should we do it at all?

That said, perhaps we could at least make a better job at highlighting the issues:

  • Detect when the Git configuration would lead to such issues
  • Detect what's the actual difference and suggest potential remediations (rather than just a failed checksum)

@arcanis arcanis removed the waiting for feedback Will autoclose in a while unless more data are provided label May 6, 2024
@Diggsey
Copy link
Author

Diggsey commented May 6, 2024

@arcanis Since this only affects file: and git: packages, Yarn could use a different mechanism for computing the hash: for git packages, the commit hash already identifies the package content, and for local files, you could use git to compute a hash in the same way.

Alternatively, Yarn could hash the ZIP file in a way that excludes the permissions metadata from the computed hash, avoiding the problem with permissions, and then have better diagnostics for CRLF/LF differences (which should in priniciple be fixable by the user, unlike the permissions issue).

A final option would be to ignore or store multiple hashes for packages where a deterministic hash cannot be easily computed.

@milahu
Copy link

milahu commented May 14, 2024

for git packages, the commit hash already identifies the package content

+1, but note: validation by commit hash requires git clone
because the commit object (which has the tree hash) is not part of github archives

the way the packages are built may depend from your systems, and that makes this process flaky. We're always looking for ways to improve that, but it's unclear right now what the solution should be.

sounds like yarn is trying to re-invent nix

also with nix, reproducible builds are hard
because "bad" packages can introduce non-determinism in the build process

so by default, nix packages are "input addressed":
all source files and all build scripts are reduced to one hash
and that input hash identifies pre-compiled packages in a binary cache

Trustix - Consensus and voting

So we lean into this, and allow each user to define what consensus means to them, fully scriptable in Lua. This is especially well suited to Trustix for two reasons. First, unlike other systems we’ve discussed, it is not essential to reach a consensus in Trustix. If every builder reports a different output hash, the user can simply build that package from source.

Content-addressed Nix − call for testers

what happens if __contentAddressed = true is used when the derivation is not reproducible (results in a different output contents across builds)?

It all depends of the exact scenario, but in the simplest case where there’s only one source of truth (either you’re only building locally, or there’s only one binary cache that feeds everything else), it’ll work mostly as input-addressed derivations, in that the first build will be accepted as the “truthful” build, and Nix won’t even try to rebuild it (why should it after all?)

What factors affect the reproducibility of Nix builds?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

9 participants