Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coq.src can't be built on darwin (UTF-8 file names?) #176225

Closed
trofi opened this issue Jun 4, 2022 · 8 comments · Fixed by #176253
Closed

coq.src can't be built on darwin (UTF-8 file names?) #176225

trofi opened this issue Jun 4, 2022 · 8 comments · Fixed by #176253

Comments

@trofi
Copy link
Contributor

trofi commented Jun 4, 2022

Describe the bug

Hydra fails to fetch coq.src when ran on darwin, blocks even linux builds as coq.src is a fixed-output derivation: https://hydra.nixos.org/build/179179093/nixlog/1 (linux failure: https://hydra.nixos.org/build/179240004. )

Steps To Reproduce

I think it's the following suspicious UTF-8 path:

$ nix-prefetch-url --unpack https://github.com/coq/coq/archive/V8.15.2.zip
error: cannot get archive member name: Pathname coq-8.15.2/test-suite/misc/deps/αβ/UT cannot be converted from UTF-8 to current locale.

$ nix-build -A coq.src
$ ls -1 result/test-suite/misc/deps
...
'#U03b1#U03b2'

Expected behavior

darwin and linux should both be able to fetch the same tarball.

Notify maintainers

@roconnor @thoughtpolice @vbgl @Zimmi48

Metadata

nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 5.18.0, NixOS, 22.11 (Raccoon), 22.11pre382890.236cc2971ac`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.9.0pre20220530_af23d38`
 - channels(root): `"nixos"`
 - channels(slyfox): `""`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos
@trofi trofi changed the title coq.src can't be built on darwon (UTF-8 file names?) coq.src can't be built on darwin (UTF-8 file names?) Jun 4, 2022
@Zimmi48
Copy link
Member

Zimmi48 commented Jun 4, 2022

The issue with fetching the tarball on macOS was discovered in #138274, and we believe this is a fetchzip bug on macOS. However, until now, it wasn't an actual issue for users because macOS users would use the fixed-output derivation downloaded on Linux and cached by Hydras (see this comment). Now, it seems like Hydras caches failures as well (it's a new thing, isn't it? Or we were just lucky previously that the Linux fetching would have happened before the macOS one?) and that because of this, a failure to download the source on Darwin is affecting everyone!

I don't really have any solution to propose except:

  • as a workaround is it possible to remove the cached failure and retrigger the build on Linux?
  • fetchzip needs to be fixed to handle UTF-8 file names properly.

Even if Coq was modified to not include these file names in its source archive anymore, this wouldn't resolve the issue with already released versions.

One last workaround idea to the fetchzip issue would be to make the fixed-output derivation not the direct result of fetching the tarball, but something that would fetch it and fix the divergent paths (e.g., by removing the corresponding tests). But I wouldn't even know how to implement this.

@vcunat
Copy link
Member

vcunat commented Jun 4, 2022

I've seen this in a couple other packages already. It's... not nice but fortunately not a very common problem (perhaps partly due to some linux getting it faster or on a retry).

@trofi
Copy link
Contributor Author

trofi commented Jun 4, 2022

Do you think fetchzip works correctly on linux? I find deps/αβ -> deps/#U03b1#U03b2 translation as very odd. Is it broken too?

@trofi
Copy link
Contributor Author

trofi commented Jun 4, 2022

Unpacking is locale-dependent:

  • LANG=C -> deps/#U03b1#U03b2 (bad)
  • LANG=C.UTF-8 -> deps/αβ (ok)

@trofi
Copy link
Contributor Author

trofi commented Jun 4, 2022

My guess the mismatch happens because macos supports UTF-8 locales as is (and mossibly musl as well?). While glibc does not unless glibcLocales is pulled into unzip.

I think we have 2 routes:

  • make fetchzip work on linux (and break a bunch of fixed-output derivations that contain UTF-8), should be a matter of pulling glibcLocales into unzip's setup.sh hook (did not test yet).
  • make fetchzip ASCII-only everywhere by using unzip -U in unzip's setup.sh. It will start consistently mangle non-ASCII for everyone.

@trofi
Copy link
Contributor Author

trofi commented Jun 4, 2022

Confirmed failure on musl as well:

$ nix build -f. pkgsMusl.coq.src --rebuild
error: hash mismatch in fixed-output derivation '/nix/store/mba2xjm41dm2bifpnvbn4cx0qqgknpim-source.drv':
         specified: sha256-h81nFqkuvZkMR7YLHy7laTq5yOhjMW+w6rYzncxvyD4=
            got:    sha256-DTspmwyD3Evl1CUmvUy2MonbLGUezvsHN3prmP9eK2I=

@trofi
Copy link
Contributor Author

trofi commented Jun 4, 2022

WDYT of proposed #176253 fix? Once it's in we can fix coq.src and friends.

@Zimmi48
Copy link
Member

Zimmi48 commented Jun 5, 2022

Thanks a lot for digging into and figuring out the cause of this issue! I find your proposed fix as the best (most principled) solution, even if it will require some work to fix past fix-output derivations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants