Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal instructions are hard to debug #589

Open
grahamc opened this issue Aug 17, 2018 · 7 comments
Open

Illegal instructions are hard to debug #589

grahamc opened this issue Aug 17, 2018 · 7 comments

Comments

@grahamc
Copy link
Member

grahamc commented Aug 17, 2018

When mixing new and old hardware it is possible for a dependency to be compiled with extra CPU features another builder doesn't have. This is undesirable and should be fixed, but it can be very hard. It would be a bit easier if the trap logs in dmesg were available:

[4836540.032076] traps: python3.6m[8250] trap invalid opcode ip:7ffff02bba5a sp:7fffffff2190 error:0 in dlib.cpython-36m-x86_64-linux-gnu.so[7ffff0284000+75d000]
[5078335.341628] traps: hw-prim-bits-:w[3283] trap invalid opcode ip:40d7e9 sp:7fffe9aa5e00 error:0 in hw-prim-bits-test[400000+323000]
[5189083.230510] traps: test_xgetbv[15607] trap invalid opcode ip:400602 sp:7ffffffdd0a8 error:0 in test_xgetbv[400000+1000]
[5191054.775165] traps: lt-QuickTest[19712] trap invalid opcode ip:7ffff748d796 sp:7fffffffd310 error:0 in libgf2x.so.1.0.2[7ffff748b000+26000]
[5192870.050411] traps: lt-QuickTest[30943] trap invalid opcode ip:7ffff748d796 sp:7fffffffd310 error:0 in libgf2x.so.1.0.2[7ffff748b000+26000]
@edolstra
Copy link
Member

I don't see a feasible way to do that.

@grahamc
Copy link
Member Author

grahamc commented Aug 17, 2018

Me either :)

@dezgeg
Copy link

dezgeg commented Aug 17, 2018

Probably you could enable coredumps inside the sandbox and add a global /proc/sys/kernel/core_pattern handler that somehow can open the stderr of a build (which presumably would be fd 2 of the pid 1 in the pid namespace of the process who crashed). Doesn't exactly sound trivial though.

@Ericson2314
Copy link
Member

If we always pass --build/build_arg, this is far less likely. CC @lheckemann. I have done this in the first few commits of https://github.com/NixOS/nixpkgs/pull/44583/commits (which, unlike the rest, I should probably make an effort to get in to 18.09).

@dezgeg
Copy link

dezgeg commented Aug 18, 2018

I don't think that helps at all. It doesn't make autotools enter cross-compilation mode (and we really don't want that anyway). -march=native is already blocked and the recent one was just someone explicitly implementing their own build-time CPU feature detection in.

@Ericson2314
Copy link
Member

I know @lheckemann used this to build armv7 stuff on aarch64 without cross compilation. But perhaps if just helps with very coarse-grained things like that.

@dezgeg
Copy link

dezgeg commented Aug 18, 2018

Yes that doesn't help at all with the recent issue on Hydra. An x86_64 cpu which supports PCLMUL has the same target triple as one that doesn't.

I personally do the ARMv6-on-ARMv7 with a local kernel hack to avoid changing any packages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants