Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP::Client.post crashes on arm when inside a Kemal route #6954

Open
helaan opened this issue Oct 17, 2018 · 6 comments
Open

HTTP::Client.post crashes on arm when inside a Kemal route #6954

helaan opened this issue Oct 17, 2018 · 6 comments

Comments

@helaan
Copy link

helaan commented Oct 17, 2018

The code is crosscompiled from an amd64 device to arm-linux-gnueabihf using the compiler flags --cross-compile --target armv7a-unknown-linux-gnueabihf. Running the code natively on amd64 works fine, as does removing the Kemal block. I'm not sure what happens exactly what changes when you put it in a route block, I believe it should be possible to generate a crash that does not require Kemal. I've repeated the situation multiple times: each time the program crashes immediately and with the same error message.

Host machine: Gentoo GNU/Linux, Amd64 laptop, LLVM 6.0.1, Crystal 0.26.1
Target machine: Gentoo GNU/Linux, Raspberry Pi 2 (armv7a-hardfloat-linux-gnueabi), LLVM 6.0.1, Crystal source code v0.26.1, make deps ran (needed for libcrystal.a).

Reproduction steps:

  1. On the host machine: crystal build crpostcrash.cr --cross-compile --target=armv7a-unknown-linux-gnueabihf
  2. Transfer the crpostcrash.o file to the target machine
  3. On the target machine: (command printed in host compile step, paths adapted for my situation)
cc crpostcrash.o -o 'crpostcrash'  -rdynamic -lyaml  -lz `command -v pkg-config > /dev/null && pkg-config --libs --silence-errors libssl || printf %s '-lssl -lcrypto'` `command -v pkg-config > /dev/null && pkg-config --libs --silence-errors libcrypto || printf %s '-lcrypto'` -lpcre -lm -lgc -lpthread crcompiler/src/ext/libcrystal.a -levent -lrt -ldl -L/usr/lib -L/usr/local/lib
  1. Run the created binary
  2. On the target machine: curl localhost:3000

Result:
The compiled binary exits with code 7 and the following is printed on stdout/err:

[development] Kemal is ready to lead at http://0.0.0.0:3000
Invalid memory access (signal 7) at address 0x74a0c3e4
[0x12d124] ???
[0x6b0e4] __crystal_sigfault_handler +72

Additionally, running the program through gdb shows the following:

Program received signal SIGBUS, Bus error.
0x76fd9e9c in ?? () from /lib/ld-linux-armhf.so.3
(gdb) info s
#0  0x76fd9e9c in ?? () from /lib/ld-linux-armhf.so.3
#1  0x76fe270c in ?? () from /lib/ld-linux-armhf.so.3
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Code:

require "http"
require "kemal"

get "/" do
	response = HTTP::Client.post("https://httpbin.org/post",
#		headers: HTTP::Headers{"Content-Type" => "application/json"},
#		body: %[{"test":123}]
	)
#	"#{response.status_code}: #{response.body}"
end

Kemal.run
@ysbaddaden
Copy link
Contributor

Maybe tweaking the CPU/FPU with --mcpu and --mattr to match the ARM target would help. See #3424 (comment) for some RPi examples.

For a RapsberryPi 2 target, you may specify --mcpu=cortex-a7 for example.

I think the issue is the context-switch assembly must save/restore FPU registers, but each ARM CPU comes with a different FPU with a specific instruction set...

@helaan
Copy link
Author

helaan commented Nov 1, 2018

That doesn't seem to help: it still crashes with the same message, just with a different address. I tried compiling with just adding --mcpu=cortex-a7 as well as compiling with --mcpu=cortex-a7 --mattr=armv7-a,a7,neon,vfp2,vfp3,vfp4, with the same results each time

@ysbaddaden
Copy link
Contributor

Sadly there is something with armhf and LLVM. I still haven't figured out what's happening. Maybe we should link the executable with a similar set of arguments (but adapted for C).

@helaan
Copy link
Author

helaan commented Nov 1, 2018

I've tried running the same code on a RPi3 (which uses aarch64) which does work better, so I'll use that for now.

With respect to the RPi2, I wanted to try a different route: maybe it is just stupid, maybe it highlights the issue: what if we compiled to IR on the host machine and do the rest on the target? Then I get the following commands:
HOST: crystal build crpostcrash.cr --cross-compile --target=armv7a-unknown-linux-gnueabihf --emit llvm-ir
TARGET: llc crpostcrash.ll; cc 'crpostcrash.s' -o 'crpostcrash' <same as before, omitted for brevity>

This gave a spectacular amount of linking errors: https://gist.github.com/helaan/2cde67f7bd7fecd57e98b24ac68b0daf . Maybe I'm doing something stupid? Adding -mcpu=cortex-a7 flags to the llc command didn't work either.

@ysbaddaden
Copy link
Contributor

Oh, that's interesting, and you're doing nothing stupid.

AFAIK the errors are unsupported assembly instructions (vmov) related to the FPU (NEON registers / VFP extension registers)... again.

This confirms all ARM errors are related to the FPU and hard/soft float. I recently had VFP issues on Scaleway's ARMv7 server (that I failed to overcome) and the Alpine Linux armhf port is blocked because of it, too.

@helaan
Copy link
Author

helaan commented Nov 2, 2018

Still is strange for me: according to /proc/cpuinfo, the machine supports NEON, so it should not have issues with these commands. Adding my usual CFLAGS to the cc call didn't solve it.

I'm still not entirely convinced by this new route: the errors are severe enough and I wondered whether my variant that worked also suffered from this issue. So I commented out the Kemal stuff, leaving just the Net::HTTP call that I confirmed working earlier and sent it down the same route. I received the same set of errors. Either all of the code that errors out is not reachable, which would be worthy of a seperate bug, or my route is broken. For now, I'm going for the latter.

I'm not ruling out FPU issues, but I think the llvm-ir route is a red herring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants