Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Illegal instruction" when running docker #56

Closed
debevv opened this issue Sep 24, 2017 · 18 comments
Closed

"Illegal instruction" when running docker #56

debevv opened this issue Sep 24, 2017 · 18 comments

Comments

@debevv
Copy link
Contributor

debevv commented Sep 24, 2017

Starting from meta-iot2000-bsp, I added the docker recipe from the meta-virtualization layer, but every docker-related executable fails with "Illegal instruction".
I learned that this should work after this commit, so am I missing something?
By inspecting build/tmp the go-cross version used is 1.6 and looks like the patches are being picked up

@jan-kiszka
Copy link
Collaborator

@debevv
Copy link
Contributor Author

debevv commented Sep 25, 2017

I just tried to rebuild my image with the bsp from that branch... the problem still persists. I'm currently building your kas-with-docker.yml, I will let you know the result as soon it completes

@debevv
Copy link
Contributor Author

debevv commented Sep 25, 2017

Unfortunately, the build fails during do_rootfs. I attached the log of the stage. The actual error seems to be opkg_solver_install: Cannot install package docker on line 8512

log.do_rootfs.31186.txt

@jan-kiszka
Copy link
Collaborator

Package does not exist - the error must have happened earlier.

BTW, are you building natively or inside the kas docker image? The latter succeeded 2 months ago here.

@debevv
Copy link
Contributor Author

debevv commented Sep 25, 2017

Could you point me to a log that could be useful to identify the problem?
I'm building natively on Ubuntu 16.04.3, from a fresh git clone

@jan-kiszka
Copy link
Collaborator

jan-kiszka commented Sep 25, 2017

First step would be the full console log. From that we can look into which detailed log is needed (if any).

@debevv
Copy link
Contributor Author

debevv commented Sep 26, 2017

I didn't have infinite scrollback on my terminal emulator at that time, so the stuff left was not very useful.
Anyway, I got it to work some way. Because the build system was clearly patching go-cross, I thought that maybe the problem was in the patches themselves, so i manually patched and compiled go 1.6.4, and with that I compiled the same docker version used by meta-virtualization, and finally no more SIGILLs from docker executables.
I think the problem could be in the go-wrapper script, because after deleting it and creating two .bbappend files for docker and containerd like this:

do_compile_prepend(){
	export GO386=quark
}

I managed to recompile my bsp + docker recipe image, and this time it run. So I guess that somehow go is not picking up the GO386 parameter and consequently producing code not compatible with Quark.

Regarding the problem I had with the kas-with-docker.yml image, I deleted the dummy recipe in the docker folder, and the package manager stopped complaining. I don't know if this was just a coincidence though

@debevv
Copy link
Contributor Author

debevv commented Sep 28, 2017

Hi, looks like the go-cross patches are working only for the docker executable. Other dependency packages like containerd and runc still give illegal instruction.

@jan-kiszka
Copy link
Collaborator

jan-kiszka commented Sep 28, 2017

Not when building the jan/docker branch. I have tested the patches locally with an hello-world Go app using oe-meta-go, and that worked as well.

@debevv
Copy link
Contributor Author

debevv commented Sep 28, 2017

Yes, the problem lies specifically in containerd and runc. I did a gdb run and both are crashing on a xorps instruction. But I don't understand why the instruction is inside the runtime.check() function of the glibc library. Do you know a proper (yocto) way to force a global -mno-mmx flag for all the generated code?

@jan-kiszka
Copy link
Collaborator

jan-kiszka commented Sep 28, 2017

containerd and runc are built with different makefiles in the docker, but they are still built with the Go compiler, not gcc. And that is being told by go-wrapper (774746e#diff-bada823f163f0098a601378434ac3160) that it should switch to 386 mode (GOARCH).

When does it crash, directly during startup? Do you have a backtrace at hand?

@debevv
Copy link
Contributor Author

debevv commented Sep 28, 2017

That's why I find it very strange. They should use all the same C library and the same (patched) go compiler

trace.txt

@jan-kiszka
Copy link
Collaborator

jan-kiszka commented Sep 28, 2017

Your code is in go/src/runtime/runtime1.go, check(), not in glibc. So your compilation via Go went wrong.

@debevv
Copy link
Contributor Author

debevv commented Sep 28, 2017

But I'm a using a clean BSP from jan/docker, why is it working with docker and not with the rest?

@jan-kiszka
Copy link
Collaborator

I don't have a version of my build at hand, I can only tell that I used to have problems with those two targets as well until I added that go-wrapper concept. After that, I checked all docker binaries for unsupported MMX instructions and found none anymore.

If you are building that branch as-is via the kas config, something must be going wrong in your setup with that wrapper. Maybe instrument it and rebuild containerd to check if it is being used at all.

@debevv
Copy link
Contributor Author

debevv commented Sep 28, 2017

I tried to build containerd first with only the go-wrapper, then with the wrapper and a .bbappend file that enforces GO386=quark in do_compile (I wrote about that some comments ago). In both cases it doesn't work.
containerd-only-wrapper.txt
containerd-wrapper-bbappend.txt

Are you sure the wrapper is working properly? I always see this line in log builds:
containerd-0.2.2+git0ac3cd1be170d180b2baed755e8f0da547ceb267-r0 do_compile: build: unexpected operator

Also, I still can't get to run docker with only the wrapper. When I add my .bbappend in my layer instead, it's fine

@debevv
Copy link
Contributor Author

debevv commented Sep 28, 2017

Finally! I found the problem! It's the script indeed.
I run manually go-wrapper and this is what I was getting:
./go-wrapper: 2: [: unexpected operator
The script is run with /bin/sh and not /bin/bash, so the == operator on line 2 is not supported. Changing it to = made it compatible with dash, which implements sh on Ubuntu 16.04.
This explains also why everything was fine in your container build, our environments were different.

And maybe I have an explanation on why my bbappend files were working only for docker. I think it's in the way the packages are compiled. In the docker case bitbake calls a custom script inside the docker source tree, in the other two cases instead the oe_runmake static command is used. The latter was not picking the environment variable set beforehand, maybe.

Man, I was losing my mind on this. Do you want me to submit a PR?

@jan-kiszka
Copy link
Collaborator

Oh, sorry for that. I'm indeed only on distros with sh=bash, so this slips through from time to time. Please send a PR!

And, yes, that difference in building the docker binaries led to the not really clean but effective wrapper method. I tried to inject the necessary vars via bitbake means before that but eventually gave up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants