Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rserve container problem #4

Closed
kforner opened this issue Jun 1, 2017 · 8 comments
Closed

rserve container problem #4

kforner opened this issue Jun 1, 2017 · 8 comments

Comments

@kforner
Copy link

kforner commented Jun 1, 2017

I'm coming back to the rserve container problem, which does not work on my computer (and on hcountoun's one too).

If I understood correctly, you said that building the docker on ubuntu 14.04 produces a container that works both on 14.04 and 16.04, but building it on 16.04 produces a container that works only on 16.04, on the same computer right ?

The post you linked (openresty/docker-openresty#39) could be an explanation, but I don't see how a kernel difference could be the culprit.

Let's suppose it's a CPU instruction set problem. Where does it come from ?
If you compiled on the same computer, whatever system you're running to build should not impact the executables produced, because the very same compiler running in the docker, or precompiled executables are used.
So I'm quite puzzled.

@dennyverbeeck
Copy link
Owner

If I understood correctly, you said that building the docker on ubuntu 14.04 produces a container that works both on 14.04 and 16.04, but building it on 16.04 produces a container that works only on 16.04, on the same computer right ?

It was on different machines. The 16.04 was on a virtualbox VM on my laptop running a Core i7-6820HQ, the 14.04 was on a server sporting Intel Xeon E5-2630 CPUs. The one i compiled on the 14.04 server i could run it in my 16.04 vm. The one i compiled on the 16.04VM i could not run on the 14.04 server, it also exited with exit code 132.

According to this post an exit code of 132 means exited because of SIGILL, i.e. illegal instruction, which seems to confirm the hunch from the first post that there is an instruction set mismatch. The i7 in my laptop was launched 3 years later than the Xeon in the server, so it might have an instruction set not available to the Xeon, causing it to SIGILL. If that is the case there really isn't anything i can do and you should compile it on your machine and use that binary. It takes some time but you'll only need to do it once. You could give it it's own tag, e.g. run in transmart-rserve directory: docker build -t karl/transmart-rserve and then modify the tmrserve service docker-compose.yml to point to karl/transmart-rserve to avoid accidentally picking up the DockerHub version again.
FYI here is the available instruction sets on the 14.04 server where rserve was built:

$ cat /proc/cpuinfo | grep flags
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx hypervisor lahf_lm ida arat epb xsaveopt pln pts dtherm

Finally, there is this post which explains a possible procedure to inspect the instruction sets used by a binary. Maybe i'll have a look tomorrow, after all it would be nice to have a definitive answer as to why it's crashing on your machine!

Cheers,
Denny

@kforner
Copy link
Author

kforner commented Jun 1, 2017

It was on different machines. The 16.04 was on a virtualbox VM on my laptop running a Core i7-6820HQ, the 14.04 was on a server sporting Intel Xeon E5-2630 CPUs.

Ok, so now it makes perfect sense. The question now is what gets compiled with these flags.
I don't think it's R itself since it comes probably from a debian package, and so ought to be compatible. It's probably some R packages that are compiled and installed.

@kforner
Copy link
Author

kforner commented Jun 1, 2017

I think I got it !!
I ran the failing rserver container on my older Xeon workstation:

%/transmart-data/R/build/bin/R      
Illegal instruction (core dumped)

So it's R itself.

Apparently it is compiled from source, using this Makefile: /transmart-data/R/Makefile
And I guess the culprit is:
R_FLAGS ?= -O2 -march=native

that must pick compilation params for the local CPU.
I suppose it is enough to set R_FLAGS=-O2 in /transmart-data/makefile.inc to test it.

@dennyverbeeck
Copy link
Owner

Great! Yes R itself gets compiled from source, -march=native indeed generates instructions optimized for the local cpu, so now we immediately now why there is no pre-compiled R in transmart-data 😄 it might be a good idea for the docker container to remove the -march=native flag so it runs on all systems. Then i'll provide an additional note on the Readme explaining this, and that for additional performance R can be compiled from source as well.
I'll try to test it out tomorrow, thanks Karl for your time in investigating this 😄

@kforner
Copy link
Author

kforner commented Jun 1, 2017

You're very welcome ! I'm so glad you're taking care of this transmart-docker.

@dennyverbeeck
Copy link
Owner

dennyverbeeck commented Jun 2, 2017

I just pushed a new transmart-rserve:etriks-v4.0 image, compiled on my i7 without the -march=native flag. It runs on the older Xeon server! Could you perhaps pull it and check if it runs on your machine?

Thanks,
Denny

@kforner
Copy link
Author

kforner commented Jun 2, 2017

Could you perhaps pull it and check if it runs on your machine?

it's working just fine :)

@kforner kforner closed this as completed Jun 2, 2017
@dennyverbeeck
Copy link
Owner

Perfect! Thanks Karl :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants