Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segmentation fault in native function built by clang (misaligned stack) #6038

Closed
vicuna opened this issue Jun 11, 2013 · 20 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link

commented Jun 11, 2013

Original bug ID: 6038
Reporter: kik
Assigned to: @xavierleroy
Status: closed (set by @xavierleroy on 2015-12-11T18:27:50Z)
Resolution: fixed
Priority: high
Severity: minor
Platform: i386
OS: Linux Mint
OS Version: 15
Version: 3.12.1
Target version: 4.02.0+dev
Fixed in version: 4.02.0+dev
Category: runtime system and C interface
Tags: patch
Monitored by: @ygrek "Richard Jones"

Bug description

I'm using ocaml binding of llvm on i386 Linux.
But it crashes with segmentation fault on the same position every time.

I noticed:

  • it crashes if llvm library is built by clang++ with optimization.
  • there is "movaps xmm0, ofs(%esp)" opcode.
  • stack pointer is misaligned.

It seems clang++ expects aligned stack but ocaml runtime does not align stack.

I do not know i386 Linux ABI requests it, but I think passing aligned stack from
ocaml runtime removes surprising result.

Steps to reproduce

/* simple example generates "movaps" and request stack alignment /
/
compile: clang -O2 -c foo.c */
#include <stdio.h>
#include <caml/mlvalues.h>
#include <caml/memory.h>
#include <caml/alloc.h>
#include <caml/custom.h>

struct X
{
int a, b, c, d;
};

attribute((noinline))
int foo(const struct X *x)
{
printf("in foo\n");
return x->a;
}

value bar(value unit)
{
struct X x = { 0, 0, 0, 0 };
CAMLparam1(unit);
CAMLreturn(Val_int(foo(&x)));
}

File attachments

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 11, 2013

Comment author: @jhjourdan

This is a common problem in x86 C compilers : it is more or less a consensus that in the "modern" x86 ABI, stack should be 16-byte aligned :

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40838

I think OCaml should try to make sure stack is aligned when calling C code.

I can imagine of at least two ways to do that :

(1) As in macosx, ocamlopt keep the stack aligned, so that C functions are calld with an aligned stack
(2) We add the force_align_arg_pointer attribute to all C primitives (by using the CAMLprim macro), so that the C compiler knows that it has to generate code to realign the stack. There is few other C functions that also need to have this attribute.

(1) uses more stack space, but I do not know whether the attribute proposed in (2) is available on all the C compilers we need.

However, all this is my own opinion, and it does not count much, as I am not an ocamlopt expert.

JH

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 12, 2013

Comment author: @xavierleroy

Linux/x86-32 bits is supposed to comply to the SVR4/ELF x86-32 ABI, which states that the stack is always 4-aligned, but does not require 16-byte alignment.

AFAIK, the only x86-32 ABI that requires 16-alignment for the stack is MacOS X.

ocamlopt generates code that conforms to the corresponding ABIs. All the support for 16-alignment of the stack is there, but it is activated only for MacOS X.

Finally, keep in mind that bigger stack alignment has a cost, in terms of wasted stack space. For deeply recursive functions, this wasted space can cause a stack overflow. So, I'm not in favor of increasing alignment if there is no good reason.

Bottom line: I don't think a x86-32 compiler for Linux is ever allowed to assume 16-alignment on the stack pointer. If "clang -O2" does assume this, I'd say it is a clang/llvm problem.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 12, 2013

Comment author: @jhjourdan

Linux/x86-32 bits is supposed to comply to the SVR4/ELF x86-32 ABI, which states that the stack is always 4-aligned, but does not require 16-byte alignment.

Indeed. However, it seems like there is a consensus among C compiler writers on the fact that the stack should be 16-aligned. This has been a debate on the GCC bugtracker (see the link above), and the statu quo is that both GCC and LLVM rely on it. The glibc has been updated for this purpose.

Finally, keep in mind that bigger stack alignment has a cost, in terms of wasted stack space. For deeply recursive functions, this wasted space can cause a stack overflow. So, I'm not in favor of increasing alignment if there is no good reason.

I agree. Moreover, 16-aligning the OCaml stack might not be worth it, because Ocaml mainly does 4-bytes memory accesses. However, we do not need to 16-align the OCaml stack: we could add the force_align_arg_pointer attribute in the CAMLprim macro. It forces the C compiler not to assume that the stack is 16-aligned. It is available in not-too-old versions of LLVM and GCC.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 12, 2013

Comment author: @alainfrisch

we could add the force_align_arg_pointer attribute in the CAMLprim macro

This would force all external stub libraries to use CAMLprim. It seems that the manual indeed says this is required, but since CAMLprim does nothing, I'm not sure all stub libraries actually follow this convention (admittedly, they should be fixed).

For deeply recursive functions, this wasted space can cause a stack overflow.

It is conceivable (and cheap) to fix the stack alignment only around calls to C functions? I don't think the case of deeply recursive functions going through C matters in practice.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 12, 2013

Comment author: @jhjourdan

This would force all external stub libraries to use CAMLprim. It seems that the manual indeed says this is required, but since CAMLprim does nothing, I'm not sure all stub libraries actually follow this convention (admittedly, they should be fixed).

If such a stub does not contain CAMLprim, it is bogus (what's the point of CAMLprim if it is not used ?). Note that anyway, those stubs will be as likely to crash as now, because nothing is done currently.

It is conceivable (and cheap) to fix the stack alignment only around calls to C functions? I don't think the case of deeply recursive functions going through C matters in practice.

While possible, I am not sure it is that easy: you will have to realign the stack before pushing arguments, and restore the stack pointer after the call. However, the compiled code for evaluating the arguments can depend on the current offset between the current frame and the stack pointer, which is unknown statically...

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 12, 2013

Comment author: @alainfrisch

(what's the point of CAMLprim if it is not used ?)

Currently, the only effect of CAMLprim is within the runtime system itself (it serves as a marker to collect the list of built-in primitives with sed).

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 13, 2013

Comment author: kik

Thanks,

Bottom line: I don't think a x86-32 compiler for Linux is ever allowed to assume 16-alignment on the stack pointer. If "clang -O2" does assume this, I'd say it is a clang/llvm problem.

I checked my clang binary and source code. The binary is from Ubuntu package. So, I think it is properly configured.
The stack setting is written in

https://github.com/llvm-mirror/clang/blob/master/lib/Basic/Targets.cpp#L2860

It seems x86_32 default is "-S128" (128bit stack) and only WindowsX86_32TargetInfo uses "-S32".
I think this is intentional.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 13, 2013

Comment author: @alainfrisch

If such a stub does not contain CAMLprim, it is bogus (what's the point of CAMLprim if it is not used ?). Note that anyway, those stubs will be as likely to crash as now, because nothing is done currently.

This is true, but since Linux x86-32 is an exotic architecture these days, it is unlikely that authors of stubs will notice the problem. If this solution is picked (instead of having ocamlopt enforce the alignment), it will be important to make it widely known that CAMLprim is now required.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 14, 2013

Comment author: @xavierleroy

It is conceivable (and cheap) to fix the stack alignment only around calls to C functions?

No, as far as I can see. This would require extensive changes in the i386 code generator, plus an extra register must be reserved (in the code that pushes the arguments to the call) to act as an alternate stack pointer.

The possible workarounds seem to be:
1- Compile the stub code for the C libraries in question with GCC flag -mstackrealign
2- Put attribute force_align_arg_pointer in CAMLprim as suggested by Jacques-Henri and encourage authors of stub code to use CAMLprim
3- Impose 16-alignment of the stack throughout all OCaml-generated code, like we currently do for i386-macosx.

#2 looks like the least intrusive solution, but I wonder how much of a speed penalty this would be for the OCaml runtime itself.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 14, 2013

Comment author: @xavierleroy

Replying to myself:

I wonder how much of a speed penalty [solution #2] would be for the OCaml runtime itself.

After examination of the code generated by GCC, the answer is: probably negligible. (Just one "and" instruction. GCC already uses a frame pointer, so that cost is already paid.)

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 18, 2013

Comment author: @damiendoligez

Looks like the best solution is Xavier's option (2): add force_align_arg_pointer to CAMLprim, with all the necessary incantations in the configure script to make sure it's only used with C compilers that know it.
Who wants to provide a patch?

Out of curiosity, there's still one unanswered question: why does this affect clang and not gcc if both are supposed to assume 16-byte alignment of the stack?

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 19, 2013

Comment author: kik

Out of curiosity, there's still one unanswered question: why does this affect clang and not gcc if both are supposed to assume 16-byte alignment of the stack?

Because use of SSE instructions is enabled default on clang but optional on gcc.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 19, 2013

Comment author: @jhjourdan

I submitted a patch that implements this solution.

I used a patched version of GCC that produces executables that crashes when entering a function with an unaligned stack. It appeared that some C callbacks were not annotated with CAMLprim (even in the Ocaml runtime). I have corrected many of them, but it may be the case there are still some of them.

I do not have a reliable way to test in the configure script whether the force_align_arg_pointer attribute is actually taken into account for the system compiler: I just tests it does not produce an error. With gcc, unknown attribute just produce warnings.

I think it could be a good idea to use this method for the macosx architecture: indeed, it would remove the performance penalty of keeping the Ocaml stack aligned. What do you think ?

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 19, 2013

Comment author: @xavierleroy

I do not have a reliable way to test in the configure script whether the force_align_arg_pointer attribute is actually taken into account for the system compiler

Googling suggests

#if defined(i386) && (GNUC > 4 || (GNUC == 4 && GNUC_MINOR > 1))

which seems to work for GCC and for Clang on my Mac OS X 10.8 ("Apple LLVM version 4.2"). Maybe this version test is as reliable or more than your configure-time test.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 26, 2013

Comment author: @jhjourdan

The problem with such a test is that it is indirect : some versions of clang will support the attribute, but won't be accepted by this test.

I would prefer using a conservative approach where we put the attribute when we know it does not generates an error (while still being able to generate a warning if the compiler is not recent enough).

As an other approach, we could test the attribute support using something like:

attribute((force_align_arg_pointer)) int test_realign() {
int x attribute((aligned(16)));
return (int)&x;
}

int main() {
int i;
for(i = 0; i < 4; i++) {
int res;
asm(
"sub %1, %%esp\n\t"
"call %2\n\t"
"add %1, %%esp\n\t"
: "=&a" (res)
: "r" (i
4), "r" (test_realign)
: "ecx", "edx");
if(res & 15)
return 1;
}
return 0;
}

If this test passes, we use the attribute.

Concerning the use of this attribute on macosx, I suggest using the previous test in order to deactivate ocaml stack alignment when possible. However, I do not have an Apple computer to do tests...

@vicuna

This comment has been minimized.

Copy link
Author

commented Jul 9, 2013

Comment author: @jhjourdan

I submitted a new version of the patch using this method to check alignment. If the C compiler does not support the attribute, it maintains the whole stack 16-aligned (using the existing macosx code).

I also added some more missing CAMLprims in libraries.

Given that we expect CAMLprim to be missing in some other code, and that macosx is very sensitive to misalignment, the stack is still kept aligned in this OS.

There is some specific code for mingw32, that should work, but I havn't tested it.

@vicuna

This comment has been minimized.

Copy link
Author

commented Jan 6, 2014

Comment author: Richard Jones

This does also affect gcc. See my comments here:
#5700#c10779

@vicuna

This comment has been minimized.

Copy link
Author

commented May 28, 2014

Comment author: @mshinwell

Am I right in thinking that we still need to fix this problem in OCaml?

It does appear that gcc has been presuming 16-byte alignment on x86-32 for some time.

@vicuna

This comment has been minimized.

Copy link
Author

commented May 28, 2014

Comment author: Richard Jones

Well it definitely affects Fedora. For a long time I've used this hack as a workaround. It would be good to have a proper fix in the compiler itself.

http://pkgs.fedoraproject.org/cgit/ocaml.git/tree/ocaml.spec#n274

@vicuna

This comment has been minimized.

Copy link
Author

commented Jun 2, 2014

Comment author: @xavierleroy

Fixed in version/4.02 (commit 14943) and on trunk (commit 14944) by forcing 16-alignment of stack for all x86-32 platforms except Win32/MSVC. It's not like the GCC and Clang developers left us much choice. Enjoy the new stack consumption. For the record, I still morally object to compiler writers violating ABIs, instead of evolving ABI standard documents.

@vicuna vicuna closed this Dec 11, 2015

@vicuna vicuna added this to the 4.02.0 milestone Mar 14, 2019

@vicuna vicuna added the bug label Mar 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.