a test for when the clang compiler method signature mismatches with the julia ccall one #978

Closed
wants to merge 1 commit into
from

Projects

None yet

4 participants

@vtjnash
Member
vtjnash commented Jun 27, 2012

The clang / llvm-gcc compiler that ships with Apple does not cast 8 bit values to the appropriate size in function arguments. I worked around this bug in the fall to correct boxing of 8 bit values and get the code to compile, but I realized this could affect ccall also (if the target code was compiled with llvm-gcc / clang) so I started to put together this testsuite for ccall to demonstrate the error.

@JeffBezanson
Member

Needs a rebase; then I will merge it.

@ViralBShah
Member

Is this the likely reason for the instability we see with clang in #1013?

@StefanKarpinski
Member

@vtjnash: can you fix this up? Should we merge it?

@ViralBShah
Member

@vtjnash can you check if this is fixed with Xcode 4.4?

@vtjnash
Member
vtjnash commented Aug 5, 2012

It wasn't fixed with clang 3.1 (although I feel I need to find a way to trigger this test more reliably). Really this needs to be fixed somehow in Julia / LLVM.

@JeffBezanson
Member

Bump

@JeffBezanson
Member

Should this be filed as a bug in clang? What should we do next?

@vtjnash
Member
vtjnash commented Nov 17, 2012

Sorry for the huge delay. I didn't know llvm well enough. The problem is with Julia not LLVM, I just didn't understand where or how to look previously.

@JeffBezanson I've come to realize that method signatures are slightly incorrect. It is very important to tell llvm the sign of the arguments (signext or zeroext). I can show you why later / on IRC / in person (in short: llvm will remove the previous bit casts because it assumes they weren't necessary). GCC is much more forgiving regarding mismatching type signatures than clang.
e.g. the method signature for jl_box_int8 in llvm optcodes should have been define void* @jl_box_int8(i8 zeroext %x) { } (before we changed this last fall to ignore the issue)

@vtjnash
Member
vtjnash commented Nov 17, 2012

http://llvm.org/docs/LangRef.html#paramattrs
Lots of interesting attributes to read about. also, clang -S -emit-llvm main.c -o - can be really informative

@StefanKarpinski
Member

I have to say it frequently seems to me like llvm should have just gone with having signed and unsigned integer types. Oh well.

@JeffBezanson
Member

I'm kind of glad it doesn't have them, since they don't really exist. All that's really there is a bit string after all.

@vtjnash do you think we should add the following:

--- a/src/intrinsics.cpp
+++ b/src/intrinsics.cpp
@@ -994,6 +994,8 @@ static Function *boxfunc_llvm(FunctionType *ft, const std::string &cname,
 {
     Function *f =
         Function::Create(ft, Function::ExternalLinkage, cname, jl_Module);
+    Function::arg_iterator AI = f->arg_begin();
+    (*AI).addAttr(Attribute::ZExt);
     jl_ExecutionEngine->addGlobalMapping(f, addr);
     return f;
 }
@vtjnash
Member
vtjnash commented Nov 18, 2012

@JeffBezanson: yes, that seems right, but you need to select zext or sext based on the sign of the argument.

I agree with Stefan. Looking at how the code generator works, they should have just gone with explicitly signed and unsigned types.

I've observed that llvm seems to mostly ignore explicit bit modification instructions, so they don't actually happen until just before the function call, and then they happen based upon this hinting. (they probably are observed more closely in the presence of load / store instructions, but I haven't investigated). Additionally, this looks like it is necessary for every function created in LLVM, even if they are purely JIT functions, for any bit type. Pointer types I think are OK without it.

Also from that link above, most ccall functions would benefit from having the sspreq attribute set, to protect the stack. Also byval seems rather interesting, although I'm not sure if it can be any more efficient that the current implementation.

@JeffBezanson
Member

I don't fully understand why we would need this, since in every call I generate the types match exactly, as required by the llvm verifier. And the type of the actual argument should matter more, since for example you would zero extend a uint8 to pass it as a signed int32.

@vtjnash
Member
vtjnash commented Nov 18, 2012

That's what this pull request was originally about. Correctly matching the types is not enough. The signature of the function must also be made to match, or (I've observed that) the assembly to guarantee conversion of the types will not be generated. And you can't pass an int8 to a function that requires an int32, their sizes don't match.

Here's my final code testcase (in c first, for readability):

int8_t ptr[300];
int8_t f2a(int8_t x) { return ptr[x]; }
int8_t f1a(int64_t x) { return f2a( (int8_t)(x&0xff) ); }

int8_t f2b(int8_t x) { return x+1; }
int8_t f1b(int64_t x) { return f2b( (int8_t)(x&0xff) ); }

int main() {
  s1 = f1a( 5000 );
  s2 = f2a( 5000 );
  return (int)s2;
}

(disclaimer: I just wrote this for readability, I didn't actually use and/or test it. It should generate something equivalent to the below LLVM, if the optimizations don't eliminate everything)

Compiling this by hand into raw llvm opt codes (this is what I really was using):

@ptr = global i8 300

define i8 @f2a(i8 zeroext %x) {
  %1 = getelementptr i8* @ptr, i8 %x
  %2 = load i8* %1, align 4
  ret i8 %2
}

define i8 @f1a(i64 %A) {
  %s1 = trunc i64 %A to i8
  %s2 = call i8 @f2a(i8 %s1)
  ret i8 %s2
}

define i8 @f2b(i8 zeroext %x) {
  %1 = add i8 %x, 1
  ret i8 %1
}

define i8 @f1b(i64 %A) {
  %s1 = trunc i64 %A to i8
  %s2 = call i8 @f2b(i8 %s1)
  ret i8 %s2
}

define i32 @main() {
  %s1 = call i8 @f1a(i64 5000)
  %s2 = call i8 @f1b(i64 5000)
  %s3 = sext i8 %s2 to i32
  ret i32 %s3
}

Experiment with removing and changing the various zeroext and truncate instructions and you will find that LLVM does not generate sufficient code to remove the extra bits if the type of the argument is not specified as zero/sign extended.

Generate the associated assembly with $ llc test.c -o -
(painful assembly output follows, comments are at the end)

    .section    __TEXT,__text,regular,pure_instructions
    .globl  _f2a
    .align  4, 0x90
_f2a:                                   ## @f2a
    .cfi_startproc
## BB#0:
                                        ## kill: EDI<def> EDI<kill> RDI<def>
    movsbq  %dil, %rax
    leaq    _ptr(%rip), %rcx
    movb    (%rax,%rcx), %al
    ret
    .cfi_endproc

    .globl  _f1a
    .align  4, 0x90
_f1a:                                   ## @f1a
    .cfi_startproc
## BB#0:
    pushq   %rax
Ltmp1:
    .cfi_def_cfa_offset 16
    movzbl  %dil, %edi
    callq   _f2a
    popq    %rdx
    ret
    .cfi_endproc

    .globl  _f2b
    .align  4, 0x90
_f2b:                                   ## @f2b
    .cfi_startproc
## BB#0:
    incb    %dil
    movb    %dil, %al
    ret
    .cfi_endproc

    .globl  _f1b
    .align  4, 0x90
_f1b:                                   ## @f1b
    .cfi_startproc
## BB#0:
    pushq   %rax
Ltmp3:
    .cfi_def_cfa_offset 16
    movzbl  %dil, %edi
    callq   _f2b
    popq    %rdx
    ret
    .cfi_endproc

    .globl  _main
    .align  4, 0x90
_main:                                  ## @main
    .cfi_startproc
## BB#0:
    pushq   %rax
Ltmp5:
    .cfi_def_cfa_offset 16
    movl    $5000, %edi             ## imm = 0x1388
    callq   _f1a
    movl    $5000, %edi             ## imm = 0x1388
    callq   _f1b
    movsbl  %al, %eax
    popq    %rdx
    ret
    .cfi_endproc

    .section    __DATA,__data
    .globl  _ptr                    ## @ptr
_ptr:
    .byte   44                      ## 0x2c


.subsections_via_symbols

Note that if zeroext is removed, the movzbl instructions will also disappear. Then both of these functions will return an incorrect answer, despite the attempt at type matching! The argument types in the function did not require it, so the callers did not bother to clear the high bits.
Further note that the return value isn't actually returned as a byte either, but as the lower 8 bits of a certain register, with the other bits unchanged. It is assumed that the caller will take care of these details, but only if this is communicated via the function argument types.

@JeffBezanson
Member

Ah, so an i8 argument with no attribute behaves as a word where the higher bits are just undefined? That strikes me as a bad way to avoid a single mov instruction. Alternatively, perhaps this can be seen as a flaw in getelementptr; it should require the offset to be extended to a pointer-size integer.

@JeffBezanson
Member

Actually, julia might be less susceptible to this problem, since internally all array indexes are word-size integers, and any other integer types are explicitly converted first. I don't think we generate any GEPs with non-word-size integer offsets.

@vtjnash
Member
vtjnash commented Dec 3, 2012

I played with this some more and found (no big surprise here) that the important thing is to be consistent (I was analyzing the above assembly slightly incorrectly). Thus, if the function is declared with zeroext in the parameter type, then LLVM will generate code that assumes the high bits of that parameter are already zero (including optimizing away explicit casts) but it will also generate code at the call site to ensure this condition is met (otherwise it will optimize those casts away).

When clang is compiling functions, it declares the function parameter types to include the zeroext / signext modifiers (e.g. it assumes that the caller will take care of the conversion). By contrast, when GCC compiles functions, it appears to take the safe approach and has both the caller and callee perform the bit truncation. Therefore, only external functions (those using addGlobalMapping, esp. those in ccall) must be annotated (to match clang's expectations) to ensure proper behavior. Internally to Julia, this parameter is optional and only serves to occasionally allow slight additional optimization.

@JeffBezanson JeffBezanson was assigned Dec 28, 2012
@vtjnash
Member
vtjnash commented Jan 6, 2013

Following up on on conversation last night, clang only applies signext/zext to i8 and i16 arguments (not pointers, floats, or larger inteters). It's probably still worth asking them what their algorithm is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment