Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failure: gfortran.dg/c_kind_params.f90 #73

Closed
fxcoudert opened this issue Dec 25, 2021 · 43 comments
Closed

Test failure: gfortran.dg/c_kind_params.f90 #73

fxcoudert opened this issue Dec 25, 2021 · 43 comments

Comments

@fxcoudert
Copy link
Contributor

gfortran.dg/c_kind_params.f90 is failing, and has been for some time. I hoped that with the new version of varargs it would be fixed, but it's not. Somewhat reduced:

$ cat c_kind_params.f90
  subroutine param_test (a, b, c, d, e, f, g, h, i, j, k, l, m, &
                         n, o, p, q, r, s, t, u, v, w) bind(c)
  use, intrinsic :: iso_c_binding
  implicit none
    integer(c_int), value :: a, b, c, d, e, f, g, h, i, j, k, l, m, &
                             n, o, p, q, r, s, t, u
    character(c_char), value :: v
    integer(c_signed_char), value :: w

    print *, v, w
end subroutine param_test
$ cat c_kinds.c        
#include <stdint.h>

void param_test(int, int, int, int, int,
		int, int, int, int, int,
		int, int, int, int, int,
		int, int, int, int, int,
		int, char, signed char);

int main (int argc, char **argv)
{
   short int my_short = 1;
   int my_int = 2;
   long int my_long = 3;
   long long int my_long_long = 4;
   int8_t my_int8_t = 1;
   int_least8_t my_int_least8_t = 2;
   int_fast8_t my_int_fast8_t = 3;
   int16_t my_int16_t = 1;
   int_least16_t my_int_least16_t = 2;
   int_fast16_t my_int_fast16_t = 3;
   int32_t my_int32_t = 1;
   int_least32_t my_int_least32_t = 2;
   int_fast32_t my_int_fast32_t = 3;
   int64_t my_int64_t = 1;
   int_least64_t my_int_least64_t = 2;
   int_fast64_t my_int_fast64_t = 3;
   intmax_t my_intmax_t = 1;
   intptr_t my_intptr_t = 0;  
   float my_float = 1.0;

   param_test(1, 2, 3, 4, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3,
              1, 0, 1, 2, 3, 'y', 1);
}
$ ./bin/gfortran c_kind_params.f90 c_kinds.c && ./a.out
 y    0

where it is clear that the last argument is passed with wrong value. It is a sensitive test, the value displayed (instead of 0) can change if you compile with -O1 or remove some of the unused local variables.

A shorter version, showing a non-zero value passed (which makes the original test pass, but not the reduced testcase) is:

$ cat c_kinds.c                                                        
void param_test(int, int, int, int, int, int, int, int, int, int,
		int, int, int, int, int, int, int, int, int, int,
		int, char, signed char);

int main (void) {
  param_test (1, 2, 3, 4, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 0, 1, 2, 3, 'y', 1);
}
$ cat c_kind_params.f90 
subroutine param_test (a, b, c, d, e, f, g, h, i, j, k, l, m, &
                       n, o, p, q, r, s, t, u, v, w) bind(c)
  use, intrinsic :: iso_c_binding
  implicit none
  integer(c_int), value :: a, b, c, d, e, f, g, h, i, j, k, l, m, &
                           n, o, p, q, r, s, t, u
  character(c_char), value :: v
  integer(c_signed_char), value :: w

  print *, v, w
end subroutine
$ ./bin/gfortran c_kind_params.f90 c_kinds.c && ./a.out
 y   96
@iains
Copy link
Owner

iains commented Dec 25, 2021

hmm .. here I am not sure why we would be using varargs anyway - the prototypes are explicit. This seems to say that we are making a mistake with packing of args on the stack - but I'm not sure why that would change with optimisation (unless some inlining or something).

darwin c-chars are signed in any event - so if you need a specific test - it would be the unsigned case.

time to look at the asm generated :)

@fxcoudert
Copy link
Contributor Author

Signed or unsigned makes no difference, I've tried. I know this case doesn't have varargs, but thought somehow an issue might have slipped into the handling of args packing at the same time.

@fxcoudert
Copy link
Contributor Author

OK I think I've reduced to its bare minimum. Take this pure C example:

$ cat c_kinds.c 
void param_test (int, int, int, int, int, int, int, int, int, int,
		 int, int, int, int, int, int, int, int, int, int,
		 int, char, signed char);

void foo_ (void) {
  __builtin_puts("ERROR");
}

int main (void) {
  param_test (1, 2, 3, 4, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 0, 1, 2, 3, 'y', 1);
}
$ cat equivalent.c 
void foo_(void);

void param_test(int a, int b, int c, int d, int e, int f, int g, int h, int i, int j,
		int k, int l, int m, int n, int o, int p, int q, int r, int s, int t,
		int u, char v, signed char w)
{
  if (w != 1)
    foo_();
}
$ ./bin/gcc c_kinds.c equivalent.c && ./a.out     

It works, as it should. Now, I replace the "param_test" function with Fortran, and boom:

$ cat c_kind_params.f90 
subroutine param_test (a, b, c, d, e, f, g, h, i, j, k, l, m, &
                       n, o, p, q, r, s, t, u, v, w) bind(c)
  use, intrinsic :: iso_c_binding
  implicit none
  integer(c_int), value :: a, b, c, d, e, f, g, h, i, j, k, l, m, &
                           n, o, p, q, r, s, t, u
  character(c_char), value :: v
  integer(c_signed_char), value :: w

  if (w /= 1) call foo
end subroutine
$ ./bin/gfortran c_kinds.c c_kind_params.f90 && ./a.out 
ERROR

The Fortran and C versions of param_test should be absolutely identical and lead to the same codegen. But:

$ ./bin/gfortran c_kind_params.f90 -S -O2              
$ ./bin/gcc equivalent.c -S -O2          
$ diff -pu equivalent.s c_kind_params.s  
--- equivalent.s	2021-12-25 20:03:06.000000000 +0100
+++ c_kind_params.s	2021-12-25 20:03:03.000000000 +0100
@@ -5,7 +5,7 @@
 	.globl _param_test
 _param_test:
 LFB0:
-	ldrsb	w0, [sp, 53]
+	ldrsb	w0, [sp, 56]
 	cmp	w0, 1
 	bne	L4
 	ret

There are more differences in the codegen at -O0, but they may be insignificant.

@iains
Copy link
Owner

iains commented Dec 25, 2021

OK, we probably ought to be able to reduce the number of initial ints... probably

int a, - x0
int b, - x1
int c,  - x2 
int d,  - x3
int e, - x4
int f,  - x5
int g,  - x6
int h,  - x7
int i, [sp]
int j, [sp + 4]
int k, [ sp + 8]
int l, [sp + 12]
int m, [sp + 16]
int n, [sp + 20]
int o,  [sp + 24]
int p, [sp + 28]
int q,  [sp + 32]
int r,  [sp + 36]
int s,  [sp + 40]
int t,  [sp + 44]
int u,  [sp + 48]
char v,  [sp + 52]
signed char w  [sp + 53]

Is what I expect.
So what we need is to dump the signature that the Fortran version is producing
My guess is that it is converting the signed char w => int.
which would be correctly positioned at [sp + 56]

edit - I'd expect the same result if you dropped i .. u from the arguments (unless the bug is dependent on the args count, of course)

@iains
Copy link
Owner

iains commented Dec 25, 2021

JFTR, clang and GCC agree on the position of w in C ( so, at least, we're ABI-compliant in this case :) )

@fxcoudert
Copy link
Contributor Author

The asm difference remains with args i to u removed.

The Fortran tree dump for that version is:

__attribute__((fn spec (". . . . . . . . . w . ")))
void param_test (integer(kind=4) a, integer(kind=4) b, integer(kind=4) c, integer(kind=4) d, integer(kind=4) e, integer(kind=4) f, integer(kind=4) g, integer(kind=4) h, character(kind=1) v, integer(kind=1) w)
{
  <bb 2> :
  if (w_2(D) != 1)
    goto <bb 3>; [INV]
  else
    goto <bb 4>; [INV]

  <bb 3> :
  foo ();

  <bb 4> :
  return;

}

Notice that v is character(kind=1) v. This is supposed to be the same as the C char type. Not sure why it's not treated as such.


In fact, that Fortran version should be equivalent to having for v this:

  integer(c_signed_char), value :: v

but the codegen shows that it is different. Hum… why are these types treated differently?

@iains
Copy link
Owner

iains commented Dec 25, 2021

character(kind=1) v,
integer(kind=1) w)

so it seems integer(kind=1) is being treated the same as integer(kind=4)
?

@iains
Copy link
Owner

iains commented Dec 25, 2021

The asm difference remains with args i to u removed.

  integer(c_signed_char), value :: v

but the codegen shows that it is different. Hum… why are these types treated differently?

I guess Fortran either does not know (or maybe has no provision for) that the default char type is signed [there are relatively few targets with default signed char, so it might not show much except on Darwin which is probably the only one using Fortran aggressively]

@iains
Copy link
Owner

iains commented Dec 25, 2021

this port has already picked up a bunch of problems with the function interfaces, so it's doing useful service for improving the compiler :)

@fxcoudert
Copy link
Contributor Author

fxcoudert commented Dec 25, 2021

so it seems integer(kind=1) is being treated the same as integer(kind=4)

No integer(kind=1) is the signed 8-bit integer type, so it's a signed char, actually. >I'm pretty convinced this guy is not the culprit, but the other one is. We have many >testcases passing around small integer types. It's the other one (v) that is weird >here.

That's not 100% convincing (although it is plausible too);
the rules for laying out the type in the frame involve two things;
1/ size
2/ necessary alignment.

the first value appears at the start of a slot (so that has sufficient alignment for either a char (1) or an int (4) .. and, yes, if the character kind=1 is being promoted to int, then that would be 4 bytes which would push the second char to 4 bytes offest.

OTOH, if the second char is promoted to int, then that would need 4 bytes of alignment which would also push it to a higher offset.

(we cannot tell which is in effect without seeing what the "c-style" function signature is)

The type node for character(kind=1) is built like this:

      type = gfc_build_uint_type (gfc_character_kinds[index].bit_size); ! with bit_size == 8
      type = build_qualified_type (type, TYPE_UNQUALIFIED);
      snprintf (name_buf, sizeof(name_buf), "character(kind=%d)",
                gfc_character_kinds[index].kind); ! with kind = 1
      PUSH_TYPE (name_buf, type);

so our character(kind=1) is explicitly made unsigned. But that shouldn't affect packing, should it?

No, but it's a potential source of other bugs - so we ought to file something on BZ about that.

@iains
Copy link
Owner

iains commented Dec 25, 2021

oops I meant to reply to your post, not edit it.

@fxcoudert
Copy link
Contributor Author

we cannot tell which is in effect without seeing what the "c-style" function signature is

I'd like to be able to hack into the middle-end and dump that function type, but I don't know how. I mean I remember I have to set up a breakpoint in a suitable place and call debug_tree, but I can't find the right place to break at.

@fxcoudert
Copy link
Contributor Author

Hey looking at the tree dump that fn spec is crazy:

__attribute__((fn spec (". . . . . . . . . w . ")))
void param_test (integer(kind=4) a, integer(kind=4) b, integer(kind=4) c, integer(kind=4) d, integer(kind=4) e, integer(kind=4) f, integer(kind=4) g, integer(kind=4) h, character(kind=1) v, integer(kind=1) w)

I mean, how could a scalar variable be writable? But I suppose this does not affect your argument layout, it is only used at middle-end level to optimise stuff. Am I right?

@iains
Copy link
Owner

iains commented Dec 25, 2021

I was looking at -fdump-rtl-expand which seems to support your hunch, suggesting that v is being treated as an 8bit sub-reg of a 32b value.

debug-wise I suppose the answer is to break on expand_call or expand_function and then look at the actual function types list passed...

@fxcoudert
Copy link
Contributor Author

So the tree dump for arg v is:

 <array_type 0x1059e9260
    type <integer_type 0x105844348 character(kind=1) public unsigned QI
        size <integer_cst 0x105804db0 constant 8>
        unit-size <integer_cst 0x105804dc8 constant 1>
        align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x105844348 precision:8 min <integer_cst 0x105804de0 0> max <integer_cst 0x105804d80 255>
        pointer_to_this <pointer_type 0x105847f00>>
    string-flag QI size <integer_cst 0x105804db0 8> unit-size <integer_cst 0x105804dc8 1>
    align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x1059e9260
    domain <integer_type 0x1059e9110
        type <integer_type 0x105844738 integer(kind=8) public DI
            size <integer_cst 0x105804cc0 constant 64>
            unit-size <integer_cst 0x105804cd8 constant 8>
            align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x105844738 precision:64 min <integer_cst 0x105804f48 -9223372036854775808> max <integer_cst 0x105804f60 9223372036854775807>
            pointer_to_this <pointer_type 0x1058575d0>>
        DI size <integer_cst 0x105804cc0 64> unit-size <integer_cst 0x105804cd8 8>
        align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x1059e9110 precision:64 min <integer_cst 0x105889c80 1> max <integer_cst 0x105889c80 1>>>

and that of w is:

 <integer_type 0x105847a68 integer(kind=1) public QI
    size <integer_cst 0x105804db0 type <integer_type 0x1058440a8 bitsizetype> constant 8>
    unit-size <integer_cst 0x105804dc8 type <integer_type 0x105844000 sizetype> constant 1>
    align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x105847a68 precision:8 min <integer_cst 0x1058050f8 -128> max <integer_cst 0x105805170 127>>

So I guess the problem is that the v argument should be a single scalar, not an one-element array (if I read that right). If so, it's a front-end bug :(

@iains
Copy link
Owner

iains commented Dec 25, 2021

Hey looking at the tree dump that fn spec is crazy:

well there's been quite a bit of fixing up of the fn-specs in this cycle (some of which is the result of problems with this port - but seems there's more to do)

__attribute__((fn spec (". . . . . . . . . w . ")))
void param_test (integer(kind=4) a, integer(kind=4) b, integer(kind=4) c, integer(kind=4) d, integer(kind=4) e, integer(kind=4) f, integer(kind=4) g, integer(kind=4) h, character(kind=1) v, integer(kind=1) w)

I mean, how could a scalar variable be writable? But I suppose this does not affect your argument layout, it is only used at middle-end level to optimise stuff. Am I right?

Layout would only be affected by the passed-as type (so if it's a char being passed as an int, that would break things) - although that is not how C rules are being interpreted by either GCC or clang (they both agree in this case)

@iains
Copy link
Owner

iains commented Dec 25, 2021

So the tree dump for arg v is:

 <array_type 0x1059e9260
    type <integer_type 0x105844348 character(kind=1) public unsigned QI
        size <integer_cst 0x105804db0 constant 8>
        unit-size <integer_cst 0x105804dc8 constant 1>
        align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x105844348 precision:8 min <integer_cst 0x105804de0 0> max <integer_cst 0x105804d80 255>
        pointer_to_this <pointer_type 0x105847f00>>
    string-flag QI size <integer_cst 0x105804db0 8> unit-size <integer_cst 0x105804dc8 1>
    align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x1059e9260
    domain <integer_type 0x1059e9110
        type <integer_type 0x105844738 integer(kind=8) public DI
            size <integer_cst 0x105804cc0 constant 64>
            unit-size <integer_cst 0x105804cd8 constant 8>
            align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x105844738 precision:64 min <integer_cst 0x105804f48 -9223372036854775808> max <integer_cst 0x105804f60 9223372036854775807>
            pointer_to_this <pointer_type 0x1058575d0>>
        DI size <integer_cst 0x105804cc0 64> unit-size <integer_cst 0x105804cd8 8>
        align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x1059e9110 precision:64 min <integer_cst 0x105889c80 1> max <integer_cst 0x105889c80 1>>>

Yikes - that supports the "w" attribute, I guess.. but not exactly an efficient way to pass a single char ;)

So I guess the problem is that the v argument should be a single scalar, not an one-element array (if I read that right). If so, it's a front-end bug :(

seems likely and likely affects all platforms - but this is the only one that's going to show it.

@iains
Copy link
Owner

iains commented Dec 25, 2021

so two possible PRs:

  1. We do not honour default-signed-char (which is possibly academic to Fortran internally - but is not to c bindings)
  2. That it seems a char is being passed as one entry in a 4byte array..

@iains
Copy link
Owner

iains commented Dec 25, 2021

I wonder, in passing why w is not producing this effect - and why changing the signed-ness of v makes no difference.

@iains
Copy link
Owner

iains commented Dec 25, 2021

I suppose the key is character c.f. int - perhaps character is always represented as an array? (speculation).

@fxcoudert
Copy link
Contributor Author

In the Fortran front-end, character variables are always arrays. We never actually have character scalars, only strings, represented as arrays. When we emit a call to a C function, with a signature that expects a scalar char value, the front-end actually builds the from the existing types. And apparently, it can fail (although why this has never mattered on other targets, I don't know.)

@iains
Copy link
Owner

iains commented Dec 25, 2021

This is the first target (or at least the only one in the current set) to pack entities on the stack, so most targets will start a new PARM_BOUNDARY (most likely 8bytes on most machines) for each stack var and you would never see it on a LE machine. You might see it on a BE machine tho. Perhaps I'll experiment with powerpc-darwin9 when it frees up from this week's test cycle.

@fxcoudert
Copy link
Contributor Author

That it seems a char is being passed as one entry in a 4byte array

I'm not very familiar with array_type and domain. How do you read that it is 4byte?

@iains
Copy link
Owner

iains commented Dec 25, 2021

I did not deduce it from your debug_tree () output - but from the output from fdump-rtl-expand which seems to show

(insn 9 8 10 2 (set (reg/v:SI 99 [ h ])
        (reg:SI 7 x7 [ h ])) "c_kind_params.f90":1:21 -1
     (nil))
^^^ last int in a reg
(insn 10 9 11 2 (set (reg:SI 100)
        (mem/c:SI (reg/f:DI 86 virtual-incoming-args) [9 v+0 S4 A128])) 
"c_kind_params.f90":1:21 -1
     (nil))
^^^ load a 32b value from [sp == virtual-incoming-args]
(insn 11 10 12 2 (set (reg:QI 101)
        (subreg:QI (reg:SI 100) 0)) "c_kind_params.f90":1:21 -1
     (nil))

^^^^ this extract the lowest 8 bits.

edit - in fairness that does not prove that it is being treated as an array - only that it is being passed as a 32b chunk (which is what's messing things up).

@fxcoudert
Copy link
Contributor Author

Reported to bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103828

@fxcoudert
Copy link
Contributor Author

Hum, I might have a one-line fix for it:

--- a/gcc/fortran/trans-types.c
+++ b/gcc/fortran/trans-types.c
@@ -2262,14 +2262,14 @@ gfc_sym_type (gfc_symbol * sym, bool is_bind_c)
 
   if (sym->ts.type == BT_CHARACTER
       && ((sym->attr.function && sym->attr.is_bind_c)
-         || (sym->attr.result
+         || ((sym->attr.result || sym->attr.value)
              && sym->ns->proc_name
              && sym->ns->proc_name->attr.is_bind_c)
          || (sym->ts.deferred && (!sym->ts.u.cl
                                   || !sym->ts.u.cl->backend_decl))))

It makes the tree for that argument now be:

 <integer_type 0x1073fc348 character(kind=1) public unsigned QI
    size <integer_cst 0x1073bcdb0 type <integer_type 0x1073fc0a8 bitsizetype> constant 8>
    unit-size <integer_cst 0x1073bcdc8 type <integer_type 0x1073fc000 sizetype> constant 1>
    align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x1073fc348 precision:8 min <integer_cst 0x1073bcde0 0> max <integer_cst 0x1073bcd80 255>
    pointer_to_this <pointer_type 0x1073fff00>>

which is looking good. It fixes the reduced testcase, but not the original one (with more arguments). On the full testcase, I still get a failure.

I'm tired and need to sleep, I won't make more progress on this :(

@fxcoudert
Copy link
Contributor Author

OK, I'm diving back in. The testcase is still:

$ cat gee.f90    
subroutine param_test (a, b, c, d, e, f, g, h, v, w) bind(c)
  use, intrinsic :: iso_c_binding
  implicit none
  integer(c_int), value :: a, b, c, d, e, f, g, h
  character(kind=c_char, len=1), value :: v
  integer(c_signed_char), value :: w

  if (v /= 'y') call foo
  print *, 'v OK'
  if (w /= 42) call foo
  print *, 'w OK'
end subroutine

$ cat gee.c  
void param_test (int a, int b, int c, int d, int e, int f, int g,
                 int h, 
		 char v, signed char w);

void foo_ (void) {
  __builtin_abort();
}

int main (void) {
  param_test (1, 2, 3, 4, 5, 6, 7, 8, 'y', 42);
}

Now, with my patch, the trees being used for the types of the last three arguments are:

 <integer_type 0x1074705e8 integer(kind=4) public SI
    size <integer_cst 0x107430f00 type <integer_type 0x1074700a8 bitsizetype> constant 32>
    unit-size <integer_cst 0x107430f18 type <integer_type 0x107470000 sizetype> constant 4>
    align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x1074705e8 precision:32 min <integer_cst 0x107430eb8 -2147483648> max <integer_cst 0x107430ed0 2147483647>
    pointer_to_this <pointer_type 0x107471a40>>

 <integer_type 0x107470348 character(kind=1) public unsigned QI
    size <integer_cst 0x107430db0 type <integer_type 0x1074700a8 bitsizetype> constant 8>
    unit-size <integer_cst 0x107430dc8 type <integer_type 0x107470000 sizetype> constant 1>
    align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x107470348 precision:8 min <integer_cst 0x107430de0 0> max <integer_cst 0x107430d80 255>
    pointer_to_this <pointer_type 0x107473f00>>

 <integer_type 0x107473a68 integer(kind=1) public QI
    size <integer_cst 0x107430db0 type <integer_type 0x1074700a8 bitsizetype> constant 8>
    unit-size <integer_cst 0x107430dc8 type <integer_type 0x107470000 sizetype> constant 1>
    align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x107473a68 precision:8 min <integer_cst 0x1074310f8 -128> max <integer_cst 0x107431170 127>>

All three are integer_type, respectively SI, unsigned QI, and QI modes. In particular, the v argument has size 8, unit-size 1, align 8. I don't see any red flag in there.

Yet, compiling with -fdump-rtl-expand gives:

(insn 9 8 10 2 (set (mem/c:SI (plus:DI (reg/f:DI 87 virtual-stack-vars)
                (const_int -560 [0xfffffffffffffdd0])) [3 h+0 S4 A32])
        (reg:SI 7 x7 [ h ])) "gee.f90":1:21 -1
     (nil))
(insn 10 9 11 2 (set (reg:SI 93)
        (mem/c:SI (reg/f:DI 86 virtual-incoming-args) [9 v+0 S4 A128])) "gee.f90":1:21 -1
     (nil))
(insn 11 10 12 2 (set (reg:QI 94)
        (subreg:QI (reg:SI 93) 0)) "gee.f90":1:21 -1
     (nil))
(insn 12 11 13 2 (set (mem/c:QI (plus:DI (reg/f:DI 87 virtual-stack-vars)
                (const_int -564 [0xfffffffffffffdcc])) [9 v+0 S1 A32])
        (reg:QI 94)) "gee.f90":1:21 -1
     (nil))

which I think is wrong: it seems to treat v as reg:SI, which is not what we want. What I don't know is: what information is used from the tree that triggers this? I don't know where to go fishing for that information…

@iains
Copy link
Owner

iains commented Dec 26, 2021

Yes, that looks equally wrong to the previous version despite that the type is now right.

I assume that the offset is still wrong in the asm too.

  • but the only real difference is:
align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x107470348 precision:8 min <integer_cst 0x107430de0 0> max <integer_cst 0x107430d80 255>

c.f.

align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x107473a68 precision:8 min <integer_cst 0x1074310f8 -128> max <integer_cst 0x107431170 127>

Where, in the gimple pipeline are you looking at the arg types?
(there is also the situation that a type T might be passed as a different one if the ABI requires it)

@iains
Copy link
Owner

iains commented Dec 26, 2021

I set a breakpoint on expand_function_start and stepped through the parms:

 <parm_decl 0x105ab3600 v
    type <integer_type 0x105916348 character(kind=1) public unsigned QI
        size <integer_cst 0x105902db0 constant 8>
        unit-size <integer_cst 0x105902dc8 constant 1>
        align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x105916348 precision:8 min <integer_cst 0x105902de0 0> max <integer_cst 0x105902d80 255>
        pointer_to_this <pointer_type 0x105926000>>
    addressable used unsigned QI test.f90:1:21 size <integer_cst 0x105902db0 8> unit-size <integer_cst 0x105902dc8 1>
    align:8 warn_if_not_align:0 context <function_decl 0x105ac4300 param_test>
    arg-type <integer_type 0x105916690 character(kind=4) public unsigned SI
        size <integer_cst 0x105902f00 constant 32>
        unit-size <integer_cst 0x105902f18 constant 4>
        align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x105916690 precision:32 min <integer_cst 0x105902f30 0> max <integer_cst 0x105902ee8 4294967295>
        pointer_to_this <pointer_type 0x1059260a8>> chain <parm_decl 0x105ab3680 w>>

So something in the middle end has decided that this needs to be passed as an int
possibly because 255 cannot be represented in a signed char?
(but then we should expect everything to work if we change that type)

@fxcoudert
Copy link
Contributor Author

I don't know how to look at the gimple in the middle-end (at least, outside of the -fdump options which don't give a full output), so I watch what the front-end emits. In this case, I'm looking at the tree produced by generate_local_decl and gfc_sym_type.

But I looked further down the front-end, and found this, which gave me a “oh shit” moment.

      /* Modify the tree type for scalar character dummy arguments of bind(c)
         procedures if they are passed by value.  The tree type for them will
         be promoted to INTEGER_TYPE for the middle end, which appears to be
         what C would do with characters passed by-value.  The value attribute
         implies the dummy is a scalar.  */
      if (sym->attr.value == 1 && sym->backend_decl != NULL
          && sym->ts.type == BT_CHARACTER && sym->ts.is_c_interop
          && sym->ns->proc_name != NULL && sym->ns->proc_name->attr.is_bind_c)
        gfc_conv_scalar_char_value (sym, NULL, NULL);

and then this:

void
gfc_conv_scalar_char_value (gfc_symbol *sym, gfc_se *se, gfc_expr **expr)
{
  if (sym->backend_decl)
    {
      /* This becomes the nominal_type in
         function.c:assign_parm_find_data_types.  */
      TREE_TYPE (sym->backend_decl) = unsigned_char_type_node;
      /* This becomes the passed_type in
         function.c:assign_parm_find_data_types.  C promotes char to
         integer for argument passing.  */
      DECL_ARG_TYPE (sym->backend_decl) = unsigned_type_node;

      DECL_BY_REFERENCE (sym->backend_decl) = 0;
    }

And if I print this sym->backend_decl before and after this stupid function:

## Before:
 <parm_decl 0x107910600 v
    type <integer_type 0x107770348 character(kind=1) public unsigned QI
        size <integer_cst 0x107730db0 constant 8>
        unit-size <integer_cst 0x107730dc8 constant 1>
        align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x107770348 precision:8 min <integer_cst 0x107730de0 0> max <integer_cst 0x107730d80 255>
        pointer_to_this <pointer_type 0x107773f00>>
    used unsigned QI gee.f90:1:21 size <integer_cst 0x107730db0 8> unit-size <integer_cst 0x107730dc8 1>
    align:8 warn_if_not_align:0 context <function_decl 0x107921300 param_test> arg-type <integer_type 0x107770348 character(kind=1)> chain <parm_decl 0x107910680 w>>

## After:
 <parm_decl 0x107910600 v
    type <integer_type 0x107770348 character(kind=1) public unsigned QI
        size <integer_cst 0x107730db0 constant 8>
        unit-size <integer_cst 0x107730dc8 constant 1>
        align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x107770348 precision:8 min <integer_cst 0x107730de0 0> max <integer_cst 0x107730d80 255>
        pointer_to_this <pointer_type 0x107773f00>>
    used unsigned QI gee.f90:1:21 size <integer_cst 0x107730db0 8> unit-size <integer_cst 0x107730dc8 1>
    align:8 warn_if_not_align:0 context <function_decl 0x107921300 param_test>
    arg-type <integer_type 0x107770690 character(kind=4) public unsigned SI
        size <integer_cst 0x107730f00 constant 32>
        unit-size <integer_cst 0x107730f18 constant 4>
        align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x107770690 precision:32 min <integer_cst 0x107730f30 0> max <integer_cst 0x107730ee8 4294967295>
        pointer_to_this <pointer_type 0x107780000>> chain <parm_decl 0x107910680 w>>

@fxcoudert
Copy link
Contributor Author

So the front-end is emitting code for that argument with a TREE_TYPE of unsigned char and a DECL_ARG_TYPE of int. According to the comments, it's intentional, and probably a misunderstanding of the C rules (which, as far as I remember, only apply to functions without prototypes, or varargs, isn't that right?)

@iains
Copy link
Owner

iains commented Dec 26, 2021

hmm.. well the comment about what C would do does not appear to agree with what either GCC or clang's C compilers actually implement - perhaps there's some misinterpretation of promotion rules that would be applied to an un-prototyped function (where things are indeed promoted)?

and indeed - there is the SI arg-type - so it's not a ME thing

@fxcoudert
Copy link
Contributor Author

OK, removing that stupid line works, and makes the test pass. I'll follow up with the Fortran maintainers.

I'm wondering why it's allowed to have a type with TREE_TYPE that is something different from DECL_ARG_TYPE, to be honest. Is there a legitimate use to that weird combination?

@iains
Copy link
Owner

iains commented Dec 26, 2021

Yes, totally legal - it corresponds to the situation in which a type cannot be passed naturally and has to be passed as something different according to ABI.

I guess it might be necessary to read the C std on promotion rules (I am only going from [possibly faulty] memory that the C-(rather than ABI) promotions happen when the destination type is unknown [so for unprototyped and maybe K&R style functions])

@fxcoudert
Copy link
Contributor Author

@iains my memory is the same, and what I've found in 5 minutes of reading confirms that.

Thanks a lot for your help, and sorry that it wasn't a target bug after all…

@fxcoudert
Copy link
Contributor Author

“Funnily” enough, the original PR that introduced this feature was https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32732
At the time (in 2007), the testcase introduced was failing on IA64 HP-UX, and only on that platform. Nevertheless, the bug was closed as fixed.

@iains
Copy link
Owner

iains commented Dec 26, 2021

well, I suppose that one could add Tobias to the BZ and/or just post the patch if there are no regressions on darwin / linux (it is possible to test Solaris on the cfarm - gcc211 is what I use).

@fxcoudert
Copy link
Contributor Author

I've just finished a clean patch for Fortran, with a couple more testcases as well (for those delicate targets…). Regtesting right now on aarch64-apple-darwin. But it seemed to pass on solaris before, so maybe it's delicate in a different way

@iains
Copy link
Owner

iains commented Dec 26, 2021

I suppose a BE target (e.g. gcc110) would be wise - that could very well show progressions if the ABI mismatches by placing the char at [SP+3] instead of [SP] (which would be the consequence of promoting to int on a BW machine)

@fxcoudert
Copy link
Contributor Author

Closing the report, it's moved to bugzilla, and I'll submit a patch later today.

@fxcoudert
Copy link
Contributor Author

Regarding signedness, I've concluded that there is no actual wrong caused by the current implementation. The Fortran standard, in its C interoperability section, actually makes some interesting notes on how Fortran does not really care about signedness and can interoperate with all of char, unsigned char and signed char in the same way. We only ever pass or return arguments, by value or by address. But internally, we can use whatever we want (and most Fortran operations on character values outside the ASCII range are noted to be implementation-defined anyway).

@fxcoudert
Copy link
Contributor Author

@iains
Copy link
Owner

iains commented Dec 26, 2021

Yeah, from my Fortran days (f77 ;) ) I recall that it does not really have the concept of unsigned integers. So I suppose that any juggling needs to be done in the C bindings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants