Variadic function argument lists#115
Conversation
dkl
left a comment
There was a problem hiding this comment.
Some big changes, good job. It's also how I thought cva_* should be implemented. I left some comments here and there; only the cva_arg() behaviour isn't entirely clear to me yet.
src/compiler/ast-node-macro.bas
Outdated
| byval n as ASTNODE ptr _ | ||
| ) as IRVREG ptr | ||
|
|
||
| dim as ASTNODE ptr o = any |
There was a problem hiding this comment.
Yes, this will be removed.
src/compiler/ir-hlc.bas
Outdated
| ctx.exprtext += ")" | ||
|
|
||
| case AST_OP_VA_END | ||
| '' cva_start(l) := __builtin_va_end(l) |
There was a problem hiding this comment.
(probably meant cva_end() instead of cva_start() in this comment)
src/compiler/ir-hlc.bas
Outdated
| ctx.exprtext += ")" | ||
|
|
||
| case AST_OP_VA_COPY | ||
| '' cva_start(l, r) := __builtin_va_copy(l, r) |
There was a problem hiding this comment.
(probably meant cva_copy() instead of cva_start() in this comment)
| end select | ||
| end if | ||
|
|
||
| function = n |
There was a problem hiding this comment.
(It seems that the return value isn't used, so it's better to remove it to avoid confusion)
There was a problem hiding this comment.
Yes, since the ASTNODE passed is modified, will make this a sub procedure.
| end if | ||
|
|
||
| if( cSymbolType( dtype, subtype ) = FALSE ) then | ||
| return NULL |
There was a problem hiding this comment.
(maybe should do some error recovery in this case, at least delete expr)
There was a problem hiding this comment.
Looks like I was all over the place with the error handling. Error handling will be cleaned up through in next update.
| expr1 = cExpression() | ||
|
|
||
| if( hCheckForValistCompatibleType( expr1, FALSE ) = FALSE ) then | ||
| return FALSE |
There was a problem hiding this comment.
(should delete expr1 in the error case here)
| expr1 = cExpression() | ||
|
|
||
| if( hCheckForValistCompatibleType( expr1, FALSE ) = FALSE ) then | ||
| return FALSE |
There was a problem hiding this comment.
(should delete expr1 in the error case here)
| expr2 = cExpression() | ||
|
|
||
| if( hCheckForValistCompatibleType( expr2, TRUE ) = FALSE ) then | ||
| return FALSE |
There was a problem hiding this comment.
(should delete expr1 + expr2 in the error case here)
|
|
||
| else | ||
| '' pointer expression: assignment | ||
| astAdd( astNewASSIGN( expr1, expr2 ) ) |
There was a problem hiding this comment.
Similar case for the ASSIGN here - cVarOrDeref() should be used to ensure expr1 is an lvalue
There was a problem hiding this comment.
Going to use CVarOrDeref thoughout.
src/compiler/symb-keyword.bas
Outdated
|
|
||
| case CVA_LIST_BUILTIN_C_STRUCT | ||
| '' cva_list is __builtin_va_list from C definition: | ||
| '' typedef struct { |
There was a problem hiding this comment.
Is it the same for all targets? (e.g. ARM?)
There was a problem hiding this comment.
I see, I didn't pay any attention to ARM. This should be a ANY alias "char" ptr for ARM 32-bit bit so the selection of builtin type needs to check the arch as well. My tests have been on x86 & x86_64 only for win/lin.
|
This last update should address almost all the comments:
The cva_arg() behaviour: For __builtin_va_arg in gcc, we just need to give it something that is compatible. So far, I have just been trying to give it appropriate types. For example in astLoadMACRO() there is some code to promote types. Kind of like what has to be done with astNewARG()/hCheckParam()/hCheckVarargParam(). Maybe this promotion can be shifted to the parser or emitter. Without promotion can get this in gcc emitted code: For the cva_* pointer expressions, I am basing the expressions from these plain old 32-bit (or lower) C defines: So for
Now if ap itself has side-effects, I don't know, should that kind of expression even be allowed? Honestly, I only ever used va_list types directly with va_* macros in C and have not needed to do anything more complicated than that. So maybe just need to check for side effects and disallow. I tried a couple of tests, trying to come up with something that might make sense. Looks like other problems to fix: |
|
Well, I missed something, or there's a regression and I didn't have a test written. On linux x86_64, the cva_list parameter type emitted is not correct when using an extern "C" function like vsprintf, it's a mismatch. Missing a pointer level indirection .. or something. Will probably take a few days to sort it out. |
|
Made some progress, linux x86-64 still giving me some trouble with passing cva_list types as paramaters of varying levels of indirection, so I have not added those tests yet. Otherwise, this last update: dtype information was being discarded too early, so pointer levels didn't match
mapping cva_list type in fbc to __builtin_va_list in gcc, (any ptr, fbc to struct{}[1], etc)
stuff I couldn't fix, so I disabled:
stuff I tested, but haven't added tests
new problems:
Some possible solutions:
|
|
Last update cleans up some special cases but not all:
Remaining Hacks:
Remaining Work:
|
|
I think this change could be usable, but also please read following: The big change in a57b82a is that the full dtype is passed to the IR. I was struggling with remapping the types until this change. This change now allows IR-HLC to make more decisions about the symbol types. For IR-TAC, since it doesn't know what to do with the extra bits, they get filtered out after getting passed to EMIT.
IR-HLC does a good job now of mapping fbc's cva_list tyepdef to __builtin_va_list. But there are 2 defects, I am thinking that since we are exposing gcc's _builtin_va* almost directly, we should follow what gcc expects and document the differences. I'm not sure it makes sense doing something else, since we would be introducing behaviour, that while it might appear to work, I don't think C spec guarantees that it must.
|
|
(Firstly this is very cool, I'm happy you're working on this! Lack of varargs support in the gcc backend has been annoying) Forgive my ignorance (I haven't read this PR in detail), but why do you want to support returning a va_list from a function? Because va_list can be an array it isn't allowed to return it by value in C anyway, as you said. But even if you could (depending on the platform), it seems the only use for that would to return a va_list from a function that was passed to it in the first place. (Also the man page states "Each invocation of va_copy() must be matched by a corresponding invocation of va_end() in the same function", so it can't even be a copy of a va_list that was passed in, it must be the same one.) Which appears to make returning a va_list rather useless, even though you may be free to allow things in FB that C doesn't. BTW, I found this interesting reference to part of the C standard about passing va_lists by value vs by reference: |
|
@rversteegen, thanks, I've probably read that SO question before, along with many others. See also https://stackoverflow.com/questions/10648337/ Yeah, I'm convinced that it should be an error in fbc too, at least for linux x86_64, or any platform using va_list typed as an array. Practically, if we want to help users write platform independent code, and because of how the va_* API is specified for C, va_list should never be returned by value as a function return. It's only because of the underlying implementation on some platforms (win/lin 32bit & win 64bit) that I know it will work returned by value. True, va_start|copy & va_end must be in same function, but I can't find any specific restriction on va_arg(). It should be expected: pass a va_list to a function, use va_arg() on it and pass back a modified va_list. It seems reasonable that va_list could be returned through a function return using byref or ptr on all platforms. |
|
Well, according to the quote from the C standard in the SO answer I linked to: you can pass a va_list either by value or by reference to a function and use it inside that function (pass it to va_arg). If you passed it by value, you can't use the va_list anymore in the calling function (you must call va_end on it), while if you passed a pointer to it, you can continue to use it (call va_arg on it) in the calling function. I'd suggest making returning cva_list (by value) an error on all platforms, not just those where its required. Also, I know that you skipped llvm support, but it's interesting to see what llvm does. |
dkl
left a comment
There was a problem hiding this comment.
Looks good to me. This is way more difficult than I thought.
| n = astNewDeref( n, typeAddrOf( typeJoinDtOnly( n->dtype, FB_DATATYPE_VA_LIST ) ), astGetSubType( n ) ) | ||
| if( astIsVAR( n ) ) then | ||
| if( symbIsParamByval( astGetSymbol( n ) ) ) then | ||
| exit select |
There was a problem hiding this comment.
looks like exit select in the wrong place
There was a problem hiding this comment.
The first exit select is actually correct. The code immediately following was something I was trying to do but it didn't work: Here's what it should look like, with comments:
case FB_CVA_LIST_BUILTIN_C_STD
if( astIsVAR( n ) ) then
if( symbIsParamByval( astGetSymbol( n ) ) ) then
'' The dtype & subtype tell us that this is a
'' C struct array. Only the pointer to the
'' array is acutally passed by value. Taking
'' the address of this symbol actually gives
'' us the address of the passed-by-value-pointer
'' to the the array instead and we don't want that.
'' We also can't, deref the symbol directly,
'' because C won't allow deref on an array.
''
'' Give up and let gcc backend use the param var
'' symbol name as-is. The mangle modifier will
'' let C backend know it's the va_list type,
'' and exprNewVREG() won't try to deref it.
exit select
end if
end if
'' for anything else, replace the dtype here and let backend
'' handle it (though there are some casts in exprNewVREG that
'' could be solved out, TODO). Taking the address of array
'' variable (in C) is the same address as just referencing
'' the array name. e.g. "int a[10]", a = &a = &(a[0]), but
'' different pointer types
'' cast( __builtin_va_list, expr )
n->dtype = typeJoinDtOnly( n->dtype, FB_DATATYPE_VA_LIST )
In theory should work for any variable and skip any casting, not just byval param variables.
| '' use expr as-is, no side effects expected... | ||
|
|
||
| case AST_NODECLASS_DEREF | ||
| '' copy the reference to a variable, to prevent side effects |
There was a problem hiding this comment.
maybe astRemSideFx() can be re-used to build the assignment/temp var
There was a problem hiding this comment.
Yeah, I saw astRemSideFx(), astMakeRef, hGetExprRef, and this new code for cva_arg() all kind doing similar things. I'll have another look.
src/compiler/symb-keyword.bas
Outdated
|
|
||
| '' type cva_list as __va_list_tag alias "__builtin_va_list" | ||
| '' subtype mangle modifier | ||
| s = symbCloneSymbol( s ) |
There was a problem hiding this comment.
What's the purpose of the cloning here? I couldn't find anything still using the original symbol, so it seemed unused. In cMangleModifier() it makes more sense to me; it probably has to avoid marking the original type with the FB_UDTOPT_ISVALISTSTRUCT* flags because otherwise it would be emitted incorrectly.
There was a problem hiding this comment.
I cloned the symbol there so:
- that the symbKeywordTypeInit() was doing exactly the same thing as if the typedefs were declared in an include file, and
- that it was possible to refer to the original symbol '__va_list_tag.' and not have it map to the __builtin_va_list type.
I have just made a few tests, and see that this cloning doesn't work so great. I tried an earlier version where cMangleModifier back patched the original symbol, but that didn't seem right.
So a few more changes yet to come.
|
Last update is coming, hopefully can bring this to a close this weekend:
Following is a summary of what we've done here: The dtype/subtype ( The mangle modifiers and What we have, is a For me, gcc linux x86_64 is the "difficult" one, compared to the other backend va_list types, since in gcc va_list is typed as a struct array. fbc doesn't have a data type for fixed length struct array. I am suggesting documenting the differences; fbc's PARSER/AST does not give exact translation to the copy/assign semantics of the va_list type in gcc. The main difference is when assigning by value or passing arguments by value of a struct array type. If we really want to preserve the assign or pass by value, can either refer to the array element directly on the backend, for example Bottom line is, there is a lot more we could do with the For the -gen llvm backend, I need help from someone that knows more. In total I have probably only given about 50 hours in last year to making llvm back end work. Mostly trying to find a version of I think we have an immediate solution for va_list type, with llvm effort to be continued later, and we have some options for future development that won't break user code. Not perfect, but probably good enough to merge in. Updates are coming... |
|
I've add or updated a number of comments that should explain how most of this works within the compiler source code. I think this is as far as I can go with this feature for now. To gain a little speed or flexibility in the compiler, next steps would be to look at reworking FBSYMBOL's attribs, but that's more in depth than I am prepared to go just now. It's possible the astRemSideFx() and friends could be combined, or more cases added to handle more scenarios. There are a few bugs on sf.net related to side effect handling, so I expect it will get revisited when those bugs are looked at. We should see the C variadic function feature immediately usable on all 32-bit targets, win64, linux-x86_64. ARM64 might work but it needs to be tested. For any unix and *bsd targets, may need to adjust the logic that selects the default cva_list type. The work around should be only to I think this pull request is ready to merge in. |
|
While I was working on the documentation for this feature (a few new pages for the wiki, though I have not uploaded any of them to the wiki), I was also reviewing the changes made in this pull request and have a couple of reservations, but only because next thing I want to do is put out a 1.06 release. This change for varargs makes some aggressive internal changes to the compiler, especially when dealing with types, and leaves a few loose ends on the TODO list, so it might be wiser to make this the first merge after a release, which will be soon. On the other hand, if I get the builds working mostly automatic, I could probably deal with re-release if something went horribly wrong due to this feature. |
…d symb_dtypeTB (in symb-data.bas)
…copy using gcc's __builtin_va_*
On targets that implement va_list as a pointer, allow overriding gcc __builtin_va* macros with our own expressions
…ndle uninitialized __builtin_va_list directly
- determine va_list based on data type instead of backend - add mangling for va_list struct array types - enable cpp mangle tests for va_list passed by ptr or ref - add va_cva_api.bas tests for cva_list byval/byref/ptr arguments & udt/array - add crt/varargs.bas test to call vsprintf
- rename existing `vregDump() as string` to `vregDumpToStr() as string` - add vregDump() entry point
varargs: guard against null SYM when in a alias name lookup
- pass full types to emitters and let emitter decide to use it or discard it - dtype info dropped was being dropped before it gets to emitter - ir-tac can't handle full dtypes, filter out anything but dtype/ptr
- allow returning byref as ANY if it has the cva_list mangle modifier - add test for cva_list return byval/byref/ptr - __builtin_va_list not allowed as a return value in linux x86_64 - handle side effects in cva_arg() - varargs: don't addrof/deref byval CVA_LIST parameter symbols
…ias names - add symbCloneSimpleStruct() to create clones of simple UDT symbols - in cMangleModifier(), "__builtin_va_list" and alias "__builtin_va_list[]" create a clone of the struct symbol and set options FB_UDTOPT_ISVALISTSTRUCT & FB_UDTOPT_ISVALISTSTRUCTARRAY - in FBS_STRUCT, increase size of options field to long
- re-use astRemSideFx() & astMakeRef() in cVALISTFunct() when handling cva_arg() - add code comments - don't clone va_list structure, just back-patch the original struct. That's most stable for now for use with va_list type, though the behaviour is quirky. Date: Sun Dec 16 09:19:56 2018 -0500
This update adds support for variadic function argument lists for all targets by adding the following:
cva_list is a typedef symbol, and cva_start, cva_arg, cva_end, cva_copy are added as keywords.
#undefcan be used to remove any of these from the global namespace.the
alias "modifier"syntax was added to specify in fbc source code how data types are to be handled/emitted in the backends, in particular name mangling, and mapping to __builtin_va_list for gcc.To implement the the cva_* forms:
Experimental:
-z valist-as-ptrcommand line option. The cva_* expressions emitted for 32-bit gas backend should also work in 32-bit gcc backend. This command line option causes the expressions to be emitted even if backend is gcc.Known issue:
__builtin_va_listis typed as an struct array in gcc on linux 64bit, so to match mangling in c++ need to mangle in the "A1_" array type to the name.Overall, I think it's a good starting point for varargs: more testing needed, of course. All the trivial types should work, though need to pay close attention to promotions and alignments.
( ref: sf.net bug https://sourceforge.net/p/fbc/bugs/881/ )
I'll leave it to countingpine to comment on
#dump&#odumppp statements. The debug information available from fbc is handy. I call the dump procs directly from gdb. But, having something builtin to fbc might let just about anyone investigate code that appears to be buggy.