Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support the vararg calling convention in C# #37

Closed
sharwell opened this issue Jan 19, 2015 · 32 comments
Closed

Support the vararg calling convention in C# #37

sharwell opened this issue Jan 19, 2015 · 32 comments

Comments

@sharwell
Copy link
Member

The Virtual Execution System defined in ECMA-335 defines a vararg calling convention for methods which is not accessible from C#. As mentioned by @jaredpar in issue #36, both of the following signatures incur memory allocations for the purpose of passing parameters, even if those parameters are value types:

WriteLine(params object[] args)
WriteLine(params IEnumerable<object> args)

For applications which are sensitive to memory allocations but still wish to provide API methods which take a variable number of arguments, I propose including the following feature in addition to the more "friendly" methods taking object[] or IEnumerable<object>.

WriteLine(params ArgIterator args)

A method containing a final params ArgIterator argument would be emitted using the vararg calling convention. Access to the arglist IL instruction is implicit via C# code which accesses this argument.

@mburbea
Copy link

mburbea commented Jan 19, 2015

Isn't varargs already available via the __arglist keyword? And unless I'm mistaken this method was much slower than taking a params object[] but probably more memory efficient.

@jaredpar
Copy link
Member

Why have this as a separate issue? Performance and usability are related discussions. Having them as two separate issues is going to make the conversation harder, not easier.

@sharwell
Copy link
Member Author

@mburbea Two things

  1. I wasn't aware of that, so ✨ to you.
  2. That doesn't really count as "being part of C#", since it's not mentioned at all in the language specification and therefore could differ in behavior between implementations.

@mburbea
Copy link

mburbea commented Jan 19, 2015

Yeah, it like __makeref and __reftype and __refvalue are part of a set of secret keywords in VS. I'm not sure if Mono, Roslyn or other compilers support them. Yeah, they are completely undocumented But my guess is that is how the secret overloads of Console.WriteLine and string.Concat were written. I'm assuming in managed C++ .net 1.1 there were some more efficient means of dealing with these varargs methods, but even they were pretty much deprecated and it has always been nearly impossible to find anything on this feature set of the CLR. I'm still annoyed there is no version of IlGenerator.Call that works with constructors with varargs.

@HaloFour
Copy link

The problem with __arglist is that it is painfully slow. But that is the official mechanism in .NET for supporting variable arguments. It really only exists for the purposes of supporting varargs methods via P/Invoke, e.g.:

[DllImport("user32.dll")]
static extern int wsprintf([Out] StringBuilder lpOut, string lpFmt, __arglist);

See also: System.ArgIterator

@mburbea
Copy link

mburbea commented Jan 19, 2015

Is it __arglist or the argiterator struct that are slow? I've never really seen anything conclusive on the matter. And I've read, though never seen that in C++ you apparently could avoid the iterator entirely.

@mikedn
Copy link

mikedn commented Jan 19, 2015

Access to an __arglist is complicated because it requires checks to ensure type safety. varargs functions get a hidden argument which is used to pass some sort of description of the arguments - count and types. ArgIterator has to parse that description and it does so via calls to runtime helpers, accessing a single argument from the list ends up being more expensive that reading an element from an array.

@sharwell
Copy link
Member Author

@mikedn Suppose for example you have the following method:

int Sum(params int[] values)
{
  int sum = 0;
  for (int i = 0; i < values.Length; i++)
  {
    sum += values[i];
  }

  return sum;
}

Now suppose you want to implement the same method without requiring the creation of an array. I believe you could use this:

int Sum(params ArgIterator values)
{
  int sum = 0;
  int count = values.GetRemainingCount();
  for (int i = 0; i < count; i++)
  {
    sum += __refvalue(values.GetNextArg(), int);
  }

  return sum;
}

I have not checked to determine at which point the second method is slower than the first.

@mikedn
Copy link

mikedn commented Jan 19, 2015

I believe the working version for the varargs looks like the following, at least that's what I tested:

static int Sum(__arglist) {
            var value = new ArgIterator(__arglist);
            var count = value.GetRemainingCount();
            var sum = 0;
            for (int i = 0; i < count; i++)
                sum += __refvalue(value.GetNextArg(), int);
            return sum;
}

1,000,000 Sum(__arglist(1, 2, 3)) calls take 288ms on my machine.

Can you guess how much time the params int[] version takes?

It takes 10ms. No comment 😄

@tmat
Copy link
Member

tmat commented Jan 19, 2015

Yes, __arglist is legacy, not-optimized and shall not be used. It's also not portable.

@mburbea
Copy link

mburbea commented Jan 20, 2015

I just tested and the slow part seems to be the ArgIterator. Calling a method with __arglist on its own is actually pretty fast Instantiating the iterator is horribly slow. They seem to be calling some native code on instantiating that does all the handling. Unfortunately, C# is extremely cautious about letting you mess with the RuntimeArgumentHandle. C# won't let you box or turn it to a valuetype. I imagine if you can extract the pointer you might be able to parse arguments if you know what your doing in a wildly unsafe manner. I suppose with some IL you might be able to pull it off but C# isn't convinced that you can write a delegate that takes an Arglist. Edit: You can write a delegate that takes a RuntimeArgumentHandle. It can't be generic either as they can't be boxed and thus can't be used as a generic arg.
Here is a method to get the PointerValue if anyone wants to play...

public delegate IntPtr GetPtrType(RuntimeArgumentHandle a);
public GetPtrType GetPtr = new Func<GetPtrType>(()=>{
var dynamicMethod = new DynamicMethod("name", typeof(IntPtr),
                new[]{typeof(RuntimeArgumentHandle)}, true);
            var ilgen = dynamicMethod.GetILGenerator();
            ilgen.Emit(OpCodes.Ldarga_S, 0);
            ilgen.Emit(OpCodes.Call,
                typeof(RuntimeArgumentHandle).GetMethod("get_Value", BindingFlags.NonPublic | BindingFlags.Instance));
            ilgen.Emit(OpCodes.Ret);
            var func = dynamicMethod.CreateDelegate(typeof(GetPtrType)) as GetPtrType;
            return func;
}.Invoke();

I'm not sure what you can do with it but maybe somebody can write some unsafe code to parse out values.

Usage

static IntPtr GetThePtr(__arglist){
     GetPtr(__arglist);
   }

@mikedn
Copy link

mikedn commented Jan 20, 2015

Yes, the call itself is fast, the arguments are simply pushed on the stack like any other call arguments. The problem is extracting the arguments from the list, that requires parsing the signature used at the callsite and that's done via calls to runtime helper functions.

Getting the pointer won't help, you need to know the format of that signature. In general, any attempt at extracting the arguments from the pointer is a recipe for disaster.

@mburbea
Copy link

mburbea commented Jan 20, 2015

Is there anyway to get that data via stackcrawls? Or is that what is so horribly slow?

@mikedn
Copy link

mikedn commented Jan 20, 2015

Stackcrawls as in System.Diagnostics.StackTrace?

Even if it would be possible how would that help? You'll end up allocating at least one object, if you're going to do that then you may as well use the params int[] version.

@xen2
Copy link

xen2 commented Jan 22, 2015

Probably too much to change/implement, but maybe that could work.

Just wondering:

  • CLR allows you to have reference on the stack (using %), avoiding GC alloc (it is only on the stack).
  • CLR doesn't allow you to have array/string on the stack (error C3699: '%' : cannot use this indirection on type 'cli::array<System::String ^,1>')

However, since array size is constant, it would probably be fine to allow it and use the stack (even if dynamic size _alloca() could work on small values; maybe a if (arraySize < 16) _alloca else GC_alloc()).

Then, any small fixed size array that is not shared (as the ones used with params) could be allocated directly on the stack, avoiding any allocation.

Stack allocated objects would then need to be mapped to C# somehow (maybe using same syntax, MyType%?).

It could also open the door to many other optimizations.

@mikedn
Copy link

mikedn commented Jan 22, 2015

An array is a reference type, reference types can never be allocated on the stack because you may end up with references to objects that have been deallocated.

Stack allocation for reference types is possible as a JIT compiler optimization, it could stack allocate if it finds that the array reference is never stored in the heap or another place that would allow the use of the object after the function returns. Unfortunately this requires some rather expensive analysis and it's more likely to see such an optimization in a AOT compiler like .NET Native rather than in the traditional JIT compiler.

@xen2
Copy link

xen2 commented Jan 22, 2015

I agree that this not safe, but it is definitely possible (as an example, C++/CLI allows you to do so, so it can be expressed in terms of MSIL).

Of course, it should only be allowed inside an unsafe block.

Or better, have something similar to https://wiki.gnome.org/Projects/Vala/ReferenceHandling : have "owned" pointer to make it safe (easy to know if somebody borrow a reference).

And as you said, it can also be an automatic optimization when it can proven to be safe.

@mikedn
Copy link

mikedn commented Jan 22, 2015

I'm not sure what in C++/CLI makes you think that you can stack allocate arrays. You seem to be drawing the wrong conclusion, that "reference on the stack (using %)" avoids GC allocations. And anyway, if this requires the callers to be marked unsafe it's kind of useless. unsafe has its good uses but C# is primarily a type safe language and that's not going to change.

As for reference counting & co. - it's probably easier to improve __arglist than go that way.

@d-kr
Copy link

d-kr commented Jan 22, 2015

@mikedn
Copy link

mikedn commented Jan 22, 2015

What does stackalloc has to do with this? You cannot use stackalloc to store reference types.

@xen2
Copy link

xen2 commented Jan 22, 2015

It does allocate on the stack (MS named it the "stack semantic operator") and you don't have to use gcnew but directly the value type ctor like a struct, without even a "new": https://msdn.microsoft.com/en-us/library/ms177191.aspx

It doesn't work on arrays/string in C++/CLI (because they have variable size), but it could be possible for MS to technically support it (using _alloca()).

I definitely agree it is not safe (except if you can guarantee reference are "owned" with additional keyword, similar to Vala, or MS research http://research.microsoft.com/pubs/170528/msr-tr-2012-79.pdf ).

Just wanted to mention this would avoid most drawbacks of params.
If not supported at language level, it could still be done internally by JIT/AOT (but could be quite heavy as you mentioned).

@mikedn
Copy link

mikedn commented Jan 22, 2015

Oh, the "stack semantics" thing. That terminology is misleading, it should be called "scoped semantics" as the feature has nothing to do with the stack. Reference types can never be allocated on the stack and C++/CLI doesn't do that. It just claims "stack semantics" because it calls Dispose on the object at the end of the scope in which the object variable was declared.

@xen2
Copy link

xen2 commented Jan 22, 2015

Right, my bad, sorry for the confusion.

ref class A
{
public:
    B A1; // Edit: Looks like inside the class, but it's not
    B^ A2;
};

Edit: Thought first A1 was inside the class, but it actually is NOT inside (despite a modreq(IsByValue) in IL, and misleading C++ syntax).
I assumed it was possible in class, and inside a function would be same, but it's not the case in both case.

int main(array<System::String ^> ^args)
{
    Object% a1 = Object(); // This is actually allocated with an indirection
    Object^ a2 = gcnew Object();
}

However, I was wondering, is anything really preventing such ref to be allocated directly on the stack? (as long as GC knows about it, and ref to stack is not transferred outside of scope)

@mikedn
Copy link

mikedn commented Jan 22, 2015

You can do that but it's the same trick, A1 will end up on the GC heap and A will only store a reference to it. You need to separate language syntax and semantics from implementation details.

I suppose it's technically possible to allocate reference types on the stack but neither the runtime nor the language currently allow this. Basically the reference types would have to be treated as value types and passed around by using C#'s ref which is already restricted to stack usage.

@xen2
Copy link

xen2 commented Jan 22, 2015

A1 is actually embedded inside A (I checked the memory). Since both are still in GC heap, it's probably not a problem. That's why I thought stack % was doing the same. Edit: Actually it's not

Concerning ref stack alloc, I meant allocating like a value type on the stack but still include vtable (like A1) and pass this stack pointer around like a normal reference. Basically it would just replace GCalloc by stack alloc.
It would work with any function (no need for ref) as long as:

  • no ref to it are kept after stack is removed (unsafe)
  • GC understand that values in stack space should be traced but not reclaimed (since not in GC heap)

This could be quite useful for LINQ, params and many custom scenario that were usually avoided in game engine dev due to unecessary GC pressure (sometimes it is hard to avoid allocating, esp. BCL internals).

I might try to experiment this feature later in https://github.com/xen2/SharpLang

@xen2
Copy link

xen2 commented Jan 22, 2015

Of course the implementation I proposed is part of the "unsafe" realm, but could be made safe if ownership concept was added to the language (same as Vala or MS research http://research.microsoft.com/pubs/170528/msr-tr-2012-79.pdf ).

Note that http://joeduffyblog.com/2013/12/27/csharp-for-systems-programming/ mentions it in first point: "We’ve stolen a page from C++ — in areas like rvalue references, move semantics, destruction, references / borrowing — and yet retained the necessary elements of safety, and merged them with ideas from functional languages. This allows us to aggressively stack allocate objects, deterministically destruct, and more.". I guess it is probably something similar.

@mikedn
Copy link

mikedn commented Jan 22, 2015

I don't know what you have checked in memory but what you're describing is not possible. Maybe the B class from your example is a traditional C++ class instead of a C++/CLI ref class. That's a completely different story.

As for the rest of the stuff, we'll see. It's one thing to do that in a research project and it's another thing to do that in an existing language with millions of lines of library and application code already written. I'm not sure if you noticed but one of the authors of that paper, Jared Parsons, is now a member of the Roslyn team and even posted in this discussion. And Joe Duffy is now the director of the language group at Microsoft.

@xen2
Copy link

xen2 commented Jan 22, 2015

Sorry, after careful checking, seems it is in fact not laid out in memory inside the class (despite the modreq(IsByVal) being emitted in MSIL -- syntax and IL can be deceiving!). Must have checked too quickly the memory and since both instances were close in memory, I didn't see there was more in between.

Concerning stack alloc of ref types, of course I didn't say it would be good to add in roslyn and commercial project (lot of implications and corner case to deal with when adding such a language feature -- note: unsafe feature is probably much easier to add than a safe one though).
Mostly wanted to have some advices/opinions on such an idea and discuss it, and possibly later try some experimentation, nothing more. Esp. because I know people that already played with such ideas are here, I thought it would be a good place to ask.

Note: updated previous post to strike out the wrong assumptions. Thanks for pointing it out, I really thought it was laid out differently in memory.

@ViToni
Copy link

ViToni commented Mar 6, 2015

Reading that __arglist is legacy I'm wondering what "The Right Way(TM)" would be to do it.

In my case there is a C library I would like to use which implements logging via a callback set by a function pointer and uses vargs. Using a delegate works to the point where I need acces to the vargs of the logging function since delegates cannot be defined with an __arglist.

@IS4Code
Copy link

IS4Code commented Jul 17, 2015

modreq doesn't do anything, it's purpose is solely to be recognized and handled by compilers.
Java already allocates some objects on the stack.
__arglist isn't legacy, it is just undocumented and exists only for interop.

Probably the best thing is to allow it on delegates.

@Thaina
Copy link

Thaina commented Sep 9, 2015

Why couldn't we just
void Func(params items)
and
void Func<T>(params<T> items)

@gafter
Copy link
Member

gafter commented Jan 21, 2016

We have no expectation of ever doing this.

@gafter gafter closed this as completed Jan 21, 2016
dibarbet added a commit to dibarbet/roslyn that referenced this issue Mar 1, 2023
Add build action to ensure server builds on PRs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests