Proposal: Ref Returns and Locals #118

Closed
stephentoub opened this Issue Jan 28, 2015 · 167 comments
@stephentoub
Member

(Note: this proposal was briefly discussed in #98, the C# design notes for Jan 21, 2015. It has not been updated based on the discussion that's already occurred on that thread.)

Background

Since the first release of C#, the language has supported passing parameters by reference using the 'ref' keyword, This is built on top of direct support in the runtime for passing parameters by reference.

Problem

Interestingly, that support in the CLR is actually a more general mechanism for passing around safe references to heap memory and stack locations; that could be used to implement support for ref return values and ref locals, but C# historically has not provided any mechanism for doing this in safe code. Instead, developers that want to pass around structured blocks of memory are often forced to do so with pointers to pinned memory, which is both unsafe and often inefficient.

Solution: ref returns

The language should support the ability to declare ref locals and ref return values. We could, for example, now declare a function like the following, which not only accepts 'ref' parameters but which also has a ref return value:

public static ref TValue Choose<TValue>(
    Func<bool> condition, ref TValue left, ref TValue right)
{
    return condition() ? ref left : ref right;
}

With a method like that, one can now write code that passes two values by reference, with one of them being returned based on some condition:

Matrix3D left = …, right = …;
Choose(chooser, ref left, ref right).M20 = 1.0;

Based on the function that gets passed in here, a reference to either 'left' or 'right' will be returned, and the M20 field of it will be set. Since we’re trading in references, the value contained in either 'left' or 'right' is updated, rather than a temporary copy being updated, and rather than needing to pass around big structures, necessitating big copies.

If we don't want the returned reference to be writable, we could apply 'readonly' just as we were able to do earlier with ‘ref’ on parameters (extending the proposal mentioned in #115 to also support return refs):

public static readonly ref TValue Choose<TValue>(
    Func<bool> condition, ref TValue left, ref TValue right)
{
    return condition() ? ref left : ref right;
}
…
Matrix3D left = …, right = …;
Choose(chooser, ref left, ref right) = new Matrix3D(...); // Error: returned reference is read-only

Note that when referencing the 'left' and 'right' ref arguments in the Choose method’s implementation, we used the 'ref' keyword. This would be required by the language, just as it’s required to use the ‘ref’ keyword when passing a value to a 'ref' parameter.

Solution: ref locals

Once you have the ability to receive 'ref' parameters and to return ‘ref’ return values, it’s very handy to be able to define 'ref' locals as well. A 'ref' local can be set to anything that’s safe to return as a 'ref' return, which includes references to variables on the heap, 'ref' parameters, 'ref' values returned from a call to another method where all 'ref' arguments to that method were safe to return, and other 'ref' locals.

public static ref int Max(ref int first, ref int second, ref int third)
{
    ref int max = first > second ? ref first : ref second;
    return max > third ? ref max : ref third;
}
…
int a = 1, b = 2, c = 3;
Max(ref a, ref b, ref c) = 4;
Debug.Assert(a == 1); // true
Debug.Assert(b == 2); // true
Debug.Assert(c == 4); // true

We could also use ‘readonly’ with ref on locals (again, see #115), to ensure that the ref variables don’t change. This would work not only with ref parameters, but also with ref locals and ref returns:

public static readonly ref int Max(
    readonly ref int first, readonly ref int second, readonly ref int third)
{
    readonly ref int max = first > second ? ref first : ref second;
    return max > third ? ref max : ref third;
}
@stephentoub stephentoub changed the title from Proposal: ref returns and locals to Proposal: Ref Returns and Locals Jan 28, 2015
@theoy theoy added the Language-C# label Jan 28, 2015
@gafter gafter added the 1 - Planning label Feb 2, 2015
@MadsTorgersen MadsTorgersen was assigned by gafter Feb 2, 2015
@MgSam
MgSam commented Feb 4, 2015

If I recall, Eric Lippert blogged about this some years back and the response in the comments was largely negative.

I do not like this feature for C#. The resulting code is like an uglier version of C++, and code written with it takes longer to reason about and understand. The use-cases are not particularly compelling, and I have never run into a situation where I wished I had ref locals or return values.

@axel-habermaier
Contributor

Yes, I know very well that mutable structs should be avoided. Still, one interesting use case would be lists of mutable structs. Consider:

struct MutableStruct { public int X { get; set; } }
MutableStruct[] a = ...
List<MutableStruct> l = ..
a[3].X = 5; // changes the value of X of the struct in the array
l[3].X = 5; // compile time error

If the indexer of the List<T> class would return the value stored in the list by reference, the code above would compile, making the use of mutable structs less surprising. It is probably even more efficient as the (potentially large) struct no longer has to be copied out from the list.

Unfortunately, I doubt that the return type of List<T>'s indexer can be changed for backwards compatibility reasons.

@xen2
xen2 commented Feb 7, 2015

Disclaimer: I work on game engine, so I am probably not the typical user.

One use case this could really help us is this one:

MyHugeStruct[] data; // we use a struct to improve data locality and reduce GC pressure
// Ideally, we would like to be able to use List<T>, but we can't take ref then
for (int i = 0; i < data.Length; ++i)
{
   // Option 1: make a local copy (slow)
   var item = data[i];

   // Option2: To avoid making a stack copy of MyHugeStruct,
   // we have to defer to a inner loop function
   MyLoopBody(ref data[i]);

   // Option3: using new proposal, that would be much better:
   ref MyHugeStruct = data[i];
}

We end up making separate function for loop body, and in case of tight loop this can end up being quite bad:

  • Have to forward all parameters
  • Sometimes we found out with VTune that inner loop stack "initlocals" was taking up most (80%+) of the time if inner loop body happened to have a several locals (even if only 0 or 1 was used due to branching). This would not happened if the locals were contained and memzeroed once in the function containing the "for" loop.
  • not inlined in simple cases

Nice to have:

  • ref this[] operator(?) so that List<> and other collections can be used (vs being forced to use arrays)
  • a ++ operator on ref to be able to loop by incrementing pointer instead of indice multiplication (but probably unsafe).

Extra (probably impossible without changing BCL):

  • Lot of struct copy could also be avoided in EqualityComparer (Dictionary) if ref could be used when large structs are being used as key.
@paulomorgado

What happens with this?

var data = GetData();
...
ref SomeStruct GetData()
{
    var ss1 = new SomeStruct();
    var ss2 = new SomeStruct();

    return ref Choose(ref ss1, ref ss2);
}
ref SomeStruct Choose(ref SomeStruct ss1, ref SomeStruct ss2)
{
    return whatever ? ref ss1 : ref ss2;
}

GetData might not be aware that Choose is returning one of its variables and returns to the caller a reference to it.

Does the value still exist after exiting GetData?

@gafter
Member
gafter commented Mar 6, 2015

@paulomorgado You would not be allowed to return a ref to a local variable or parameter.

@paulomorgado

@gafter, the only difference between my Choose method and @stephentoub's one is that mine does not have the selector passed as a delegate. Did I miss something here?

@stephentoub
Member

@paulomorgado, the compiler would only let you return a ref to something that it knew was either on the heap or that came from the caller. In my example, the ref inputs to the Choose method were all from ref parameters (or ref locals to ref parameters), so the compiler would conclude that the result of the Choose method met the criteria and would allow its returned ref to be returned. But in your example, the refs passed to Choose were not from the caller nor from the heap, such that the compiler couldn't be sure that the result of Choose was allowed to be returned, and it would error out.

@paulomorgado

@stephentoub, forget my Choose method. Your's is the best that can be done and you just published it to NuGet and I added it to my project. How can the compiler know where the return valur of Choose is coming from? My GetData is just complying to the contract of Choose to get its result and pass along as all the code written so far and to be written in the future does.

What you're saying is that publicly exposed methods can't return refs, which reduce the usage to only private methods.

@stephentoub
Member

@paulomorgado, I understand the confusion, but that's not what I'm saying.

There would be some rules about what it would be safe to return, e.g.

  • refs to variables on the heap are safe to return
  • ref and out parameters are safe to return
  • a ref returned from another method is safe to return if all refs passed to that method were safe to return (by this same set of rules)

Forget the implementation of Choose here. Assuming Choose abides by these rules (which the compilation of Choose would enforce), in my example all of the inputs to Choose were valid to be returned, therefore the result of Choose could be returned. In your example, at least one of the inputs to Choose wasn't valid to be returned, therefore the result of Choose could not be returned. The compiler can validate that.

@paulomorgado

@stephentoub, what I'm having trouble with is understanding how those rules can be effectively enforced.

And a proposal should have an example that works under the proposal.

@stephentoub
Member

@paulomorgado, how does my example not work under the proposal? And why do you believe the rules can't be enforced?

@paulomorgado

@stephentoub, either that or I totally missed everything.

My understanding is that there's no way the caller can take the result of your Choose method as safe to return as reference. Is there? If so, how?

@stephentoub
Member

@paulomorgado, in this example:

public static ref TValue Choose<TValue>(
    Func<bool> condition, ref TValue left, ref TValue right)
{
    return condition() ? ref left : ref right;
}

left and right are both safe to return because they came from the caller.

In this example:

public static ref int Max(ref int first, ref int second, ref int third)
{
    ref int max = first > second ? ref first : ref second;
    return max > third ? ref max : ref third;
}

first, second, and third are all safe to return because they all came from the caller. max is safe to return because the only refs it's possibly assigned to are those which are safe to return.

If I as a caller wanted to use Choose, e.g.

public static ref TValue ChooseByTime<TValue>(
    ref TValue left, ref TValue right)
{
    return Choose(() => DateTime.UtcNow.Seconds % 2 == 0, ref left, ref right);
}

Both left and right are safe to return because they came from the caller. Therefore all of the ref inputs to Choose are safe to return. Therefore the resulting ref from Choose is also safe to return. I don't need to worry about the implementation of Choose, because the compiler is enforcing all of these same rules on the implementation of Choose.

@paulomorgado

Both left and right are safe to return because they came from the caller. Therefore all of the ref inputs to Choose are safe to return. Therefore the resulting ref from Choose is also safe to return. I don't need to worry about the implementation of Choose, because the compiler is enforcing all of these same rules on the implementation of Choose.

But ChooseByTime isn't returning neither left nor right. It's returning the return value of Choose. Noting but the implementation details of Choose is saying its return value is the same as one of its parameters. What if Choose is an implementation of an interface?

You're restricting the use of Choose to cases where it works without any safeguards or proof that it's safe.

My example shows the opposite.

@stephentoub
Member

@paulomorgado, your example wouldn't compile... the compiler would error out exactly because it doesn't abide by the rules: your call to Choose is passed ref values that are not safe to return, therefore the result of your call to Choose is not safe to return. I'm sorry if I'm not explaining this well; not sure how to convey it differently.

@stephentoub
Member

Noting but the implementation details of Choose is saying its return value is the same as one of its parameters.

Ah, maybe this is the point of confusion. The implementation doesn't matter because the compiler assumes the worst: regardless of how a parameter is actually used, if any argument isn't safe to return, then the result of the call isn't safe to return. The compiler is conservative in that regard.

@paulomorgado

A conservative compiler that assumes the worst cannot assume the return value of Choose is safe to return.

Is this what you're proposing?

public static ref TValue ChooseByTime<TValue>(
    ref TValue left, ref TValue right)
{
    TValue result = Choose(() => DateTime.UtcNow.Seconds % 2 == 0, ref left, ref right);
    if (result == left) reurn ref left;
    else if (result == right) return ref right;
    else throw new Exception("Invalid value.");
}
@stephentoub
Member

Why do you say that? What specifically about this example do you believe is problematic?

@stephentoub
Member

Let's try something else: can you construct an implementation of Choose that will compile based on the aforementioned rules/explanations but where the caller of the method could not assume its return value was safe to return?

@paulomorgado

No I can't. Because I haven't been able to understand how this would work.

I can understand how, in your implementation of Choose, it is safe to return that reference.

What I can't understand is why its callers can safely return the same reference without intimately knowing its internals..

@stephentoub
Member

Because it wouldn't be allowed to return anything that's not safe in the case where the caller assumes it is safe. If the only thing the caller passes in are refs that are safe to return, then what could this method return?

  • one of those refs: that's safe.
  • a ref to an object it allocates on the heap: that's safe
  • a ref to some other local or parameter: that's not safe, but it's also not allowed, so it can't actually do this
  • a ref it got back from another call, but only if it passed in safe to return refs; if it passed in any non safe refs, then the returned ref would also not be safe to return, the compiler wouldn't allow it. Effectively the rules apply recursively here.

Etc.

@paulomorgado

So, this wouldn't be safe, right?

public static ref TValue ChooseByTime<TValue>(
    ref TValue left)
{
    ref TValue right = default(TValue);
    return Choose(() => DateTime.UtcNow.Seconds % 2 == 0, ref left, ref right);
}
@stephentoub
Member

Correct, that would not compile.

@copernicus365

Beautiful solution, I've wondered why this couldn't be done before.

@MgSam [The resulting code is like an uglier version of C++]
Because of sentiments like this (i.e. 'anything I don't personally use should never be part of the language for anybody else either, even though the CLR itself has this capability'), it means our language is needlessly crippled in places where a very easy and beautiful solution like this gives us such a capability. As the gamer showed in the comment above, this can be a big performance win in some cases.

@whoisj
whoisj commented Jul 7, 2015

👍

Anytime I can pass a pointer instead of performing a value copy, I'm all for it. Are there good reasons to pass memory by value-copy? Yes. Should it always be the case? Absolutely not.

The resulting code is like an uglier version of C++

I agree, it is not pretty but it is very descriptive. It would be nice if the ref keyword could be replaced with syntax we're all used to. Perhaps we could use * in place of ref because int* foo; is "cleaner" and "easier" to read than ref int foo;. I put "cleaner" and "easier" in quotes because it is incredibly subjective.

Yes, I know that * is generally reserved for unsafe but there's no reason the symbol cannot be reused, so long as one is reserved for a "safe" contexts and the other for an "unsafe" context.

@HaloFour
HaloFour commented Jul 7, 2015

Given the limitations listed above imposed to maintain a safe context I'm having a hard time envisioning the use cases for this feature. The real gains would seem to be in how structs can be used throughout the BCL with arrays, lists or other collection types.

@whoisj
whoisj commented Jul 7, 2015

Given the limitations listed above imposed to maintain a safe context I'm having a hard time envisioning the use cases for this feature. The real gains would seem to be in how structs can be used throughout the BCL with arrays, lists or other collection types.

Agreed. This is, in my opinion, a small step in the right direction though.

@whoisj
whoisj commented Jul 8, 2015

Would this implementaion allow for ref int[] intRefs = new ref int[512];?

If it doesn't, then I am less excited than I originally was. If it does, read ref struct[] is difficult. Is it a reference to an array of structures or an array of structure references?

Better to use struct*[] in my opinion.

@HaloFour
HaloFour commented Jul 8, 2015

I don't disagree that ref something is unattractive, however your use of * is already legal C# syntax and implies an unsafe context. I'm sure that you know that, but I thought it warranted mention.

I imagine that the array scenario would likely depend on the proposal for fixed-size buffer enhancements, #126. Once the size is determined and allocated I believe that would behave the same as a field or as a local.

@whoisj
whoisj commented Jul 8, 2015

I don't disagree that ref something is unattractive, however your use of * is already legal C# syntax and implies an unsafe context. I'm sure that you know that, but I thought it warranted mention.

I do. I also know that * is only legal with an unsafe block. Thus, the compiler could assume that * needed to be "safe" unless in an unsafe block. Therefore operations like int* p = ...; p++; would no be legal, instead int* p would have to point to safely referenced memory.

Yes, there would be complexities if devs started an unsafe block, but there can rules established on this would work, etc.

@VSadov
Member
VSadov commented Jul 22, 2015

FYI: the PR for the initial commit of a prototype #4042

@Thaina
Thaina commented Sep 9, 2015

I support ref return

And can we have ref parameter in lambda?

@HaloFour
HaloFour commented Sep 9, 2015

@Thaina You can use ref parameters in lambdas today as long as the signature of the target delegate defines those parameters as ref:

public delegate void RefAction<T>(ref T arg);

RefAction<string> action = (ref value) => { value = "Hello World!"; };

string x = "";
action(ref x);
Console.WriteLine(x);
@axel-habermaier
Contributor

@HaloFour: What @Thaina probably means is that you can't capture a ref-parameter in a lambda.

@Thaina
Thaina commented Sep 9, 2015

@HaloFour Sorry I don't know that. Which version we can use ref lambda?

I use unity for such long time so I don't update new info of C# much

@HaloFour
HaloFour commented Sep 9, 2015

@axel-habermaier Maybe. That wouldn't be my first guess given the proposal they posted under, but it is terribly unspecific. IIRC ref parameter capture would be wading too close to unsafe territory since you'd basically have to stuff the address to a variable in the state machine class and the compiler could no longer control its lifetime.

@Thaina C# has always supported ref and out parameters for anonymous delegates and lambdas.

@Thaina
Thaina commented Sep 9, 2015

oh... I never know that we just can't (ref i) => {}. I just need to (ref int i) => {}

Thanks for your point

@gafter gafter removed the 1 - Planning label Nov 20, 2015
@gafter gafter added this to the C# 7 and VB 15 milestone Nov 20, 2015
@gafter gafter assigned VSadov and unassigned MadsTorgersen Nov 20, 2015
@gafter gafter added the 2 - Ready label Nov 20, 2015
@gafter gafter modified the milestone: C# 7 and VB 15 Nov 21, 2015
@gafter gafter added this to the 1.3 milestone Dec 14, 2015
@BreyerW
BreyerW commented Dec 23, 2015

Sorry for necropost, but i have question. I found that ref properties will be supported but only for getter. Why couldnt it be resolved for setters too? I mean if we have

class Foo{

public ref int Number{
get;
set;
}

}

it could be resolved to public ref int get_Number(){...} and public void set_Number(ref int){...}

And if there is reason for abandoning setters why not do like this:

class Foo{

public string Description{
ref get;
set;
}

}

so we still be able to have setter and getter in one property (or this is already the case?)

@VSadov
Member
VSadov commented Dec 23, 2015

@BreyerW the main reason for disallowing setters in byref properties and indexers is that they would not be very useful. While you can make a ref for a field or an array element and return that from the getter, you cannot go the other way in the setter.
If some use pattern is discovered, restriction on byref setters can be relaxed later, so it was decided to start with not allowing them.

@BreyerW
BreyerW commented Dec 23, 2015

Thanks for reply. I wonder - avoiding copy value types while passing to setter method isnt a good thing? And next thing - allowing non-ref setter alongside with ref getter is impossible? like i show in second example?

EDIT:
Ah and if there isnt any dangerous situation with ref setter I dont see why we have to be so strict about this - if someone find pattern for ref setter then you dont have to cook special c# version in future, this already be enabled. And ref could be defined per acessor not per property. Obviously you are designers not I, so possibly there is something subtle i dont know ;).

@VSadov
Member
VSadov commented Dec 24, 2015

@BreyerW An important part here is that byref properties and indexers have assignable getters.
If a type has a byref indexer, you already can read and write elements without redundant copying.
What would be the "obvious purpose" of a setter if property is already assignable via its getter?

In particular byval setter next to a byref getter would actually make assignments ambiguous.

obj.Description = "aaa";

is this a assignment to getter or invocation of a setter?

There are short and long term costs of adding language features and it is next to impossible to remove them. That motivates the design team to resist features with unclear utility or confusing behavior.

@BreyerW
BreyerW commented Dec 24, 2015

oh i think i understand now why you abandoned setter, one of the reason is that ref getter can work like setter thanks to returning ref so there is no point in having setter? If that true then i completely see why you abandoned this. Thanks for clarification, now i feel a bit dump, obviously overlooked that.

the only thing that can be missing is fact by using ref getter there might be problem with firing events like OnBeforeValueChange but this is feature of ref itself, not flaw of c# design

BTW dont forget to update PropertyInfo somehow so CanWrite return true if there is only ref getter or add new check property for signalising there is ref getter. I mention this because i use this class.

@alrz
Contributor
alrz commented Dec 24, 2015

Nothing mentioned about foreach. It would be nice to be able to write foreach(ref Struct item in arr).

@Thaina
Thaina commented Dec 24, 2015

@alrz Your suggest would be impossible from foreach implementation. foreach use IEnumerable interface to return Current which is not return ref from there

It need to do opposite. We should have IEnumerableByRef to override Current. And let foreach check that if the collection is IEnumerableByRef then it will return item as ref automatically

Or maybe it should enable ref keyword in generic. So we will use IEnumerable<ref Struct>

It would be the best if MS will implement all things it has IEnumerable attached to (all things in System.Collection) to implement IEnumerableByRef when the feature was finished

@alrz
Contributor
alrz commented Dec 24, 2015

@Thaina How about this?

When iterating over an array (known at compile-time) the compiler can use a loop counter and compare with the length of the array instead of using an IEnumerator

@Thaina
Thaina commented Dec 24, 2015

@alrz Only array is possible with that kind of foreach. Which I think it should not be difference workflow. Instead, array should implement IEnumerableByRef if C# have one

@alrz
Contributor
alrz commented Dec 24, 2015

It can be simply allowed only for arrays, and then translate to for( .. ) { ref T item = arr[i]; ... }. I don't think that something like <ref Struct> would be possible because it ultimately causes to outlive the local object which is not supported by CLR, AFAIK.

@Thaina
Thaina commented Dec 24, 2015

@alrz I apologize that I am very against the idea of making array a special thing again. Actually I am against the idea to make something special case. We have this special problem from the start that only array has indexer return by ref and now we try to fix it, everything should have indexer return by ref as array could do

Yeah I think <ref Struct> is overkill too. Just IEnumerableByRef is enough

@alrz
Contributor
alrz commented Dec 24, 2015

@Thaina I think nothing's wrong with special cases. foreach already has, though, unobservable, special case for arrays to make it faster, and ref locals also help to make things faster (avoid copying), so combining these two in an use case like this would be nice.

@asvishnyakov
Contributor

👍

@MouseProducedGames

GameDev often involves initializing large, structured data to some default values. Therefore, I suggest the following pattern for consideration:

ref largeData[i] = initData;

This avoids the otherwise needless (in this scenario):

ref data = largeData[i];
data = initData;

Granted, it only saves an extra line. But that extra line really doesn't have a reason to exist, and being able to assign directly to a reference to a location in an array, list or etc would bring it in line with assigning to a value location in an array and so on.

If it's too much work, it can certainly wait, of course; it's far from critical.

@svick
Contributor
svick commented Mar 13, 2016

@MouseProducedGames I don't understand, how is that different from just largeData[i] = initData;?

@MouseProducedGames

@svick: It's an assignment to a ref, not a copy-and-assign to value.

@svick
Contributor
svick commented Mar 13, 2016

@MouseProducedGames Are you saying that largeData is an array of refs? Because I don't think that's proposed here and that CLR can't support that.

If it's not an array of refs, just a normal array of structs, then you do need to copy the data.

@MouseProducedGames

@svick You're missing the point, and misunderstanding it.

This:

ref data = largeData[i];
data = initData;

Is getting a reference to a location in an array and assigning data to that reference.

Likewise, this:
ref largeData[i] = initData
Is also getting a reference to a location in an array and assigning data to that reference. Just without one extra, and unnecessary, line.

The bottom example is simply replicating the code in the top example, only without the need for an explicit ref variable.

@HaloFour

@MouseProducedGames

That wouldn't be possible unless the array was itself of refs or pointers. You can't avoid the copy, so largeData[i] = initData is as good as it gets.

@MouseProducedGames

@HaloFour Would it help if I gave you the equivalent C++ syntax?

Here:
(&values[i]) = initData;

I guarantee, this works exactly as I've actually stated.

And no, you don't need an array of refs to get a reference to a location in an array.

@MouseProducedGames

...Oh, I see the problem. You've latched on to this statement: "It's an assignment to a ref, not a copy-and-assign to value."

Look, that was in response to this statement: "@MouseProducedGames I don't understand, how is that different from just largeData[i] = initData;?"

And assigning to an indexer often has an extra copy into the setter of the indexer. Since you can ref return from a "this[index]" indexer, this avoids copying into the setter of the indexer.

You instead deal with the ref directly.

Does that help?

@svick
Contributor
svick commented Mar 13, 2016

@MouseProducedGames So it's not about arrays, only about custom collections? For those, I believe that just having ref-returning getter (and no setter) would achieve what you want, no need for additional syntax.

@HaloFour

@MouseProducedGames

Ah, so largeData is some type with a custom indexer property, not an array? That's where I think we were getting it confused.

While @svick does mention using a "readonly" property with a ref return as the type I am curious if C# would allow treating that as if it were a settable property. Otherwise you would probably need to assign the ref to a local before you then assigned the value to the address stored in that local. And you're right that this is what the C# compiler would have to emit anyway.

@MouseProducedGames

@svick There's two threads of conversation here. One is about shortening this:

ref data = largeData[i];
// Do something with data.

to this:
ref largeData[i] // Do something with data without the need to copy to an explicit variable, when you don't need an explicit variable;

The other is clarifying that assigning to ref can save you a copy to setter, to explain one reason why you'd want to operate on the ref directly.

Edit: Sorry for the confusion.

@MouseProducedGames

Another reason for using a ref directly would be something like this:

ref data = largeData[i];
data.oneFloat = 5f;

Which is a bit needlessly long-winded compared to:
ref largeData[i].oneFloat = 5f;

@svick
Contributor
svick commented Mar 13, 2016

@HaloFour The original proposal above seems to imply this would be allowed (in the context of methods, but I assume it would work on indexers too).

@svick
Contributor
svick commented Mar 13, 2016

@MouseProducedGames I still think you don't need additional syntax for that. If largeData[i] is a ref-returning indexer with only a getter, then largeData[i] = someValue; and largeData[i].oneFloat = 5f; will do what you want. There is no need for the ref largeData[i] syntax.

@MouseProducedGames

@svick: Ah, I misunderstand your point. I thought you were talking about returning a copy from a getter, which left me quite confused.

However, if there's both a copy getter and a ref getter (for example, if the implementer wanted to make the usage explicit) then ref values[i] could make it explicit which one you're using. Other than that, though, I think you're right on that one, given that clarification.

@MouseProducedGames

@svick Final update, I think. Looks like the devs came to the same syntax you did; I think I was tripped up by trying to apply the same sort of syntax as I would in C++.

Namely:

using System;

namespace TossCSharp7b
{
    class Program
    {
        static void Main(string[] args)
        {
            LargeData[] largeData = new LargeData[3];

            largeData[1].floatValue = 5f;

            foreach (var data in largeData)
                Console.WriteLine(data);

            Console.ReadKey(true);
        }
    }


    struct LargeData
    {
        public double doubleValue;
        public int intValue;
        public float floatValue;
        public char charValue;
        public bool boolValue;

        public override string ToString()
        {
            return string.Format("double: {0}, int: {1}, float: {2}, char: {3}, bool: {4}", doubleValue, intValue, floatValue, charValue, boolValue);
        }
    }
}

Output:
double: 0, int: 0, float: 0, char: , bool: False
double: 0, int: 0, float: 5, char: , bool: False
double: 0, int: 0, float: 0, char: , bool: False

Overall, perhaps allowing a "ref" qualifier, even if pointless, might ease a transition from C++; OTOH, the above format is exactly what I tried to do with array variables when I first learned C# with, IIRC, either v1.1 or v2. Probably v1.1 on XP.

@orthoxerox

A ref-getter is a very welcome feature. It will help things like array slices work exactly like real arrays. Of course, something like IRefList<T> will be required to represent this behaviour.

@fsacer
fsacer commented Apr 2, 2016

Why doesn't the code example work in VS '15' preview. It looks like ternary operator doesn't support it. Im also interested in how to reassign reference variable (ref locals) like in the following code which doesn't compile:

public static ref int Max(ref int first, ref int second, ref int third) {
    ref int max = ref first;
    if (second > max) {
        ref max = ref second;
    }
    if (third > max) {
        ref max = ref third;
    }
    return ref max;
}
@Thaina
Thaina commented May 14, 2016 edited

@bbary @svick If it not permit then it could if we find a way to implement it

What I think it possible is introduce class TaskRef<T> in additional to class Task<T>. Which TaskRef implement public ref T Results { get; } and constructor public TaskRef(FuncRef<T>)

And implement async/await to select Task<T> or TaskRef<T> based on return type of function we used on async

And I'm fine about ref local cannot cross await. But it should be allow ref local to use between await it get from

Code like this actually should work. Should remove CS8942 for this

async ref Product Produce(int i) { }
async Task Consume()
{
    ref Product X = await Produce(0);
    Console.WriteLine(X); // Still live in the block of await it got value from so no error

    ref Product Y = await Produce(1);
    Console.WriteLine(Y);
}

But like this should not work. Would be use CS8942 in this case

async Task Consume()
{
    ref Product X = await Produce(0);
    ref Product Y = await Produce(1);

    Console.WriteLine(X); // Throw CS8942 here
    Console.WriteLine(Y);
}

What I talking about is not that it work with current await or not. I'm saying that it should also work when we introduce ref return feature. And if it need more implementation then we should figure out how to do it. What I ask is the possibility technically, not the limit of current implementation

Still ref is difference from pointer, technically it should be safe. And must not be allowed across async block. But it also should be allowed in the same async block

@bbarry
bbarry commented May 14, 2016

@Thaina I think async ref T Method() should be best left to a separate spec. It is different than both #10902 (in that the return is ref T, not some tasklike) and this issue (in that such a spec would need to entail a state machine and some transport type TaskRef<T> or something like that) but at the same time it is clearly something that will be influenced by both issues.

As nice as production of an async ref method may be, it could be sidestepped for the immediate future of getting ref returns and ref locals out the door. This code:

async Task Consume()
{
    ref Product X = await Produce(0);
    ...
}

could be written without an additional allocation this way:

async Task Consume()
{
    var handle = await ProduceHandle(0);
    ref Product X = ref Resolve(handle);
    ...
}

Or in the slightly more frustrating way the current implementation works:

async Task Consume()
{
    var handle = await ProduceHandle(0);
    DoWork(handle);
}
void DoWork(int handle)
{
    ref Product X = ref Resolve(handle);
    ...
}

And I'm not saying this shouldn't be done. But I will say it will take considerable time to get done and should not be a cause for delaying C#7.

@densh densh referenced this issue in scala-native/scala-native May 20, 2016
Closed

Proposal: first-class support for stack allocation #115

@kfdf
kfdf commented May 25, 2016 edited

Is it correct to understand that only references to local variables are not safe to return? And if so, why not just insert a runtime check in the caller that throws an exception if the returned reference points to something that lies in the discarded stack frame? That shouldn't be too much of a performance hit, and it can be optimized away if the compiler can prove that the returned reference is safe.

@VSadov
Member
VSadov commented May 26, 2016 edited

@kldf At the CLR level - yes we cannot return references to locals and byval parameters. At the C# level the situation is a bit more complex because of lexical scoping.
According to the language semantics a new set of local variables is created when control flow enters {}. As a result you should not be able to access the variable via a reference when outside of {}. From the language semantics the variable does not exist, form implementation prospective the IL slot of the variable could be reused for something else.

So, once we have a reference to a local variable, we would not only have problems with returning it. Using it as a source of byref assignments could also be problematic and cannot rely on runtime checks.

One simple solution to this is - make ref locals single-assignment and require ref assignment to happen at declaration only.

Also it is generally preferable to have compile-time errors over run time failures - you do not want to ship something and only later discover that some scenarios may lead to crashes.

@kfdf
kfdf commented May 26, 2016 edited

One simple solution to this is - make ref locals single-assignment and require ref assignment to happen at declaration only.

Doesn't feel like C#, and what about ref parameters? Maybe just forbidding assignments to ref variables from outer scope will be enough? And, most importantly, the compiler is easy to satisfy in this case.

Also it is generally preferable to have compile-time errors over run time failures

No arguments about that. But I don't see how can current limitations to ref locals and ref returns work in any reasonably complex scenario, with lots of structs passed around and assigned to locals, where just one not safe to return struct 'taints' the returned reference and ref locals are can only be used as aliases to the parameters.

@kfdf
kfdf commented May 27, 2016

One more thought.. second best thing after compile time error is failing fast, and that's what check after return does, whereas currently the compiler can satisfied by promoting local variable to a field. It's a straightforward thing to do and is not too ugly, so is likely to become an accepted practice. But if there is a logical error and a reference to this field is returned, the programs just moves on.

@JeroMiya
JeroMiya commented Jun 2, 2016

This proposal would increase the complexity of the language, without a significant use case or benefit to offset that complexity. I'm not convinced.

@Thaina
Thaina commented Jun 2, 2016 edited

@JeroMiya I have heard something like your words from so many people who was spoiled by fast computer and don't know what new object did to memory, how the GC work or how it impact performance, and how struct could be used. People like these always make every little thing as a class, generate so many little garbage in memory, and boxing thing with ignorance

@VSadov VSadov modified the milestone: 2.0 (Preview 1), 1.3 Jun 6, 2016
@gafter gafter modified the milestone: 2.0 (Preview 1), 2.0 (RC) Jun 20, 2016
@qrli
qrli commented Jul 7, 2016 edited

Would this compile?

class Base {
  public virtual readonly ref int Foo(readonly ref int bar) { ... }
}

class Derived {
  public override ref int Foo(ref int bar) { ... }
}

In other words: Is readonly part of method signature or not?

@qrli
qrli commented Jul 7, 2016

And does readonly ref accept r-value?

@ceztko
ceztko commented Jul 10, 2016

Will ref return allow properties with code to return refs to structs inside classes or structures? For example having Nullable to return ref of the internal value would allow some optimizations when dealing with big structs.

struct Nullable<T>
where T : struct
{
    T m_value;
    bool m_hasValue;

    ref T Value
    {
        get
        {
             if (!m_hasValue)
                 throw new Exception();
             return m_value;
        }
        set { /* ... */ }
    }
}
@VSadov
Member
VSadov commented Jul 11, 2016

@ceztko Inside classes - Yes. Inside structs - No.

That is to prevent cases like

   ref var refToNowhere = ref new Nullable<int>().RefValue; // reference outlives the referent

If a struct field needs to be directly modifiable, It would be safer and likely more efficient to just make such field public instead.

Nullable, in particular, would not allow ref Value for a different reason though. Nullable values are intentionally readonly. Value and HasValue are loosely coupled and are not supposed to be modified separately from each other.

@ceztko
ceztko commented Jul 11, 2016 edited

@VSadov Fair enough for Nullable, I tend to forget that ref allows full assignment of structures. Generically speaking about ref properties in struct, what is the semantics of the second "ref" before "new Nullable" in your example? Is it necessary or could it be omitted?

@HaloFour

@joeante The ref prefix doesn't just indicate that the value can be changed, it also represents a different data type with different semantics both for the language and the CLR. I think hiding that fact would be more confusing than not.

It's also important to note that since ref indicates a different CLR type that the CLR permits overloading ref and by-value methods:

public void Foo(int x) { ... }
public void Foo(ref int x) { ... }

As for readonly ref, only the ref part is relevant to the CLR and to the caller. The readonly keyword would be lost during compilation since there is no CLR metadata to actually encode it or enforce it. The caller would never know if a method was readonly ref vs. just ref.

I can understand the desire to have both performant and attractive code utilizing ref locals/returns. What about ref flavors of the operators?

    // Transforms a [[Vector4]] by a matrix.
    static public ref Vector4 operator *(ref Matrix4x4 lhs, ref Vector4 v, ref Vector4 res)
    {
        res.x = lhs.m00 * v.x + lhs.m01 * v.y + lhs.m02 * v.z + lhs.m03 * v.w;
        res.y = lhs.m10 * v.x + lhs.m11 * v.y + lhs.m12 * v.z + lhs.m13 * v.w;
        res.z = lhs.m20 * v.x + lhs.m21 * v.y + lhs.m22 * v.z + lhs.m23 * v.w;
        res.w = lhs.m30 * v.x + lhs.m31 * v.y + lhs.m32 * v.z + lhs.m33 * v.w;
        return ref res;
    }
@ceztko
ceztko commented Jul 12, 2016

@VSadov still on your example, I have the feeling that it could be solved with rules on stack allocation and ref variables scope/initialization:

{
    // refToProp must be stack allocated and initialized only in this block,
    // and not on any nested blocks. Structure1 must be stack allocated
    // and can't be deallocated before the end of the block
    ref var refToProp = ref new Structure1().RefProp;
}

Am I may missing something else or this is already in violation of other rules? I would be interested in looking at some discussions on the topic, if there are.

@IllidanS4

@HaloFour Of course there's a CLR metadata suited to encode readonly (apart from attributes). It's called modopt and modreq, which is part of a signature. It's used heavily in C++/CLI. As a matter of fact, readonly is quite similar to const in C++, thus it would be suitable to use System.Runtime.CompilerServices.IsConst on parameters as C++/CLI does.

@HaloFour

@IllidanS4

Possibly, but modopt and modreq have the problem of the requirement of being encoded into the call-site. This makes it unsuitable for use with non-nullable references, and I believe makes it just as unsuitable for use with readonly parameters. Marking a parameter as readonly shouldn't cause existing consumers to fail, which is what will happen if they are encoded via modopt/modreq.

@IllidanS4

@HaloFour Good point. However, making it part of a signature has both pros and cons. On one hand, readonly ref guarantees the value won't be modified, so a change in a library from readonly ref to just ref should be a breaking one, because it could highly affect depending code. On the other hand, change from ref to readonly ref is just an additional contract in the method code, like [Pure]. It would require modifying CLR signature resolving rules to be more benevolent in handling type modifiers, maybe differing between opt and req more significantly.

@dotnetchris

I have to reiterate, any solution needs to prevent:

foo.GetByRef("x") = 42;

It should always require capturing the ref as a local to allow modification.

@HaloFour

@dotnetchris

What about with properties?

public struct Foo {
    private int x;

    public ref int X {
        get { return ref this.x; }
    }
}

var foo = new Foo();
foo.X = 123;
@whoisj
whoisj commented Jul 15, 2016

It should always require capturing the ref as a local to allow modification.

Why?

@jaredpar jaredpar modified the milestone: 2.0 (RC), 2.0 (RTM) Jul 18, 2016
@Ziflin
Ziflin commented Aug 12, 2016

Regarding the desire to return references to List<> elements. Doesn't this open up a new set of potential bugs for users? Or are we just assuming people wouldn't do something like:

{
   var items = new List<Item>();
   items.Add( new Item( "ABC", 123 ) );
   var item0 = ref items[0]; // (is ref before var needed? hopefully not)
   items.RemoveAt( 0 );
   items.Add( new Item( "XYZ", 456 ) );
   // < item0 now references {"XYZ", 456"}
}

This can be an annoying source of bugs if the modifications to the list are done in a function call after the item0 assignment (not easily visible).

The performance penalty mentioned should mostly come from this List<> style operator[] access where a copy is returned and it does not seem like a 'safe' thing to add 'ref operator[]' to List<>. But if this proposal is added without ref operator [], is everyone that cares about performance just going to reimplement List<> and add it themselves?

@whoisj
whoisj commented Aug 12, 2016 edited

@Ziflin this is why developers need to understand concurrency. If you want immutable data structures, use read-only or immutable variants.

regardless, if items[0] returns a reference, then your scenario won't happen because item0 would continue to point to the returned reference, not the item[0] which is an indexer (basically a function).

My question would be be (to the designers): what happens when T : struct? Is the ref T a pointer at an address, if so what happens when the List<T> resizes, or is just updated?

Is retaining the ref T advised? Seems to me that it would invalidate quickly - but C#/CLR has some fancy memory tricks it can play.

@HaloFour

@Ziflin

I can't see that being remotely possible, for a couple of reasons.

From a language perspective it's not possible to overload properties (indexer or otherwise) based on their return value, so you couldn't add an indexer that returns a ref while the existing indexer is defined.

From an implementation point of view, internally List<T> will dispose of its underlying array anytime it needs to grow. There's no way for List<T> to safely return a ref into its underlying storage.

@Ziflin
Ziflin commented Aug 12, 2016 edited

@whoisj If List's operator[] returned a reference, it would return a reference to an element of the underlying Array of items. This seems valid according to this spec and the desired behavior in @xen2 comment. Based on how List<>'s Add/Remove are implemented, my example should be the result.

@HaloFour List<>'s operator[] could in theory be changed to only return a readonly ref T. I'm not saying it would, just that seems to be the desired for those performance-minded comments.
Do you mean Dispose() of? Otherwise it's just letting the GC collect the old array and this is no different than any other 'ref' to a member of a heap object (which is allowed in the spec).

I'm not really trying to argue against the points you're making, but I think this is something people that worry about performance are going to try to do. So if anything I just wanted to bring up the issues that they'd have.

@HaloFour

@Ziflin

List<>'s operator[] could in theory be changed to only return a readonly ref T.

This would be a breaking change. T and ref T are completely different types according to the CLR and they require different IL in order to work with them.

Do you mean Dispose() of? Otherwise it's just letting the GC collect the old array and this is no different than any other 'ref' to a member of a heap object (which is allowed in the spec).

"Discard" is probably the better word here. List<T> doesn't do anything to try to clear the memory of the array, it just stops using it. But what this means is that ref values pointing to the underlying array of the list may go stale as the List<T>:

var list = new List<int>();
list.Add(1);
list.Add(2);
list.Add(3);
list.Add(4);
ref int first = ref list[0];

list.Add(5); // Adding a fifth element triggers the List<T> to grow and change arrays
list[0] = 6; // Updates the new array 
Debug.Assert(first == 1); // Ref still points to the old array
@Ziflin
Ziflin commented Aug 12, 2016 edited

@HaloFour - Yes, I agree with you :) I just wanted to point these issues out to anyone that might try to make their own "OptimizedList<>". It's basically possible, but would have some hidden issues.

@jaredpar
Member

Closing as this is now implemented.

@jaredpar jaredpar closed this Sep 19, 2016
@dotnetchris

@jaredpar so does the finalized implementation allow or reject foo.GetByRef("x") = 42;

@jaredpar
Member

@dotnetchris yes. it should even allow

foo.GetByRef("x") = foo.GetbyRef("x")
@dotnetchris
dotnetchris commented Sep 23, 2016 edited

@jaredpar that's very saddening, i just have to pray 99% of people don't touch this feature.

@benaadams

@dotnetchris I think its very good!

However a readonly type extension to both ref arguments and ref locals and returns probably would also be a useful addition (e.g. for larger structs where they are passed by ref due to size; but not for modification)

@copernicus365

foo.GetByRef("x") = 42;

Looks like elegant code to me.

Please never allow foo.GetByRef("x") = 42;
This code reads that you're somehow assigning 42 as if GetByRef was a Func being assigned (s) => 42 ...

That's a lambda, not a simple equals sign. So they are clearly distinguishable. I've been on the other side of this equation before, where what I wanted wasn't done in the end, so I don't mean to rub it in. I'm just personally very glad these restrictions weren't made.

Do you prefer marking every variable with var? I don't either. So just because some people do this (even in cases where the type on the right side isn't evident, therefore making the code less clear), does that mean an arbitrary restriction should be made on the usage of var? Likewise with the examples some gave above of ugly usage, yes, you can always make ugly code. But the snippet example above is still elegant in my view, though it does signal of course a whole new world for C#, ref returns and locals, a very exciting new future.

@Ziflin
Ziflin commented Sep 23, 2016

So I'm sorry I'm still unclear on what is now possible. In C++ the more practical uses would be to set/get large value types by const T&. So is it possible to do readonly ref or did that not make it in? If not, I don't exactly see this being very useful as it appears to break encapsulation, or I'm missing some good use cases.

@benaadams

@Ziflin it improves encapsulation.

Currently if you want to return a reference to an array element (say array of structs) you need to return the whole array and an index to reference into it. So the entire array and all its data needs to be exposed for a single access.

With this change only a single element needs to be exposed.

@Ziflin
Ziflin commented Sep 23, 2016

Yes, but assuming you have some class that contains the array, that class is still not able to return a reference to an element in a way that prevents you from modifying it - or in a way that it knows the element was modified. So this features doesn't seem 'complete' without that functionality.

@HaloFour
HaloFour commented Sep 23, 2016 edited

@Ziflin

that class is still not able to return a reference to an element in a way that prevents you from modifying it

Correct me if I'm wrong, but isn't the only way that the C++ compiler enforces this is via the const modifier? If you were to hand a library to someone with a header and they stripped that modifier then they could modify the reference at will, right?

I think that's the issue here with C#. The CLR itself offers no concept of a ref to a readonly. At best the C# compiler could do is to attach its own metadata to the return value and hope that other compilers will understand and support it. That seems like a relatively fragile solution.

@benaadams
benaadams commented Sep 23, 2016 edited

@VSadov @jaredpar can you use this (or coupled with Tuples) to present a structure of arrays as an array of structures? (or could it do in future?)

e.g.

struct DataElement
{
    public ref int data0;
    public ref int data1;
    public ref Vector4 data2;
    public ref Vector4 data3;
    public ref Vector4 data4;
}

class Data
{
    private int[] data0;
    private int[] data1;
    private Vector4[] data2;
    private Vector4[] data3;
    private Vector4[] data4;

    // Either
    public DataElement this[int i]
    {
        get
        {
            return new DataElement()
                {
                    data0 = ref data0[i],
                    data1 = ref data1[i],
                    data2 = ref data2[i],
                    data3 = ref data3[i],
                    data4 = ref data4[i]
                };
        }
    }

    // Or
    public (ref int, ref int, ref Vector4, ref Vector4, ref Vector4) this[int i]
    {
        get
        {
            return (ref data0[i],
                    ref data1[i],
                    ref data2[i],
                    ref data3[i],
                    ref data4[i]);
        }
    }
}

Structure of arrays for efficient vector processing; array of structures for easy programmablity.

@Ziflin
Ziflin commented Sep 23, 2016

@HaloFour Sure, it's possible to force just about anything in C++, but returning a const T& is still more restrictive than returning a T&. For those wanting a performance benefit for methods previously returning a T by value, a readonly ref is the closest match.

I was just hoping it would be done more correctly in C# or at least support similar use cases. But it seems this is mostly just improving cases where you were already wanting a modifiable reference.

@HaloFour

@Ziflin

It would certainly be possible by going the same route as C++/CLI and placing a modreq(IsConst) with on the return parameter of the method to mark it as a constant. However without explicit support for it any compiler could simply ignore that modifier when consuming your assembly and overwrite your data. I would assume that the vast majority of compilers would need to be modified in order to properly support ref returns anyway so maybe that's not a big deal.

@jaredpar
Member

@dotnetchris

that's very saddening,

Why? That behavior is consistent with every other feature in C# which can return a location. Consider for instance if that code return and array instead of a ref value:

int[] GetArray(string s)

This method gives virtually the same behavior as the one I listed:

GetArray("x")[0] = GetArray("x")[0]
@whoisj
whoisj commented Sep 23, 2016

Why?

Because you've broken the illusion of immutability? 😏

@jaredpar
Member
jaredpar commented Sep 23, 2016 edited

@benaadams

However a readonly type extension to both ref arguments and ref locals and returns probably would also be a useful addition (

I agree. But for it to be useful you need to take it one step further. Consider for example this code:

void M(readonly ref BigStruct s)
{
  Console.WriteLine(s.ToString());
}

In this case the argument is taken by ref presumably to avoid copying a large struct. However in order to execute the ToString call the compiler will fully copy the value to the stack. Oops 😦

This is the behavior of C# when you call a struct method on a readonly location. Without a copy it would be possible for the stuct to violate readonly by modifying it's state within the method.

This logic doesn't just apply to methods, but to properties as well. Hence passing a struct by readonly ref is only advantageous compared to passing by value if you read fields off of it. Any use of method or properties and you're better off passing it the standard way.

In order to get around this we need to be able to mark struct methods in such a way that the compiler knows they aren't mutating. That way it can invoke the method directly vs. having to go through a copy on the stack.

There are two proposals for how to do that:

  • readonly structs: ability to tag an entire struct as readonly. For such structs the type of this in non-constructor members would be readonly ref T instead of ref T.
  • readonly members on structs: ability to tag a struct member as readonly. For that member the type of this would be readonly ref T.
@benaadams

I agree. But for it to be useful you need to take it one step further.

Yes definitely a different Issue; readonly structs are problematic. I see #115 main, addtional #12364 #3202

ref returns and locals as they stand are an amazing addition! Thank you very much!

@VSadov
Member
VSadov commented Sep 23, 2016

@benaadams - ref fields are not allowed in CLR with few very special exceptions which are ref-like structs that are stack only.
So, the structure as you suggest is not currently possible. It might be possible in theory, if concept of ref-like stack/only is more general.

@Ziflin
Ziflin commented Sep 23, 2016

@jaredpar Yes, readonly structs would be nice. We actually had that in our C#-like language that compiled to C++ and it worked well. We also cheated slightly and had the C++ compiler automatically treat them as 'const T&' parameters.

However, how does a readonly struct prevent the encapsulation issue of:
transform.GetPositionByRef() = position;

In this case, having the position (say a Vector3) type be readonly does not help prevent the assignment to the transform. Can this be fixed without resorting to a C++ like const modifier: const ref Vector GetPositionByRef() const {...} ? (We did not want to do this in our language as it seemed to greatly increase the learning curve.)

@jaredpar
Member

@Ziflin

However, how does a readonly struct prevent the encapsulation issue of:
transform.GetPositionByRef() = position;

No. The proposal about readonly struct only refers to the ability of the struct to modify itself via instance methods. The mechanism for doing so is changing this to be typed as readonly ref T instead of the normal ref T.

The GetPositionByRef method though can control whether or not callers can assign into the returned value. Using readonly ref as the return prevents assignment irrespective of whether or not the struct itself is readonly.

@Ziflin
Ziflin commented Sep 23, 2016

The GetPositionByRef method though can control whether or not callers can assign into the returned value. Using readonly ref as the return prevents assignment irrespective of whether or not the struct itself is readonly.

Ok, so then there is a proposal / feature planned for returning a readonly ref? That's mostly what I've been trying to figure out. And can this be used by properties?

@benaadams

@Ziflin I think its captured in #115 though it deals with ref parameters so may need to be extended for ref returns, now they are a thing.

@jaredpar
Member

@Ziflin

As @benaadams pointed out, #115 has it a bit. But it does need to be extended for readonly structs and refs to be complete. It's on my list of items to write up.

@Ziflin
Ziflin commented Sep 23, 2016

@jaredpar @benaadams Ok great. I'm definitely with @joeante (in #115) in his desire to see C# perform as well as it can and this seems like one of the last issues I've had with moving to C# from C++ for game engine development. I guess keeping C# as clean a language as possible with that is the hard part.

@xoofx
Member
xoofx commented Oct 13, 2016

In this case the argument is taken by ref presumably to avoid copying a large struct. However in order to execute the ToString call the compiler will fully copy the value to the stack. Oops

@jaredpar, not sure to understand why readonly ref should be interpreted as a ref readonly... Couldn't we have the ability to have a:

  • readonly ref, i.e. you can do whatever you want on the struct behind the ref, but you can't modify the ref
  • ref readonly, i.e. you cannot modify the struct behind the ref and you can't modify the ref (this would be different from a C++ const). This would allow typically to be able to pass a ref to a readonly field (something we can't do today)

I'm really missing the readonly ref behavior there...

@whoisj
whoisj commented Oct 13, 2016

@xoofx please can avoid the taint of C here with its const int const *ptr non-sense.

I'm having a difficult time thinking of a real scenario where any would want an mutable pointer to an immutable object. The function could too easily just replace the object with its own, mutate as it sees fit, then return control to the caller who would then have a new struct not realizing it. Seems rife with misuse and danger.

@xoofx
Member
xoofx commented Oct 13, 2016 edited

a real scenario where any would want an mutable pointer to an immutable object

No scenario, I don't propose this (as I said above, the pointer ref readonly is not mutable).

@Thaina
Thaina commented Oct 13, 2016 edited

In C# world ref of struct is not pointer. It is the object itself. It can only be immutable pointer for mutable object

And by the standard of static internal protected is the same as protected internal static. ref readonly and readonly ref must be the same

@jaredpar
Member

@xoofx

I'm really missing the readonly ref behavior there...

Reading your comment I think there may be a bit of a terminology difference. Let me elaborate a bit on the operations for a ref that could be affected by readonly:

  • re-pointing: a ref is really just a pointer that is safe. Hence just like you can change the address a pointer refers to, you could also change the location a ref points to.
  • mutating the target: this is modifying the memory a ref points to. In the case of a class it would be changing it to refer to a new instance (or null). In the case of a struct though it's mutating the contents directly.

Attaching readonly semantics to a ref could choose to affect one, or both of these operations.

When I say readonly ref I'm referring to protecting against mutating the target. I definitely understand the inclination to say that syntactically the readonly modifies the ref so perhaps it should be guarding against re-pointing.

At this time though the language doesn't allow for re-pointing of ref values. I have a lot of skepticism that it would ever be allowed. Mostly because it is of fairly limited use. Midori made heavy use of ref locals / returns and there was only one case in our extremely large code base where we ever wanted to allow for a re-point operation. Additionally allowing for re-pointing complicates the lifetime rules around ref locals significantly. Hence it's low use, extra complication ... less likely to happen.

My skepticism aside though, assume we did desire both re-pointing and the ability to guard against it. That would be in addition to guarding against mutating the target (a very good case can be made for this feature). That means logically variables can now be defined as readonly ref readonly. While that is logically correct and meaningful it probably makes most developers go "huh?".

But if we did go with this feature I'm sure we'll spend plenty of time debating ref readonly vs. readonly ref. Hard to pass up a good naming / syntax debate 😄

@xoofx
Member
xoofx commented Oct 13, 2016 edited

@jaredpar Ah, sorry, may be I have not been enough clear. I'm not proposing the idea to re-pointer the ref (though, I have never had a need for this, but hey, the idea could grow on me 😄 ) , but to disallow the variable (and the struct behind of course) to be re-assigned entirely.

Let me take an example for a readonly ref scenario:

struct MyStruct
{
   public readonly int X;
   public int Y; 
}

public void Process(readonly ref MyStruct val)
{
   // This would not compile
   // In this case, we also disallow the field X to be modified
   // while with a regular ref, we could modify it indirectly with the following code
   val = new MyStruct();  
   // We cannot do this
   val.X++;
   // But we can do this:
   val.Y++;
   ....
}

It allows typically to protect the variable + protect readonly fields behind, which is a nice behavior as It allows partial immutability of a ref struct. If the caller of the method is passing this struct, It can ensure that the callee will not be able to modify its readonly fields (or even private ones).

On the other hand ref readonly would allow to pass a readonly field or variable to another method:

class MyClass
{
     public static readonly MyStruct MyField;
}

public static void Process(ref readonly MyStruct val)
{
    // We cannot do this:
    val = new MyStruct();
    // And also we cannot do this:
    val.Y++;
}

Process(ref MyClass.MyField); // It would be possible

Hope it makes more sense 😅

@whoisj
whoisj commented Oct 13, 2016

It'll be difficult to make a solid case for why we need "immutable references to mutable structs", "mutable references to immutable structs", "immutable references to immutable structs", and "mutable references to mutable structs".

Seems to be (ref stuct) and (readonly ref stuct) is all we need. One allows for mutable the other is immutable. This is a far simpler set of things to understand and the lost "flexibility" closes a lot of holes for bugs to sneak in through.

IMO (readonly ref struct) should be the same as (ref readonly struct), given C# laziness in keyword order enforcement historically.

@xoofx
Member
xoofx commented Oct 13, 2016

It'll be difficult to make a solid case for why we need "immutable references to mutable structs", "mutable references to immutable structs", "immutable references to immutable structs", and "mutable references to mutable structs".

@whoisj, I have been abusing structs for years in C#, because they are lightweight objects, interop nicely with native code and allow to lower substantially pressure on the GC. And while using them a lot, I have been facing many problems, not only related to performance but also about their safety-ness. Being a strong users of structs makes me looking forward to more powerful abilities (e.g ref locals/returns... but I have so many other stuffs that would probably roll your eyes 😋 ) and stronger options for safety (readonly, more control on immutability). So yes, the cases you are listing like they are small side things (e.g who cares about safety or immutability?), are for me primordial. I'm not talking from a "nice to have place" but from a "real-world usage" place, as yours, but with a different "is all we need" world if you prefer... 😉

@benaadams
benaadams commented Oct 13, 2016 edited

@whoisj I see two variants which @xoofx covers

Pass byval semantics with pass byref cost which I think was the ref readonly example so (good for large structs and read only use):

public static void Process(ref readonly MyStruct val)
{
    // We cannot do this:
    val = new MyStruct();
    // And also we cannot do this:
    val.Y++;

   // However we can do this as it creates a copy; though introduces a byval cost
   var newVal = val;
   newVal.Y++;
}

For byref where you want to allow modifications to the original but not allow overriding of properties which is the readonly ref first example (semi-mutable structs)

public void Process(readonly ref MyStruct val)
{
   // This would not compile
   // In this case, we also disallow the field X to be modified
   // while with a regular ref, we could modify it indirectly with the following code
   val = new MyStruct();  
   // We cannot do this
   val.X++;
   // But we can do this:
   val.Y++;
}

As if you can do the new MyStruct(); then you can override the readonly properties on it with the .ctor

@xen2
xen2 commented Oct 14, 2016

+1 to readonly ref!

When you deal with large struct and want to avoid copies (think Matrix), ref makes a lot of sense.
And of course, we want to have predefined values as static readonly (i.e. Matrix.Identity).

The problem is we can't use any of the Matrix methods that take a ref with those static readonly (i.e. Matrix.Multiply(ref Matrix.Identity, ref matrix2)). The only way is to make a full copy beforehand, or getting rid of the readonly (bringing lot of safety issues).

@Thaina
Thaina commented Oct 14, 2016 edited

Well, I have remember there is an argument about readonly ref. It is about the problem that struct with readonly may not be able to call method (also property). Because, internally, method of struct could modify its value. So it cannot call any method at all

Even get only property can modified struct too

Maybe we also need readonly function and readonly get/set to make it compatible?

@jaredpar
Member

@xoofx

but to disallow the variable (and the struct behind of course) to be re-assigned entirely.

Gotcha. That is absolutely the intent of readonly ref. It's a way to safely take a ref to a struct that lives in a readonly location. It effectively disallows all mutations, including assignments.

@xoofx
Member
xoofx commented Oct 14, 2016

the intent of readonly ref [...] It effectively disallows all mutations, including assignments.

That's a ref readonly in my terminology. 😅 A readonly ref would allow partial/controlled mutation but not full assignment (see my example above where val.Y++; is possible for a readonly ref), which is important when you want to make sure that a callee cannot modify the private/readonly fields/state of the mutable struct but only through its public mutable API.

@jaredpar
Member

@xoofx

I feel like you're trying to draw a distinction that doesn't really exist though. Their is no real difference between mutating the public and non-public / readonly portions of a struct. A struct is either mutable in it's entirety or not mutable at all.

This example is clearer if you consider method calls. Take for example the following, completely legal, method:

struct S
{
  public readonly int X;
  public int Y;
  private int Z;

  public void M()
  {
    this = new S();
  }
}

In your design would a readonly ref be able to call M without introducing a copy? In order to maintain the proposed semantics of readonly ref the answer must be no. This means then that readonly ref is only a useful distinction for accessible, mutable fields of a struct. I don't think that's enough of a benefit for the extra complexity.

@benaadams
benaadams commented Oct 14, 2016 edited

@xoofx for params I could see the semi-muatable working as an in parameter (due to the loose keyword ordering of C# on ref and readonly)

// passed by val (or register)
void Process(MyStruct val)
 // fully mutable including assignment
void Process(ref MyStruct val)
// readonly struct; no assignment, no method calls (get props allowed?)
void Process(ref readonly MyStruct val)
// must be assigned in function
void Process(out MyStruct val) 
// semi-mutable struct, no assignment, but method calls & non readonly assignment fields allowed
void Process(in MyStruct val) 

However the in paramater wouldn't make sense for a return/local maintaning the same sematics

// value/register struct
MyStruct val0 = val;
// fully mutable including assignment
ref MyStruct val0 = val;
// readonly struct; no assignment, no method calls (get props allowed?)
ref readonly MyStruct val = val;
// semi-mutable struct, no assignment, but method calls & non readonly assignment fields allowed
// Not sure what would match for local
@xoofx
Member
xoofx commented Oct 14, 2016

In your design would a readonly ref be able to call M without introducing a copy?

@jaredpar Yes. The struct itself know its state and is the owner of the implementation details. It disallows the callee to break anything that is not exposed by the public API on the struct, but the implementation in the struct can choose whatever is needed. Again, the readonly ref is just saying = The ref is not assignable by the callee, not that the struct behind is readonly. But I understand that the keyword could be misleading (though if we are introducing it for other locals/params, it feels more natural to me but well...)

As @benaadams is suggesting another keyword would be something like in ref or refin, basically a ref that cannot be out (assigned entirely by the callee)

@jaredpar
Member

@xoofx

The ability to assign to a struct location and call methods without a copy are equivalent operations. Adding protection for one without protection for the other is just lulling developers into a false sense of confidence about their code.

This all has to do with how this is modeled. In a struct the type of this is ref T. Hence whenever you call a method on a struct the target must be convertible to ref T. That is why it's wrong from a language correctness standpoint to allow readony ref to call a method without a copy. It's implying there is a conversion between readonly ref T and ref T.

@baronfel baronfel referenced this issue in baronfel/issueTestRepo Oct 14, 2016
Open

Expand support for byref to match C# 7 #22

@xoofx
Member
xoofx commented Oct 14, 2016 edited

Well, If a library A provides a struct (that can be created in a valid state only by lib A using some internal constructors) and and interface with a readonly ref method, this can guarantee to an end user implementing it that It cannot modify the struct in unexpected ways that lib A hasn't covered. It provides confidence for user of lib A, but sure, It doesn't save the developer of lib A to make mistake internally. Can't really adhere to the idea of a strict equality in the behavior between assignment on a struct location and calling a method on it... and like the transient for stackalloc for class, it could be possible to detect struct method that are making such a "violation" and the compiler could report it...
But, seeing how much this idea is controversial, we can forget it, good sign that it is not a good idea after all... 😉

@baronfel baronfel referenced this issue in fsharp/fslang-suggestions Oct 20, 2016
Closed

Expand support for byref to match C# 7 #22

@OndrejPetrzilka
OndrejPetrzilka commented Oct 22, 2016 edited

I'm testing ref locals and I've encountered following limitation. I declare variable ref int as reference to first element in array. How can I change where this variable points? Let's say I want to change it to point to last element, but it's not possible? (see my attempts below)

static int[] data = new int[] { 0, 1, 2, 3, 4 };
unsafe static void Main(string[] args)
{
    ref int slot = ref data[0];

    slot = data[4]; // This stores value "4" into data[0], I don't want that

    // This does not work
    //ref int slot = ref data[4]; // This would be it, except variable is already declared
    //slot = ref data[4];
    //ref slot = ref data[4];

    slot = 99; // When it works, this would overwrite last element

    foreach(var item in data)
    {
        Console.WriteLine(item);
    }
    Console.ReadKey();
}

I thought I'll try to compare performance of this approach in my tree-like collection implemented on array. Currently when looking for element to add/remove, I have locals like int currentIndex, int parentIndex. With this I thought I would use ref Node current, ref Node parent, but when it's not possible to modify current in while loop, it won't work.

@axel-habermaier
Contributor

@OndrejPetrzilka: AFAIK, that's unsupported. You can't reassign references in C++ either.

@OndrejPetrzilka

@axel-habermaier: That makes sense, otherwise it would be probably much harder if not impossible for compiler to detect invalid use. I'm not happy about it though. Is it possible to reassign reference in IL?

@VSadov
Member
VSadov commented Nov 16, 2016

It is possible to assign managed pointer in IL, but it is not possible to reset ref local or parameter in C#. Not in C#7.

Safety of use is indeed an issue to solve here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment