Proposal: move #160

Open
stephentoub opened this Issue Jan 29, 2015 · 12 comments

Projects

None yet
@stephentoub
Member

Background

Linear transfer of ownership is a useful concept in programming. It’s the notion of taking some variable and handing out its value while at the same time making the original variable unusable for the value. For example, consider a type meant to serve as a data structure of work to be processed; it lets you add to it one element at a time and then extract from it the chunk of work previously added to it.

public sealed class WorkQueue<T>
{
    Queue<T> m_work;

    public void Enqueue(T item)
    {
        if (m_work == null) {
            m_work = new Queue<T>();
        }
        m_work.Enqueue(item);
    }

    public bool TryExtractAll(out Queue<T> queue)
    {
        queue = m_work;
        m_work = null;
        return queue != null;
    }
}

You wouldn’t want multiple calls to ‘TryExtractAll’ to return the same data, so the method explicitly nulls out the work queue after copying it and before returning it. Failure to null it out would silently result in the same elements being processed multiple times by the consumer.

As another example, consider implementing a basic stack data structure:

public sealed class SimpleStack<T>
{
    T[] m_items = new T[4];
    int m_count; 

    public int Count => m_count;

    public void Push(T item)
    {
        if (m_count == m_items.Length) {
            readonly T[] arr = new T[m_count * 2];
            Array.Copy(m_items, arr, m_count);
            m_items = arr;
        }
        m_items[m_count++] = item;
    }

    public T Pop()
    {
        readonly T item = m_items[--m_count];
        m_items[m_count] = default(T);
        return item;
    }
}

Here we null out the removed element before handing them back. From the perspective of processing the data, there’s no value in null’ing out the element in the underlying array, but there is from a correctness perspective: if the value isn’t zeroed out, the array will retain a reference to the data even if no one else is using it, and thus could artificially extend the lifetime of the data indefinitely (or until enough elements are added back to the collection to overwrite this slot).

Problem

Both TryExtractAll and Pop in these examples implement a linear transfer of ownership, copying some value and then nulling out the original. Given how common this is, and given the reliability bugs that can result from neglecting to null out the original, language support for the concept is beneficial (and as we’ll see later in this chapter, it’s also actually required for other scenarios).

Solution

Introduce a 'move' keyword. 'move' would provide the exact behavior being discussed here: extract some value, zero out the original, and hand back the copied value, similar to the following method:

static T Move<T>(ref T location)
{
    T value = location;
    location = default(T);
    return value;
}

and could be used as follows:

string s1 = "hello"; 
string s2 = move s1;
Debug.Assert(s1 == null);
Debug.Assert(s2 == "hello");

With 'move', we can now re-implement our previous examples. The code bodies shrink to the point where we can easily just use expression-body syntax to implement the members:

public bool TryExtractAll(out Queue<T> queue) => (queue = move m_work) != null;
...
public T Pop() => move m_items[--m_count];

In these examples, 'move' has not only helped to ensure proper behavior, it’s also reduced the amount of code we had to write to achieve the same functionality, and less code mean less chance for error.

It’s important to note that 'move' does not provide any atomicity guarantees. In other words, it doesn’t atomically extract the value and zero out the original. This means that you still need to be careful when performing an operation like this involving multiple threads concurrently accessing the same data, using methods like Interlocked.Exchange.

On its own, 'move' isn't particularly valuable; after all, its functionality can be achieved using a Move method like that previously shown. Its value comes from the compiler understanding the implications of it, which enables additional features to be implemented that rely on the compiler having this knowledge: see #161.

@sharwell
Member

For types that fit in a machine word, you have this:

string s1 = "hello";
string s2 = Interlocked.Exchange(ref s1, null);

The definition of move s1 could be a straight translation to:

Interlocked.Exchange(ref s1, default({typeof s1})

Obviously this is shorter, but I'm not (yet) sure it's worth the additional complexity.

@HaloFour

It feels weird to have a keyword dedicated something like this. Maybe some kind of exchange operator?

string s1 = "hello";
string s2 <= s1;

I'm not proposing <= specifically, I don't even really care for it, but no character combination immediately came to mind.

Could the same feature be used for more general purpose swaps, instead of zeroing out the source?

int x = 1, y = 2;
x = swap y;
Debug.Assert(x == 2);
Debug.Assert(y == 1);
@MgSam
MgSam commented Jan 30, 2015

Agree that this seems like way too specialized a scenario for language support. It saves only a few keystrokes and is only useful in a very small set of circumstances.

The only real benefit I can see is in contexts where an expression is required. In these situations however, the ; operator would be a better solution as it allows you to accomplish what you want while being much more general purpose.

@stephentoub
Member

Regarding value, please see the last paragraph of the proposal above:

On its own, 'move' isn't particularly valuable; after all, its functionality can be achieved using a Move method like that previously shown. Its value comes from the compiler understanding the implications of it, which enables additional features to be implemented that rely on the compiler having this knowledge: see #161.

I simply separated it out from #161 since it can stand on its own.

@gafter gafter added the 1 - Planning label Feb 2, 2015
@MadsTorgersen MadsTorgersen was assigned by gafter Feb 2, 2015
@dpaoliello
Contributor

For locals, would it be better to "uninitialize" the variable instead of setting it to default(T):

var foo = move bar;
Console.WriteLine(bar.ToString()) // Compilation error: 'bar' is uninitialized
@gafter
Member
gafter commented Feb 25, 2015

@dpaoliello I think making the variable not definitely assigned is a great idea. But I think it would still have to be assigned default(T), as the variable may have been captured by a lambda or ref variable (see #118) and used elsewhere.

@paulomorgado

@gafter, I think it needs to be one way or the other. And O like @dpaoliello 's suggestion better than @stephentoub 's. And this is where the compiler can bring value over simply using an Interlocked method.

If the variable has already been captured, this might be a problem and the user must choose if she want the variable captured or move its value. And it's not hard to work around it.

This:

var a = 2;

var t = Task.Run(() => { DoSomething(); DoSomethingWith(a); });

var b := a; // compiler error

Would have to be this:

var a = 2;
var aa = a;

var t = Task.Run(() => { DoSomething(); DoSomethingWith(aa); });

var b := a;

By the way, I tried out :=. Not a good idea, though, beacause it's already used for attribute property initializer.

@gafter
Member
gafter commented Feb 26, 2015

@paulomorgado Generally when you have two otherwise orthogonal language features, if there is a special rule for when they are used together that is a bad language design smell. So your suggestion that move() and capture are mutually exclusive smells bad to me.

This feature was carefully designed to be part of a coherent set of features including #161, for which there is some practical experience. Is that feature set still coherent with your suggested change?

@gafter gafter added 0 - Backlog and removed 1 - Planning labels Nov 20, 2015
@Unknown6656

@HaloFour: How about the syntax a <- b? <= is already used in comparison expressions 😜

int a = 5;
int b <- a;
// b has the value 5;

int a = 5, b = 2;
a swap b;
// a has the value 2, and b the value 5
// could also be achieved using the following line:
a ^= b ^= a ^= b;
@HaloFour
HaloFour commented Jul 5, 2016

@Unknown6656

b <- a is already a legal expression, testing whether b is less than negated a, or b < -a.

@Unknown6656

@HaloFour: damn, I forgot ... maybe the token <> or <|> could be used to indicate a swap and some sort of pipe (e.g. <|) for a value move...

@sirgru
sirgru commented Dec 30, 2016

I think it would be required to have a dedicated keyword for this operation, otherwise you would not be able to write:

SomeMethod(move omod1);

without

var omod2 ~= omod1;
SomeMethod(omod2);

which is not quite the same.
it also makes sense analogous to the new keyword, the same way we can 'new something up' we can 'move an existing value'.
If there was a shorthand, then it would make most sense to me to write ~=, as in 'destroy the previous one and assign'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment