Skip to content

SmartRValueReferences

David Jewsbury edited this page Mar 21, 2015 · 3 revisions

#Smart use of rvalue references

Consider carefully how you use rvalue references, move operators and move constructors. Use them intelligently and avoid common problems.

##C++ is new again!

Even with all of the cool new things that C++11 brings, rvalue references stand out as a wonderful new addition to the language. This feels like the language feature that C++ has been missing for so long!

I remember the first time I heard that the working group was considering this feature. It was part of a mailing list discussion related to pointer aliasing and return-by-value compiler optimisations. But even then, it was clear that this idea was a really big change to the language.

rvalue references are not just about avoiding a few unnecessary copies. They make so many things much easier. Consider:

  • viable smart pointers
  • std::unique_ptr<> would just be std::auto_ptr<> with rvalue references -- a hacked version of move semantics without compiler support
  • more viable to pass-by-value and return-by-value
  • return-by-value problems led to compilers implementing hidden hacks to avoid inefficiencies
  • but now those hacks collapse seamlessly into the wider concept of "moving"
  • compiler-enforced memory management behaviour
  • see the page on Smart pointers in function signatures. The methods described there require move operations to work
  • fewer chances for exceptions to occur
  • move operations typically never throw exceptions, but copy operations can throw
  • So, using a move where there previously would have been a copy reduces the possibility that can exception will occur

##rvalue reference...?

If you haven't used move operators yet, here's a quick run-down.

A rvalue references is the type of references we get from std::move().

Object obj;
Object&& rvalueReference = std::move(obj);

By contrast, the old type of reference is now a lvalue reference:

Object obj;
const Object& lvalueReference = obj;

It's called an "rvalue" because it represents the right-hand side of an assignment expression. Consider the following:

obj = Object("SomeInitialiser");

In the above, obj is the "lvalue." Object("SomeInitialiser") is the rvalue. Here, if obj has a move operator, then that move operator will be invoked with an rvalue reference to the rvalue.

Here's the interesting thing... Real rvalues don't have a name. In the above example, Object("SomeInitialiser") constructs a new Object, but that object never gets a name. It's a compiler generated temporary, without a name. So it's an rvalue.

Because it doesn't have a name, we can never refer to that temporary object again. We can't do anything to it, except destroy it.

We can convert a lvalue to an rvalue using std::move().

Object lvalueObj("SomeInitialiser");
obj = std::move(lvalueObj);

Here, we construct an Object and give it the name "lvalueObj." It has a name, so it's an lvalue. But then we use std::move() to convert it into an rvalue. In theory, during this conversion, we also revoke its name!

There are more detailed explanations elsewhere on the internet. But this now leads us to the first problem...

##Common problems

###Don't use objects after moving them

Consider:

std::string str("Some String");
AVerySmartFunction(std::move(str));
str = "New String";

What would happen in the above case?

When we use std::move, we're giving a function or object permission to destroy our object. Here, SomeFunction has permission to use str as a parameter to a move operator or move constructor (for example, a std::unique_ptr<> will just behave like a normal empty pointer).

After an object has been used as a parameter to one of these, it typically reverts to an undefined state. Most move operators will move all of the members, which will "blank-out" the original object. Imagine the following class:

class Foo
{
public:
  unsigned _int;
  std::vector<float> _floats;
  Foo(Foo&& moveFrom);
};

Foo::Foo(Foo&& moveFrom)
: _int(moveFrom._int)
, _floats(std::move(moveFrom._floats))
{}

Foo originalObject;
Foo newObject(std::move(originalObject));

What happens to originalObject after the call to the move constructor? Some members (like _floats) will get blanked-out as a result of the move operation. But not all objects can be moved. Members that can't be moved, will usually get copied instead. In this case the member _int still retains its previous value.

This leaves originalObject in a very unusual state. It has reverted to a partially initialised state. In this state, it is safe to call the destructor. But any other operations with the object may result in undefined (or unexpected behaviour).

This is particularly worrisome when using a compiler generated move operator. The author of the class many never have anticipated that!

This is maybe the biggest problem with std::move. Normally we want to avoid these kind of partially initialised states as much as possible. But the nature of std::move() means it's necessary here.

####Revoke the name!

I said before that std::move() revokes the "name" of an object. The object becomes not only partially initialised, but also "unnamed."

This how we should think about it. After an lvalue has been wrapped by std::move(), we should never refer to it again.

However, it can be difficult to catch this situation. It's easy to use std::move(), and accidentally refer to the same object again. The compiler will not generate any errors or warning. Worse, some objects may appear to work fine if they are used after begin a parameter to a move operator or move constructor.

Also, consider the above example with AVerySmartFunction(). We don't know if this function is really going to use the object as a parameter to a move operator or move constructor.

Maybe the current implementation of AVerySmartFunction() won't call any move operators or move constructors. So today, it's fine. But maybe sometime in the future, AVerySmartFunction() could be changed. Then the calling code could mysteriously (and unexpectedly) stop working.

####Exception for swap method

Every rule has an exception. Consider the following code:

template<typename Type>
  void Swap(Type& lhs, Type& rhs) never_throws
{
  Type temp = std::move(lhs);
  lhs = std::move(rhs);
  rhs = std::move(temp);
}

This function will swap the contents of two objects using the move constructor and operator. On the first line of the function, we move data out of lhs. But on the next line, we then move new data into lhs. (We then do the same thing for rhs).

This violates our "revoke the name" rule, because we are using lhs after moving data out of it. But it's necessary for this type of swap behaviour. And this is useful because it only uses the move constructor and move operator of Type and those should never throw.

So this leaves us with 2 valid things that can be done with an object after it's contents have been moved out:

  • the object can be destroyed
  • or we can move new data into the object

It seems unlikely that we would need to do this kind of thing in any situation other than implementing a swap pattern like this. But then again, every rule has it's exception.

It would be interesting to consider whether copying new data into an object should be allowed. Conceivably, the preconditions for the swap operator should be similar to the move operator. But why would we need to do that?

###Copy constructors and copy operators can hide move constructors and move operators

Consider:

// class FooBar, written 5 years ago
class FooBar
{
public:
  FooBar();
  FooBar(const FooBar& copyFrom);
private:
  std::vector<float> _floats;
};

// client code, written in new C++
FooBar originalObject;
FooBar newObject(std::move(originalObject));

A lot of older code will have explicitly defined copy constructors and copy operators. But what happens when we use that code with a new compiler, like the above example? The caller is hoping that newObject will be constructed with a compiler-generated move constructor.

However, the standard prevents compilers from generating move constructors when the copy constructor is explicitly defined. This means that FooBar can never be move constructed. Even when it looks like the move constructor will be called, instead we get the copy constructor.

This can be confusing in cases like this:

std::vector<FooBar>& vector = ...;
vector.insert(i, FooBar());

Do we get move operator optimisations when vector resizes? Nope, just the copy constructor.

This can happen when hiding the copy operator and copy constructors:

class FooBar
{
private:
    // prevent compiler from generating copy operations
  FooBar(const FooBar&);
  const FooBar& operator=(const FooBar&);
};

Now we can't copy FooBar... But we can't move it, either.

Instead, use the new syntax:

class FooBar
{
  FooBar(const FooBar&) = delete;
  const FooBar& operator=(const FooBar&) = delete;
};

##Conclusion

Every new language feature brings with it new dangers. It wasn't that long ago that many people refused to use the STL. They thought it was too easy to make mistakes with it.

But the STL is powerful, if used the right way. It's the same with rvalue references. Sure, if you use it the wrong way, you can just end up hurting yourself.

But if you use it right... Well, this is the language feature that C++ has been lacking for too long.