In [1]:
#include "../common.hpp"

# Types and Safety

**Goal: Write _complete_, _expressive_, and _efficient_ types**

## Exercises

**SKIP** Following cells are skipped for workshop

**Exercise:** Find a type in your project which is not equationally complete and make it so.

- Why?
    - An equationally complete type is easier to test
        - If you cannot read a property, how do you validate it?
    - Considering how to make a type equationally complete forces you to think through the properties of the type

- Considerations
    - Only properties with associated constraints (invariants) and relationships require accessors member functions
    - Providing direct data access is preferred to boiler plate _getters and setters_
    - The Objective-C naming conventions can make an API more clear
        - Reading a property is simply the name of the property, i.e. `property()`
        - Writing a property is done with `set_property()`

```cpp
v.resize(10);
auto s = v.size();

v.reserve(10);
auto s = v.capacity();
```
vs.
```cpp
v.set_size(10);
auto s = v.size();

v.set_capacity(10);
auto s = v.capacity();
```

**Exercise:** `my_type` contains a bug. Find the bug. Fix it using at least two different
approaches. What are the trade-offs?

In [2]:
namespace library {

class my_type {
    struct implementation;             // forward declaration
    implementation* _remote = nullptr; // remote part
public:
    // declare the basis operations - implementation is in a .cpp file
    my_type(int x, int y);
    ~my_type();
    my_type(const my_type&);
    my_type& operator=(const my_type&);

    my_type(my_type&& a) noexcept : _remote{a._remote} { a._remote = nullptr; }
    my_type& operator=(my_type&& a) noexcept;
};

} // namespace library

In [3]:
namespace library {

struct my_type::implementation {
    int _x;
    int _y;
    //...
};

my_type::my_type(int x, int y) : _remote{new implementation{x, y}} {}
my_type::~my_type() { delete _remote; }
my_type::my_type(const my_type& a) : _remote{new implementation{*a._remote}} {}
my_type& my_type::operator=(const my_type& a) {
    *_remote = *a._remote;
    return *this;
}
my_type& my_type::operator=(my_type&& a) noexcept {
    delete _remote;
    _remote = a._remote;
    a._remote = nullptr;
    return *this;
}

} // namespace library

- What bug?

```cpp
{
    using namespace library;
    
    my_type a{10, 20};
    my_type b{12, 30};
    
    b = move(a);
    a = b;
}
```

```
input_line_9:11:6: warning: null passed to a callee that requires a non-null argument [-Wnonnull]
    *_remote = *a._remote;
     ^~~~~~~
Interpreter Exception: 
```

```cpp
// b = move(a);

my_type& my_type::operator=(my_type&& a) noexcept {
    delete _remote;
    _remote = a._remote;
    a._remote = nullptr; // <--
    return *this;
}
```

```cpp
// a = b;

my_type& my_type::operator=(const my_type& a) {
    *_remote = *a._remote;
//   ^~~~~~~ nullptr dereference
    return *this;
}
```

- Some operations _must_ be valid on the otherwise unspecified state
    - destruction
    - copy and move assigning to the object (to establish a new value)
    - self move assignment (for self-swap)

## Safety

- An object which represents an entity is _fully formed_.
- An object which does not represent an entity is _partially formed_.

- Any operation which maintains the correspondence between an object and an entity it represents is _safe_
- An operation which loses the correspondence is _unsafe_

- There are different categories of safety
    - i.e. _memory safety_
        - Destroying the correspondence of unrelated objects to an entity ultimately causes the bug

- An operation is _operationally safe_ if, when the operation pre-conditions are satisfied, the operation results in objects which are fully formed
- An operation is _operationally unsafe_ if, when the operation pre-conditions are satisfied, the operation may result in an object which is not fully formed
    - From here on, when referring to a _safe_ operation we mean _operationally safe_

- As a general rule
    - Only safe operations should be public
    - Unsafe operations should be private
    
- Moving from an object _may_ leave the object in a "valid but **unspecified**" state
    - _Unspecified_ is without a correspondence to an entity
    - move is a public unsafe operation, it may leave the moved from object in a partially formed state
    
- There is a trade-off between safety, and efficiency
    - Not every operation can be implemented to be both safe, and efficient (provably)

- There are many examples of unsafe operations with the built in types:

In [4]:
{
    int x; // unspecified
    cout << x << endl;
}

32766


In [5]:
{
    double x = 0.0/0.0; // explicitly undefined
    cout << x << endl;
}

nan


In [6]:
{
    string x = "hello world";
    string y = move(x); // unspecified
    cout << x << endl;
}




In [7]:
{
    unique_ptr<int> x = make_unique<int>(42);
    unique_ptr<int> y = move(x); // safe! x is guaranteed to be == nullptr
}

- After an unsafe operation where an object is left partially formed
    - Subsequent operations are required to restore the fully formed state prior to use
        - If the partially formed state is _explicit_ it may by used in subsequent operation but those operations must yield explicitly undefined values for later detection and handling
        - i.e. NaN, expected, maybe-monad pattern
    - Or the object must be destroyed

- An _implicit move_, one generated by the compiler, always occurs on an expiring value
    - This means the combined operation of `op(rv); rv.~T();` is safe
- `std::move()` is equivalent to `static_cast<T&&>()`
    - Explicit move is unsafe
    - Circumventing the type system requires additional care

### Fixes to copy-assignment crash

- We need to be able to assign to our partially formed value
    - Two possible options
        - Change assignment
        - Change move

In [8]:
namespace lib3 {

class my_type {
    struct implementation;             // forward declaration
    implementation* _remote = nullptr; // remote part
public:
    // declare the basis operations - implementation is in a .cpp file
    my_type(int x, int y);
    ~my_type();
    my_type(const my_type&);
    my_type& operator=(const my_type&);

    my_type(my_type&& a) noexcept : _remote{a._remote} { a._remote = nullptr; }
    my_type& operator=(my_type&& a) noexcept;
};

} // namespace lib3

In [9]:
namespace lib3 {

struct my_type::implementation {
    int _x;
    int _y;
    //...
};

my_type::my_type(int x, int y) : _remote{new implementation{x, y}} {}
my_type::~my_type() { delete _remote; }
my_type::my_type(const my_type& a) : _remote{new implementation{*a._remote}} {}
my_type& my_type::operator=(const my_type& a) {
    if (_remote) *_remote = *a._remote;
    else _remote = new implementation{*a._remote}; // <---
    return *this;
}
my_type& my_type::operator=(my_type&& a) noexcept {
    delete _remote;
    _remote = a._remote;
    a._remote = nullptr;
    return *this;
}

} // namespace lib3

In [10]:
{
    using namespace lib3;
    
    my_type a{10, 20};
    my_type b{12, 30};
    
    b = move(a);
    a = b;
}

In [11]:
namespace lib4 {

class my_type {
    struct implementation;             // forward declaration
    implementation* _remote = nullptr; // remote part
public:
    // declare the basis operations - implementation is in a .cpp file
    my_type(int x, int y);
    ~my_type();
    my_type(const my_type&);
    my_type& operator=(const my_type&);

    my_type(my_type&& a) noexcept : _remote{a._remote} { a._remote = nullptr; }
    my_type& operator=(my_type&& a) noexcept;
};

} // namespace lib4

In [12]:
namespace lib4 {

struct my_type::implementation {
    int _x;
    int _y;
    //...
};

my_type::my_type(int x, int y) : _remote{new implementation{x, y}} {}
my_type::~my_type() { delete _remote; }
my_type::my_type(const my_type& a) : _remote{new implementation{*a._remote}} {}
my_type& my_type::operator=(const my_type& a) {
    *_remote = *a._remote;
    return *this;
}
my_type& my_type::operator=(my_type&& a) noexcept {
    swap(_remote, a._remote); // <----
    return *this;
}

} // namespace lib4

In [13]:
{
    using namespace lib4;
    
    my_type a{10, 20};
    my_type b{12, 30};
    
    b = move(a);
    a = b;
}

```cpp
{
    using namespace lib4;
    
    my_type a{10, 20};
    my_type b = move(a);
    a = b;
}
```
```
input_line_17:11:6: warning: null passed to a callee that requires a non-null argument [-Wnonnull]
    *_remote = *a._remote;
     ^~~~~~~
Interpreter Exception: 
```

### Idiomatic Approach

In [14]:
namespace lib5 {

class my_type {
    struct implementation;             // forward declaration
    implementation* _remote = nullptr; // remote part
public:
    // declare the basis operations - implementation is in a .cpp file
    my_type(int x, int y);
    ~my_type();
    my_type(const my_type&);
    my_type& operator=(const my_type&);

    my_type(my_type&& a) noexcept : _remote{a._remote} { a._remote = nullptr; }
    my_type& operator=(my_type&& a) noexcept;
};

} // namespace lib5

In [15]:
namespace lib5 {

struct my_type::implementation {
    int _x;
    int _y;
    //...
};

my_type::my_type(int x, int y) : _remote{new implementation{x, y}} {}
my_type::~my_type() { delete _remote; }
my_type::my_type(const my_type& a) : _remote{new implementation{*a._remote}} {}
my_type& my_type::operator=(const my_type& a) {
    return *this = my_type(a); // <--- copy and move
}
my_type& my_type::operator=(my_type&& a) noexcept {
    delete _remote;
    _remote = a._remote;
    a._remote = nullptr;
    return *this;
}

} // namespace lib5

In [16]:
{
    using namespace lib5;
    
    my_type a{10, 20};
    my_type b{12, 30};
    
    b = move(a);
    a = b;
}

{
    using namespace lib5;
    
    my_type a{10, 20};
    my_type b = move(a);
    a = b;
}

- The idomatic solution can work with unique_ptr

In [17]:
namespace lib6 {

class my_type {
    struct implementation;
    struct deleter {
        void operator()(implementation*) const noexcept; // <---
    };
    unique_ptr<implementation, deleter> _remote;
public:
    // declare the basis operations - implementation is in a .cpp file
    my_type(int x, int y); // <---
    ~my_type() = default;
    my_type(const my_type&); // <---
    my_type& operator=(const my_type& a) { return *this = my_type(a); }

    my_type(my_type&& a) noexcept = default;
    my_type& operator=(my_type&& a) noexcept = default;
};

} // namespace lib6


In [18]:
namespace lib6 {

struct my_type::implementation {
    int _x;
    int _y;
    //...
};

my_type::my_type(int x, int y) : _remote{new implementation{x, y}} {}
my_type::my_type(const my_type& a) : _remote{new implementation{*a._remote}} {}
void my_type::deleter::operator()(implementation* p) const noexcept { delete p; }

} // namespace lib6

In [19]:
{
    using namespace lib6;
    
    my_type a{10, 20};
    my_type b{12, 30};

    b = move(a);
    a = b;
}

{
    using namespace lib6;
    
    my_type a{10, 20};
    my_type b = move(a);
    a = b;
}


### Tradeoffs

- **Copy Assignment: In situ assignment (if available) or copy construct**
- **Move Assignment: Swap**

- Performance: Faster for in situ case (saves heap allocations)
- Object Lifetime: Not precise
- Exception Safety: Basic Guarantee (not transactional)
- Implementation: Complex

- **Copy Assignment: Copy construct and move assign**
- **Move Assignment: Consume**

- Performance: Slower
- Object Lifetime: Precise
- Exception Safety: Strong Guarantee (transactional)
- Implementation: Simple

- Recommendation
    - I prefer the idiomatic, simpler approach
        - unless I have evidence of a performance issue
        - or the type is heavily used
    - Write it correct and simple first

## Default Construction

- What should the state be of a default constructed object?
    - Should it always be fully formed?
    
- A common use case of a default constructed object is to create the object before we have a value to give to it:

In [20]:
namespace {
bool predicate() { return true; }
std::pair<std::string, std::string> get_pair() { return std::make_pair<string, string>("Hello", "World"); }
}

In [21]:
{
    string s;
    if (predicate()) s = "Hello";
    else s = "World";
}

In [22]:
{
    string s1;
    string s2;
    tie(s1, s2) = get_pair();
}

- The language has facilities that make it rarely necessary to construct an object before providing a value:

In [23]:
{
    string s = predicate() ? "Hello" : "World";
}

In [24]:
{
    auto [s1, s2] = get_pair();
}

- This makes having a default constructor optional
    - But not having one can be inconvenient 

- A default constructor value is often overwritten before use
    - As such it is inefficient to allocate memory, or acquire resources, in the default constructor

- A default constructor should:
    - Be noexcept (one way to do this is to initialize to point to a const (or constexpr) singleton)
    - Be `constexpr`
    - Execute in time no worse than the time proportional to the `sizeof()` the object
    - If the object has a meaningful _zero_ or _empty_ state it should initialize to that state
        - Otherwise it may be partially-formed

In [25]:
namespace lib7 {

class my_type {
    struct implementation;
    struct deleter {
        void operator()(implementation*) const noexcept; // <---
    };
    unique_ptr<implementation, deleter> _remote;
public:
    // declare the basis operations - implementation is in a .cpp file
    constexpr my_type() noexcept = default; // partially formed
    my_type(int x, int y); // <---
    ~my_type() = default;
    my_type(const my_type&); // <---
    my_type& operator=(const my_type& a) { return *this = my_type(a); }

    my_type(my_type&& a) noexcept = default;
    my_type& operator=(my_type&& a) noexcept = default;
};

} // namespace lib7

- Recommendation
    - Provide a default-ctor
    - Avoid using it unless it has a meaningful zero or empty value

**Exercise:** Look at the regular operations (copy, assignment, equality, default construction) for a type in the standard library, or a commonly used type within your project. Is the implementation correct? Complete? Efficient?

### What is _mutable_?

### Polymorphism will be covered later

## Efficiency

- An operation is _efficient_ if there is no way to implement it to use fewer resources
    - time
    - space
    - energy
    
- Unless otherwise specified, we will use efficiency to mean _time efficiency_
    - But in practice, where not all three can be achieved the trade-offs should be consider