In [1]:
#include "../common.hpp"

# Preface

- What is _better code_?

- What is _good code_?

- Goals:
    - Not prescriptive
    - Not always achievable

- Not limited to C++
- Language is a constraint

## Types

**Goal: Write _complete_, _expressive_, and _efficient_ types**

> A _type_ is a pattern for storing and modifying objects.

- In C++ _type_ is a mechanism for implementing types, but can also be used for other purposes
- We use _type_ to mean _type_ as well as the mechanism for implementing types in C++ interchangeably

> An _object_ is a representation of an entity as a value in memory.

- An object is a _physical_ entity, and as such is imbued with a set of properties
    - size
    - address

- All objects have of common, _basis_, operations
    - constructible
    - destructible
    - copyable<sup>1</sup>
    - equality comparable<sup>1</sup>

- <sup>1</sup>Well defined, but may be problematic to implement

> The _computational basis_ for a type is a finite set of procedures that enable the construction of any other procedure on the type

- A type which does not implement a _computational basis_ is _incomplete_

## Regular

> There is a set of procedures whose inclusion in the computational basis of a type lets us place objects in data structures and use algorithms to _copy objects_ from one data structure to another. We call types having such a basis _regular_, since their use guarantees regularity of behavior and, therefore, interoperability.

- The copy operation creates a new object, equal to, and logically disjoint from the original

\begin{align}
b & \to a \implies a = b. && \text{(copies are equal)}
\end{align}

> Two objects are _equal_ iff they represent the same entity

- From this definition we can derive the following axioms for equality:

\begin{align}
(\forall a) a & = a. && \text{(Reflexivity)} \\
(\forall a, b) a & = b \implies b = a. && \text{(Symmetry)} \\
(\forall a, b, c) a & = b \wedge b = c \implies a = c. && \text {(Transitivity)} \\
\end{align}

- Copies are logically disjoint

For all $op$, which modifies its operand, and $b = c$:
\begin{align}
b & \to a, op(a) \implies a \neq b \wedge b = c.  && \text{(copies are disjoint)}
\end{align}

- An _algebraic structure_ is a set of connected axioms
    - as with copy and assignment
- Algebraic structures define the basic semantics of operations

### Implementing Copy, Assignment, and Equality

- Copy-constructor is used to implement the copy operation
    - **The compiler is fee to assume the semantics of the copy constructor and may elide the copy**
- To copy an object, simply copy all the _members_ or _parts_
- If not defined, the compiler will provide a member-wise copy-constructor
- The copy-constructor can be declared `= default` to ensure it is present

In [2]:
class my_type {
    // members
public:
    my_type(const my_type&) = default;
};

In [3]:
.undo

- Similarly, the compiler will provide a member-wise copy-assignment operator

In [4]:
class my_type {
    // members
public:
    my_type(const my_type&) = default;
    my_type& operator=(const my_type&) = default;
};

In [5]:
.undo

- If the representation of an object is unique, then equality can be implemented as member-wise equality
- Unfortunately, the compiler does not implement member-wise equality (until C++20)
- Use `std::tie()` as a simple mechanism to implement equality

- Do not declare `operator==()` as a member operator
- A `friend` declaration may be used to implement directly in the class definition.
    - `inline` is implied.

In [6]:
class my_type {
    int _a = 0;
    int _b = 42;
    
    auto underlying() const { return std::tie(_a, _b); }
public:
    my_type(const my_type&) = default;
    my_type& operator=(const my_type&) = default;
    
    friend bool operator==(const my_type& a, const my_type& b) {
        return a.underlying() == b.underlying();
    }
    friend bool operator!=(const my_type& a, const my_type& b) {
        return !(a == b);
    }
};

In [7]:
.undo

#### Semantics and Complexity

- We associate semantics with operation names to ascribe meaning to software
    - Operations with the same semantics should have the same name
- The complexity of an operation is another important part of the operation semantics
    - By associating complexity with names we make code easier to reason about
- The _expected_ complexity of copy, assignment, and equality<sup>2</sup> is proportional to the area of the object
    - If these operations cannot be implemented with the expected complexity, they should be given different names


- <sup>2</sup> worst case, if equal. For most unequal objects the expected complexity is a small, constant.

- Naming is language
    - Often semantics are expected from patterns of common use
    - When naming functions consider expectations and that few will read any specification

### Equationally Complete

- A type for which equality can be implemented as a non-friend (non-member) function is said to be _equationally complete_
- A type which is both equationally and computationally complete can be copied without the use of the copy-constructor or assignment operator
    - Equationally complete implies all the parts are readable
    - Computationally complete implies all the values are obtainable

**SKIP** Next cell is skipped for workshop

**Exercise:** Find a type in your project which is not equationally complete and make it so.

## Relationships

- Relationships are unavoidable with objects in a space
    - The address of an object is the relationship between the object and the space within which it resides
    
- For any relationship there is a predicate form
    - Dick and Jane are married (relationship)
    - Are Dick and Jane married? (predicate)

- We normally think of objects as representing _things_ or _nouns_
    - An object may also represent a _relationship_
    - The `next` pointer in a linked list represents the relationship between one element and its successor

- An object which represents a relationship is a _witness_ to the relationship
- When copying a witness, or an object in the witnessed relationship, there are three possible outcomes
    - The relationship is maintained
    - The relationship is severed
    - The witness is invalidated 

### Whole-Part Relationship

- A common and useful relationship is the _whole-part_ relationship
- An object is a whole, composed of its parts
- A part is _local_ if it is stored directly in the object

In [8]:
class my_type {
    std::string _str; // local part
    int _val; // local part
    //...
};

In [9]:
.undo

- A part is remote if it is stored elsewhere (such as on the heap)
    - Variable size data (polymorphic or dynamic arrays)
    - Trade-off in performance of copy vs. _move_
    - Sharing of immutable data
    - Separation of interface from implementation dependencies (PImpl)

- Remote parts are expensive
    - You can copy roughly 10K of data in the time it takes to make a small heap allocation (< 1K)
    - And 5K of data in the time it takes to make a large heap allocation
    - Each access is a potential cache miss
    - Most objects are never or rarely copied
        - We'll see why soon

- Prefer local parts when appropriate
    - There are _many_ unnecessary heap allocations in Photoshop (and most products) 
- But also be aware that techniques like PImpl can greatly improve build time and reduce header file pollution
    - In C++20, modules may make this less necessary

- Here is a common implementation of PImpl
    - We'll look at this more later

In [10]:
// my_type.hpp

namespace library {

class my_type {
    struct implementation;             // forward declaration
    implementation* _remote = nullptr; // remote part
public:
    // declare the basis operations - implementation is in a .cpp file
    my_type(int x, int y);
    ~my_type();
    my_type(const my_type&);
    my_type& operator=(const my_type&);
};

} // namespace library

In [11]:
// my_type.cpp

// #include "my_type.hpp" // first include

// other includes

namespace library {

struct my_type::implementation {
    int _x;
    int _y;
    //...
};

my_type::my_type(int x, int y) : _remote{new implementation{x, y}} {}
my_type::~my_type() { delete _remote; }
my_type::my_type(const my_type& a) : _remote{new implementation{*a._remote}} {}
my_type& my_type::operator=(const my_type& a) {
    *_remote = *a._remote;
    return *this;
}
    
} // namespace library

In [12]:
.undo 2

- A major downside of using the PImpl pattern is the amount of forwarding boiler plate that must be written.

### Move
- The _move_ operation transfers the value of one object to a new or existing object
    
\begin{align}
a = b, a & \rightharpoonup c \implies c = b. && \text{(move is value preserving)}
\end{align}

- This says nothing about the moved from value
    - In this way, move is a _weaker_ form of copy
- The expectation is that moving a value does not require additional resources, beyond the local storage, for an object
    - In this way, move is a _stronger_ form of copy
- Move is a distinct operation as part of an _efficient_ basis

> A basis is _efficient_ if and only if any procedure implemented using it is as efficient as an equivalent procedure written in terms of an alternative basis.

- In C++ we implement the move operation in terms of rvalue references.
    - An rvalue is a temporary value
    - Any witnesses to remote parts can be maintained without copying the remote part

In [13]:
namespace library {

class my_type {
    struct implementation;             // forward declaration
    implementation* _remote = nullptr; // remote part
public:
    // declare the basis operations - implementation is in a .cpp file
    my_type(int x, int y);
    ~my_type();
    my_type(const my_type&);
    my_type& operator=(const my_type&);

    my_type(my_type&& a) noexcept : _remote{a._remote} { a._remote = nullptr; }
    my_type& operator=(my_type&& a) noexcept;
};

} // namespace library

In [14]:
namespace library {

struct my_type::implementation {
    int _x;
    int _y;
    //...
};

my_type::my_type(int x, int y) : _remote{new implementation{x, y}} {}
my_type::~my_type() { delete _remote; }
my_type::my_type(const my_type& a) : _remote{new implementation{*a._remote}} {}
my_type& my_type::operator=(const my_type& a) {
    *_remote = *a._remote;
    return *this;
}
my_type& my_type::operator=(my_type&& a) noexcept {
    delete _remote;
    _remote = a._remote;
    a._remote = nullptr;
    return *this;
}

} // namespace library

- The requirements in the C++ standard are that we must leave the moved from object _"valid but unspecified"_ state
    - This is a contradiction
    - Because the value is _unspecified_ the object no longer has _meaning_ and not all operations are valid
- Some operations _must_ be valid on the otherwise unspecified state
    - destruction
    - copy and move assigning to the object (to establish a new value)
    - self move assignment (for self-swap)

**Exercise:** `my_type` contains a bug. Find the bug. Fix it using at least two different approaches. What are the trade-offs?