## 7 - Smart Pointers
##### **Author: Adam Gatt**

### Problem: How do we know when to delete a dynamic object?

We can create objects in dynamic memory (i.e. on _the heap_) using the keywords `new` and `delete`. Objects created in this way do not follow the rules of scope like static objects do. They will not have their destructor called and memory freed when their originating scope block ends. Instead we must be sure to manually call `delete` to free the object when it is no longer needed.
* If we fail to `delete` the object its memory will remain allocated but unused, resulting in a memory leak
* If we use the object after it has been deleted we cause a [use after free](https://encyclopedia.kaspersky.com/glossary/use-after-free/) error, or potential vulnerability
* If our control logic [`delete`s the object twice](https://isocpp.org/wiki/faq/freestore-mgmt#double-delete-disaster) then we enter _undefined_ behaviour and a runtime crash or worse may occur.

We conclude that each object with a `new` statement must be matched with a corresponding `delete`. But in practice this rule may be surprisingly difficult to enforce due to complexities in reasoning about _object lifetimes_.

* The object's creation and end-of-life points might be very far away from each other in the code, requiring careful tracking of the object to see where it goes.
* The object might flow through different control logic, resulting in multiple different areas where its end-of-life is expected and `delete` must be called. The logic must be checked to ensure that only one such call can be reached for each object.
* We need to always be certain whether a variable is an array or not as we would need to call `delete[]` instead, and calling the wrong type of delete operator results in undefined behaviour
* A dynamic object might be created (and returned) by a function, showing no `new` syntax in the immediate area around where the object is first used.
* The object reference may be provided to a function call, which may "consume" the object (and thus delete it) or might not, depending on convention and how the function is expected to be used.
* The object may be provided for storage in a data structure which may itself be deleted. Some data structures may delete their held references upon destruction (e.g. a list) and some may not (e.g. a cache).
* There may be multiple references to the same dynamic object, potentially held in different threads, requiring us to be careful that only one reference is deleted and at a safe time. This often requires us to choose one reference as being responsible for object deletion.

### Object ownership

Many libraries and idioms attempt to straighten out this complexity by modelling the concept of _object ownership_. A dynamic object only ever has a single _owning reference_, with other temporary non-owning references allowed. The ownership of the object can be transferred between references, or across function boundaries, or between threads, but there is only one owner at any point in time. When the object is no longer required, the owning reference is responsible for deleting it. Non-owning references have a way to check whether the object has been deleted before attempting to make use of it.

### Smart pointers: A RAII approach to object ownership

Smart pointers are an application of the RAII technique towards memory management. We use RAII objects called _smart pointers_ to represent and track the ownership of the dynamic object. The smart pointers are then responsible for automatically deleting the object at the right time, according to the strategy used. C++ offers three types of smart pointers:

* `std::unique_ptr`

`unique_ptr` is a single unambiguous owner of the dynamic object. By using _move semantics_ (discussed more in the next notebook) we can allow for the object to be transferred to a new owner at important boundaries (e.g. function boundaries). We follow scope-based rules to ensure that the object is eventually deleted when its current owner falls out of scope.

* `std::shared_ptr`

`shared_ptr` is for situations where multiple owners is required and unavoidable. In this model, you can have multiple `shared_ptr`s to the same dynamic object, and it will be deleted when all of its `shared_ptrs` have been destroyed (i.e. at the moment when the last surviving `shared_ptr` is destroyed). This is 

* `std::weak_ptr` is a weak, non-owning reference to a shared_ptr


#### Note: Please ignore the existence of std::auto_ptr
auto_ptr is an earlier approach to smart pointers that didn't work out. It had issues with being stored in containers, it did not support hosting arrays, and it did not support the new move semantics. unique_ptr is considered as its replacement, and auto_ptr is deprecated in the language standard.

### unique_ptr

`unique_ptr` is the C++ standard library class for a smart pointer that only has a single owner. This owner will be responsible for deleting the dynamic memory once it falls out of scope. 

In [None]:
#include <array>
#include <iostream>

class HeavyObject {
    private:
    std::array<double, 500> samplePoints;
    
    public:
    HeavyObject() {
        std::cout << "Object created using 4kb" << std::endl;
    }
    
    ~HeavyObject() {
        std::cout << "Memory has been freed up" << std::endl;
    }
}

In [None]:
std::cout << "Before the scope" << std::endl;
{
    // We should use the "make_unique" function to create unique_ptrs, for reasons we will go
    // into later
    std::unique_ptr<HeavyObject> myObj = std::make_unique<HeavyObject>();
    
    // Some other stuff happens before the end of the scope, we never need to remember that a dynamic
    // object is floating around and needs to be deleted.
}
std::cout << "After the scope" << std::endl;

In the example above, _myObj_ is a unique_ptr holding a HeavyObject. Although the unique_ptr is a stack variable, the HeavyObject in its contents has been dynamically allocated and lives on the heap. When the scope ends, the unique_ptr will be destroyed and it will automatically `delete` (or `delete[]`) its contents, cleaning up the dynamic data. Thus we have nothing to remember later for when or how the dynamic data should be deleted. Assuming we aren't "moving" the contents elsewhere (see below), we simply have to ___do nothing___ to ensure the cleanup of this dynamic memory.

#### "Moving" the dynamic contents
Because a unique_ptr is the single, unambiguous owner of its data:

* You can never create a copy of it
* Instead you can "move" its contents to another unique_ptr

"Moving" a unique_ptr follows the new rules for C++ "move semantics". This is the concept of transferring (moving) the data from one object to another. When you move from one unique_ptr to another, the new unique_ptr is now the owning smart pointer and the old unique_ptr is now empty (technically in a "valid but unspecified state"). 

In [None]:
#include <memory>

std::unique_ptr<int> first = std::make_unique<int>(5);

In [None]:
// Copying a unique pointer is not allowed
std::unique_ptr<int> second = first;

In [None]:
#include <iostream>
/
// But we can "move" the contents of a unique_ptr to another
std::unique_ptr<int> second = std::move(first);

// I can "de-reference" the unique_ptr with the expected * syntax
std::cout << *second << std::endl;

_Note: You must not use the contents of an "empty" unique_ptr unless you take steps to re-populate its contents (either initialise new data for it or move another unique_ptr into it)._

### Moving across function boundaries
#### Returning from a function
These "move semantics" allow us to transfer unique_ptrs in and out of different scopes. Most importantly, it allows us to transfer unique_ptrs across function call boundaries, submitting them to functions or returning them from functions.

In [None]:
std::unique_ptr<int[]> range(int size) {
    std::unique_ptr<int[]> output = std::make_unique<int[]>(size);
    
    for (int i = 0; i < size; ++i) {
        output[i] = i;
    }
    
    // std::move to move the contents of the smart pointer out of the function. Technically we can leave this
    // out and the compiler will auto-deduce the use of the "move constructor" for creating the unique_ptr that
    // will be created in the calling scope
    return std::move(output);
}

In [None]:
std::unique_ptr<int[]> countToTen = range(10);

// unique_ptrs with arrays offer access to contents with usual [] syntax
countToTen[4]

It is important to be clear about this process. _countToTen_ lives in the outer, calling scope. But its contents have been provided by the return value of the _range_ function, allowing the dynamic array data to be transferred to a new scope instead of being deleted at the end of _range_. Instead, _countToTen_ now owns the dynamic data and it will be deleted when _countToTen_ falls out of scope, unless we proceed to transfer it again with a subsequent move operation.

Also, if we don't need the unique_ptr to exist in the function body except just at the very end, solely for the purpose of being returned (such as a factory function) then we can just use make_unique directly in our return statement.

In [None]:
#include <memory>

std::unique_ptr<Character> characterFactory(int level) {
    
    return std::make_unique<Character>("Thrag", "Orc", level);
}

#### Passing ownership into a function
What happens if we submit a unique_ptr to a function call? At the function boundary, the contents of the smart pointer will be "moved" into the unique_ptr of the function's parameters, allowing these contents to be transferred into the function.

In [None]:
class Message {
private:
    const char* sender;
    const char* contents;

public:
    Message(const char* sender, const char* contents)
      : sender(sender), contents(contents) { };
    
    const char* getContents() const {
        return contents;
    }
};

In [None]:
#include <iostream>

void readAndDelete(std::unique_ptr<Message> message) {
    
    // unique_ptrs with single objects offer access to its content's members with the usual * and -> syntax
    std::cout << message->getContents() << std::endl;
}

In [None]:
std::unique_ptr<Message> greeting = std::make_unique<Message>("Adam", "Hello, World");

readAndDelete(std::move(greeting));

// NOTE: I must be sure not to use greeting anymore unless I provide it with new data to own
*greeting

___But note___ that this means the original, outer-scope unique_ptr is no longer valid. When you submit a unique_ptr to a function call you are "consuming" that unique_ptr. The called function has now taken ownership of the data and it will be deleted when the function scope ends.

#### Passing to a function without losing giving up ownership

_What if we didn't want to do this?_ What if we want the called function to have temporary access to the unique_ptr data without taking ownership of it. Perhaps we want to continue using the unique_ptr after the function has been called.

We have two options to choose from depending on what we intend to do. Here are the recommendations from Bjarne Stroustrup's [C++ Core Guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#f7-for-general-use-take-t-or-t-arguments-rather-than-smart-pointers):

__1)__ Most of the time we want to ask for a reference for the unique_ptr's underlying type, and call that function by de-referencing the unique_ptr. This implies we are only interested in a reference to some data, without any intention of changing or considering data ownership or lifetimes.

_This approach can also be used with a pointer instead of a reference. In that scenario we call the unique_ptr's [get() method](https://en.cppreference.com/w/cpp/memory/unique_ptr/get) to get to its underlying pointer. We could also ask for the underlying type by-value if we want to create a copy of the underlying data._ 

In [None]:
#include <iostream>
std::unique_ptr<Message> greeting = std::make_unique<Message>("Adam", "Hello, World");

In [None]:
void readWithoutTakingOwnership(const Message& message) {
    // I have a reference to the unique_ptr of the outer, calling scope without taking ownership
    std::cout << "In function: " << message.getContents() << std::endl;
};

// Dereference the unique_ptr to get its underlying data
readWithoutTakingOwnership(*greeting);

// The unique_ptr "greeting" remains the owner of the data and still has its contents intact
std::cout << "After function: " << greeting->getContents() << std::endl;

__2)__ Alternatively we can ask for a reference to the actual unique_ptr itself. This implies we are interested in having access to this ownership information, and in fact we can do all the usual operations we might expect with the unique_ptr including moving data out of it or assigning it to new data. This is the recommended convention if the function is expected to change the ownership of the provided unique_ptr, such as "re-seating" the pointer on to other data.

In [None]:
#include <iostream>

void modifyTheSmartPointer(std::unique_ptr<Message>& message) {
    // reset() assigns new data to the unique_ptr and safely deletes the previous data
    message.reset(new Message("Warren", "Goodbye cruel world"));
    std::cout << "In function: " << message->getContents() << std::endl;
};

std::unique_ptr<Message> greeting = std::make_unique<Message>("Adam", "Hello, World");
modifyTheSmartPointer(greeting);

// The unique_ptr has been "re-seated" by the function and contains new dynamic data
std::cout << "After function: " << greeting->getContents() << std::endl;

3) we can pass only a reference to the unique_ptr instead of passing by value (which would trigger the move semantics) or we can get the raw pointer within our unique_ptr with `get()` and provide that to the function.

In [None]:
void readWithoutTakingOwnership(const Message* message) {
    // I have a raw pointer, in the same approach as with traditional C++
    std::cout << "In function: " << message->getContents() << std::endl;
};

std::unique_ptr<Message> greeting = std::make_unique<Message>("Adam", "Hello, World");
readWithoutTakingOwnership(greeting.get());

// The unique_ptr "greeting" is still the owner of the data and still has its contents
std::cout << "After function: " << greeting->getContents() << std::endl;

#### unique_ptr function summary
If we are returning a unique_ptr from a function:
- the return type should be `std::unique_ptr<Data>`
- Either create the unique_ptr in the function and `std::move` it out, or just call `make_unique` in the return statement

If we need to pass a unique_ptr to a function, the parameter should ask for:
 - `std::unique_ptr<Data>` if we intend the function to "consume" and take ownership of the data
 - `std::unique_ptr<Data>&` if we intend to make changes to the ownership of the unique_ptr object
 - `Data&` / `Data*` / `Data` if we want access to the underlying data (or a copy) without making any changes to ownership

### unique_ptrs as object data members
The introduction above presents unique_ptrs as existing solely on the stack within standard block scopes, but they can also be used as object data members for classes that need to own dynamic data. In the traditional approach, we create the dynamic data during object creation and then we must ensure that it is cleaned up in the destructor.

In [None]:
class CircularBuffer {
    private:
    char* data;
    
    public:
    CircularBuffer(size_t size) : data(new char[size]) { }
    
    ~CircularBuffer() { delete[] data; }
}

With a unique_ptr we still want to create the dynamic data in the constructor (likely still in the member initialisation list), but we do not need to remember any sort of cleanup. When the object is destroyed it will follow the usual process of calling ["the destructors for all non-static non-variant members of the class"](https://en.cppreference.com/w/cpp/language/destructor).

In [None]:
class CircularBuffer {
    private:
    std::unique_ptr<char[]> data;
    
    public:
    CircularBuffer(size_t size) : data(new char[size]) { }
}

In [None]:
class CircularBuffer {
    private:
    std::unique_ptr<char[]> data;
    
    public:
    CircularBuffer(size_t size) : data(std::make_unique<char[]>(size)) { }
}

### Optional: Smart pointer creation, `new` vs `make_*`
In the examples above I have been using the `std::make_unique` function to create unique_ptrs. If I wanted to, there is no reason why I can't create a unique_ptr with [its standard constructor](https://en.cppreference.com/w/cpp/memory/unique_ptr/unique_ptr), passing it a pointer to the dynamic data that I create manually with the usual `new` keyword.

In [None]:
#include <memory>
#include <string>

std::unique_ptr<std::string> dynamicStringManager(new std::string("Hello, World!"));

*dynamicStringManager

However, in C++14 the functions `make_unique` and `make_shared` were introduced as it was realised that there was a subtle bug that may result in a memory leak. In short, if a smart pointer was created as a temporary object (e.g. such as in a function call) and then an exception occurs before the temporary object is used and resolved, there is a chance that the dynamic data would be successfully created but the creation of the owning smart pointer would fail. With no smart pointer to clean it up, the dynamic data would leak.

In [None]:
func1(std::unique_ptr<A>(new A("Hello", 50)), func2())

Consider the (contrived) example above, with a function "func1" that requires a smart pointer and the results of a second function "func2". In the C++ standard, the execution order of the statement is not strongly defined. It is very possible for the dynamic object `new A()` to be created first, and then for `func2` to execute and cause an exception. The temporary unique_ptr never gets created, and there is nothing responsible for deleting our dynamic object.

In the alternative syntax below, the `make_unique()` call is atomic. The unique_ptr may already have been created when func2 throws an exception, but it is firmly an existing _temporary object_ by then. When the exception occurs the stack will be unwound and the smart pointer's destructor will be called, cleaning up the data. 

In [None]:
func1(std::make_unique<A>("Hello", 50), func2())

But then when C++17 was released, the standard was changed to include additional guarantees, including that all side-effects of a function argument must be evaluated before moving on to the next argument. This rule fixes the above example and obviates the safety-based argument for using make_unique. However, if you cannot guarantee that your code will only be compiled with C++17, it may be good to make a habit of using make_unique (and make_shared) to avoid being caught out.

In addition to the safety argument, there are two other arguments commonly made for using the make_* creation functions:
* You only need to write the underlying type name once, improving readability when the type name is especially long
* You can arguably avoid the need to write `new` completely, making the new/delete semantics effectively obsolete.



### Shared_ptr
Reference counted
#### Creation (Manual/make_shared)
#### Create extra reference
#### Drop reference
#### weak_ptr
Non-owning reference - check for null and expect to lose at any time
    Breaks cycles of shared_ptrs
Example of object cache


### Rules for smart pointers
* create smart points with their std::make_* functions
* return smart pointers by value
* Accept unique_ptr as parameter if you intend to claim ownership
* Accept shared_ptr as parameter if you intend to add to ownership
* Accept raw pointer if you intend to use but not claim ownership

### Use as data members in classes
Use in member initialisation list
Tie to object life-cycle
#### Standard pattern