## 6 - RAII (or: stack-based resource management)
##### **Author: Adam Gatt**

### The resource management pattern
A common pattern that repeatedly comes up in software development goes as follows:
1. You need to acquire a resource
2. You do something with the resource
3. You have to release the resource or perform some other clean up

This is the [resource management](https://en.wikipedia.org/wiki/Resource_management_(computing)) pattern, and it is a generalisation of a wide range of use cases, including the following:

| Acquire the resource | Activity | Release / cleanup |
| --- | --- | --- |
| Memory allocation (new) | **Using dynamic memory** | Freeing the memory (delete) |
| Opening a file | **Performing file operations** | Closing the file |
| Connecting to a database server | **Making database transactions** | Closing the connection |
| Acquire a mutex/lock | **Perform thread-safe activity** | Release the mutex/lock |
| Altering a bitmap mask | **Performing drawing operations** | Restoring the previous mask |
| Establish an SFTP connection | **Perform a file transfer** | Closing the connection |
| [Turn off system interrupts](https://energia.nu/reference/en/language/functions/interrupts/nointerrupts/) | **Perform an atomic operation** | [Re-establish interrupts](https://energia.nu/reference/en/language/functions/interrupts/interrupts/) |

The third step must be important if an activity is to fall into this pattern. The programmer must be careful to ensure that it occurs or else there will be important consequences, such as memory leaks, or a thread deadlock, or exhausting of a database connection pool.

We can decide to solve the problem by just making sure that whenever an acquisition occurs we can write a matching release operation. But as with any strategy that we decide to enforce manually, complications get in the way:
 * A function may have unexpected early returns that require us to write multiple release statements (and check each one to determine if the resource was ever actually acquired, and check to make sure we don't release it twice).
 * An exception may be thrown that prevents the program from ever reaching our release statement (and without checked exceptions we might not realise that a statement might throw an exception).
 * The resource might only need to be acquired based on some condition, and so we must be careful to ensure that releases are executed the exact same amount as the executing acquisitions.
 * If the function is long and messy enough we might simply forget the release statement and not notice its absence.

### What do other languages do?
We might notice that memory allocation isn't too much of an issue with interpreted or virtual machine languages which can employ [garbage collection](https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)) to clean up old memory. Although this comes at a cost to overhead and reliability, it allows us to write entire programs in Java or Python without ever needing to manually delete an object or even consider when their deletion will occur.

But similar automated schemes are usually not available for these other resource management activities. How can a virtual machine know when to close down a database connection, for example? Even if we have some Database object that falls out of use, how do we know that deleting it will close the connection? And even if it did, how do we know if the garbage collector will be called soon enough or if the connection will remain unclosed for quite some time, using up a valuable limited resource?

And if the VM follows some other heuristic (e.g. after some period of inactivity) then how do we cope with the times when the VM gets it wrong? Do we need to check that the connection is fine every single time that we want to make a transaction, and re-establish the connection if it isn't?

Instead, these newer languages often offer language semantics to automatically handle the resource releasing for us. This often involves representing the activity's context as an object that understands how to perform its own cleanup, and then offering syntax in the language itself to trigger that cleanup operation when control leaves that context.
#### C Sharp
C# offers the [using syntax](https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/using-statement) to automatically manage objects with the [IDisposable](https://docs.microsoft.com/en-us/dotnet/api/system.idisposable?view=net-5.0) interface:

In [None]:
using (DbConnection conn = new DbConnection())
{
    // Perform database operations
}

#### Python
The Python [with statement](https://www.geeksforgeeks.org/with-statement-in-python/) allows us to call any function that returns a [context manager](https://book.pythontips.com/en/latest/context_managers.html#implementing-a-context-manager-as-a-class) object. It will give us a handle to the object for our own use and call its `__exit__()` method automatically when control flow leaves the scope of the with statement:

In [None]:
for filename in config_files:
    with File(filename, 'r') as file:
        loaded_configs[filename] = list(file)

### The destructor, C++'s secret weapon
It has been said that C++'s greatest feature is [precisely determined object lifetimes](https://akrzemi1.wordpress.com/2013/07/18/cs-best-feature/) and more specifically, the destructor.
 * Garbage collected languages do not allow us this level of control, where we can precisely reason the exact moment when an object will be deleted.
 * Even better, when our object is stack-based (not dynamic) then the program will automatically delete the object for us the moment it falls out of scope, with no explicit `delete` statement required.
 * Even betterer, the availability of the destructor allows us to write our own code that will be executed at those moments.

These advantages form the heart of C++'s answer to the resource management pattern, in the form of the hideously named [Resource Acquisition is Initialisation](https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization) (RAII) idiom. In short, the RAII approach is:
> We can manage resources by tying them to scope blocks, and we can do this by tying them to stack-based objects.

Firstly, we can trivially create objects that fulfil a "context manager" role, where they acquire a resource in the constructor and release it in the destructor. Secondly, if the object is on the stack then we can simply use scopes to manage the lifetime of that object, allowing for the "release" operation to be called automatically without the need for any special syntax like the languages above.

This provides us with a high-level abstraction for handling the resource management pattern. Instead of needing to carefully determine when and how to call low-level release functions, instead we simply create a context-managing "RAII object" and define the scope over which the resource should be held.

### The standard RAII object
The minimum template for a RAII object is remarkably simple:

In [None]:
class ResourceManager {
    public:
    ResourceManager() {
        // Acquire the resource
    }
    
    ~ResourceManager() {
        // Free the resource
    }
}

Perhaps we need data members to keep a reference to the resource (e.g. a file handle) or perhaps we don't (e.g. if we are simply calling global functions to turn interrupts on/off). We can also provide other methods to allow for other operations that we might need with the acquired resource. These will be application-dependent. But the bare minimum of a RAII object is:
* It is a class, that
* Obtains the resource in its constructor, and
* Releases the resource in its destructor

Initialisation and resource acquisition are inextricably tied together, and so "Resource Acquisition is Initialisation".

To use the object we instantiate it on the stack within the scope that we need the resource. Here is an example that uses Qt's [QMutexLocker](https://doc.qt.io/qt-5/qmutexlocker.html), which is an RAII class for automatically acquiring and releasing a mutex:

In [None]:
class BackBuffer {
    private:
    QImage pixels;
    QMutex pixelMutex;
    
    ...
    
    public:
    /*
      The backbuffer image can be written to and read from by
      different threads and so we use a Mutex to synchronise
      access to our pixels object.
    */
    void drawToBackBuffer(int colIdx, int rowIdx, QRgb colour) {
        // QMutexLocker accepts a reference to our mutex and calls lock() on it
        QMutexLocker locker{&pixelMutex};
        
        // Operations from this point onwards are thread-safe with respect to pixelMutex
        
        
        // At the end of the method "locker" will fall out of scope and its
        // destructor will be called, which will call unlock() on our mutex.
        // Nowhere did we need to call unlock() ourselves.
    }
}

Note how in this example our actual resource, the mutex, existed long-term as a data member in our BackBuffer class. But the RAII object `locker` is not stored anywhere or kept as a data member. Its purpose is to be created on the stack within the function itself, and to die when we want the mutex to be released.

### Robust cleanup for scope blocks

It is important to think about what we are doing here. It's not too important that our resource is being managed inside of an object. What's important is that if the object is stack-based then we are tying the resource to the object's scope, and doing so in a way is automatically triggered and robust to unexpected control flows.

_It doesn't matter how we leave the scope_ - any stack object within the scope will be destroyed and have its destructor called when the scope ends. Even if an exception is called, all stack objects will still be destructed.

It's important to note that this scope doesn't have to just be a function body. We can use braces `{ }` to declare smaller sub-scopes anywhere that we have code executing. We have full control over the scopes that will manage our resources.

As such, I like to think of the concept as _"stack-based resource management"_, or perhaps _"scope-based resource management"_.

In [1]:
#include <iostream>
#include <fstream>

class FileLogger {
    public:
    FileLogger(const std::string& filename) : myfile(filename) {
        std::cout << " [Debug] Log opened" << std::endl;
    }
    
    ~FileLogger() {
        myfile.close();
        std::cout << " [Debug] Log closed" << std::endl;
    }
    
    void log(const std::string& entry) {
        if (myfile.is_open()) {
            myfile << entry << "\n";
            std::cout << " [Debug] Logged " << entry;
        }
    }
    private:
    std::ofstream myfile;
}

In [2]:
void logMultiples() {
    std::vector<int> multiples{0, 3, 6, 9, 12, 15, 18, 21, 24, 27};
    std::cout << "Done preparing multiples" << std::endl;

    // Here is an arbitrary inner scope within the function for managing the life of our RAII object
    {
        FileLogger logger("log.txt");
        // log the calculated values

        for (int i = 0; i < 10; ++i) {
            std::stringstream ss;
            ss << "Multiple " << (i+1) << " of 3 is " << multiples[i] << std::endl;
            logger.log(ss.str());

            // Will this early return leave our file handle dangling?
            if (multiples[i] > 10) {
                return;
            }
        }
    }
    // If execution leaves this scope the RAII object is cleaned up
    
}

logMultiples();

Done preparing multiples
 [Debug] Log opened
 [Debug] Logged Multiple 1 of 3 is 0
 [Debug] Logged Multiple 2 of 3 is 3
 [Debug] Logged Multiple 3 of 3 is 6
 [Debug] Logged Multiple 4 of 3 is 9
 [Debug] Logged Multiple 5 of 3 is 12
 [Debug] Log closed


In [3]:
try {
    FileLogger logger("log.txt");

    logger.log("The last letter of the alphabet is Z\n");

    // Will this thrown exception leave our file handle dangling?
    throw std::exception();

    logger.log("The next-to-last letter of the alphabet is Y\n");
} catch (const std::exception& e) {
    std::cout << "Exception caught!" << std::endl;
}

 [Debug] Log opened
 [Debug] Logged The last letter of the alphabet is Z
 [Debug] Log closed
Exception caught!


### Performance considerations

In theory, the RAII idiom should come with [no extra runtime cost of time or memory](https://www.hackcraft.net/raii/#sect4_0) when compared to the previous approach of acquiring and releasing the resource manually. Compilers have long been able to inline simple object constructors and destructors, and generally these functions are kept very simple indeed for RAII classes.

* In the ideal scenario, the compiler can optimise out and inline the entirety of the RAII class when compared to the fully manual approach.
 * See this example RAII approach: https://godbolt.org/z/GcjEKc9bG
 * Compared to original manual approach: https://godbolt.org/z/6vobK39WM

* However, care must be taken for the RAII class to be as efficient as the manual approach.
 * Simply accepting our log filename and messages as std::string results in a [substantial increase in size](https://godbolt.org/z/jonr5cdhb).
 * We should use references as widely as possible to prevent uneccessary copying.

The claim of zero-cost abstraction relies on optimisation. Debug binaries will often be larger and less efficient when using the RAII approach, as can be seen in the above example when run with `-O0` (no optimisations):
* RAII approach: https://godbolt.org/z/ETs98Yv1q
* Manual approach: https://godbolt.org/z/McPdze45e

Though even in this scenario the overhead of RAII is likely to be slim as to be unnoticeable during typical program execution. The standard considerations of performance should be employed to determine whether the code cannot afford this overhead. If the code under consideration is a performance-critical loop that churns through resource management calls and is often run under debug conditions, then the manual approach may be preferred. In most typical scenarios, RAII offers a higher-level abstraction for robust resource management that allows for fewer mistakes and does so at negligable cost.

### Correctness considerations

If our RAII class exists to manage another object (such as a mutex handler managing a mutex), then we must make sure to accept the mutex as a reference to ensure we are managing the actual intended object and not a copy of the object.

Similarly, we should strongly consider whether we want to [delete](https://en.cppreference.com/w/cpp/language/function#Deleted_functions) the [implicitly-declared copy constructor](https://en.cppreference.com/w/cpp/language/copy_constructor) of our RAII class to prevent shallow copies from being made of it. _What does it mean to have two simultaneous managers of the same database connection?_

If it is necessary to pass a RAII instance into or out of a function, or into a data structure / another thread / another object, then we should consider implementing the [move constructor](https://en.cppreference.com/w/cpp/language/move_constructor) and [assignment operator](https://en.cppreference.com/w/cpp/language/move_assignment) to allow our sole RAII object to be "moved" around while maintaining its management of the resource.