# 4 - Improvements, With Consideration
##### **Author: Adam Gatt**

## Const references (pre-11)
Despite having covered the useful `nullptr` in an earlier notebook, the concept of the _null reference_ has been described as its creator as the [Billion Dollar Mistake](https://www.youtube.com/watch?v=YYkOWzrO3xg). Null-dereferencing errors are widespread and hidden, often only discovered by unexpected run-time errors.

To generalise our language, we can consider C++ to make two types of variable references available:

* C++ pointers (`*`) are "nullable references" and can be used to hold nullptr (or NULL/0)
* C++ references (`&`) are "non-nullable references" and cannot

References will never be null, and so they can never perform a null-dereference or be used before initialisation. You do not need a nullptr check before using them. Behind the scenes they compile as if they were pointers, resulting in no efficiency cost.

As such, if we are truly serious about preventing the Billion Dollar Mistake then:
> References should be our default choice, with pointers only for when nullability is a requirement or unavoidable.

In [1]:
#include <string>
void acceptStringRef(std::string& stringIn) {
    // We can do anything we want with stringIn inside this function with
    // complete confidence we will never cause a null-dereference error
};

Of course we can always call this function with a pointer that we de-reference first.

In [2]:
std::string* myString;
acceptStringRef(*myString)

acceptStringRef(*myString)
[0;1;32m                 ^~~~~~~~
[0m

Interpreter Exception: 

But still the function itself remains completely safe with the error occuring outside of it, in the `*myString` operation, closer to the site of the original pointer. The widespread use of references in function signatures help our reasoning during debugging, allowing us to eliminate "safe" parts of the codebase and preventing run-time exceptions from appearing deep within the call stack, far from the site of the actual logic error.

We can take this safety further with an important concept from Functional Programming, which is the idea of _immutability_. In FP, variables are "immutable" by default and this extends to input parameters in function definitions. If a function is guaranteed to never change a referenced input variable, then we can eliminate that function when debugging errors occuring with that variable outside the function. The variable had the same state both before and after the function was called, and the problem lies elsewhere.

Of course in C++ "immutable" _const_ variables are not the default and there are plenty of functions that are required to make changes to input parameters (e.g. std::fill). But the more functions that declare input references (and pointers) to be const, the easier it is to reason about the codebase when tracking down a bug.

And so, for further enforcing of correctness:
> const should be our default choice for references (and pointers), unless mutability is a requirement or unavoidable

And so we should look for all opportunities possible to convert our `type* name`s into `const type& name`s.

In [None]:
// What does "processing" involve? Might this change our cart without our
// expecting it to? Will it break if cart is a nullptr?
void processOrder(Cart* cart);

// This is much safer in comparison
void processOrder(const Cart& cart);

## Uniform Initialisation Syntax

The simple initialisation of a variable might be one of the most complicated things in C++. Firstly, there is a bewildering number of ways that a variable can actually be initialised, [highlighted by this joke](http://mikelui.io/img/c++_init_forest.gif). I don't even know what the majority of these mechanisms are. More than this, the different methods of initialisation come with different rules and levels of strictness. [This is a very useful chart](https://josuttis.com/cpp/c++initialization.pdf) to help make the differences easy to compare. On that chart, the notable entries appear to be those shown with red braces. The `Type i{x};` row (and the related `Type i = {x};`) seem to have the advantage for pretty much all of the listed criteria. This makes it a strong choice if those criteria seem useful or important to you, such as enforcing that narrowing operations should result in errors.

In [3]:
double totalFunds = 10.0;

int x(totalFunds / 4); // Some compilers warn, but all will still allow
int y{totalFunds / 4}; // Narrowing is error

[1minput_line_10:4:7: [0m[0;1;31merror: [0m[1mtype 'double' cannot be narrowed to 'int' in initializer list [-Wc++11-narrowing][0m
int y{totalFunds / 4}; // Narrowing is error
[0;1;32m      ^~~~~~~~~~~~~~
[0m[1minput_line_10:4:7: [0m[0;1;30mnote: [0minsert an explicit cast to silence this issue[0m
int y{totalFunds / 4}; // Narrowing is error
[0;1;32m      ^~~~~~~~~~~~~~
[0m[0;32m      static_cast<int>( )
[0m

Interpreter Error: 

The curly-braced approach is the [Uniform Initialisation Syntax](https://www.learncpp.com/cpp-tutorial/b-4-initializer-lists-and-uniform-initialization/) introduced with C++11 to make initialisation safer and more predictable. It comes with a new syntax, using braces, to avoid stepping on the toes of existing initialisation which must be kept unchanged for compatibility reasons.

The contents of the braces is known as an _initialiser list_ and is a powerful new tool for easily initialising complex structures. Firstly we can use it when instantiating objects and structs, just like with curved brackets:

In [4]:
struct Employee {
    std::string name;
    int age;
    double performanceRating;
};

// Initialiser list populates members in order of their declaration
Employee newHire{"Adam", 34, 4.0};
newHire.name

"Adam"

More powerfully, we can initialise vectors and arrays all at once. We can now represent the "list literal".

In [5]:
#include <vector>
#include <string>

std::vector<std::string> colours{"Red", "Yellow", "Orange", "White"};

// May I never initialise a vector like this again, so help me God
std::vector<std::string> birds;
birds.push_back("Magpie");
birds.push_back("Ostrich");
birds.push_back("Goose");
birds.push_back("Albatross");
birds.push_back("Hummingbird");

In [6]:
#include <array>
#include <set>
#include <utility>

// New-style array
std::array<int, 6> lotteryNumbers{4, 8, 15, 16, 23, 42};

// C-style array, dynamic-memory
int* okayTheyreTheLostNumbers = new int[6]{4, 8, 15, 16, 23, 42};

// Sets, pairs
std::set<char> typedKeys{'g', 'q', 'v', 'o', 'q', 'm', 't'};
std::pair<std::string, float> population{"South Australia", 1.7e6};

And with nested braces we can initialise complex data types:

In [7]:
#include <unordered_map>
#include <iostream>

std::unordered_map<std::string, std::string> capitols{
    {"Malta", "Valetta"},
    {"New Zealand", "Wellington"},
    {"United States", "Washington D.C"},
    {"Australia", "Canberra"},
    {"New Caledonia", "Nouméa"}
};

std::cout << "The capitol of Malta is " << capitols["Malta"] << std::endl;

The capitol of Malta is Valetta


And even nested structures:

In [8]:
#include <iostream>
#include <string>
#include <list>
#include <algorithm>

struct Runtime {
    std::string name;
    std::list<std::string> languages;
    int developmentYear;
};

// Second item is itself a full initialiser list, used for initialising the second declared member
Runtime jvm{"Java Virtual Machine", {"Java", "Kotlin", "Scala", "Clojure"}, 1994};

std::cout << jvm.name << " supports at least " << jvm.languages.size() << " languages";

Java Virtual Machine supports at least 4 languages

### But a gotcha, initialiser lists make uniform initialisation not fully uniform
Initialiser lists are supported by special intialiser-list constructors, which are provided for most container types. According to the standard, these constructors are attempted before other constructors. This can be a problem when trying to call regular constructors with argument types that match the container's template, for example:
* `vector( size_type count )` for vector\<int\>
* `basic_string( size_type count, CharT ch, const Allocator& alloc = Allocator());`

In [9]:
// Prepare a vector of 100 ints according to the std::vector(int size) constructor
std::vector<int> raffleDraws{100};

// Oh, actually it treated this as a single-element initialiser list
std::cout << raffleDraws.size() << ": " << raffleDraws[0];

1: 100

In [10]:
// But this container of strings will actually create a 100-length vector, now that
// it is unable to call the initialiser-list constructor with a 1-length list of ints

std::vector<std::string> attendees{100};
std::cout << attendees.size() << ": " << attendees[0];

100: 

In [11]:
// I want to use the constructor to prefill a string of 100 '_' characters
std::string underline{100, '_'};

underline

"d_"

In [12]:
// But returning to regular bracket initialisation works
std::string underline(100, '_');

underline

"____________________________________________________________________________________________________"

### So should I use or avoid braces initialisation?

Scott Meyers in Effective Modern C++:

> Most developers end up choosing one kind of delimiter as a default, using the other only when they have to. Braces-by-default folks are attracted by their unrivaled breadth of applicability, their prohibition of narrowing conversions, and their immunity to C++’s most vexing parse. Such folks understand that in some cases (e.g., creation of a std::vector with a given size and initial element value), parentheses are required. On the other hand, the go-parentheses-go crowd embraces parentheses as their default argument delimiter. They’re attracted to its consistency with the C++98 syntactic tradition, its avoidance of the auto-deduced-a-std::initializer_list problem, and the knowledge that their object creation calls won’t be inadvertently waylaid by std::initializer_list constructors. They concede that sometimes only braces will do (e.g., when creating a container with particular values). There’s no consensus that either approach is better than the other, so my advice is to pick one and apply it consistently.

## auto _(in some situations)_
To quickly provide the concept for those who are not familiar, `auto` is a keyword that tells the compiler to automatically deduce a type rather than having the programmer specify it. Note that all `auto`s are still deduced at compiler time and so auto is not an approach to achieving true dynamic typing like `var` in Javascript.

In [13]:
#include <vector>

auto x = 5;
auto y = "Hello, World";
auto z = std::vector{1.1, 1.2, 1.3, 1.4};
z.size()

4

`auto` might be a controversial listing for those who disagree with its hiding of variable types. The argument could be made that the programmer should be conscious and aware at all times about what data he is actually manipulating, and that `auto` serves to obfuscate that. I agree with the sentiment in general, but below I will list particular situations where `auto` is both valuable and safe to use. I would encourage the use of auto in these situations.
### iterator types _(and other long types)_
Declaring the types of all variables involves having to play along with the sometimes-wordy results of the type system. What is the type of the iterator returned by calling `.cbegin()` on a vector of strings? Is it `iterator` or `c_iterator`? No, it is `std::vector<std::string>::const_iterator` and if you are manipulating the iterator inside of the loop you are potentially typing that out a number of times. Then, if you change `cbegin()` to `begin()` or `vector` to `list` you potentially need to make that change in several places.

I consider it legitimate to use `auto` with iterators (and other lengthy, one-time objects) as
 1. The type of the iterator is easy to determine as it is based on a container
 2. The iterator variable itself has a lifetime limited to the loop
 3. It is useful for the `auto` to reflect changes to the container
 4. The shorter code is simply cleaner and more readable

In [14]:
#include <string>
#include <vector>

std::vector<std::string> filenames;

for (std::vector<std::string>::const_iterator it = filenames.cbegin(); it < filenames.cend(); ++it) {
    // Read the file "fopen(*it)" and do something with it
}


for (auto it = filenames.cbegin(); it < filenames.cend(); ++it) {
    // Read the file "fopen(*it)" and do something with it    
}

In [15]:
#include <string>
#include <map>

enum class Team {RED, BLUE, NO_TEAM};

std::map<std::string, Team> players;

for (const std::pair<std::string, Team>& it : players) {
    // Do something with the player/team pair
}


for (const auto& it : players) {
    // Do something with the player/team pair    
}

### simple function arguments (predicates, comparators, maps, etc)
Higher-level functions are functions that accept other functions as parameters. Examples include:
 * `std::sort` accepting a comparator function to tell it how to order elements
 * `std::copy_if` accepting a "predicate" function to tell it which elements to copy
 * `std::accumulate` requiring a function to indicate how each element is combined into the final result
 
Often these function arguments are supplied as short lambdas of a single, very simple statement. Considering that the lambda only really exists for that single function call and the input type is determined by the higher-level function (or a container it operates on), I consider it legitimate to use `auto` for the function argument signature rather than having to type out the full type of the container element.

_Note: The use of `auto` for lambda input parameters is only available for C++14 onwards._

Don't worry if you don't know about lambdas/closures and their uses yet. You can come back to this section after we do the notebook on lambdas if you like.

In [16]:
#include <vector>

struct Member {
    const char* name;
    bool active;
};

using MemberList = std::vector<Member>;

In [17]:
#include <algorithm>
#include <iostream>

MemberList allMembers{{"Adam", true}, {"Saxon", false}, {"Chenny", true}, {"Nenin", false}};

MemberList lapsedMembers;

std::copy_if(
    allMembers.begin(), allMembers.end(), std::back_inserter(lapsedMembers),
    [](auto member){ return !member.active;}
);

std::for_each(lapsedMembers.begin(), lapsedMembers.end(), [](auto member){std::cout << member.name << std::endl;});

Saxon
Nenin


### generic lambdas
The other cases involved using `auto` to basically save on typing, but this case involves adding functionality. With C++14 introducing auto for lambda parameters, it became possible to define a function that performs automatic type deduction on all input parameters and its return type. The result is a _generic lambda_ that effectively acts the same as a template with alternative syntax. _(The compiler treats them exactly the same, even down to instantiating the function once for each unique signature required by the calling code)._

I consider these to be essentially another syntax for writing generic functions for those who prefer it. For simple functions I don't see any advantage or extra information provided by the traditional `T`s.

_Note: Full use of auto for generic behaviour is applicable only to lambdas. Traditional functions can use `auto` for return types but not for parameter types, and so must use template syntax for full generic behaviour._

In [18]:
auto add = [](auto a, auto b) {
    return a + b;
};

// Add these ints even though the add() function never specifies "int" anywhere
add(2, 5)

7

In [19]:
// Template function for comparison
template <typename T>
T addTemplated(T a, T b) {
    return a + b;
}

In [20]:
// Can proceed to use add() on doubles and other primitives
add(3.2, 1.5)

4.7000000

In [21]:
// Actually I can use it on anything that provides "operator+()"
add(std::string("Hello "), std::string("World"))

"Hello World"