# Survival C/C++ - basic concepts to get things done

## Course material

The course material (including this notebook) can be found [here on Github](https://github.com/pelagos-consulting/survival_cplusplus). Clone the repository with this command.

```bash
git clone https://github.com/pelagos-consulting/survival_cplusplus.git
```

Alternatively, you can use **wget** to download the course material on the command line.

```bash
wget https://github.com/pelagos-consulting/survival_cplusplus/archive/refs/heads/main.zip
unzip -DD main.zip
cd survival_cplusplus-main/course_material/L0_Survival_C++
```

## Introduction

In order to understand software written to work with frameworks like HIP, CUDA, and OpenCL, we need to have a rudimentary grasp of the C++ language. While C++ is the language of choice for many real-world scientific applications, it can be more complex to understand than plain C, and that extra complexity may get in the way of the science. This course is designed to impart just enough "Survival C/C++" in order to get things done in scientific computing, and to better understand the code that is covered in subsequent courses. It also aims to deliver fundamental information such as how numbers are represented in binary. 

### Topics covered

* Making code understandable with comments
* Statements and code blocks
* Understanding how integers and floats are represented
* Variables and pointers
* Using functions
* Printing variables
* Math operations
* Using flow control such as **if** statements and **for** loops
* Using memory allocations as arrays
    * Static and dynamic memory allocation
    * Multi-dimensional array representations
    * Safe memory allocations with C++ container types
* Strings
* Reading and writing binary data
* Exercises
    * Simple application
    * 2D arrays and debugging with GDB
    * Matrix multiplication 

### A brief history of C++

C++ is the brainchild of Danish Computer scientist [Bjarne Stroustrup](https://en.wikipedia.org/wiki/Bjarne_Stroustrup). It was first released in 1985 and was standardised in 1998. Since then subsequent revisions have been ratified by an [ISO working group](https://en.wikipedia.org/wiki/ISO/IEC_JTC_1/SC_22), and released approximately once every three years for compiler developers to adopt.

* C++ 98
* C++ 03
* C++ 11
* C++ 14
* C++ 17 (default standard used for GNU g++ 11 and 12)
* C++ 20 (stable)
* C++ 23 (testing)

The default standard implemented by compilers is usually one of the more recent ones. For example GNU g++ compiler versions 11 and 12 use the 2017 standard by default.

### Reference material

<a id="reference"></a>

There are a few good resources that have been useful in learning C++. Here are some resources:

* [The C++ Programming Language by Bjarne Stroustrup](https://amzn.asia/d/9DzqZea), this book is from the author of the C++ programming language.
* [cppreference.com](https://en.cppreference.com/w/), a definitive online reference for quick lookup of C and C++ syntax.
* [cplusplus.com](https://cplusplus.com/reference/), another good resource with lots of examples.

#### Example code 

Most of the code discussed in the sections below are also in the file [basic_examples.cpp](basic_examples.cpp). You can use this file as a quick reference for how things are done.

## Comments

Comments in C++ begin with two forward slashes `//`. Anything after the two forward slashes is ignored for the rest of the line.

```C++
// This is a comment
```

Another way to write a comment that covers large amounts of text is to start with a `/*` and end the comment with a `*/`. This multi-line comment is also valid in C.

```C++
/* This is a
multi-line comment
*/
```

## Statements and code blocks

A statement is a line of code. Since C++ doesn't use new lines to separate statements, every statement must have an ending. Both C and C++ uses the semicolon `;` character to end statements.

```C++
// This is a statement that declares an integer with the value 2
int a = 2;
```

**Code blocks** are collections of statements enclosed by a starting brace `{`, and ending with an end brace `}`. Variables declared within code blocks do not exist outside their enclosing code block.

```C++
{ // Starting a code block
    int a = 2;
} // Ending a code block
```

It is good practice to indent the code that lies within a code block, as demonstrated above. The code for a function is also contained within a code block. In fact, all the code that runs in a C++ program is launched from a single function called `main`. Here is the source code for a complete C/C++ program. 

```C++
int main() {
    // Code for the program goes within this code block
    int a = 2;
    return 0;
}
```

There can only be one **main** function in a C/C++ program.

## Basic data types

Bits (the 1's or 0's) are the fundamental data type in computing. Everything else is represented using an arrangement of bits. Integers and floats are a standardised way of representing numbers, they typically use multiples of 8 bits (called **Bytes**) to represent information. Other datatypes are either reinterpretation (such as characters), or compound mixture of these basic data types.

### Integers

Integers represent whole numbers. There are two main types, signed and unsigned, with varying numbers of bits to represent them. For integers of length **N**, each bit is associated with a power of 2, and the value of the integer is the dot product between the vector of bits (1's and 0's) and a vector of decreasing powers of 2 ranging from $2^{N-1}$ to $2^{0}$. The bit associated with the highest power of 2 is called the **Most Significant Bit**, and the bit associated with the lowest power of 2 is called the **Least Significant Bit**. 

<figure style="margin-left:auto; margin-right:auto; width:60%;">
    <img style="vertical-align:middle" src="../images/integers.svg">
    <figcaption style= "text-align:lower; margin:1em; float:bottom; vertical-align:bottom;">Signed and unsigned integers with N=8 bits.</figcaption>
</figure>

With signed integers the most common representation is **two's complement**. This is where the most significant (or highest value) element in the vector of two's is negative.  

For example the bit sequence "10011000" for an 8 bit signed integer represents a value of 

$$-2^{7} + 2^{4} + 2^{3} = -104, $$
$$-128 + 16 + 8 = -104. $$

For an unsigned integer the same bit sequence has a value of 

$$2^{7} + 2^{4} + 2^{3} = 152, $$
$$128 + 16 + 8 = 152. $$


Using the formulae for values, we can derive the largest and smallest values for the integer types.

| | smallest | largest | 
| :- | :- | :- |
| **signed integer** | $$-2^{N-1}$$ | $$2^{N-1}-1$$  | 
| **unsigned integer** | $$0$$ | $$2^{N}-1$$ |


It is important to mention how a programming language handles cases where a math operation forces the number to overflow or underflow its representation. For C and C++, unsigned integers wrap around in a **known** way. For example, adding 1 to an 8-bit unsigned int with a value of 255 will return 0, and subtracting 1 from 0 will equal 255. For signed integers the overflow behaviour is **undefined**. An unsigned integer is a **safer choice** when you know you don't need to represent negative integers.

There are a number of data types in C that represent integers with varying numbers of bits. Not every integer type means the same number of bits on every platform.

| Nominal number of bits (N) | name of signed form | name of unsigned form | 
| :- | :- | :- |
|8|char|unsigned char| 
|16|short|unsigned short|
|32|int|unsigned int|
|32-64|NA|size_t|
|64|long|unsigned long|

Unfortunately there is a lack of certainty across C++ software development frameworks on the number of bits used for integers. It depends on the compiler and the platform used. Thankfully there are other headers you can include, such as [**\<cstdint\>**](https://www.cplusplus.com/reference/cstdint/) that provide integer types with a **standardised number** of bits such as `int8_t` or `uint32_t`.

Variable declarations for integers like the following in [basic_examples.cpp](basic_examples.cpp).

```C++
    char a_i=1;         // Using char as an integer
    short b_i=4;        // 16 bit
    int c_i=2;          // 32 bit          
    unsigned int d_i=3u; // 32 bit
    long e_l = 5l;       // 64 bit
    unsigned long e_ul = 5ul; // 64 bit
```

When putting numbers like this into a program, the compiler infers the number as an integer and a type conversion will take place in the assignment unless you decorate the number with a `u` for unsigned, `l` for long, or a `ul` for unsigned long.

The character type `char` is just an 8-Bit integer whose value is associated with a lookup table of ASCII characters. We can also create a character in C using the following form.

```C++
// Create a character
char c = 'a';
```

### IEEE754 Floating point numbers

The current [IEEE754 standard](https://ieeexplore.ieee.org/document/8766229) for floating point numbers was established in 1985 and is the standard used in many applications, including OpenCL implementations. The bits that make up a floating point number are laid out in three sections: a **sign bit**, a **floating point exponent**, and a **mantissa**. The sign bit occupies one bit, the floating point exponent has **NBE** bits, and the mantissa has **NBM** bits. The total number of bits for a floating point number is then NBE+NBM+1. The **floating point exponent** **(E)** is just an unsigned integer with **NBE** bits. In order to represent negative powers of 2, the value of E is subtracted by a **Bias** equal to $2^{\mathrm{NBE}-1}-1$ to form an exponent (in the mathematical sense) with base 2. The **mantissa** is constructed the same way, but the value for the mantissa is $2^{0}==1$ plus the dot product of the mantissa bits with a vector of decreasing powers of 2 ranging from $2^{-1}$ to $2^{-\mathrm{NBM}}$. All three components combine together to form the value, as shown below.

<figure style="margin-left:auto; margin-right:auto; width:100%;">
    <img style="vertical-align:middle" src="../images/floating_point.svg">
    <figcaption style= "text-align:lower; margin:1em; float:bottom; vertical-align:bottom;">Floating point representations with differing numbers of bits.</figcaption>
</figure>

For exploration with floating point numbers have a look at a floating point explorer, like [this one](https://evanw.github.io/float-toy/) or [this one](https://float.exposed/0x65fd). From the above we see that floating point representations are a logarithmic scale with base 2. Every power of two increase in value increases the spacing between representations by a factor of 2.

#### Limits on floating point number representations

The standard reserves special meaning (such as 0, $\pm \infty$, NaN, or subnormal numbers) when the bits in the **floating point exponent** or **mantissa** are either all zeros 0 or all 1. For example:

* If the exponent and manitssa bits are **all** zero then the value is $\pm0.0$ depending on the sign bit.
* If **all** the exponent bits are 1, and **all** the mantissa bits are 0, then the value is $\pm \infty$, depending on the sign bit. 
* If **all** the exponent bits are 1 and **any** of the bits in the mantissa are 1, the value is NaN.
* If **all** the exponent bits are 0, and **any** of the mantissa bits are not zero, we are in the range of **subnormal** numbers.

Outside these special cases, the smallest value possible for **E** is $E=1$ and the largest is $E=2^{\mathrm{NBE}}-2=2 \times \mathrm{Bias}$. Within these limits, we see that a floating point number can describe values within the following (normal) range of numbers:

| | | 
| :- | :- |
| **smallest absolute value** | $$ 2^{1-\mathrm{Bias}} = 2^{2-2^{(NBE-1)}}$$ 
| **largest absolute value** | $$ \left (2-2^{\mathrm{-NBM}} \right) 2^{\mathrm{Bias}}=\left (2-2^{\mathrm{-NBM}} \right)2^{(2^{(NBE-1)}-1)} $$ 

#### C Function to extract values for E and mantissa

The **frexp** function in Python and C returns the values $x=0.5 \times (-1)^{S} \times \left ( 1.0 + \sum^{\mathrm{NBM}-1}_{i=0} B_i 2^{i-\mathrm{NBM}} \right )$ and $y=E-\mathrm{Bias}+1 $, such that $\mathrm{value} = x 2^{y}$. This can be useful for working out $E=y+\mathrm{Bias}-1=y+\left (  2^{\mathrm{NBE}-1}-2 \right )$ and the value of the mantissa $=2x$.

In [1]:
import math
(x,y) = math.frexp(1.0)
print(x,y)

0.5 1


#### Spacing between floating point representations

For any given floating point number $f$ with floating point exponent $E$, the next represented floating point number is exactly at a spacing of $$|\Delta f|=2^{(E-\mathrm{Bias}-\mathrm{NBM})}=2^{(y-1-\mathrm{NBM})}.$$ Where $y$ is the value from the **frexp** function. The result of your calculation might be able to be represented **exactly** using a floating point number, or it **might not**. The difference between the exact result and the **nearest floating point number** that can represent your result is called **floating point error**. For iterative processes floating point errors can accumulate over time, which is why 64-bit floating point arithmetic is essential for accuracy in some sciences. A twofold change in spacing occurs at every power of two in absolute value. Furthermore, as long as $\Delta f$ is less than 1, then whole numbers can always be represented **exactly**, up to $|\Delta f|=1$. By solving $|\Delta f|=2^{(E-\mathrm{Bias}-\mathrm{NBM})}=1$ for E, we see that the largest absolute value for which we can guarantee representation by a floating point number is $$N_{\mathrm{max}} = (2^{\mathrm{NBM}+1}-1).$$ For the nearest power of two whose absolute value is larger than this, the spacing becomes $|\Delta f|=2$ and larger.

If you include `<cmath>` you can also use the `std::nexttoward` or `std::nextafter` functions to find the next represented floating point number.

Because floating point errors can accumulate, it is **not good practice** to have your code keep track of a variable through incrementing it by a small value. Instead it is better to use an integer to keep track of the iteration, then compute an offset. We demonstrate this with Python to iterate over a process.

In [1]:
import numpy as np

# Starting value
t1=np.float32(0.0);
t2=np.float32(0.0);

# Can't be represented exactly
dt=np.float32(1.0/3.0)
niters=100000

# Floating point errors will accumulate
for n in range(0,niters):
    t1=t1+dt
    
# Using an integer elminates accumulated errors.
for n in range(0,niters):
    t2 = (n+1)*dt
    
print(t1,t2, t2-t1)

33356.555 33333.33432674408 -23.22036075592041


Over 100,000 iterations the counter t1 has diverged from the correct result by more than 20 simply due to floating point error! This is because (1/3) can't be represented exactly by floating point numbers.

#### Examples with floating point numbers

1. For 32-bit floats, the number of bits in the floating point exponent is $\mathrm{NBE}=8$ and the number of bits in the mantissa is $\mathrm{NBM=23}$, therefore $\mathrm{Bias}=2^{\mathrm{NBE}-1}-1=127$. The smallest (normal) floating point representation is $2^{1-\mathrm{Bias}}=1.1754944\times 10^{-38}$ and the largest (normal) floating point representation is $3.4028235\times10^{38}$. For values $f$ in the range $(1.0<=f<2.0)$, where $E-\mathrm{Bias}=0$, the spacing from one floating point representation to the next is $\Delta f=2^{E-Bias-NBM} = 2^{-23} \approx 1.1920929 \times 10^{-7}$. In the range $(0.1<=f<1.0)$ the spacing will be $\Delta f=2^{-24}$ and in the range $(2.0<=f<4.0)$ the spacing will be $\Delta f=2^{-22}$

1. Given the 16 bit sequence "0010000000001101" what is the corresponding floating point number? For a 16-bit floating point number $\mathrm{NBE}=5$ and $\mathrm{NBM}=10$. We line up the bits against bit positions and calculate the answer.

<figure style="margin-left:auto; margin-right:auto; width:70%;">
    <img style="vertical-align:middle" src="../images/float16.svg">
    <figcaption style= "text-align:lower; margin:1em; float:bottom; vertical-align:bottom;">Example 16-bit floating point number.</figcaption>
</figure>

$$ \mathrm{Bias} = 2^{\mathrm{NBE}-1}-1=15$$

$$ E = 1 \times 2^{3} = 8 $$

$$\mathrm{Value} = (-1)^{0}\left( 1 + 2^{-7} + 2^{-8} + 2^{-10} \right) \times 2^{E-\mathrm{Bias}} \approx 7.91 \times 10^{-3}$$

#### Floating point numbers in C

The C floating point data types for varying numbers of bits are as follows:

| Nominal number of bits (NBM+NBE+1) | name |  
| :- | :- | 
|16|half||
|32|float| 
|64|double|
|64-128| long double|

As with integers, the lack of certainty in the number of bits used for floating point numbers can be a problem, as it depends on the implementation. Standards like OpenCL fix this problem by defining a universal set of standard datatypes.

Declaring a floating point number in C may be done as follows. See the compiled example in the file [basic_examples.cpp](basic_examples.cpp).

```C++
// half precision (16-bit), only on some platforms
half a=2.0f16;

// single precision (32-bit)
float b=2.0f; //or 2.0f32

// double precision (64-bit)
double c=2.0; // or 2.0f64

// quadruple precision (128-bit)
long double d=2.0l; // or 2.0f128
```

## Big-endian and little-endian number storage

Both floats and integer representations are divided into discrete chunks of bytes (8-bits). The arrangment of bits within a byte is fixed, however when a data type is constructed from multiple bytes, there are two different ways of arranging the **bytes** in memory and on disk. In **Big-endian** represenation the bytes representing the higher powers of two (big end, or most significant) are first, and bytes representing the lower powers of two (little end, or least significant) are last. A **little-endian** representation swaps the ordering of the bytes and moves the least significant bytes to the front. In either case it is important to know that within each byte the ordering of the 8 bits is **unchanged**.

Shown in the diagram below is a 32-bit number (integer or float) represented in both the big-endian and little-endian formats: 

<figure style="margin-left:auto; margin-right:auto; width:70%;">
    <img style="vertical-align:middle" src="../images/endian.svg">
    <figcaption style= "text-align:lower; margin:1em; float:bottom; vertical-align:bottom;">A 32-bit number represented in big-endian and little-endian representations.</figcaption>
</figure>

Unfortunately, there is no universal standard for "endianess". It depends entirely on the processor architecture. The x86-64 architecture mostly uses the little-endian representation, whereas some architectures such as ARM even mix and match! Endianness becomes an **issue** when reading or writing files to disk or sending data over a network. This is why a simple binary dump of values to disk is **not suitable** for sharing data. Instead, use a file format such as [HDF5](https://www.hdfgroup.org/) where the "endian" format and other explanatory metadata is included with the file.

## Variables on the stack

Thus far, we have been declaring variables as integers and floats. Memory for variables that are declared like this are allocated at runtime in a prepared and reserved area of memory called the **stack**. These variables are valid while they are accessible, and are taken off the stack when the program executes statements outside the code block in which the variables are defined. For example, in the code block below we define variables `a` and `b` in different code blocks.

```C++

// Variable
int c=0;

// First code block
{
    int a = 2;
}

// Second code block
{
    int b = 1;
}
```

During execution of the first code block a reservation of memory large enough `a` (one integer) is created on the stack. When the second code block is executed, the variable `a` goes out of scope. It can no longer be "seen" and is deallocated. The variable `c` is still in scope because it can be "seen" from within the first and second code blocks. Within a program, variables may be "seen" from within code blocks, but not the other way around.

## Pointers

In C/C++ one can get the starting address (as an integer type) of any memory allocation, and this address is called a **pointer**.

The pointer type determines how the memory will be accessed, for example as a float, as an integer or another data type. We create a pointer to a particular data type by declaring a variable as usual and putting a `*` in front of the variable name to indicate that it is a pointer to a variable of that type.

```C++
int *p;  // Create a pointer to an integer called p
int a=2; // Create an integer and give it a value of 2
```

The address operator **&** gets the address of allocated memory, which can be assigned to the pointer.

```C++
p = &a;  // Get the address of the integer and assign it to p
```

Finally, the de-referencing operator `*` can access the value pointed to by a pointer.

```C++
int y = *p; // Access the value pointed to by p and assign to y
```

There is a danger here, if **p** was not assigned to an address and we try to de-reference it, then the result will lead to a memory error and a **potential crash** of the program. 

### Void pointers

There is a pointer type called `void*` that can point to any memory allocation.

```C++
void *v = &a;
```

Pointers should be set to **NULL** when they are not pointing to anything useful. That way the program can fail in a more predictable manner if the NULL pointer is used inappropriately.

```C++
    v = NULL;
```

In C++ you can also use **nullptr** instead of **NULL**, provided you include **\<cstddef\>** in the header. 


## Functions

A function is some code that does a task and can be called any number of times. Functions take zero or more arguments or variables as input and return zero or one output. Here are some example functions:

```C++
void simple() { // Takes no arguments and returns no values, we use void as the return type
    int a=2;
}

int get_int() { // Takes zero arguments and returns an integer, notice we have to declare the return type
    int a=2;
    
    return a; // The return statement is necessary if the function returns something
}
```

We call these functions from within a program like this;

```C++
// Call a function that doesn't return anything
simple();

// Call a function that returns an integer
int y = get_int();
```

The limitation on one return value isn't as bad as it seems, because we can pass pointers into the function and use the dereferencing operator to set values. Notice that we have to specify the data types on the inputs and outputs.

```C++
float more_complex(float *p, float a) { // Takes two arguments, a pointer and a float
    *p = a; // dereference p and set the contents to a
    return *p + 1; // return the value pointed to by p, but add 1
}
```

Such a function would be used as follows:

```C++
float a = 1.0f;
float b = 2.0f;

// We send the address of b and the value of a to the function.
// The function will then fill b with a copy of the contents of a and return a+1
float c = more_complex(&b, a);
```

What would be the value in C?

## Printing values

### C-style printing

The function [**printf**](https://www.cplusplus.com/reference/cstdio/printf) from the standard library offers a C-like way to print variables to the command line. You need to include **\<cstdio\>** in the header of the program, like this:
    
```C++
#include <cstdio>
```

The **printf** function is used like this:

```C++
char a='s';
float x=5.0f;
float y=6.0f;

int i=2;
size_t s=3;

std::printf("%c is like %c", a, b); // Print a to the screen with its memory interpreted as character
std::printf("%f %f\n", x, y); // Print x and y to the screen with its memory interpreted as floats
std::printf("%i %zu\n", i, s); // Print i and s to the screen with their memory interpreted as an integer of type size_t 
```

The first argument is the format string and it can be followed by any number of arguments. Notice the newline character `\n` in the format string to make a new line at the end of the printing. The percent characters **\%** are placeholders for variables in the arguments and describe how the arguments should be interpreted and displayed. The **std::** means that we are using the **printf** function from the **std** namespace, this extra namespace qualifier isn't necessary or available when we work with C code.

### C++-style printing

In C++ there is a function called **std::cout** that can be used to print variables. In the header we need to include **<iostream>**

```C++
#include <iostream>
```

We use the insertion operator `<<` to send variables to **std::cout**. Note that one does not have to choose a format method like we need to do with **printf**. We can also use the insertion operator multiple times to send information to **std::cout**.

```C++
// Print the C-string f and a new line
std::cout << f << "\n" << std::endl;

// Print a float followed by an end of line
float x = 1.0f;
std::cout << x << std::endl;
```

The **std::endl** function can be used to append an end of line character to the output as shown above, however the end of line character itself `\n` is usually faster.

## Math and other operators in C++

Math operators in C and C++ have the following form in descending order of priority. A more comprehensive list can be found [here](https://en.cppreference.com/w/c/language/operator_precedence).

| Operator | Explanation |  
| :- | :- | 
| ++x | increment x by 1 and return the incremented x |
| x++ | return x and then increment x by 1 |
| --x | decrement x by 1 and return decremented x |
| x-- | return x and then decrement x by 1 |
| x * y | multiply x and y |
| x / y | divide x by y |
| x % y | return the remainder when x is divided by y |
| x > y | return 1 if x is greater than y, zero otherwise |
| x < y | return 1 if x is less than y, zero otherwise |
| x >= y | return 1 if x is greater than or equal to y, zero otherwise |
| x <= y | return 1 if x is less than or equal y, zero otherwise |
| x == y | return 1 if x is equal in value to y, zero otherwise |
| x != y | return 1 if x is not equal in value to y, zero otherwise |
| x \|\| y | return 1 if either x OR y are not zero, return zero otherwise |
| x && y | return 1 if both x AND y are not zero, return zero otherwise |

```C++
// Example use of math operators
int x=2;
int y=3;

std::printf("%d\n", x>y); // will print 0, why?
```

If we include the **\<cmath\>** header then a large number of math functions become available. For more information see [this resource](https://en.cppreference.com/w/cpp/header/cmath).

```C++
#include <cmath>
#include <cstdio>

int main() {

    // Example use of math operators
    float x=2.0f;
    float y=sin(x);

    std::printf("%f\n", y); // Print the sine of 2.0
}
```

When writing code, if there is any ambiguity with the order of math operations it is better to enclose those operations in parentheses and make them explicit, rather than rely on operator precedence. In the example below we know from operator precedence that `(a*b)` will take place before the subtraction of b,

```C++
float a = 2.0;
float b = 3.0;
float y = a*b-b;
```

It is far more readable (and eliminates the guesswork) if we use parentheses to make the precendence explicit.

```C++
float a = 2.0f;
float b = 3.0f;
float y = (a*b)-b;
```

When representing decimal numbers in code, the default precision for the number is `double` unless you put a `f` (float) or `l` (long double) next to it. During math operations the precision is automatically converted to that of the variable, however you can avoid this conversion step by using the appropriate suffix next to it.

## Flow control with `if` statements and `for` loops

**If** statements and **for** loops are encased in parentheses `()` and the code path taken is implemented in code blocks. **If** statements may optionally have one or more `else if` statements, and optionally an `else` catch-all statement. Wether or not the code block is entered depends on the result of the expression immediately after the **if**. A value of anything other than 0 means `True` and the code block is executed. A value of 0 means `False` and the code block is not executed. For example:

```C++
// Demonstrating if statements
int x=2;
int y=3;

// Using an if statement
if (x>y) {
    // This line won't be executed because x is smaller than y
    std::printf("%i is greater than %i", x, y);
}

// Using an if statement with an else
if (x > y) {
    std::printf("%i is greater than %i", x, y);
} else {
    // This statement gets executed
    std::printf("%i is less than or equal to %i", x, y);  
}

// Using an if statement with an else if and an else
if (x > y) {
    std::printf("%i is greater than %i", x, y);
} else if (x == y) {
    std::printf("%i is equal to %i", x, y);  
} else {
    // This statement gets executed
    std::printf("%i is less than %i", x, y);  
}
```

When writing `if` statements one has to be very careful to adequately handle all of the different permuations that the choices provide and not leave any gaps. 

**For** loops are constructed in the following manner:

```C++
// Demonstrating a for loop
int N=12;

//   starting condition; continue condition; increment
for (int x=0; x<N; x++) {
    // Execute this code block each time we go around the loop
    std::printf("x is now %i\n", x);
}
```

For loops usually have a loop variable. In the example above the starting condition is `x=0` and we are free to create the loop variable at that time. Next is the continuing condition, here it is `(x<N)` which means that the loop will continue while `(x<N)` evaluates to 1 (True). The increment condition is what happens to the loop variable at the end of each iteration.

Of course **for** loops may be nested, like this:

```C++
// Demonstrating a 2D for loop

// initialise loop limits
int M=3, N=2;

//   starting condition; continue condition; increment
for (int m=0; m<M; m++) {
    for (int n=0; n<N; n++) {
        // Execute this code block each time we go around the loop
        std::printf("m is %i, n is %i\n", m, n);
    }
}
```

The `printf` statement in the innermost loop is evaluated 6 times.

## Arrays and memory allocations

### Static array allocation

Sometimes we want to allocate memory for more than one element. If we know the size of the array at compile time and the array is small (i.e. less than the stack size limit determined by the OS), then we can `statically` allocate an array as follows:

```C++
       // Create an array of 3 elements
        int a[3]; 

        // Setting values in the array
        a[0]=0;
        a[1]=1;
        a[2]=2;

        // C++-style printing
        std::cout << "Statically allocated array a at index 0 is: " << a[0] << "\n";
        // C-style printing
        std::printf("%d\n", a[2]); // Using indexing operator
        std::printf("%d\n", *(a+2)); // Using pointer arithmetic
```

When we do this, an array of 10 integers is allocated on the stack and the variable **a** is actually a **pointer** that points to the first address of the memory allocation. Then we can use the indexing operator **[]** with the pointer to get at the elements of the allocation, as shown below:

```C++
// Print the first element in the array at index 0
std::cout << "Statically allocated array a at index 0 is: " << a[0] << "\n"; 
```

Indicies in C++ start at 0 and array bounds are not checked, so if we go beyond the 10th element (at index 9) and try to access memory that is not allocated for the array then the OS will most probably detect the access violation and crash the program.

### Array access and pointer arithmetic

A pointer is an address. We can add or subtract an integer offset to a pointer to access neighbouring elements (of the same data type) to the pointer address. Thus the following statements are equivalent;

```C++
std::printf("%d\n", a[2]); // Using indexing operator
std::printf("%d\n", *(a+2)); // Using pointer arithmetic
```

### Dynamic array allocation with `calloc` and `free`, `new` and `delete`

Sometimes we don't know how big an array needs to be, or we know for sure that the array is too big to fit into the stack. In such cases we can **dynamically** allocate memory at runtime from a large pool of available memory called the **heap**. In C and C++ two such functions to reserve memory are **[calloc](https://www.cplusplus.com/reference/cstdlib/calloc/)** and **[malloc](https://www.cplusplus.com/reference/cstdlib/malloc/)**. Malloc is fast, however I prefer `calloc` because unlike `malloc` it always initialises the memory contents with zeros. Memory allocation with `calloc` looks like the following, and you need to include the header `<cstdlib>` to access these functions.

```C++
size_t N = 1024;
void *a = calloc(N, sizeof(float)); // Allocates memory for 1024 floats
```

This allocates memory for 1024 floats. The sizeof function determines the size of argument in bytes. The pointer returned by calloc is a void pointer, but we can use the casting operator `()` to convert the void pointer to a pointer that we need. In this case it is `float*`.

```C++
size_t N = 1024;
float *a = (float*)calloc(N, sizeof(float)); // Allocate memory and cast to float* in one step.
```

Now we can use the indexing operator **[]** to access elements in the allocated memory relative to the position of the first address, which is **p**.

```C++
// Print the first element
std::printf("Dynamically allocated array a at index 0 is: %f\n", a[0]);
```

Memory that is allocated in this way **must be manually deallocated**. The **free** function takes a pointer to allocated memory and frees it for use again. Failure to free allocated memory when we no longer need it can result in what is known as a **memory leak**, where memory is reserved but not allocated. If this leak happens in a commonly used loop then it can quickly consume most of the available resources on a computer before being terminated by the OS.

```C++
free(a);
```

In C++ there is also **new** and **delete**. These operate in much the same way as `malloc`, where initialisation of the allocation to zero is not assured. The **new** and **delete** functions are essential functions for creating more complex C++ types.

```C++
// Allocate memory, the pointer a was already defined
a = new float[N];

// Print the first element
std::printf("%f", a[0]); 

// Deallocate memory
delete[] a;
```

For all the examples above we could have used any other data type such as **double** or **int** instead of **float**.

### Safer dynamic allocations with C++ vectors

In order to prevent memory leaks it is much safer to use one of the many available C++ container types. A **vector** is a container type that has access to a dynamic memory allocation that **can be resized**. This dynamic allocation (and any vector elements) is **automatically destroyed** when the vector goes out of scope. In the file [basic_examples.cpp](basic_examples.cpp) we demonstrate basic usage of a vector.

```C++
#include <vector>
#include <iostream>

int main() {

    // Create a vector designed to hold ints
    std::vector<int> v;

    // Resize the vector to store 10 elements
    v.resize(10);

    // Access element 0
    std::cout << v[0] << "\n";

    // Get a pointer to the first element
    int* p = v.data();

    // Get the size
    std::cout << v.size() << "\n";
}
```

When this vector goes out of scope, the memory allocation is automatically released. Any pointers to container elements must no longer be used however, as they will point to memory that is no longer valid. While a proper discussion of C++ container types and classes is beyond the scope of this foundational course, using container types is **best practice** for production quality C++ code. See the [reference material](#reference) for further information.

### Strings as an array of characters

#### C-style strings

A C-style string is just a memory allocation of characters with a null character **'\0'** to signify the end of the string. The null character is required by many string handling functions, otherwise they can run off the end of the array and cause a memory access violation. The allocation of memory may be larger than is needed to store all the characters in the string. We can declare a C-string as follows:

```C++
// Declare a C string
char f[] = "first string"; // String whose contents may be modified
const char p[] = "second string"; // String whose contents may not be modified
const char *q = "third string"; // String whose contents may not be modified

// The values f and p are just pointers to the first address in the array of characters

// Print the string, this prints all characters up to the null character 
std::printf("%s\n", p);

// Print element 0 as a character 
std::printf("%c\n", p[0]);

// Print a float
float x = 1.0f;
std::printf("%f\n", x);

// Print a float using scientific notation
std::printf("%e\n", x);
```

When strings are declared in this way, there is no need to explicitly assign a null character to the last element. The compiler does it for you. Inserting a null character may become necessary if you have an array of characters and  want to use that array as a string.

#### C++-style strings

C++ introduces a new object for string handling that works very much like a `vector`. Firstly, we need to have this in the header,

```C++
#include <string>
#include <iostream>
```

then we can declare objects of type **string**. Here are some common operations with C++ style strings.

```C++
// Declare a C++ string
std::string words = "Hey there!";

// String concatentation
std::string morewords = words + " Nice to see you!";

// Print the size of the string (number of characters in the string)
std::cout << morewords.size();

// Get a C-style string (string + termination character) from C++ style string
std::cout << morewords.c_str();

// Print the string with std::cout
std::cout << words << "\n";
```

This code can be found in [basic_examples.cpp](basic_examples.cpp).

## The compilation process

C and C++ sources need to be **compiled** into a program before it can be run. 

Compilation of C/C++ sources takes place in four stages:

1. Pre-process to remove comments and include any files brought in with the `#include` statement.
2. Compile from source code to lower level assembly code.
3. Use an assembler to convert assembly file to an **object file** containing machine code.
4. Link object files together to make a program.

The compiler takes care of the specifics in these steps. usually when compiling C and C++ sources we can perfom steps 1-3 (compile), and then step 4 (link); or we can do steps 1-4 in one go.

### Compile example code 

Here we compile the example code [basic_examples.cpp](basic_examples.cpp) using either 1-step or 2-step compilation.

#### 1-step compilation

This is where the compiler takes care of all the steps.

```bash
# Pre-process, compile, assemble, and link
g++ basic_examples.cpp -o basic_examples.exe
```

#### 2-step compilation

```bash
# Pre-process, compile, and assemble an object file
g++ -c basic_examples.cpp -o basic_examples.o

# Link the object file into an executable
g++ basic_examples.o -o basic_examples.exe
```

The two step process is more amenable to more complex projects as object files can be reused.

## Multi-dimensional arrays

Now we have a way of allocating and accessing memory in a one-dimensional array, lets see how we apply this to higher-dimensional structures like matrices and tensors. Fortunately, we don't need multi-dimensional array representations, they slow down access to elements. All we need to do is treat the 1D memory allocation as if it was **folded** into a higher-dimensional structure. Then we can use math to step in the higher dimensions. Shown below is a memory allocation of 24 elements. Below it are two 3D arrays of size **N** = (3,4,2), constructed by regarding the memory allocation as if it were "folded" according **row-major** and **column-major** ordering.

<figure style="margin-left:auto; margin-right:auto; width:100%;">
    <img style="vertical-align:middle" src="../images/col_major_row_major_indexing.svg">
    <figcaption style= "text-align:lower; margin:1em; float:bottom; vertical-align:bottom;">Row-major and column-major representations of a multi-dimensional array.</figcaption>
</figure>

If you look carefully at the folded representations you can see that a step of 1 along any dimension corresponds to a **stride** of a fixed number of elements in the 1D array. For example, in the arrays above a step of 1 along dimension 1 (left to right) requires a **stride of 2** elements in row-major ordering and **3** elements in column-major ordering. The stride for each dimension is encapsulated in the strides vector **S**. In this example `S = (8,2,1)` for row-major and `S = (1,3,12)` for column-major representations. For any (zero-based) coordinate vector **C** the position **P** in the underlying memory allocation is just the dot product between the coordinate vector **C** and the strides vector **S**.

$$P=C \cdot S$$

For example, take the coordinates `C=(1,3,1)`. Under row major ordering with `S=(8,2,1)` the position is:

$$P_{\mathrm{row-major}} = C \cdot S = (\textbf{1} \times 8) + (\textbf{3} \times 2) + (\textbf{1} \times 1) = 15.$$

Under column-major ordering with `S=(1,3,12)` the position is now:

$$P_{\mathrm{col-major}} = C \cdot S = (\textbf{1} \times 1) + (\textbf{3} \times 3) + (\textbf{1} \times 12) = 22.$$

This way of thinking about multi-dimensional arrays is **really powerful**, because in order to step by one in any dimension all we need to know is the stride for that dimension. It is also much cheaper to access multidimensional arrays because the math is cheap to calculate.

#### Constructing stride vectors

Constructing the strides vector in each ordering scenario is straightforward. For **row-major ordering** we fill the stride vector from right to left and place a stride of 1 in the last dimension. In this case $S_{2}=1$. Then we obtain the previous element $S_{1}$ using the pattern.

$$S_{i-1} = S_{i} N_{i}.$$

Repeat the
pattern until the stride vector is filled. 

Filling the strides vector is much the same for **column-major ordering**, but we fill from left to right. Starting with $S_{0}=1$ the next element is filled using the pattern

$$S_{i+1} = S_{i} N_{i}.$$

## Rudimentary File IO

There are much better ways to write binary data to files, using self-descriptive and cross-platform formats like [HDF5](https://www.hdfgroup.org/solutions/hdf5/). Sometimes though you might be stuck and just need a quick way to get your data in and out of your program for **testing and verification purposes**. Within the C subset of the C++ language there is a simple way to read and write binary data, using the endian representation that your architecture supports. You just have to be aware that this data may not be portable to other architectures without accounting for differences in endianess, so it is **not advisable** to use raw binary dumps for actual production code.

### Opening and closing files

The **std::fopen** function and **std::fclose** pair of functions open and close files. The **std::fopen** function takes the form:

```C++
std::FILE *std::fopen(const char *fname, const char *mode)
```

where the string **fname** is the file name and **mode** determines how the file is opened. A pointer to a **std::FILE** object is returned by **std::fopen**. There are two main modes to access a file, text mode and binary mode. Linux and Windows systems differ in how to represent a new line, on Windows a new line means '\r\n', whereas on Linux it means '\n'. Opening a file in text mode means lines are read and written in an OS-specific way. Opening a file in binary mode means that everything is read or written **as is** from the file, and no special interpretation is applied to new line characters. Here are the main modes useful for working with files:

| Mode | Explanation |  
| :- | :- | 
| "w" | open a file for writing in text mode, overwrite if file exists |
| "wb" | open a file for writing in binary mode, overwrite if file exists |
| "r" | open a file for reading in text mode |
| "rb" | open a file for reading in binary mode |
| "a" | open a file for appending in text mode, create if file does not exist |
| "ab" | open a file for appending in binary mode, create if file does not exist |

Adding a plus sign **"+"** to the mode allows updates to happen to the file as well. It is kind of like a read/write modifier that allows information to flow both ways, to and from the file.

| Mode | Explanation |  
| :- | :- | 
| "w+" | open a file for writing and updating in text mode, overwrite if file exists |
| "w+b" | open a file for writing and updating in binary mode, overwrite if file exists |
| "r+" | open a file for reading and updating in text mode |
| "r+b" | open a file for reading and updating in binary mode |
| "a+" | open a file for appending and updating in text mode, create if file does not exist |
| "a+b" | open a file for appending and updating in binary mode, create if file does not exist |

This [site](https://cplusplus.com/reference/cstdio/fopen/) explains in more detail how the `+` sign changes behaviour with the file. 

In the code below we open a file for writing, you will need to include **\<cstdio\>** in the header of the program.

```C++
// Open a file for writing in binary mode
std::FILE *fp = std::fopen("myfile.dat", "wb");
```

Of course, if this process fails somehow then **fp** will be filled with a **NULL** pointer. So you can check to see if something went wrong. 

When we are finished with files they must be closed with **std::fclose**.

```C++
// Open a file for writing in binary mode
std::fclose(fp);
```

### Positioning within a file using `std::fseek` and `std::ftell`

Every open file has the notion of a "current position". This hearkens back to the days when files were read from and written to tape. We can "rewind" or "fast-forward" this position using the **std::fseek** function and report on the current position using **std::ftell**. The signature for both functions is as follows:

```C++
int std::fseek(std::FILE *fp, long int relative_offset, int position);
long int std::ftell(std::FILE *fp);
```

The value **relative_offset** is the offset to seek to relative to **position**, and position is one of **SEEK_SET** for the beginning of the file; **SEEK_CUR** for the current position; and **SEEK_END** for the end location. The function **std::ftell** just tells the absolute location (in bytes) of the current position in the file. The mode that you open the file in also has a effect on what happens to the file position after an **fseek**. For example if the file is opened in append mode then any subsequent writes to the file will move the file position to the end.

For example, in the code below we seek to the start and end locations in the file and use that to report how big the file is.

```C++
// Open the file
std::FILE *fp = std::fopen("file.dat", "rb");

// Seek to the end of the file, 
// this position is beyond all information in the file
std::fseek(fp, 0, SEEK_END);

// Get how many bytes long the file is
long int nbytes = std::ftell(fp);

// Seek to the beginning of the file
std::fseek(fp, 0, SEEK_SET);

// Close the file
std::fclose(fp);
```


### Reading and writing binary data with **fread** and **fwrite**

Once a file is open in binary mode, you can use the **std::fread** and **std::fwrite** functions to read and write data. Both functions have the same signature.

```C++
// Binary read
size_t std::fread(void *p, size_t element_size, size_t nelements, std::FILE *fp);
// Binary write
size_t std::fwrite(void *p, size_t element_size, size_t nelements, std::FILE *fp);
```

In both instances **p** must be a pointer to allocated memory with size (element_size\*nelements), and the file must point to a valid open file.

### Complete example

In the file [cstyle_fileio.cpp](cstyle_fileio.cpp) is a complete example of writing 5 floats to a file and reading it back into an array.

```C++
    // Define the size of the array
    int N=5;
    const char *fname = "filename.dat";

    // Fill the array
    float* src = (float*)calloc(N, sizeof(float));
    for (int n=0; n<N; n++) {
        src[n] = (float)n;
    }

    // Open the file and write the array to it
    std::FILE *fp = std::fopen(fname, "wb");
    std::fwrite(src, sizeof(float), (size_t)N, fp);

    // free the source array and close the file
    std::fclose(fp);
    free(src);

    // Open the file for reading
    fp = std::fopen(fname, "rb");

    // Get the number of bytes in the file
    std::fseek(fp, 0, SEEK_END); // Zero offset relative to the end
    long int nbytes = std::ftell(fp);

    // Number of elements in an array
    long int nelements = nbytes/sizeof(float);

    // Allocate and read from the file
    float *dst = (float*)calloc(nelements, sizeof(float));
    std::fread(dst, sizeof(float), nelements, fp);

    // Free the destination array and close the file
    free(dst);
    std::fclose(fp);
```

<address>
Written by Dr. Toby Potter of <a href="https://www.pelagos-consulting.com">Pelagos Consulting and Education</a> for the Pawsey Supercomputing Centre.
</address>