---
title: Procedural Abstraction
abstract: |
    The goal of procedural abstraction is code reuse, which enables programmers to efficiently solve a class of related problems without duplicating code. To make abstraction effective, programmers follow design principles such as encapsulation, which hides implementation details and exposes only necessary interfaces, and modularization, which organizes code into independent, reusable components. To achieve these principles, several techniques are employed: argument passing (by value or reference) controls how data flows into functions; function overloading enables multiple functions with the same name to handle different types or numbers of arguments; and closures capture local state for flexible, thread-safe code; recursion allows functions to call themselves so that the solution of a complex problem can be easily obtained by combining solutions for simpler subproblems. Together, these techniques make procedural abstraction a powerful tool for building maintainable and robust software systems.
skip_execution: true
---

In [None]:
from __init__ import *
!mkdir -p private

In [None]:
if not input('Load JupyterAI? [Y/n]').lower()=='n':
    %reload_ext jupyter_ai

## Motivation

With imperative programming, we compose different programs to solve different problems by designing the flow of control. However, for problems that are closely related, we often end up with [duplicate code](https://en.wikipedia.org/wiki/Duplicate_code), making it difficult to maintain and extend the solutions over time. [Procedural abstraction](https://en.wikipedia.org/wiki/Procedural_programming) addresses this challenge by allowing us to define a piece of code that can be [reused](https://en.wikipedia.org/wiki/Recursion) across different programs to solve similar problems.

To motivate the concept, recall the program for computing the GCD in [Lecture 3](../Lecture3/Iterative_Programming.ipynb#code_gcd3):


```cpp
int c;
while (a) {
    b = b % a;
    c = b;
    b = a;
    a = c;
}
if (b<0) b=-b;
```

Note that a local variable `c` is used to swap the values of `a` and `b`:

```cpp
int c;
...
    c = b;
    b = a;
    a = c;
...
```

This can be simplified into one line using the function [`std::swap`](https://en.cppreference.com/w/cpp/algorithm/swap.html) from `<utility>`:

In [None]:
%%cpp
int a=2*3*4, b=3*4*5; // input

cout << format("gcd({}, {})=", a, b);
while (a) {
    b = b % a;
    swap(a, b);   // swap the values of a and b  
}
if (b<0) b=-b;
cout << b << "\n"; // final answer

::::{note}

Much like how a driver can operate a car without knowing how it’s built, programmers can write powerful programs by reusing well-designed procedures written by others, without understanding their internal implementations.

::::

We can further simplify the program into nearly one line by [defining a function](https://en.cppreference.com/w/c/language/function_definition.html):

In [None]:
%%cpp
int gcd(int a, int b) {
    return a? gcd(b%a, a): (b>0? b: -b);  // tail recursion
}  // function body must be a compound statement, i.e., enclosed by braces {}.

To use the function, we *call/invoke* it with two integers, whose values are passed to the parameters `a` and `b` in the [function parameter scope](https://en.cppreference.com/w/cpp/language/scope.html#Function_parameter_scope):

In [None]:
%%cpp
int a=2*3*4, b=3*4*5; // input

cout << format("gcd({}, {})={}\n", a, b, gcd(a, b));  // function invocation

The function can even be used by a different programming language! Play with it using the `ipywidgets` in Python below:

In [None]:
@interact(a="2*3*4", b="3*4*5")
def print_gcd(a, b):
    print("gcd({}, {})={}".format(a:=eval(a), b:=eval(b), ROOT.gcd(a, b)))

For instance, change the input box for `a` to `2*3*4*5` to see the new GCD value instantly.

::::{seealso} Functional programming

The `gcd` function is an example of recursion: it calls itself with `gcd(b % a, a)`, effectively swapping the values and computing the modulo operation in one step. This is one of the techniques in a programming paradigm called [functional programming](https://en.wikipedia.org/wiki/Functional_programming), which focuses on declaring what should be done, rather than how to do it, unlike imperative programming. The approach reuses code to an extreme that a function can even reuse itself!

::::

## Code Reuse

### How to reuse code?

Recall the program for computing the inverse square root in [Lecture 3](../Lecture3/Iterative_Programming.ipynb#code_fast_inv_sqrt3):


```cpp
constexpr auto threehalfs = 1.5, rel_tol = 1e-9;
const auto x2 = x*0.5;
double x_, gap, abs_x, abs_x_;
do {
    y = y * (threehalfs - (x2 * y * y));
    x_=1/y/y; gap=(x>x_)?(x-x_):(x_-x); abs_x=(x>0?x:-x); abs_x_=(x_>0?x_:-x_);
} while (y<0 || x!=x && gap > rel_tol*(abs_x>abs_x_?abs_x:abs_x_));
```

The code contains long declarations with similar use of conditional operations:

```cpp
...
..., gap, abs_x, abs_x_;
do {
    ...
    ... gap=(x>x_)?(x-x_):(x_-x); abs_x=(x>0?x:-x); abs_x_=(x_>0?x_:-x_);
} while (...);
```

::::{caution} [DRY (don't repeat yourself)](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself)

Duplicate codes are hard to maintain because an update to a piece code needs to be repeated on all its duplicates. 

::::

We can improve the program using [`std::fabs`](https://en.cppreference.com/w/cpp/numeric/math/fabs.html) from `<cmath>` and [`std::max`](https://en.cppreference.com/w/cpp/algorithm/max.html) from `<algorithm>`:

In [None]:
%%cpp
double x = 10./3;  // input

auto i = *(int64_t *) &x;
i = 0x5fe6eb50c7b537a9 - (i >> 1);
auto y = *(double *) &i;

constexpr auto threehalfs = 1.5, rel_tol=1e-9;
const auto x2 = x*0.5;
double x_;
do y = y * (threehalfs - (x2 * y * y));
while (y<0 || (x_=1/y/y)!=x && fabs(x-x_) > rel_tol*max(fabs(x),fabs(x_))); // <cmath> and <algorithms>
cout << format("rsqrt({})={}.\n", x, y); // final answer

The code is more readable and easier to maintain.

Why not package the inverse square root as a function for others to use?

In [None]:
%%cpp
double rsqrt_fast(const double x) {
    auto i = *(uint64_t *) &x;
    i = 0x5fe6eb50c7b537a9 - (i >> 1);
    auto y = *(double *) &i;

    static constexpr auto rel_tol=1e-9;
    const auto x2 = x*0.5;
    double x_;
    do {
        static constexpr auto threehalfs = 1.5;
        y = y * (threehalfs - (x2 * y * y));
    } while ( // threehalfs is not in scope but rel_tol is
        y<0 || (x_=1/y/y)!=x && fabs(x-x_) > rel_tol*max(fabs(x),fabs(x_))); // <cmath>, <algorithm>
    return y;                    
}

In [None]:
@interact(x="10/3")
def print_rsqrt_fast(x):
    print("rsqrt({})={}".format(x:=eval(x), ROOT.rsqrt_fast(x)))

::::{tip} Knowledge is power! 

The more you know about the functions available, the faster you can code without reinventing the wheel.

::::

[\<cmath\>](https://en.cppreference.com/w/cpp/header/cmath.html) corresponds to a standard library header file that specifies a set of common mathematical functions or constants defined in the standard library.

In [None]:
%%cpp
atan(tan(M_PI/4))*4  // arctan, tan, and pi

In [None]:
%%cpp
exp(log(M_E))            // exponential, natural logarithm, and natural number

In [None]:
%%cpp
sqrt(pow(2, 2))          // square root and power

Does it click? We can squeeze the inverse square root function into nearly one line using the [`std::sqrt`](https://en.cppreference.com/w/cpp/numeric/math/sqrt) function from `<cmath>`:

In [None]:
%%cpp
double rsqrt(const double x) {
    return 1./sqrt(x);     // <cmath>
}

In [None]:
@interact(x="10/3")
def print_rsqrt(x):
    print("rsqrt({})={}".format(x:=eval(x), ROOT.rsqrt(x)))

::::{caution}

Constants such as `M_PI` from `<cmath>` is defined as a [macro](https://en.cppreference.com/w/cpp/preprocessor/replace) but it is not part of the C++ standard and is not type-safe as the macro definition does not carry any type information. Different platform may have their own implementations. For [Visual C++](https://learn.microsoft.com/en-us/cpp/c-runtime-library/math-constants?view=msvc-170), one would need `#define _USE_MATH_DEFINES` before `#include <cmath>` to use `M_PI`.

::::

In [None]:
%%ai
Explain briefly why the macro INFINITY is in the C++ standard but M_PI is not?

To provide portable and type-safe constants, C++20 introduced [`<numbers>`](https://en.cppreference.com/w/cpp/numeric/constants.html), which provides different constants in different types:

In [None]:
%%cpp
numbers::pi

In [None]:
%%cpp
numbers::pi_v<float>

The standard library also has other headers listed [here](https://en.cppreference.com/w/cpp/header.html) such as those we have used so far:

- [\<algorithm\>](https://en.cppreference.com/w/cpp/algorithm.html) for operating on ranges of elements,
- [\<utility\>](https://en.cppreference.com/w/cpp/utility.html) for language support and other general purposes,
- `<iostream>`, `<format>`, `<limits>`, `<cstring>`, `<cstdint>`, and `<cstdlib>`.

::::{caution} Scope `std`

To use functions from `<cmath>` in a C++ source file, remember to include the header with `#include <cmath>` and use the scope `std`, even though many implementations also make an unqualified version of `sqrt` available for backward compatibility with C.

::::

### When to write your own code?

As a student learning programming, shouldn't you write your own code all the time?

In [None]:
%%ai
Explain briefly the benefit of reusing code from standard libraries instead of
writing your own?

Other than being short, the implementation of `rsqrt` using `sqrt` actually gives better answers, especially when the argument is very large or very close to 0.

For instance, we expect $\frac1{\sqrt{0}}=\infty$:

In [None]:
%%cpp
rsqrt(0)    // perfect!

In [None]:
%%cpp
rsqrt_fast(0)          // not large enough... Why?

On the opposite extreme, we expect $\frac1{\sqrt{\infty}}=0$:

In [None]:
%%cpp
rsqrt(1./0) // perfect!

In [None]:
%%cpp
rsqrt_fast(1./0)       // totally off... Why?

`std::sqrt` is also fast because it can be compiled to use the hardware accelerated `SQRTSD` instruction provided by the [SIMD instruction sets](https://en.wikipedia.org/wiki/X86_SIMD_instruction_listings).

The following cells generate the source code and compile it to use the `SQRTSD` instruction with the `-O2 -ffast-math` compiler options:

In [None]:
%%writefile private/rsqrt.cpp
#include <cmath>

double rsqrt(const double x) {
    return 1./std::sqrt(x);
}

In [None]:
!clang++ --stdlib=libc++ -O2 -ffast-math -S private/rsqrt.cpp -o private/rsqrt.s

::::{exercise}
:label: ex:fast-math

Locate the line that call the hardware accelerated square root function in the generated assembly code [](./private/rsqrt.s).

::::

:::::{solution} ex:fast-math
:class: dropdown

::::{code} assembly
:label: code_sqrtsd
:caption: Assembly code `rsqrt.s` showing `std::sqrt` can be compiled to use the hardware accelerated `SQRTSD` instruction provided by the [SIMD instruction sets](https://en.wikipedia.org/wiki/X86_SIMD_instruction_listings).
:linenos:
:emphasize-lines: 13

    	.file	"rsqrt.cpp"
    	.section	.rodata.cst8,"aM",@progbits,8
    	.p2align	3, 0x0                          # -- Begin function _Z11rsqrt_cmathd
    .LCPI0_0:
    	.quad	0x3ff0000000000000              # double 1
    	.text
    	.globl	_Z11rsqrt_cmathd
    	.p2align	4
    	.type	_Z11rsqrt_cmathd,@function
    _Z11rsqrt_cmathd:                       # @_Z11rsqrt_cmathd
    	.cfi_startproc
    # %bb.0:
    	sqrtsd	%xmm0, %xmm1
    	movsd	.LCPI0_0(%rip), %xmm0           # xmm0 = [1.0E+0,0.0E+0]
    	divsd	%xmm1, %xmm0
    	retq
    .Lfunc_end0:
    	.size	_Z11rsqrt_cmathd, .Lfunc_end0-_Z11rsqrt_cmathd
    	.cfi_endproc
                                            # -- End function
    	.ident	"Ubuntu clang version 20.1.7 (++20250612073421+199e02a36433-1~exp1~20250612193439.130)"
    	.section	".note.GNU-stack","",@progbits
    	.addrsig

::::

:::::

## Abstraction

### Callable

Procedural abstraction allows code reuse by hiding implementation details and exposing input parameters.
A [*callable*](https://en.cppreference.com/w/cpp/named_req/Callable) is the entity that realizes it:
> It is a piece of code that can be invoked with different arguments passed to the input parameters. 

In C++, functions are technically only one type of callables.

The following defines the simpest callable, which does nothing other than using three different kinds of brackets:

In [None]:
%%cpp
[](){}            // the simplest function

This is called a [lambda expression](https://en.cppreference.com/w/cpp/language/lambda.html), which can be invoked by appending an empty argument list `()`:

In [None]:
%%cpp
[](){}()          // the simplest function call that does nothing

Alternatively, the callable can be [defined as a function](https://en.cppreference.com/w/c/language/function_definition.html) with the name `do_nothing`:

In [None]:
%%cpp
void do_nothing() {}     // the simplest named function
do_nothing()             // the simplest function call by the function name

::::{caution} Defining functions in Jupyter notebooks

Due to [this issue](https://github.com/jupyter-xeus/xeus-cling/issues/40) with interpreting C++ functions in Jupyter notebook, we have to define a function in a separate cell. Some return types such as `unsigned long long` does not work, and some require the namespace to be explicitly specified such as `std::function`. 

::::

:::::{seealso} Why the name `lambda`?
:class: dropdown

The name `lambda` originates from [$\lambda$-calculus](https://en.wikipedia.org/wiki/Lambda_calculus), which is a formal system for computation introduced by Alonzo Church in 1930s. For more details, see the following video made with Manim:

::::{card}
:header: Programming with Math | The Lambda Calculus
:footer: [open in new tab](https://www.youtube.com/embed/ViPNHMSUcog?si=KJxEIQx949OeFx9H)

:::{iframe} https://www.youtube.com/embed/ViPNHMSUcog?si=KJxEIQx949OeFx9H
:width: 100%
:::
::::

:::::

A lambda expression is also known as an anonymous function, even though it is not regarded as a function according to [`<type_traits>`](https://en.cppreference.com/w/cpp/header/type_traits.html) for meta-programming, i.e., programming programming:

In [None]:
%%cpp
!is_function_v<decltype([](){})> && is_invocable_v<decltype([](){})> // from <type_traits>

In [None]:
%%cpp
is_function_v<decltype(do_nothing)>

As shown below, an anonymous function can actually be named by assigning it to a variable. It may be better called a callable literal literally!

In [None]:
%%cpp
std::function<void()> simplest_callable = [](){};
simplest_callable()

[`std::function<void()>`](https://en.cppreference.com/w/cpp/utility/functional/function.html) from `<functional>` specifies the type for the callable:
- [`void`](https://en.cppreference.com/w/cpp/language/types.html#void) is a type with an empty set of values.
- `void()` is the signature of the callable `[](){}`:
    - The callable returns nothing, or equivalently, has output type `void`, and
    - it takes no arguments, i.e., has an empty parameter list `()`.

While we cannot declare any variables with type void (why?), we can declare a pointer to void:

In [None]:
%%cpp
int a=1;
void *addr=&a;
cout << format("a={} @ {}", a, addr);
// *addr    // fails as addr is not meant for dereferencing

To return a value, we can use a [return statement](https://en.cppreference.com/w/cpp/language/return.html) in the compound statement at the end:

In [None]:
%%cpp
[]() {
    return 0;     // return statement
}
()                // called with no arguments

In [None]:
%%cpp
int zero() { return 0; }
zero()

Parameters can be specified before the body in parentheses:

In [None]:
%%cpp
[](const int x){     // one parameter
    return x>0 ? x : -x;
}
(-1.f)               // one argument passed by copy to x

In [None]:
%%cpp
double absolute(const double x) {
    return x>0 ? x : -x;
}
absolute(-1.f)

In [None]:
%%ai
Explain briefly why it is a good practice to declare the argument `x` as
`const` if its value is not modified within the function body.

A callable can have multiple parameters/arguments separated by commas:

In [None]:
%%cpp
[](const auto a, const auto b){ // takes two arguments
    return a>b? a: b;
}
(2, 3)

Note that `auto` can be used to deduce the type for each parameter from the arguments provided to the function call. This is known as an [abbreviated function template](https://en.cppreference.com/w/cpp/language/function_template.html#Abbreviated_function_template).

In [None]:
%%cpp
[](const auto a, const auto b){
    return a>b? a: b;
}
(2.f, 3)  // called with float and int

Note that the resulting type is `float` intead of `int`, according to how the conditional operator resolves its type.

In [None]:
%%cpp
[](const auto a, const auto b) -> double {
    if (a>b) return a;
    return b;
}
(2.f, 3)

If multiple return statements return values of different types, the callable may encounter errors unless the output type is explicitly specified, as shown in the code above with `-> double`, or the code below using function definition:

In [None]:
%%cpp
double maximum(const auto a, const auto b) {
    if (a>b) return a;
    return b;
}
maximum(2.f, 3)

::::{caution} Why can't we skip the return type?
:class: dropdown

Since C++ is statically typed, all the variable types must be deduced at compile time.

::::

::::{exercise}
:label: ex:f0

Why is it okay not to specify the output type for the following code using the conditional operator instead of the `if` statement?

```cpp
[](const auto a, const auto b){
    return a>b? a: b;
}
(2.f, 3)
```

::::

YOUR ANSWER HERE

A parameter cannot be specified as `const` if its value may be modified:

In [None]:
%%cpp
long a=2; unsigned long b=3;
[](long a, unsigned long b) { // not declared with const
    auto ans=1uL;
    while (b) {
        if (b%2) ans *= a;
        a *= a;
        b /= 2;
    }
    return ans;
} 
(a, b)

In [None]:
%%cpp
unsigned long ipow(long a, unsigned long b) {
    auto ans=1uL;
    while (b) {
        if (b%2) ans *= a;
        a *= a;
        b /= 2;
    }
    return ans;
}
ipow(a, b)

::::{exercise}
:label: ex:f1

What does the above program do?

::::

YOUR ANSWER HERE

::::{caution}

Modifying the parameters, however, does not modify the arguments *passed by copy* ([rvalue](https://en.cppreference.com/w/cpp/language/value_category.html#rvalue)) to the parameters.

::::

For instance, the global values of `a` and `b` remain unchanged if though the parameters `a` and `b` are modified in the function call.

In [None]:
%%cpp
cout << format("a={}, b={}\n", a, b);

### Overloading

Can we implement the inverse square root directly using the CPU instruction `SQRTSD` instead of relying on the compiler?

We can  use the [CPU intrinsic](https://en.wikipedia.org/wiki/Intrinsic_function#C_and_C++) [`_mm_sqrt_sd`](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm_sqrt_sd&ig_expand=6272) from `<immintrin.h>` (or the smaller `<emmintrin.h>`) that maps directly to the CPU instruction `SQRTSD`:

In [None]:
%%cpp
double rsqrt_(const double x) {
    return 1. / _mm_cvtsd_f64(_mm_sqrt_sd(_mm_set_sd(x), _mm_set_sd(x)));
}

In [None]:
@interact(x="10/3")
def print_rsqrt_(x):
    print("rsqrt({})={}".format(x:=eval(x), ROOT.rsqrt_(x)))

In [None]:
%%ai
In the following code, explain briefly why it uses _mm_set_sd and _mm_cvtsd_f64?
Why _mm_sqrt_sd takes two arguments instead of one?
---
1. / _mm_cvtsd_f64(_mm_sqrt_sd(_mm_set_sd(x), _mm_set_sd(x)))

There is also a CPU instrinsic [`_mm_rsqrt_ss`](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm_rsqrt_ss&ig_expand=6272,6278,6272,6278,5647) from `<xmmintrin.h>` that maps to the instruction `RSQRTSS` for inverse square root, but it only has single precision:

In [None]:
%%cpp
float rsqrt_(const float x) {
    return _mm_cvtss_f32(_mm_rsqrt_ss(_mm_set_ss(x)));
}

Here comes the challenging questions:
- Should we define `rsqrt_` with single precision or double precision? The user should have a say.
- Should we define two versions of `rsqrt_`, one with single precision or double precision? It is already complicated to have different versions of square roots!

We should define two version, and we already did:

In [None]:
%%cpp
rsqrt_(10.f/3)

In [None]:
%%cpp
rsqrt_(10./3)

Observe from the output data type that C++ can call the correct version of `rsqrt_` based on the argument type. We have [overloaded](https://en.cppreference.com/w/cpp/language/overload_resolution.html) the same function `rsqrt_` with two different versions, one for double precision argument, and one for single precision.

Unlike C++, Python does not support traditional function overloading. Because Python is dynamically typed, function definitions do not include argument types, making it impossible to distinguish functions by their parameter types.[^dispatch]

[^dispatch]: However, Python does allow operator overloading. It uses type-based dispatch instead.

::::{caution}

An overloaded function cannot be assigned to a variable. E.g., the following code fails

```cpp
auto rsqrt_intrin = rsqrt_;
```

Similarly, we cannot overload a function by assigning multiple lambda expressions to the same variable.

::::

## Encapsulation

### Pass by Reference

To implement a callable that can be reused by future unseen programs, it is important to encapsulate the content properly to avoid undesirable effects contaminating other code and vice versa.

In functional programming, the function `gcd` is called a [pure function](https://en.wikipedia.org/wiki/Pure_function) because

1. its return value is identical for identical arguments; and
2. it has no side effects like modifying the global `a` and `b`.

In comparison, `std::swap` from `<utility>` is an example of an *impure function* which has the *side effect* of changing the values of its argument.

::::{caution}

When debugging a program that uses an impure function, the programmer may need to look into the implementation of the function as well because
1. the function may produce side effects not captured by the output of the function; and
2. such output may depend on the program state other than the arguments of the function such as a variable with static storage duration.

::::

How to implement `std::swap`?

In [None]:
%%cpp
int a=2, b=3; // input
cout << format("a={}, b={}\n", a, b);
[](auto a, auto b) {  // arguments passed by copy
    auto c=a;
    a = b;
    b = c;
} (a, b);
cout << format("a={}, b={}\n", a, b);

::::{exercise}
:label: ex:swap_int

Why does the above program fail to swap the values of the integers `a` and `b`?

::::

YOUR ANSWER HERE

To fix the issue, one way is to use [pointers](https://en.cppreference.com/w/cpp/language/pointer.html):

In [None]:
%%cpp
int a=2, b=3; // input
cout << format("a={}, b={}\n", a, b);
[](auto * const a, auto * const b) {  // references passed by vlue
    auto c=*a;
    *a = *b;
    *b = c;
} (&a, &b);   // pass the references by copy
cout << format("a={}, b={}\n", a, b);

The arguments `&a` and `&b`, namely the references of the global `a` and `b`, are passed by copy to the constant pointer parameters `a` and `b` respectively, which then points to the global `a` and `b`.

Another way is to use the [*lvalue reference*](https://en.cppreference.com/w/cpp/language/reference.html), which avoid the additional dereference operations:

In [None]:
%%cpp
int a=2, b=3;
cout << format("a={}, b={}\n", a, b);
[](auto &a, auto &b) {  // arguments passed by lvalue reference
    auto c=a;
    a=b;
    b=c;
}
(a, b);
cout << format("a={}, b={}\n", a, b);

Since the parameters are delared with `auto &`, the references (memory addresses) instead of the values of the global variables are passed as the references of the parameters. The parameters `a` and `b` become aliases of the global variables `a` and `b` respectively. This is called *pass by reference*, as supposed to *pass by copy*.

In comparison, Python does not support pass by copy, as the argument values are not copied to the parameters:

In [None]:
def f(a):
    print("ID of local  a:", id(a))
    
a = 1
print("ID of global a:", id(a))
f(a)

Python passes arguments to the parameters by object reference, i.e., it binds the address of the value of an argument to the corresponding parameter.

::::{caution} Is Python's pass by object reference the same as C++'s pass by reference?
:class: dropdown

No. This is because a variable in Python is just a name with no persistent location. We cannot implement `swap` in python because the memory location is associated with the values (rvalue in C++'s term), not the variables (lvalue). Put it another way, `swap(1, 2)` in C++ fails because `1` and `2` are not in memory locations suitable for users to modify.

::::

C++ also supports pass by object reference using [rvalue reference](https://en.cppreference.com/w/cpp/language/reference.html#Rvalue_references):

In [None]:
%%cpp
int &&b=1

This will be useful to move larger objects around without the extra copy step and assigning them to variables.

In [None]:
%%ai
Explain how C++ supports pass by object reference using the rvalue reference.

::::{exercise}
:label: ex:swap

While there is no issue defining `void swap_int(int *a, int *b)` and `void swap_int(int &a, int &b)` simultaneously, why should we not define `void swap_int(int a, int b)` and `void swap_int(int &a, int &b)` simultaneously?

::::

YOUR ANSWER HERE

In [None]:
%%ai
How to implement swap in <utility>, which works for different data types?
Is it as efficient as defining swap for specific data types?

The following code overloads a function called `loc` with different ways of passing the arguments:

In [None]:
%%cpp
const auto &loc(const auto &x) {
    cout << "addr of const : " << &x << '\n';
    return x;
}

In [None]:
%%cpp
auto &loc(auto &x) {
    cout << "addr of lvalue: " << &x << '\n';
    return x;
}

In [None]:
%%cpp
auto loc(auto &&x) { // do NOT write &loc instead
    cout << "addr of rvalue: " << &x << '\n';
    return x;
}

The function can help understand code like the following by printing the addresses (locating the values).

```cpp
const int c=1;
int a=c, &b=a;
b+=1;
a
```

In [None]:
%%cpp
const int &c=loc(1);
int a=loc(c), &b=loc(a);
loc(b)+=1;
a

::::{exercise}
:label: ex:rvalue_ref

Unlike the other versions of `loc`, why should the last version `auto loc(auto &&x)` not be defined as `auto &loc(auto &&x)`, i.e., to return value by reference?

::::

YOUR ANSWER HERE

### Static Variable

Recall that in the function `rsqrt_fast` defined before, constant expressions `threehalfs` and `rel_tol` are declared as [`static`](https://en.cppreference.com/w/cpp/language/storage_duration.html#Static_block_variables). 

```cpp
double rsqrt_fast(const double x) {
    ...
    static constexpr auto rel_tol=1e-9;          // initialized once only
    ...
    do {
        static constexpr auto threehalfs = 1.5;  // initialized once only
        ...
    } while (...);
    ...               
}
```

Why?

Like global variables, static variables are initialized only once and persist throughout the program's execution. However, it is not visible outside the scope where it is defined.

For example, consider the following code that counts function calls:

In [None]:
%%cpp
unsigned int increment_count() {
    static int count=0;
    return ++count;
}

Repeatedly call the function below to observe the increase in the static variable `count`, even though it is initialized to `0` within the function:

In [None]:
%%cpp
cout << increment_count();
// count  // fails as count is out of scope

In contrast, the memory for non-static local variables is typically released when they go out of scope. For instance, without the `static` qualifier, the `count` is initialized in every function call, which is not desirable:

In [None]:
%%cpp
unsigned int increment_count_() {
    int count=0;
    return ++count;
}
increment_count_(); increment_count_(); increment_count_();
increment_count_()  // boring

::::{exercise}
:label: ex:undecimal_function

Recall the program for converting undecimal to decimal in [Lecture 3](../Lecture3/Iterative_Programming.ipynb#undecimal1):


```cpp
unsigned long value=0;
constexpr auto m1=numeric_limits<unsigned long>::max(), m2=m1/11, m3=m1-10;
for (size_t i=0, length=strlen(s); i < length; ++i) {
    value *= value<=m2 ? 11 : throw runtime_error("The value is too big.");
    switch (unsigned char c=s[i]) {
    case 'X':
        [[fallthrough]];
    case 'x':
        value<=m3 ? value+=10 : throw runtime_error("The value is too big.");
        break;
    default:
        if ((c=s[i]-'0')>9) throw runtime_error("Invalid character found.");
        value<=m1-c? value+=c : throw runtime_error("The value is too big.");
    }
}
```

Define it as a function `undecimal2decimal` that takes `s` as an argument and return the `value`. Declare variables to be static if doing so considered good programming pratice.

::::

In [None]:
%%cpp
/*
# REPLACE THE ENTIRE COMMENT WITH YOUR CODE #
*/

Check that your function works:

In [None]:
@interact(s="1X2X3x")
def print_undecimal2decimal(s):
    print("Decimal value of the undecimal {} is {}.".format(s, ROOT.undecimal2decimal(s)))

::::{exercise}
:label: ex:local-vs-global

Why might it not be a good idea to declare a local static variable as a non-static global variable instead?

::::

YOUR ANSWER HERE

### Closure

While it is good to declare constants as `static`, the use of static variables that are not constants is discouraged as it can complicate the logic of the program and cause nasty bugs. For example, declaring `x2` as `static` in `rsqrt` not only leads to incorrect results but can also result in infinite loops. (How?)

In more complicated scenarios, when the operating system needs to execute the program in multiple threads and pause or resume execution, static variables will not work properly:

::::{caution} Why non-constant static variables are discouraged even when they make a program easy to write?
:class: dropdown

Programs such as `increment_count` that use non-constant `static` variables are [non-reentrant](https://en.wikipedia.org/wiki/Reentrancy_(computing)) and not [thread-safe](https://en.wikipedia.org/wiki/Thread_safety).

::::

How to avoid using static/global variables, for instance, in `increment_count`?

```cpp
unsigned int increment_count() {
    static int count=0;
    return ++count;
}
```

To avoid using static/global variables, we can create a [closure](https://en.wikipedia.org/wiki/Closure_(computer_programming)) using the lambda expression:

In [None]:
%%cpp
std::function<int()> counter(int count=0) {
    return [count]() mutable {  // count captured by copy
        return ++count;
    };
}

- `[count]` is a [lambda capture](https://en.cppreference.com/w/cpp/language/lambda.html#Lambda_capture) that captures the value of the existing variable `count` by copying its value (not reference); and
- `mutable` is a [lambda specifier](https://en.cppreference.com/w/cpp/language/lambda.html#Explanation) that allows the body of the lambda expression to modify the objects captured by copy.

Run the following cell to create multiple counters:

In [None]:
%%cpp
auto counter1=counter(), counter2=counter();

Try running the following two cells repeatedly in arbitrary order to see that the counters are incremented independently.

In [None]:
%%cpp
counter1()  // increment counter1

In [None]:
%%cpp
counter2()   // increment counter2

As another example, consider the problem of computing the Fibonacci number $F_n$ of order $n\geq 0$, which is defined as

$$
\begin{align}
F_n = 
\begin{cases}
0 & \text{if } n = 0 \\
1 & \text{if } n = 1 \\
F_{n-1} + F_{n-2} & \text{if } n \geq 2.
\end{cases}
\end{align}
$$ (eq:fibonacci)

Like the `gcd` function, we can implement it easily as a recursion:

In [None]:
%%cpp
unsigned long fibonacci(const unsigned long n) {
    return n>1? fibonacci(n-1) + fibonacci(n-2): n==1? 1: 0;
}
fibonacci(4)

The following prints the first 10 Fibonacci numbers:

In [None]:
%%cpp
for (auto n=0uL; n<10; n++) {
    cout << fibonacci(n) << '\n';
}

Can we define a recursion using a lambda expression?

In [None]:
%%cpp
std::function<unsigned long(const unsigned long)> fib=[&fib](const unsigned long n) {
    return n>1? fib(n-1) + fib(n-2): n==1? 1: 0;
};
fib(4)

- `[&fib]` captures `fib` by reference. This is possible because `fib` is immediately available as we are definining it!
- `auto` would not work because the declaration is incomplete without evaluating the initializer, which relies on `fib` to be declared.

::::{exercise}
:label: factorial

Define the factorial function by assigning a lambda expression to a variable named `factorial`.

::::

In [None]:
%%cpp
/*
# REPLACE THE ENTIRE COMMENT WITH YOUR CODE #
*/
factorial(10) == 10*9*8*7*6*5*4*3*2*1 // for testing

Let's run `fibonacci` for bigger orders:

In [None]:
%%cpp
fibonacci(20)

In [None]:
%%cpp
fibonacci(30)

In [None]:
%%cpp
fibonacci(40)

In [None]:
if input("You really want to wait? [y/N]").lower() == 'y':
    ROOT.fibonacci(50)       # restart the kernel to stop

::::{caution}

Unlike [tail recursion](https://en.wikipedia.org/wiki/Tail_call) for `gcd`, the recursion for `fibonacci` is very inefficient: When the order `n` is increased by 1, the execution time increases roughly by a factor of $\phi\approx 1.618$, the golden ratio. `fibonacci(50)` is expected to be a hundred times slower than `fibonacci(40)`, so you need to wait for about 2 minutes. 

::::

The following is an implementation using iteration:

In [None]:
%%cpp
unsigned long fibonacci_iteration(const unsigned long n) {
    if (n==0) return 0;
    if (n==1) return 1;
    auto _=0uL,   // F_{n-2}
         F=1uL,   // F_{n-1}
         tmp=F;
    for (auto i=n; i>1; i--) {
        tmp += _; // F_{n-1} + F_{n-2} mod uL_max
        _ = F;    // F_{n-1}
        F = tmp;  
        if (_>F) throw runtime_error("The value overflows.");
    }
    return F;     // F_n
}

In [None]:
@interact(n=(0,100))
def print_fibonacci_iteration(n):
    print("Fibonacci number of order {}: {}.".format(n, ROOT.fibonacci_iteration(n)))

::::{exercise}
:label: ex:overflow_order

How does `fibonacci_iteration` detect overflow? What is the smallest order of which the Fibonnaci number overflows?

:::{hint}
:class: dropdown

Fibonacci squence is strictly increasing for order larger than 1.

:::

::::

YOUR ANSWER HERE

Note that `fibonacci_iteration(50)` gives $F_{50}$ with no sweat!

In [None]:
%%cpp
fibonacci_iteration(50)

In [None]:
%%cpp
cout << format("fibonacci(90)={}\n", fibonacci_iteration(90)); // no wait/way?

`fibonacci(90)` is expected to be 10 billion times slower than `fibonacci(40)`, so you have to wait for centuries to see it returns the above value!

To demonstrate the issue, the following program overloads `fibonacci` to print the recursive calls and count them:

In [None]:
%%cpp
unsigned long fibonacci(const unsigned long n, const int verbosity) {
    static auto count=0uL, depth=-1uL;
    depth+=1;
    if (!depth) count=0;
    count+=1;
    if (verbosity>1) {
        for (auto i=depth; i>0; i--) cout << '|';
        cout << format("fibonacci({})\n", n);
    }
    auto value = n>1? fibonacci(n-1,verbosity)+fibonacci(n-2,verbosity): n==1?1:0;
    if (!depth && verbosity>0) cout << format("Total number of calls: {}\n", count);
    depth-=1;
    return value;
}

With verbosity level 2, the function calls and the total number of calls are printed:

In [None]:
%%cpp
fibonacci(0, 2)

In [None]:
%%cpp
fibonacci(1, 2)

In [None]:
%%cpp
fibonacci(2, 2)

In [None]:
%%cpp
fibonacci(3, 2)

In [None]:
%%cpp
fibonacci(4, 2)

The depth of a recursive call is denoted by the number of bar `|`. Observe the redundant computations of lower order Fibonacci numbers.

To see the total number of calls for larger orders, use the verbosity level $1$ to print only the count, not the calls:

In [None]:
%%cpp
fibonacci(30, 1)

The number of calls in computing $F_n$ can be shown to be $2F_{n+1}-1$:

In [None]:
%%cpp
2*fibonacci(31)-1

It can also be shown that the Fibonacci number grows exponentially as $F_n\approx \frac{\phi^n}{\sqrt{5}}$, so does the count. This explains the exponentially long wait.

::::{exercise}
:label: ex:fix_fibonacci

How to make the recursion `fibonacci` efficient?

::::

YOUR ANSWER HERE

But the more important question for now is:

::::{exercise}
:label: ex:fix_fibonacci_static

How to rewrite the verbose `fibonacci` without using static/global variables?

::::

In [None]:
%%cpp
unsigned long fibonacci(const unsigned long n, const int verbosity) {
    auto count=0uL, depth=-1uL;
    /*
    # REPLACE THE ENTIRE COMMENT WITH YOUR CODE #
    */
}
fibonacci(4, 2)  // for testing

## Modularization

How to use a function you define in a file? The simplest way is to write the code that uses the function in the same file that defines it:

In [None]:
%%writefile private/rsqrt.cpp
#include <cmath>
#include <iostream>
#include <format>

inline double rsqrt(const double x) {
    return 1./std::sqrt(x);
}

using std::cout, std::format;

int main() {
    double x = 10./3;
    cout << format("rsqrt({})={}.\n", x, rsqrt(x));
    return 0;
}

In [None]:
!clang++ -fuse-ld=lld --std=c++20 --stdlib=libc++ -O2 -ffast-math private/rsqrt.cpp -o private/rsqrt && private/rsqrt

[`inline`](https://en.cppreference.com/w/c/language/inline.html) is a [function specifier](https://en.cppreference.com/w/c/language/function_specifiers.html) that suggests to the compiler that it can perform optimizations such as [inlining](https://en.wikipedia.org/wiki/Inline_expansion). Inlining involves replacing function calls with the function's code directly at the call site, which can reduce function call overhead and improve performance, especially for small functions.

We can also separate the function declaration from its definition:

In [None]:
%%writefile private/rsqrt.cpp
#include <cmath>
#include <iostream>
#include <format>

using std::cout, std::format;

int main() {
    double rsqrt(const double x);   // rsqrt available in block scope, but inline not allowed
    double x = 10./3;
    cout << format("rsqrt({})={}.\n", x, rsqrt(x));
    return 0;
}

inline double rsqrt(const double x) {
    return 1./std::sqrt(x);
}

In [None]:
!clang++ -fuse-ld=lld --std=c++20 --stdlib=libc++ -O2 -ffast-math private/rsqrt.cpp -o private/rsqrt && private/rsqrt

The above attempts to encapsulate `rsqrt` within the block scope of `main`, making it easier to maintain. However, the `inline` declaration is only allowed in namespace scope, not block scope. If multiple programs need to access `rsqrt`, we can use a custom header file `rsqrt.hpp` as shown below, to be included in other files that use it:

In [None]:
%%writefile private/rsqrt.hpp
#pragma once  // Non-standard pragma to ensure the header is only parsed once.
#include <cmath>

inline double rsqrt(const double x) {
    return 1./std::sqrt(x);
}

In [None]:
%%writefile private/main.cpp
#include <iostream>
#include <format>
#include "rsqrt.hpp"

using std::cout, std::format;

int main() {
    double x = 10./3;
    cout << format("rsqrt({})={}.\n", x, rsqrt(x));
    return 0;
}

In [None]:
!clang++ -fuse-ld=lld --std=c++20 --stdlib=libc++ -O2 -ffast-math private/main.cpp -o private/main && private/main

`rsqrt.hpp` is a header file included in `main.cpp` using a the quote form as supposed to the angular form:

- Headers included in the quote form such as `#include "rsqrt.hpp"` tells the compiler to search in the current directory before the system directories.
- Headers included using the angular bracket form such as `#include <format>` tells the compiler to search only in the system include directories as they are not defined in the current project.

In [None]:
%%ai
What happens if a header file is included more than once? there is a cyclic 
inclusion of header files? E.g., have `rsqrt.hpp` include itself?

[`#pragma once`](https://en.cppreference.com/w/cpp/preprocessor/impl) in the first line of `rsqrt.hpp` ensures that the header is parsed only once even if the header is included multiple times in a single compilation. However, since `#pragma once` is not part of the C++ standard, it may not be supported by older compilers. Whether it works also rely on the compiler's ability to detect file identity. A more conservative alternative is to use an include guard such as

```cpp
#ifndef RSQRT_HPP
#define RSQRT_HPP
...
#endif // RSQRT_HPP
```

where the macro `RSQRT_HPP` should be chosen as specific as possible to indicates that the content of `rsqrt.hpp` has been included.

In [None]:
%%writefile private/rsqrt.hpp
#ifndef RSQRT_HPP
#define RSQRT_HPP
#include <cmath>

inline double rsqrt(const double x) {
    return 1./std::sqrt(x);
}
#endif // macro RSQRT_HPP to indicate whether the header has been included

In [None]:
!clang++ --std=c++20 --stdlib=libc++ -O2 -ffast-math private/main.cpp -o private/main && private/main

`rsqrt.hpp` is also known as a header-only library because all implementations are included in the header file. For more complex implementations that can benefit from pre-compilation, the implementations can be placed in a separate source file. Consider the following modified header file:

In [None]:
%%writefile private/rsqrt.hpp
#include <cmath>

inline double rsqrt(const double x) {
    return 1./std::sqrt(x);
}

namespace intrin {
double rsqrt(const double x);
float rsqrt(const float x);
} // namespace intrin to avoid name conflicts

The header file
- defines `inline double rsqrt(const double x)` as before, and
- declares but not defines `double rsqrt(const double x)` and `float rsqrt(const float x)`.

The declaration without inline specifiers are grouped under a custom [namespace](https://en.cppreference.com/w/cpp/language/namespace.html) called `intrin` to avoid name conflict with the inline version of `rsqrt`.

To implement the overloaded `intrin::rsqrt` function in a separate source file:

In [None]:
%%writefile private/rsqrt.cpp
#include "rsqrt.hpp"     // optional but strongly prefered
#include <immintrin.h>    

namespace intrin {
// inline is unnecssary for definitions not in a header file
double rsqrt(const double x) {
    return 1. / _mm_cvtsd_f64(_mm_sqrt_sd(_mm_set_sd(x), _mm_set_sd(x)));
}

float rsqrt(const float x) {
    return _mm_cvtss_f32(_mm_rsqrt_ss(_mm_set_ss(x)));
}
} // namespace mymath

While the source file includes `<immintrin.h>` to access the CPU intrinsics directly in the implementations of `intrin::rsqrt`, it also includes the header file `rsqrt.hpp`, which contains the functions it implements. Technically, such inclusion is optional since the implementations do not rely on the inline version of `rsqrt`. However, doing so is strongly preferred to ensure the definitions are consistent with the declarations.

To use the `rsqrt`'s in a program:

In [None]:
%%writefile private/main.cpp
#include <iostream>
#include <format>
#include "rsqrt.hpp"

using std::cout, std::format;

int main() {
    double x = 10./3;
    float x_ = static_cast<float>(x);
    cout << format("rsqrt({})={}.\n", x, rsqrt(x));
    cout << format("intrin::rsqrt({})={}.\n", x, intrin::rsqrt(x));
    cout << format("intrin::rsqrt({}f)={}f.\n", x_, intrin::rsqrt(x_));
    return 0;
}

In [None]:
!clang++ -fuse-ld=lld --std=c++20 --stdlib=libc++ -O2 -ffast-math private/main.cpp private/rsqrt.cpp -o private/main && private/main

To share the library between multiple programs, we can first compile rsqrt to a dynamic library:

In [None]:
!clang++ -v -fuse-ld=lld -fPIC -shared --std=c++20 --stdlib=libc++ -O2 -ffast-math private/rsqrt.cpp -o private/librsqrt.so

To compile and link `main.cpp` with `librsqrt.so`:

In [None]:
!clang++ -v -fuse-ld=lld --std=c++20 --stdlib=libc++ -O2 private/main.cpp -L private -l rsqrt -o private/main_

To run the program, we need to include the folder `private` in the linker search path for dynamic libraries. This can be done by prepending `private` to the environment variable `LD_LIBRARY_PATH`:

In [None]:
!LD_LIBRARY_PATH=private:$LD_LIBRARY_PATH private/main_

Observe that the executable `private/main_` relying on `private/librsqrt.so` is smaller in size than the standalone executable `private/main`.

In [None]:
!ls -l private/main private/main_ private/librsqrt.so

In [None]:
%%ai
Why including the header file (e.g., rsqrt.hpp) in the implementation source file 
(e.g., rsqrt.cpp) that defines the function is strongly preferred in C++?

::::{exercise}
:label: ex:addmul

Review the [demo project from Lecture 1](../Lecture1/CMakeLists.txt) and describe how the functions defined by the project are 
- declared and defined in different header/source files; and
- compiled into libraries/executables.

::::

YOUR ANSWER HERE