---
title: Elements of Computations
skip_execution: true
---

In [None]:
from __init__ import *

In [None]:
if not input('Load JupyterAI? [Y/n]').lower()=='n':
    %reload_ext jupyter_ai

## Motivations

This notebook introduces the basic ingredients for computations in C++ and explores their limitations. To provide motivation for the subject, we begin by presenting two problems below.

### GCD

Consider the problem of computing the [greatest common divisor (GCD)](https://en.wikipedia.org/wiki/Greatest_common_divisor), which is a fundamental concept in number theory.[^Galois] 

::::{prf:definition} GCD
:label: def:gcd

The GCD of two non-zero integers $a$ and $b$, denoted as $\operatorname{gcd}(a, b)$  is *the largest integer $d$ that divides both $a$ and $b$*, i.e., $d|a$ and $d|b$.


::::

[^Galois]: For instance, two numbers with GCD equal to 1 are called coprime, which is a concept used in [finite fields](https://en.wikipedia.org/wiki/Finite_field) with significant applications to different areas such as coding theory and cryptography.

The following is an implementation in C++.

1. To specify and store the non-zero integers $a$ and $b$:

In [None]:
%%cpp
int a=2*3*4, b=3*4*5; // integer variables declaration and initialization
cout << format("gcd({}, {})=?\n", a, b); // print formatted string

2. To compute the GCD of $a$ and $b$ step-by-step, repeatedly run the following program until *`a` becomes `0`, in which case the absolute value of `b` is the gcd*.
  ::::{code} cpp
  :label: code_gcd1
  :caption: Computation of the GCD of the non-zero integers `a` and `b`.
  :linenos:
  {
      b = b % a; // assignment and modulo operation
      int c = b; // variable in the block scope of the compound statement
      b = a;
      a = c;
  } // compound statement
  ::::

In [None]:
%%cpp
{
    b = b % a;
    int c = b;
    b = a;
    a = c;
    cout << format("gcd({}, {})\n", a, b); // intermediate answer
} // compound statement

::::{caution}

Want to see a crash? Run the code further after getting the GCD from the printout `gcd(0, ...)`! Press <kbd>0, 0</kbd> to restart the kernel, and then run all above cells again.

::::

::::{exercise}
:label: ex:gcd1

How can the above program fail to give the correct GCD?

:::{hint}
:class: dropdown

The code fails for some choices of $a$ and $b$.

:::

::::

YOUR ANSWER HERE

If you can answer the above questions, great! But don't worry if you cannot. We will learn some basic programming elements to solve [](#ex:gcd1) in this notebook.

::::{exercise}
:label: ex:gcd2

Explain how the program computes the GCD. Evaluate the effectiveness of the algorithm and suggest improvements.

:::{hint}
:class: dropdown

The program uses the [](https://en.wikipedia.org/wiki/Euclidean_algorithm). Compare it with [Python's implementation for the `gcd` function in C](https://github.com/python/cpython/blob/47bc10e6b3cb44658da275f3484781ef2a2b9222/Objects/longobject.c#L5726C1-L5931C2), applying [](https://en.wikipedia.org/wiki/Lehmer%27s_GCD_algorithm). See also the [Github issue]()https://github.com/python/cpython/issues/66676) for the implementation detail.

If you would like to give the computational complexity, the worst case performance is when $a$ and $b$ are two consecutive [Fibonacci numbers](https://en.wikipedia.org/wiki/Fibonacci_sequence), such as

> `a = 8` ($F_6$) and `b = 13` ($F_7$).

$F_n$ grows exponentially as $O(\phi^n)$ where $\phi$ is the golden ratio.

:::

::::

YOUR ANSWER HERE

[](#ex:gcd2) is more challenging, as it requires quite a bit of [computational thinking](https://en.wikipedia.org/wiki/Computational_thinking) that you will develop over time, mostly outside the classroom. Dive into it with the help of AI:

In [None]:
%%ai
What is the fastest possible algorithm for computing GCD?

### Inverse Square Root

While computing the greatest common divisor (GCD) involves only integers, many computations involve rational, real numbers, and even complex numbers. An example is to compute the *inverse/reciprocal square root* 

$$
\begin{align}
\operatorname{rsqrt}(x)&:=\frac{1}{\sqrt{x}} && \text{for $x>0$},
\end{align}
$$ (eq:rsqrt)

which is an important computation for 3D graphics, e.g., in first-person-shooting games such as [](https://en.wikipedia.org/wiki/Quake_III_Arena).

The following is an implementation in C++.

1. Store the real number $x$ *approximately* as a floating-point number, and compute its reciprocal $y=\frac1x$ as the initial guess of the inverse square root:

In [None]:
%%cpp
double x = 10./3;  // a floating point number as input
cout << format("rsqrt({})=?\n", x);

auto y = 1/x;      // reciprocal, type `auto`matically deduced
cout << format("Initial guess: {}\n", y);

2. To compute the desired square root of the reciprocal, repeatedly runs the following until the answer remains unchanged/steady.
  ::::{code} cpp
  :label: code_inv_sqrt1
  :caption: Update rule for the inverse square root of `x`.
  :linenos:
  y = (y+1/y/x)/2;   // update the answer
  ::::

In [None]:
%%cpp
y = (y+1/y/x)/2;
cout << format("Intermediate answer: {}\n", y);

You can verify whether the answer is correct by recovering $x$ from $y$ as $\frac{1}{y^2}$:

In [None]:
%%cpp
1/y/y

The program [can be improved](https://mrober.io/papers/rsqrt.pdf) by:

1. A better initial guess
  ::::{code} cpp
  :label: code_fast_inv_sqrt1_guess
  :caption: Initial guess for the fast inverse square root of `x`.
  :linenos:
   auto i = *reinterpret_cast<int64_t *>(&x);   // copy x to i as an integer
   i = 0x5fe6eb50c7b537a9 - (i >> 1);           // ???
   auto y = *reinterpret_cast<double *>(&i);    // copy i to y as a double
  ::::

In [None]:
%%cpp
double x = 10./3;  // input
cout << format("rsqrt({})=?\n", x);

auto i = *reinterpret_cast<int64_t *>(&x);
i = 0x5fe6eb50c7b537a9 - (i >> 1);
auto y = *reinterpret_cast<double *>(&i);
cout << format("Initial guess: {}\n", y);  // initial guess

2. An alternative update rule
  ::::{code} cpp
  :label: code_fast_inv_sqrt1_update
  :caption: Update rule for the fast inverse square root of `x`.
  :linenos:
  y = y * (1.5 - (x*0.5 * y * y));
  ::::

In [None]:
%%cpp
y = y * (1.5 - (x/2 * y * y));
cout << format("Intermediate answer: {}\n", y);

::::{exercise}
:label: ex:inv_sqrt1

How can the above program fail to give the correct GCD?

:::{hint}
:class: dropdown

The code fails for some choices of $x$.

:::

::::

YOUR ANSWER HERE

::::::{exercise}
:label: ex:inv_sqrt2

Explain how the programs compute the inverse square root.

:::::{hint}
:class: dropdown

With $y:=\frac{1}{\sqrt{x}}$, we can obtain an equation in $y$ whose root is the desired inverse square root:
\begin{align}
f(y):=y^2 - \frac1x
\end{align}

The question is how to compute the root of $f(y)$. To understand the improved initial guess, see a related video below:

::::{card}
:header: The Fast Inverse Square Root -- 0x5f3759df explained!!
:footer: [open in new tab](https://www.youtube.com/embed/NCuf2tjUsAY?si=oaqOOD64Z6kxCS_m)

:::{iframe} https://www.youtube.com/embed/NCuf2tjUsAY?si=oaqOOD64Z6kxCS_m
:width: 100%
:::
::::

:::::

::::::

YOUR ANSWER HERE

Answering both [](#ex:inv_sqrt1) and [](#ex:inv_sqrt2) requires a very good understanding of how real numbers are represented in computers in addition to the numerical methods involved:

In [None]:
%%ai
List different methods of computing the square root using basic arithmetic 
operations.

In the sequel, we will tackle the easier task of learning the elements of C++ involved in carrying out the above computations.

## Integer

### Literal

How to specify an integer value in C++?

One way to specify a value is to use a *literal*, which is a fixed value directly embedded in the source code. A literal can be written in different ways:

In [None]:
%%cpp
15 // in decimal

In [None]:
%%cpp
0b1111 // in binary

In [None]:
%%cpp
017 // in octadecimal

In [None]:
%%cpp
0xF // in hexadecimal

All of the above have the same value of type `int`!

In comparison, Python also uses the same syntax except for octadecimal format:

In [None]:
15, 0b1111, 0o17, 0xF

C++ also supports [different types of integer literals of different sizes](https://en.cppreference.com/w/cpp/language/types.html), e.g., 3 billions is represented as type `long`:

In [None]:
%%cpp
3000000000

$10^{19}$ is represented as unsigned integer type:

In [None]:
%%cpp
10000000000000000000 // with a warning

We can use the [appropriate suffixes](https://en.cppreference.com/w/cpp/language/integer_literal.html) to specify the desired type:

In [None]:
%%cpp
10000000000000000000uLL // no warning message

In [None]:
%%cpp
1L // not the default type `int`.

The size of different types can be obtained by the [`sizeof` operator](https://en.cppreference.com/w/cpp/language/sizeof.html):

In [None]:
%%cpp
sizeof(short)  // in the unit of byte (8 bits)

In [None]:
%%cpp
sizeof(int)

In [None]:
%%cpp
sizeof(long)

In [None]:
%%cpp
sizeof(long long)

Let's calculate the number of values of type `int`:

In [None]:
%%cpp
1uLL << sizeof(int)*8

::::{caution} Shouldn't we use `cout <<` instead to print the number of values?
:class: dropdown

The number of values of type `int` can be computed using the [bitwise shift operator `<<`](https://en.cppreference.com/w/cpp/language/operator_arithmetic.html#Bitwise_shift_operators), which is not the usual [stream insertion operator](https://en.cppreference.com/w/cpp/io/basic_ostream/operator_ltlt2.html) used to print to standard output/error `cout`/`cerr`:[^bitwise]

::::

[^bitwise]: Bitwise operators perform operations on the bits of the binary representation of integers. You may explore other bitwise operators to learn about them:
    1. Bitwise Shift: `<<`, `>>`
    2. Bitwise AND: `&`
    3. Bitwise XOR: `^`
    4. Bitwise OR: `|`

::::{caution} What happens if you change `1uLL` to `1`?
:class: dropdown

The computed value will be wrong. You may see a warning such as

```
warning: shift count >= width of type [-Wshift-count-overflow]
```

because `int` is not large enough to represent the number of values of type `int`! Can you explain what the computed value actually is? Try running the code in `xeus-cpp` kernel as a comparison.

::::

We can also obtain the exact range of a number type using the [`std::numeric_limits`](https://en.cppreference.com/w/cpp/types/numeric_limits.html) class template from the header `<limits>`:

In [None]:
%%cpp
cout << format("Range of int: {{{} ... {}}}\n", 
               numeric_limits<int>::min(),  
               numeric_limits<int>::max());

The number of values of type `int` can alternatively be computed as follows:

In [None]:
%%cpp
1uLL + numeric_limits<int>::max() - numeric_limits<int>::min()

You might have noticed that the actual size of `int` is not the same as the required size specified in the C++ standard. Let's ask AI about it:

In [None]:
%%ai
Why int in C++ is 32 bits instead of 16 bits required by the C++ standard?

Similar, the size of `long` can be 32 bits or 64 bits depending on the computer/compiler used. To ensure a definite range across different implementations, fixed integer types are available from [`<cstdint>`](https://en.cppreference.com/w/cpp/types/integer.html).

In [None]:
%%cpp
1uL - numeric_limits<int16_t>::min() + numeric_limits<int16_t>::max() == 1uL << 16

For instance, [](#code_fast_inv_sqrt1_guess) uses `int64_t` to ensure that the integer type has `64` bit, which matches the size of the type of `x` to be explain in the section [](#Floating-Point-Number).

::::{exercise}
:label: ex:limits

What if `1` is used instead of `1uLL` in the above calculation?
    
::::

YOUR ANSWER HERE

### Variable

Knowing how to specify an integer, we need a way to store it and retrieve it to perform more complicated computations.

::::{caution}

C++ is [statically typed](https://en.wikipedia.org/wiki/Type_system#Static_typing), requiring explicit declaration of variable types and making programmers aware of their memory requirements. When done properly, C++ programs can be very fast. Unfortunately, writing good C++ programs can be quite demanding. Mistakes can lead to security risks that are hard to detect or fix.

::::

The following declares integer variables with the default [initialization](https://en.cppreference.com/w/cpp/language/initialization.html) of value `0` for variables with static storage duration such as [global variables](https://en.cppreference.com/w/cpp/language/scope.html).

In [None]:
%%cpp
int a;
short b, c;
unsigned long d, e, f;
cout << format("{} {} {} {} {} {}\n", a, b, c, d, e, f);

`int a;` is an [expression statement](https://en.cppreference.com/w/cpp/language/statements.html#Expression_statements) (ended with a semi-colon) that declares an integer variable named `a`.

A variable name must be a valid [identifier](https://en.cppreference.com/w/cpp/language/identifiers.html). In particular, similar to Python, it should start with a letter or an underscore, and should not be one of the [keywords](https://en.cppreference.com/w/cpp/keywords.html). For instance, the following declarations fail:

```cpp
int 1a;
int this;
```

Variables without static storage duration such as the following variables defined in a [block scope](https://en.cppreference.com/w/cpp/language/scope.html#Block_scope) are not initialized:

In [None]:
%%cpp
{
    int a;
    short b, c;
    unsigned long d, e, f;
    cout << format("{} {} {} {} {} {}\n", a, b, c, d, e, f);
}

To initialize the variables explicitly, the following uses the [*copy initialization*](https://en.cppreference.com/w/cpp/language/copy_initialization.html):

In [None]:
%%cpp
auto a = 1;
short b = a+1, c = b+1;
auto d = 1uL;
cout << format("{} {} {} {}\n", a, b, c, d);

In the above code, `auto` is a [placeholder type specifier](https://en.cppreference.com/w/cpp/language/auto) that allows the compiler to automatically deduce the data type based on the initializer:

In [None]:
%%cpp
a

In [None]:
%%cpp
d

Note that a variable is available immediately after the [*locus/point* of declaration](https://en.cppreference.com/w/cpp/language/scope.html#Point_of_declaration). It is really **immediately**:[^recursion]

[^recursion]: This will be useful when defining a [recursion](https://en.wikipedia.org/wiki/Recursion) using [lambda expression](https://en.cppreference.com/w/cpp/language/lambda.html).

In [None]:
%%cpp
int one=one+1; // `auto` would not work obviously.
one

In [None]:
%%ai
Explain briefly whether `one` in the following code initialized as `one+1`? 
Or `1`? or Or `0`?
---
int one=one+1;

The copy initialization involves an extra copy step, which can be costly for composite data consisting of many values. The extra copy step can be avoided using [*direct initialization*](https://en.cppreference.com/w/cpp/language/direct_initialization.html):

In [None]:
%%cpp
int one(one+1);
one

In [None]:
%%ai
Is `int one(one+1);` faster than `int one=one+1;`?

Let's play with the copy and direct initialization some more:

In [None]:
%%cpp
unsigned short d=-1;  // runs without warning
d

In [None]:
%%cpp
short b(100000);    // runs with warning
b

::::{caution}

Both the copy and direct initializations do not prevent *narrowing*, i.e., the process of converting a value from a larger data type to a smaller one, potentially resulting in the loss of information or precision.

::::

To catch bugs caused by narrowing, C++11 introduced [*list initialization*](https://en.cppreference.com/w/cpp/language/list_initialization.html):

In [None]:
%%cpp
int one{one+1};
one

This initialization not only avoid an extra copy step associated with the copy initialization, it also prevents information loss due to narrowing. In particular, the following fails as narrowing is checked by list initialization:

```cpp
short b{100000};
unsigned long d{-1};
```

### Operator

The values of variables can be modified using [assignment operators](https://en.cppreference.com/w/cpp/language/operator_assignment.html):

In [None]:
%%cpp
unsigned long d, e, f;
d = e = f = 1;
cout << format("{} {} {}\n", d, e, f);

`d = e = f = 1` behaves like chained assignment in Python:

In [None]:
d = e = f = 1
print(d, e, f)

Similar to the augmented assignment operators in Python, C++ also has compound assignment operators such as `+=`, `-=`, `*=`, `/=`, `%=`, `&=`, `|=`, `^=`, `<<=`, `>>=`:

In [None]:
%%cpp
f += e += d += 1;
cout << format("{} {} {}\n", d, e, f);

However, `f += e += d += 1` is not a valid syntax in Python. How does assignment work in C++?

1. In C++, assignment operators are right associative and so the evaluation is equivalent to
    ```cpp
    f += (e += (d += 1))
    ```
    
    In comparison, assignment operators are non-associative in Python.

2. In C++, an assignment operation has a value equal to the assigned value, so 
    - `(d += 1)` evaluates to `d+1`;
    - `(e += d+1)` evaluates to `e+d+1`; and
    - `(f += e+d+1)` evaluates to `f+e+d+1`.
   
   In comparison, an assignment is usually a statement in Python that does not have a value.[^walrus]

[^walrus]: The exception is the assignment expression using the walrus operator `:=`.

Note that assignment and initialization are different operations. For instance, a variable can be declared to be a [constant](https://en.cppreference.com/w/cpp/language/cv.html) using `const`:

In [None]:
%%cpp
const int one=one+1; // cannot write `const int one+=1;`
// one += 1;   // fails
cout << one;

A constant is stored in read-only memory that cannot be modified after initialization. While the default initialization and copy initialization in `const int one=one+1;` are okay, the compound assignment operation `one+=1` is not.

::::{caution} Precedence and Associativity

An expression often involves many operators, so it is important to learn the precedence and associativity of [the list of operators](https://en.cppreference.com/w/cpp/language/operator_precedence.html) to understand the code. If you get to write the code instead, you can always use paratheses to specify the desired order.

::::

Note that different programming languages may have different meanings or implementations for the same operator.

 While C++ has no exponentiation operator `**`, unlike Python, it has the the suffix/prefix increment/decrement operators `++`/`--`, which are not available in Python:

In [None]:
%%cpp
int x = 0;
int y = x++; // increments x after evaluation
int z = --x; // decrements x before evalution
x == y && y == z && z == 0  // all zero?

The last expression utilizes the comparison operator `==` (not `=`) to check *equality* and combine these checks using the logical *AND* operator `&&`. In general, comparison operators have higher precedence than logical operators and so they are evaluated first, i.e.,

```cpp
(x == y) && (y == z) && (z == 0)
```

In Python, you can achieve the same effect using a chained comparison:

In [None]:
ROOT.x == ROOT.y == ROOT.z == 0

::::{exercise}
:label: ex:all-zero

Why does the following C++ code return false even when `x`, `y`, and `z` are all zeros?

::::

In [None]:
%%cpp
int x = y = z = 0;
x == y == z

::::{solution} ex:all-zero
:class: dropdown

The chaining in C++ works differently than the chained comparison in Python.

- `x == y == z` evaluates to `true == z` since `==` is left-associative and `x == y` evaluates to `true`; and
- `true == z` evaluates to `false` since `z` is `0`, not `true.

::::

[](#code_gcd1) uses the [modulo operation](https://en.wikipedia.org/wiki/Modulo) `a % b` to give the remainder of `a` divided by `b`. However, the behavior can be confusing for negative operands, i.e., when `a` or `b` are negative. [](#code_gcd1) can give different results if written in a different programming language or hardware because the modulo operator may have a different implementation.

For C++:

In [None]:
%%cpp
int r0=  5 %  3;
int r1= -5 %  3;
int r2=  5 % -3;
int r3= -5 % -3;
cout << format("{}, {}, {}, {}\n", r0, r1, r2, r3);

For Python:

In [None]:
print("{}, {}, {}, {}\n".format(5 % 3, -5 % 3, 5 % -3, -5 % -3))

::::{prf:definition} Modulo
:label: def:modulo

$a \bmod b$ gives the remainder $r$ that satisfies for some integer $q \in \mathbb{Z}$, called quotient, that

$$
\begin{align}
a &= b\cdot q + r\\
\lvert r\rvert&<\lvert d\rvert.
\end{align}
$$ (eq:modulo)

The condition above does *not* determine $r$ (and $q$) uniquely unless the sign of $r$ is also specified.

::::

::::{exercise}
:label: ex:modulo

How is the modulo operation implemented differently in C++ and Python?

::::

::::{solution} ex:modulo
:class: dropdown

The sign of $r$ follows that of the dividend (divisor) for C++ (Python). E.g., for C++,

\begin{alignat}{3}
-5 &= &3 &\cdot \overbrace{(-1)}^{q} +&\overbrace{(- 2)}^{r}&\\
5 &= &-3 &\cdot (-1) + &2&
\end{alignat}

::::

The quotient $q$ in [](#eq:modulo) can be obtained by the division operation `/` operation.

In [None]:
%%cpp
int q0=  5 /  3;
int q1= -5 /  3;
int q2=  5 / -3;
int q3= -5 / -3;
cout << format("{}, {}, {}, {}\n", q0, q1, q2, q3);

In Python, `//` is used instead for integer division. Similar to `%`, the implementation for `//` is also slightly different from `/` in C++.

In [None]:
print("{}, {}, {}, {}\n".format(5 // 3, -5 // 3, 5 // -3, -5 // -3))

In [None]:
%%ai
Explain in one line how integer division is implemented differently in 
C++ versus Python.

## Character

How to represent a character?

A character literal is a character delimited by *single* quotes.

In [None]:
%%cpp
'f'

Each value of type [`char`](https://en.cppreference.com/w/cpp/language/types.html#Character_types) is represented by 1 byte.

In [None]:
%%cpp
sizeof(char)

[`char`](https://en.cppreference.com/w/cpp/language/types.html#Character_types) is is actually an integer type, e.g., we can initialize variables of type `char` with integer values as follows.

In [None]:
%%cpp
char a = 65, b(66), c {67};
cout << format("{} {} {}", a, b, c);

We can also perform arithmetic operations on characters like what we can do on integers:

In [None]:
%%cpp
a - b * 2

`char` can also be signed or unsigned:

In [None]:
%%cpp
numeric_limits<char>::min()           // sign bit (left-most bit) equals 1

In [None]:
%%cpp
numeric_limits<unsigned char>::min()  // 0

A character is represented by an integer according to the [ASCII code](https://en.cppreference.com/w/cpp/language/ascii.html). The following converts between `int` and `char` using [static type casting](https://en.cppreference.com/w/cpp/language/static_cast.html):

In [None]:
%%cpp
cout << format("The ASCII code of {} is {}.\n", static_cast<char>(65), static_cast<int>('A'));

::::{caution} C-style cast

The [C-style cast](https://en.cppreference.com/w/c/language/cast.html) `(int) 'A'` also works, but it is not preferred because it lacks compile-time type safety checks.
For example, [](#code_fast_inv_sqrt1_guess) for computing the initial guess of the [fast inverse square root algorithm](https://en.wikipedia.org/wiki/Fast_inverse_square_root#Overview_of_the_code) can use the C-style cast like

```cpp
auto i = *(int64_t *)(&x);
...
auto y = *(double *)(&i);
```

The first line converts the address `&x` of a `double` to an address of `int64_t`, so the dereferenced value can be assigned as an integer to `i`. In other words, while the binary sequences stored in the memory locations of `x` and `i` are the same, they represent numbers of very different types. Such pointer conversions are rare and are often the result of mistakenly typing `*`, which can lead to memory corruption. Therefore, `static_cast` incorporates type safety checks to raise an error for such conversions, helping to identify issues more easily. If such a conversion is intended, however, programmers can use `reinterpret_cast` instead, as shown in [this code](#code_fast_inv_sqrt1_guess).

::::

In [None]:
%%ai
What is static about static_cast?

Be careful that not every character can be printed. E.g., the last character in the ASCII code is <kbd>DEL</kbd> (delete), which cannot be printed.

In [None]:
%%cpp
cout << format("The ASCII code of {} is {}.\n", static_cast<char>(127), 127);

::::{exercise}
:label: ex:ascii

Explain why the following static type casting fails:

```cpp
static_cast<char>(128)
```

::::

YOUR ANSWER HERE

The first character in the ASCII code is <kbd>NUL</kbd> (null), which cannot be printed either.

In [None]:
%%cpp
cout << format("The ASCII code of {} is {}.\n", static_cast<char>(0), 0);

Note that it is missing a few more characters at the end, namely, `is 0.`. Why?

In [None]:
%%ai
Why the following C++ code only prints "The ASCII code of"?
---
cout << format("The ASCII code of \{\} is \{\}.\n", static_cast<char>(0), 0);

## Floating Point Number

### Declaration

Computations on real numbers are essential for many applications such as simulations, modeling, computer graphics, and machine learning, etc. To manipulate real numbers, a simple idea is to approximate a real number by a rational number, which can then be represented by two integers, namely, the numerator and denominator. For instance, 

$$\pi\approx \frac{22}{7}.$$

In [None]:
%%cpp
22/7

That is not quite what we expected! While you may know how to fix the above issue, it highlights the need for a more convenient representation for numerical computations‚Äî[floating-point arithmetics](https://en.wikipedia.org/wiki/IEEE_754).[^FPU] 

[^FPU]: Modern CPUs and GPUs are equipped with dedicated hardware called [Floating Point Units (FPUs)](https://en.wikipedia.org/wiki/Floating-point_unit) specifically designed to handle floating-point arithmetic efficiently.

In [None]:
%%cpp
22/7.  // the point is not the period

::::{exercise}
:label: ex:op_overloading

Explain why `22/7` and `22/7.` produce different results.

::::

YOUR ANSWER HERE

In C++, there are two floating-point data types: `double` and `float`.

In [None]:
%%cpp
constexpr auto PI = 0.314e1  // scientific notation

The qualifier [`constexpr`](https://en.cppreference.com/w/cpp/language/constexpr.html) declares a constant expression whose value is known at compile time and will not be modified later. In comparison, the qualifier `const` declares a runtime constant whose value need not be determined at compile time.[^constexpr] For instance, a number randomly drawn with `std::rand()` from `<cstdlib>` can be defined as a constant, but not a constant expression:

[^constexpr]: Constant expressions potentially allows the compiler to further optimize the code for faster execution.

In [None]:
%%cpp
const auto a=rand();
a

The following will fail because the initializer is randomly drawn at runtime:

```cpp
constexpr auto a=rand();
```

To enter a `float`, add the suffix `f` to the floating point number:

In [None]:
%%cpp
constexpr auto PI = 3.14f  // single precision instead of the default double precision

### Precision

The benefit of `float` is that it occupies less memory than `double`:

In [None]:
%%cpp
sizeof(double)

In [None]:
%%cpp
sizeof(float)

However, the smaller memory footprint comes at the cost of a lower precision. For instance, consider the [mass-energy equivalence](https://en.wikipedia.org/wiki/Mass%E2%80%93energy_equivalence):

In [None]:
%%cpp
constexpr float c = 2.99792458e8f;  // the speed of light
float m = 5, E = m*c*c;        // the mass-energy equivalence

Of course, $E=mc^2$ as verified below:

In [None]:
%%cpp
(E == m*c*c)

But $\frac{E}{c^2} \neq m$ somehow:

In [None]:
%%cpp
E/(c*c) == m

Changing the order or operations makes it work somehow:

In [None]:
%%cpp
E/c/c == m

Using `double` instead of `float` also works:

In [None]:
%%cpp
constexpr double c = 2.99792458e8;  // double instead of float
double m = 5, E = m*c*c;        // double instead of float
E == m*c*c && E/(c*c) == m

You might think that the issue has to do with very large/small numbers. The following shows that even a number close to 1, and with just one decimal place cannot be accurately represented in floating point:

In [None]:
%%cpp
cout << fixed << setprecision(20) << 1.1 << '\n';  // for double

::::{seealso} How does the above code print up to 20 decimal places?
:class: dropdown

- [`std::fixed`](https://en.cppreference.com/w/cpp/io/manip/fixed.html): This manipulator sets the output format to fixed-point notation.
- [`std::setprecision(20)`](https://en.cppreference.com/w/cpp/io/manip/setprecision.html) from [`<iomanip>`](https://en.cppreference.com/w/cpp/header/iomanip.html): This sets the number of digits after the decimal point to 20.

::::

In [None]:
%%cpp
cout << format("{:.20f}\n", 1.1f);  // for float

::::{seealso} How does the above code format the floating point number?
:class: dropdown

The second piece of code uses a [format specifier](https://en.cppreference.com/w/cpp/utility/format/spec.html) `{:.20f}`, which is a little bit cryptic but very convenient.

::::

Floating-point numbers have *limited precision*:

- Single precision is accurate typically to 6-9 decimal digits.
- Double precision is accurate typically to 15-17 decimal digits.

The precision error can accumulates differently for different operations executed in different orders.

The limits of `float` (and similarly `double`) can be obtained as follows from [`numeric_limits`](https://en.cppreference.com/w/cpp/types/numeric_limits.html#Member_functions):

In [None]:
%%cpp
cout << format("Minimum value: {}\n", std::numeric_limits<float>::min());
cout << format("Lowest value (including subnormal): {}\n", std::numeric_limits<float>::lowest());
cout << format("Maximum value: {}\n", std::numeric_limits<float>::max());
cout << format("Epsilon (difference between 1.0 and the next representable float): {}\n", std::numeric_limits<float>::epsilon());
cout << format("Round error: {}\n", std::numeric_limits<float>::round_error());
cout << format("Denormalized minimum value: {}\n", std::numeric_limits<float>::denorm_min());

In [None]:
%%cpp
numeric_limits<float>::min()           // the smallest positive normal value

In [None]:
%%cpp
numeric_limits<float>::lowest()        // the lowest finite value

In [None]:
%%cpp
numeric_limits<float>::max()           // the largest finite value

In [None]:
%%cpp
numeric_limits<float>::epsilon()       // the gap from 1.0 to the next value

In [None]:
%%cpp
numeric_limits<float>::round_error()   // the maximum rounding error

There are also some special values defined according to the IEEE 754 standard:

In [None]:
%%cpp
numeric_limits<float>::infinity()      // the positive infinity value

In [None]:
%%cpp
numeric_limits<float>::quiet_NaN()     // a quiet NaN value

In [None]:
%%cpp
numeric_limits<float>::signaling_NaN() // a signaling NaN value

In [None]:
%%cpp
numeric_limits<float>::denorm_min()   // the smallest positive subnormal value

Python's `float` is different from C++'s `float` because it has double precision:

In [None]:
import sys
sys.float_info

To understand the precision issue of floating point numbers, play with a similator:

- [IEEE 754 Floating Point Converter](https://www.h-schmidt.net/FloatConverter/IEEE754.html)
- [Float Toy](https://evanw.github.io/float-toy/)

The following is an IEEE 754 simulator in written in Python:

In [None]:
@interact(x=FloatLogSlider(
    value=1,        # Initial value of the slider
    base=2,         # Base of the logarithm (e.g., 10 for base-10 log)
    min=-1023-52,
    max=1023,
    step=1,
    description='x' # Label for the slider
))
def double2binary(x):
    # Convert the double to its binary representation
    binary = f"{unpack('>Q', pack('>d', x))[0]:064b}"
    
    # Extract sign, exponent, and mantissa
    sign = binary[0]
    exponent = binary[1:12]
    mantissa = binary[12:]
    
    # Convert exponent and mantissa to decimal
    sign_val = int(sign, 2)
    exponent_val = int(exponent, 2)
    mantissa_val = int(mantissa, 2)/2**52
    
    # Create color-coded HTML output 
    html_output = (
        f"Binary: "
        f"<span style='color:red;'>{sign}</span>"
        f"<span style='color:green;'>{exponent}</span>"
        f"<span style='color:blue;'>{mantissa}</span><br>"
    ) 
    html_output += (
        f"$(-1)^{{\\color{{red}}{sign_val}}}\\times "
        f"2^{{{{\\color{{green}}{exponent_val}}}-1023}}\\times "
        f"(1+{{\\color{{blue}}\\text{{{mantissa_val}}}}})$"
    ) if exponent_val < 2047 else (
        r"NaN" if mantissa_val > 0 else (
            f"${('', '-')[sign_val]} \\infty$"
        )
    )

    display(HTML(html_output))

In [None]:
double2binary(float('inf'))

In [None]:
double2binary(-float('inf'))

In [None]:
double2binary(float('nan'))

::::{exercise}
:label: ex:max_double

Explain why the followings are true?

::::

In [None]:
%%cpp
const double m=1e16;
m - 1 == m

In [None]:
%%cpp
const double m=1e100;
m*m*m*m == m*m*m*m*m*m*m*m*m*m*m*m*m*m*m*m*m*m*m*m

YOUR ANSWER HERE

::::{exercise}
:label: ex:nan

Explain why the mass of an atom is not equal to itself.

::::

In [None]:
%%cpp
constexpr float mass_of_universe = 1.45e53;
constexpr float num_of_atoms = 1e80;
const float mass_of_atom = mass_of_universe/num_of_atoms;
(mass_of_atom == mass_of_atom)

YOUR ANSWER HERE

::::{note}

`mass_of_atom` is declared with the [`const` type qualifier](https://en.cppreference.com/w/c/language/const.html) instead of `constexpr` because its value is not known at compile time even if it is expected to be a constant.

::::

## Scope

The access of a variable is restricted to its [scope](https://en.cppreference.com/w/cpp/language/scope.html). For instance, in [](#code_gcd1), `a` and `b` can be accessed anywhere since they have global scope, but `c` can only be access within the [compound statement](https://en.cppreference.com/w/cpp/language/statements.html#Compound_statements) enclosed by the braces `{ ... }`, which creates a [block scope](https://en.cppreference.com/w/cpp/language/scope.html#Block_scope).

```cpp
{
  ...
  int c = b;
  ... // c visible here
} // c is out of scope
```

::::{caution} Redeclarations of a variable

You might wonder why we use a compound statement. This is because, even though the `cling` interpreter allows redeclarations of a variable in separate runs, C++ does not allow redeclaring a variable within the same scope. The code inside the block needs to be executed repeatedly to produce the final result.

::::

C++ follows [lexical scoping](https://en.wikipedia.org/wiki/Scope_(computer_science)) to access variables defined in the closest enclosing scope. To understand how this works, consider the following example:

In [None]:
%%cpp
// global scope
int a;
{ // block scope level 1
    { // block scope level 2
        int a;
        { // block scope level 3
            cout << "Level 3: a=" << a << '\n';
        }       
    }
    cout << "Level 1: a=" << a << '\n';
}

- The first `cout` in level 3 accesses `a` defined in level 2, which shadows the variable `a` in the global scope. Note that local variables are not initialized to `0` by default.
- The second `cout` in level 1 accesses `a` defined in the global scope, which is initialized to `0` by default. `a` defined in level 2 is out of the scope of level 1.

In C++, a variable is not merely a name; it is a named container whose size is determined by its type. In comparison, Python is [dynamically typed](https://en.wikipedia.org/wiki/Type_system#DYNAMIC). Instead of a memory location, a variable in Python can be considered simply as a name of an object. In particular, the memory locations of different variables can be the same:

In [None]:
a = b = 1
print(f"a={a} @ {id(a):#x}")
print(f"b={b} @ {id(b):#x}")

The above uses `id` in CPython, which returns the memory location of its argument. The assignments above are called aliasing, since both `a` and `b` are different names pointing to the same memory location.

::::{tip}

To learn more about a Python function, we can use the contextual help by placing the cursor over a function name and 
- click the menu item `Help`$\to$`Show Contextual Help` or
- press the short-cut key <kbd>Shift + Tab</kbd>.

::::

For C++, diferent variables *normally* have different memory locations even if they have the same value.

In [None]:
%%cpp
int a=1, b=a;
cout << format("a={} @ ", a) << &a << '\n';
cout << format("b={} @ ", b) << &b << '\n';

The above code uses the [address-of operator `&`](https://en.cppreference.com/w/cpp/language/operator_member_access.html#Built-in_address-of_operator), which returns the address of type `int *`:

In [None]:
%%cpp
&a

Variables with the same name also have different memory locations:

In [None]:
%%cpp
int a=1;
{
    int a=++a;
    cout << format("a={} @ {:p}\n", a, static_cast<void*>(&a));
}
cout << format("a={} @ {:p}\n", a, static_cast<void*>(&a));

The above code uses `static_cast<void*>` to convert `&a` from to type `void*` so it can be formatted as an address with the format specifier `{:p}`.

::::{caution} Shouldn't the code prints `a=2 @ ...` first?
:class: dropdown

`++a` actually uses the already declared `a` in the block scope, whose value is uninitialized and therefore may not be `1`.

::::

We can store the address using a variable known as a [pointer](https://en.cppreference.com/w/cpp/language/pointer.html):

In [None]:
%%cpp
int* p=&a;
cout << format("a={}\n", *p);

`*p` above uses the [indirection/dereference operator `*`](https://en.cppreference.com/w/cpp/language/operator_member_access.html#Built-in_indirection_operator) to access the value that `p` points to. Indeed, since it is far more common to operate on `*p` instead of `p`, the declaration for multiple pointers require specifying `*` for each pointer:

In [None]:
%%cpp
int *p=&a, *q=&b;
cout << format("a={}, b={}\n", *p, *q);

The default initialization for pointers in static stor duration is `nullptr` or `0`, referred to as the null pointer. The value indicates that the pointer does not point to any object, i.e.,  dereferencing it leads to an error:

In [None]:
%%cpp
int *p, *q=0, *r=nullptr;  // global p is initialized to nullptr by default
p==q && q==r               // nullptr has an integer value 0

::::{caution}

Using an uninitialized pointer in non-static storage duration is unsafe. For instance:

```cpp
{
    int *p;
    cout << *p;  // üë®üèª‚Äçüè´ ‚ùå Undefined behavior
    *p = 1;      // üë®üèª‚Äçüè´ ‚ùå Dangerous: p points to an arbitrary memory location
}

```

::::

In [None]:
%%cpp
{
    int *p;
    cout << *p;
    // üòà: If you never try, you'll never know.
    // *p = 1;
}

The use of pointers can get rather complicated as can be seen in [](code_fast_inv_sqrt1_guess). The idea can be used to write an IEEE 754 simulator in C++:

In [None]:
%%cpp
double x=1.;
auto i = reinterpret_cast<int64_t *>(&x);
cout << format("{:064b}", *i);

The above prints the binary representation of the integer `i`, which is also the binary representation of the floating point number `x`.

The effect of aliasing can also be achieved using pointers.

In [None]:
%%cpp
int a=1, *b=&a;
cout << format(" a={} @ {:p}\n", a++, static_cast<void*>(&a));
cout << format("*b={} @ {:p}\n", *b, static_cast<void*>(b));

Aliasing can also be achieved in C++ using an [*lvalue (locator value) reference*](https://en.cppreference.com/w/cpp/language/reference.html) such as `int &`:

In [None]:
%%cpp
int a=1, &b=a;
cout << format("a={} @ {:p}\n", a++, static_cast<void*>(&a));
cout << format("b={} @ {:p}\n", b, static_cast<void*>(&b));

The declaration `int &b=a;` binds the address of `a` to `b`, so both `a` and `b` share the same memory location. The behavior of aliasing in Python is still very different:

In [None]:
b = a = 1  # a and b have the same memory location
a = a + 1  # a incremented to 2
b          # b is also 2? a and b have the same memory location right?

::::{caution} Why `b` is not equal to `2`?
:class: dropdown

`b` is still `1` because it is not an alias of `a`, but rather, an alias of the object/integer `1`. 
- `a = a + 1` in Python assign the name `a` to a new value `a + 1`, without changing the value `1` that `b` points to.
- `a++` in C++ increments the value `a`, which is also the value of `b`.

::::

In C++, an integer such as `1` also has an associated memory location, but it is temporary and optimized heavily by the compiler in a way not suitable for users to modify. `int &b=1;` fails because, otherwise, the location of `1` would be exposed to the users to modify. We say that `1` is not an lvalue, but rather, an rvalue, or more specifically, a [prvalue](https://en.cppreference.com/w/cpp/language/value_category.html#prvalue).

It is okay, however, if `b` is declared as a constant:

In [None]:
%%cpp
const int &b=1

The compiler can safely extend the lifetime of the prvalue as long as `b` is in scope, without concern about modifications or interference with optimizations for temporary objects. Such a constant reference will be useful when passing larger objects around without the extra copy step as in the copy assignment.

::::{exercise}
:label: ex:pre_vs_post

Why `++a=1` works but `(a++)=1` fails?

::::

::::{solution} ex:pre_vs_post
:class: dropdown

This is because `++a` returns an lvalue reference, but `(a++)` returns an rvalue, namely, the increment of the value of `a`.

::::

In [None]:
%%ai
Explain the value categories of C++ with concrete examples.

::::{exercise}
:label: ex:uninit

Since different variables occupy different memory locations, can we modify [](#code_gcd1) as follows to avoid overwriting the original `a` and `b`?

```cpp
{
    int a=a, b=b;
    ...
}
```

Why or why not?

::::

YOUR ANSWER HERE

As a hint, try running the following program to check whether the local variables are actually clones of the global variables:

In [None]:
%%cpp
int a=2*3*4, b=3*4*5;
{
    int a=a, b=b;
    cout << format("a={}\nb={}\n", a, b);  // local clone of global a and b?
}
cout << format("a={}\nb={}\n", a, b);  // global a and b not overwritten

## String

How to represent a piece of text, which consists of a sequence of characters?

A string literal is delimited by double quotes:

In [None]:
%%cpp
"15"

Note that the data type is not `string` but `const char[3]`: a constant array of 3 `char` values. Such an array of characters is referred to as a *C string*, even though there isn't actually a distinct `cstring` data type in C.

Why are there 3 characters instead of 2? It‚Äôs almost as if there‚Äôs a hidden element at play. We can inspect the elements of the array using a [member access operator like `[]`](https://en.cppreference.com/w/cpp/language/operator_member_access.html), but what secrets might it unveil?

In [None]:
%%cpp
"15"[0] // [0] picks out the first character at index 0

In [None]:
%%cpp
"15"[1] // the second character

Lo and behold ü™Ñ:

In [None]:
%%cpp
"15"[2] // the third character

The last character is the *null character*, which can also be entered as `'\0'`.

In [None]:
%%cpp
'\0'

C strings are *null-terminated* to ensure they are [(uniquely) decodable](https://en.m.wikipedia.org/wiki/Variable-length_code#Uniquely_decodable_codes). In other words, the null character signals the end of a string without needing to keep track of its length. What a clever mechanism!

Can you print a string containing `\0` in the middle?

In [None]:
%%cpp
cout << "321\0 123";

In [None]:
%%cpp
cout << "321\0123";

::::{exercise}
:label:ex:0_in_middle

Explain what gets printed below. 

::::

YOUR ANSWER HERE

::::{exercise}
:label: ex:index_OOB

Why does the following code even run in C++?

:::{hint}

Try running the code in python.

:::

::::

In [None]:
%%cpp
"15"[3]

YOUR ANSWER HERE

The length of a C string can be returned using [`std::strlen`](https://en.cppreference.com/w/cpp/string/byte/strlen.html) from [`<cstring>`](https://en.cppreference.com/w/cpp/header/cstring.html):

In [None]:
%%cpp
strlen("15")

Note that the return type is `size_t`, which is an unsigned integer type. Failure to understand the implication could lead to logical error like the following:

In [None]:
%%cpp
strlen("Ava") - strlen("Betty") > 0

In [None]:
%%ai
In C++, should `strlen("Ava") - strlen("Betty") > 0` return false since 
"Ava" is shorter than "Betty"?

::::{exercise}
:label: ex:unsigned

What is the maximum possible length for C string?

::::

YOUR ANSWER HERE

::::{exercise}
:label: ex:max_length

What is the maximum possible length for C string?

::::

YOUR ANSWER HERE

In [None]:
%%ai
Since the length of a C string is recorded in its type, why do we need to 
null-terminate it? Isn't it a waste of memory to store the null character?

Another frustrating gotcha is that the length of a string is **not** the same as the number of characters or symbols in the string! Consider the length of ["ÊÄù"](https://translate.google.com/?sl=yue&tl=en&text=%E6%80%9D&op=translate):

In [None]:
%%cpp
strlen("ÊÄù")

Why is the length 3 instead of 1?

In [None]:
%%cpp
"ÊÄù"[0]

In [None]:
%%cpp
"ÊÄù"[1]

In [None]:
%%cpp
"ÊÄù"[2]

Finally:

In [None]:
%%cpp
"ÊÄù"[3]

The three characters before the null terminator are actually the three bytes that correspond to the [UTF-8 encoding of "ÊÄù"](https://www.compart.com/en/unicode/U+601D). UTF-8 is a [variable-length encoding](https://en.wikipedia.org/wiki/Variable-length_code) designed to represent [Unicode characters](https://en.wikipedia.org/wiki/Unicode), including those that go beyond the ASCII character set.

For example, the emoticon below requires more bytes to encode:

In [None]:
%%cpp
"üòé"

A unicode symbol can also be encoded differently using [UTF-16](https://en.wikipedia.org/wiki/UTF-16):

In [None]:
%%cpp
u"ÊÄù"

In [None]:
%%cpp
u"ÊÄù"[0]

In [None]:
%%cpp
u"ÊÄù"[1]

In [None]:
%%cpp
"\u601D" // escape code to enter a symbol using UTF-16 code

In [None]:
%%ai
In C++, how to return the number of symbols in a C string, given that the
symbols may be encoded using a variable-length code.

There are [other types of string literals](https://en.cppreference.com/w/cpp/language/string_literal.html) in C++, but their support varies. For example, `strlen(u"1")` or `cout << u8"1";` will fail because these functions and operators do not handle wide or UTF-8 encoded strings directly. The exception is raw string literals, which ensure [WYSIWYG](https://en.wikipedia.org/wiki/WYSIWYG) (to some extent):

In [None]:
%%cpp
cout << R"(
  ____  ____   ____   _____  _   ___  
 / ___|/ ___| |___ \ |___ / / | / _ \
| |    \___ \   __) |  |_ \ | || | | |
| |___  ___) | / __/  ___) || || |_| |
 \____||____/ |_____||____/ |_| \___/ üòé
)";

What does it mean to be a constant array?

Consider the following code:

In [None]:
%%cpp
auto s="15";
s = "35";       // works
// s[0] = '1';  // fails

::::{caution} Why is it okay to change all the characters as in `s="35";` but not the first as in `s[0]=1;`?
:class: dropdown

`s` is indeed not a constant. It is a pointer that points to characters in read-only memory. 

- `s="35"` is allowed since it changes the value of the pointer to point to a new array of characters.
- `s[0]='3'` is not allowed since it attempts to rewrite a character in read-only memory.

::::

::::{exercise} 
:label: ex:auto_ref

Why the following code works even though `auto &b=1;` failed? Can I further assign `s` to another string as in `s="35";`?

::::

In [None]:
%%cpp
auto &s="15";

YOUR ANSWER HERE