# Chapter 1. Basics

## Python numerical types

### Decimal type

For applications that require decimal digits with accurate arithmetic operations, use the `Decimal` type from the `decimal` module in the Python Standard Library:

In [1]:
from decimal import Decimal
num1 = Decimal('1.1')
num2 = Decimal('1.563')
assert Decimal('2.663') == num1 + num2

Certain numbers such as `0.1` cannot be represented exactly using a finite sum of powers of 2. `0.1` has a binary expansion `0.000110011...`, which does not terminate. Any floating-point representation of this number will therefore carry a small error.

In [2]:
assert 2.663 != 1.1 + 1.563

The `decimal` package also provides a `Context` object, which allows fine-grained control over the precision, display, and attributes of `Decimal` objects:

In [3]:
assert Decimal('1.4641') == num1**4

from decimal import localcontext
with localcontext() as ctx:
    ctx.prec = 3
    assert Decimal('1.46') == num1**4

> When we set the precision to `3`, rather than the default `28`, we see that the fourth power of 1.1 is rounded to three significant figures.

This means that context can be freely modified inside the `with` block, and will be returned to the default at the end.

### Fraction type

The `Fraction` type from the `fractions` module in the Python Standard Library, simply stores two integers (the numerator and the denominator), and arithmetic is performed using the basic rule for arithmetic of fractions.

In [4]:
from fractions import Fraction
fr1 = Fraction(1, 3)
fr2 = Fraction(1, 7)
assert Fraction(10, 21) == fr1 + fr2
assert Fraction(4, 21) == fr1 - fr2
assert Fraction(1, 21) == fr1 * fr2
assert Fraction(7, 3) == fr1 / fr2

## Basic mathematical functions

The `math` module in the Python Standard Library provides all of the standard mathematical functions, along with common constants and some utility functions:

In [5]:
import math
assert 2.0 == math.sqrt(4)
assert math.isclose(math.sin(math.pi/4), math.cos(math.pi/4))
assert math.isclose(1.0, math.tan(math.pi/4))

The `log` function in the `math` module performs logarithms.

$$
\begin{aligned}
y = \log_b{x} &\iff x = b^y\\
y = \ln{(x)} &\iff x = e^y
\end{aligned}
$$

Where the constant $e$ is base of the natural logarithm, sometimes known as [Napier's constant](https://mathworld.wolfram.com/e.html). The constant can be accessed using `math.e`.

In [6]:
assert math.log(10) == math.log(10, math.e)
assert math.log(100, 10) == 2.0

In addition, the `math` contains various number of theoretic and combinatorial functions.

The `comb(n, k)` returns the number of ways to choose `k` items from a collection of `n` without repeats if order is not important. This number is sometimes written $\displaystyle {^n}C_k = \binom{n}{k} = \frac{n!}{k!(n-k)!}$.

In [7]:
assert 10 == math.comb(5, 2)

The `factorial(n)` returns the factorial
$n! = n(n-1)(n-2)\cdots 1$.

In [8]:
assert 120 == math.factorial(5)

There are also a number of functions for working with floating-point numbers. As `0.1` carries a small error, Python built-in function `sum` cannot return an accurate sum of values in an iterable, while `math.fsum` avoids loss of precision by tracking multiple intermediate partial sums.

In [9]:
nums = [0.1]*10    # a list containing 0.1 ten items
assert math.isclose(1.0, sum(nums))
assert 1.0 == math.fsum(nums)

The `math` module is a good choice if you need to apply a function to a relatively small collection of numbers. If you want to these functions to a large collection of data simultaneously, it is better to use their equivalents from the NumPy package, which are more efficient for working with arrays.

## NumPy arrays

In [10]:
import numpy as np

The basic type provided by the NumPy library is the `ndarray` type (henceforth referred to as a NumPy array).

- The NumPy array type is a Python wrapper around an underlying C array structure. The array operations are implemented in C and optimized for performance.
- NumPy arrays must consist of homogeneous data (could be a pointer to an arbitrary Python object).
- Under the hood, a NumPy array of any shape is a buffer containing the raw data as a flat (one-dimensional) array, and a collection of additional metadata that specifies details such as the type of the elements.

NumPy provides a number of [universal functions](https://numpy.org/doc/stable/user/basics.ufuncs.html).

- A universal function (or ufunc for short) is a function that operates on ndarrays in an element-by-element fashion, supporting array broadcasting, type casting, and several other standard features.
- That is, a ufunc is a “vectorized” wrapper for a function that takes a fixed number of specific inputs and produces a fixed number of specific outputs.

*Broadcasting*: Each universal function takes array inputs and produces array outputs by performing the core function element-wise on the inputs (where an element is generally a scalar, but can be a vector or higher-order sub-array for generalized ufuncs).

In [11]:
assert np.all([0,1,2,3] == np.arange(4))

In [12]:
assert np.all([1,1,1,1] == np.ones((1, 4)))

In [13]:
assert np.all([-1,0,1,2] == np.arange(4) - np.ones((1, 4)))

### Array creation

There are six general mechanisms for [creating arrays](https://numpy.org/doc/stable/user/basics.creation.html):

1. conversion from other Python structures (i.e., lists and tuples)
   - NumPy arrays can be defined using Python sequence such as lists `[...]` and tuples `(...)`
   - use `numpy.array` function to define a new array
   - the `dtype` of elements in the array can be specified explicitly
3. intrinsic NumPy array creation functions (e.g., arange, ones, zeros, etc.)
4. replicating, joining, or mutating existing arrays
5. reading arrays from disk, either from standard or custom formats
   - this is the most common case of large array creation
   - the [details](https://numpy.org/doc/stable/user/how-to-io.html#how-to-io) depend greatly on the format of data on disk
6. creating arrays from raw bytes through the use of strings or buffers
   - if the file has a relatively simple format then one can write a simple I/O library and use the NumPy fromfile() function and .tofile() method to read and write NumPy arrays directly (mind your byteorder though!)
8. use of special library functions (e.g., random)
   - many Python libraries, including SciPy, Pandas, and OpenCV, use NumPy ndarrays as the common format for data exchange

### Intrinsic NumPy array creation functions

These functions can be split into roughly three categories

- 1D array creation
- 2D array creation
- *n*-dimension array creation

#### 1D array creation

The 1D array creation functions e.g. `numpy.linspace` and `numpy.arange` generally need at least two inputs, `start` and `stop`.

`numpy.arange` creates arrays with regularly incrementing values, starting from `start` until `end` exclusive:

`numpy.linspace` will create arrays with a specified number of elements, and spaced equally between the specified beginning and end values inclusive.

In [14]:
assert np.all([ 0., 16., 32., 48., 64.] == np.linspace(0, 64, 5))

#### 2D array creation

The 2D array creation functions e.g. `numpy.eye`, `numpy.diag`, and `numpy.vander` define properties of special matrices represented as 2D arrays.

`np.eye(n)` defines a 2D identity matrix. The elements where $i=j$ (row index and column index are equal) are 1 and the rest are 0, also known as the *identity matrix*.

In [15]:
assert np.all(np.eye(3) == [
    [1,0,0],
    [0,1,0],
    [0,0,1]
])

`numpy.diag` can define either a square 2D array with given values along the diagonal and the rest are 0, also known as the *diagonal matrix*.

In [16]:
assert np.all(np.diag([1,2,3]) == [
    [1,0,0],
    [0,2,0],
    [0,0,3]
])

#### *n*-dimension array creation

The ndarray creation functions e.g. `numpy.ones`, `numpy.zeros`, and `random` define arrays based upon the desired shape. The ndarray creation functions can create arrays with any dimension by specifying how many dimensions and length along that dimension in a tuple or list.

`numpy.zeros` will create an array filled with 0 values with the specified shape.

In [17]:
assert np.all(np.zeros((2,3,4)) == [
    [[0.,0.,0.,0.],[0.,0.,0.,0.],[0.,0.,0.,0.]],
    [[0.,0.,0.,0.],[0.,0.,0.,0.],[0.,0.,0.,0.]]
])   # the default type is float64

`numpy.ones` will create an array filled with 1 values. It is identical to `numpy.zeros` in all other respects.

The `random` method of the result of `default_rng` will create an array filled with random values between 0 and 1. It is included with the `numpy.random` library. Below, two arrays are created with shapes (2,3) and (2,3,2), respectively. The seed is set to 42 so you can reproduce these pseudorandom numbers:

```python
from numpy.random import default_rng
arr1 = default_rng(42).random((2,3))      # 2D array of shape (2,3)
arr2 = default_rng(42).random((2,3,4))    # 3D array of shape (2,3,4)
```

### Higher dimensional arrays

Here, we can create a two-dimensional array by providing a list of lists, where each member of the inner list is a number.

In [18]:
vec = np.array([1,2])
assert np.all(vec.shape == (2,))

In [19]:
mat = np.array([[1,2],[3,4]])
assert np.all(mat.shape == (2,2))

Recall that under the hood, a NumPy array of any shape is a buffer containing the raw data as a flat (one-dimensional) array, and a collection of additional metadata that specifies details. One of the metadata is the `shape` that specifies the dimension of the NumPy array.

An array can be reshaped with little cost by simply changing the associated metadata. This is done using the `numpy.ndarray.reshape` method on the NumPy array:

In [20]:
mat2 = mat.reshape((4,))
# mat.reshape((4,))
assert np.all(mat2.shape == (4,))
assert np.all(mat2 == [1,2,3,4])
assert np.all(mat.shape == (2,2))
assert np.all(mat == [[1,2],[3,4]])

Note that the total number of elements must remain unchanged. The matrix `mat` originally has shape (2, 2) with a total of 4 elements, and the latter is a one-dimensional array with shape (4,), which again has a total of 4 elements. Attempting to reshape when there is a mismatch in the total number of elements will result in a `ValueError`.

Note that the `numpy.ndarray.reshape` method returns either a new view view object or a copy, depending on [the `order` and `copy` arguments](https://numpy.org/doc/stable/reference/generated/numpy.reshape.html). In the above code, `mat2` is a reshaped view of `mat`: therefore

In [21]:
mat[0,1] = 10
assert mat2[1] == 10

## Matrices

NumPy arrays also serve as matrices, which are fundamental in mathematics and computational programming. A matrix is simply a two-dimensional array.

A matrix with $m$ rows and $n$ columns is usually described as an $m \times n$ matrix. A matrix that has the same number of rows as columns is said to be a *square matrix*.

In [22]:
A = np.arange(15).reshape(3,5)
assert np.all(A == [
    [ 0, 1, 2, 3, 4],
    [ 5, 6, 7, 8, 9],
    [10,11,12,13,14]
])
assert A.ndim == 2                  # two-dimensional
assert np.all(A.shape == (3,5))     # 3 rows, 5 columns

The *identity matrix* (of size $n$) is the $n \times n$ matrix where the $(i,i)$-th entry is 1, and the $(i,j)$-th entry is zero for $i \neq j$.

In [23]:
I3 = np.eye(3)
assert np.all(I3 == [
    [1,0,0],
    [0,1,0],
    [0,0,1]
])

### Basic methods

The *transpose* of a matrix is a matrix where rows and columns are interchanged.

- the `numpy.ndarray.transpose` method
- the `T` property of the `numpy.ndarray` object

In [24]:
a = A.transpose()
assert np.all(a.shape == (5,3))    # 5 rows, 3 columns
assert np.all(a == [
    [0,5,10],
    [1,6,11],
    [2,7,12],
    [3,8,13],
    [4,9,14]
])
assert np.all(a == A.T)

The *trace* of a matrix is the sum of the diagonal elements.

$$
\begin{aligned}
\text{trace}(B) &= \sum_{i=1}^{n} b_{i,i} \\
&= 0 + 4 + 8 = 12
\end{aligned}
$$

In [25]:
B = np.arange(9).reshape(3,3)
assert np.all(B == [
    [0,1,2],
    [3,4,5],
    [6,7,8]
])
assert B.trace() == 12

### Matrix multiplication