# Introduction to NumPy

Computational Finance with Python

[Alet Roux](https://www.york.ac.uk/maths/staff/alet-roux/) ([Department
of Mathematics](https://maths.york.ac.uk), University of York)

Click on the following to open this file in Google Colab:

<figure>
<a
href="https://colab.research.google.com/github/aletroux/comp-finance-python/blob/main/demonstrations/D02_Numpy_slides.ipynb"><img
src="https://colab.research.google.com/assets/colab-badge.svg"
alt="Open In Colab" /></a>
<figcaption>Open In Colab</figcaption>
</figure>

# NumPy

-   [Numpy](https://numpy.org/) is an important foundational package for
    numerical computing in Python.
-   Some features:
    -   Powerful multidimensional array object.
    -   Mathematical functions for fast operation on arrays.
    -   Linear algebra, random number generation, Fourier transform,
        etc.
    -   Ability to connect with C, C++ and FORTRAN libraries for
        performance computing.
-   We will focus on one and two dimensional arrays. Arrays work similar
    in higher dimensions.

# Numpy performance

-   NumPy is designed for efficiency on large amounts of data:
    -   Data stored internally in a contiguous block of memory.
    -   Can perform complex computations on arrays without the need for
        Python `for` loops, which can be slow for large sequences.

In [10]:
my_list = list(range(1_000_000))
%timeit my_list2 = [x * 2 for x in my_list]

52 ms ± 6.55 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [11]:
import numpy as np
my_arr = np.arange(1_000_000)
%timeit my_arr2 = my_arr * 2

2.03 ms ± 317 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

-   NumPy-based algorithms are generally 10 to 100 times faster than
    list-based Python algorithms.

# Using NumPy

-   Standard NumPy convention is to use `import numpy as np`.
-   Then NumPy functions and variables are prefixed by `np.`, for
    example `np.exp`, `np.pi`, `np.array`.
-   There are other possibilities. For example, `from numpy import *`
    means we don’t need the `np` prefix.
-   However, `numpy` namespace is very big and contains a number of
    functions whose names conflict with built-in Python functions (like
    `min` and `max`) and functions defined in other libraries. Thus the
    standard NumPy convention is strongly recommended.

# NumPy arrays

-   Like lists but:
    -   Length cannot be changed.
    -   Elements must all be of same type.
-   The easiest way to create an array is the `array` function. This
    function accepts all sequence-like types such as `list` and `tuple`.
-   Example:

In [12]:
list1 = [1, 3.4, 8]
array1 = np.array(list1)
print(array1)

[1.  3.4 8. ]

-   Nested lists become multi-dimensional arrays:

In [13]:
list2 = [[1, 3], [2, 4]]
array2 = np.array(list2)
print(array2)

[[1 3]
 [2 4]]

## Array properties

-   Number of dimensions/axes:

In [14]:
array3 = np.array([[1, 2, 3], [4, 5, 6]])
print("array3 =\n", array3)
print("Number of dimensions =",array3.ndim)

array3 =
 [[1 2 3]
 [4 5 6]]
Number of dimensions = 2

-   Shape inferred from data:

In [15]:
print("Shape =", array3.shape)

Shape = (2, 3)

-   Data type inferred from data:

In [16]:
print("Data type=", array3. dtype)

Data type= int64

-   Many other possibilities, such as `'float64'`.

## Creating new arrays

-   Number of different ways to create new arrays, for example:

In [17]:
array4 = np.zeros(4)
print('array4 =\n', array4)

array5 = np.ones((2, 3, 3))
print('array5 =\n', array5)

array6 = np.ones_like(array3)
print('array6 =\n', array6)

array4 =
 [0. 0. 0. 0.]
array5 =
 [[[1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]]

 [[1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]]]
array6 =
 [[1 1 1]
 [1 1 1]]

## Important array creation functions

| Function | Description |
|--------------------|----------------------------------------------------|
| `array`, `asarray` | Convert input to NumPy array. `array` data by default, `asarray` does not copy data if input already a NumPy array. |
| `arange` | Like built-in `range` but creates NumPy array. |
| `linspace` | Similar to built-in `range` but creates floating point array, and allows non-integer step sizes. |
| `ones`, `ones_like` | Produces array of all 1s with given shape and data type. |
| `zeros`, `zeros_like` | Produces array of all 0s with given shape and data type. |
| `empty`, `empty_like` | Allocates memory, but does not populate with values. |
| `full`, `full_like` | Creates array with given shape and data type with all values set to given *fill value*. |
| `eye`, `identity` | Square identity matrix (1s on diagonal, 0s elsewhere). |

-   More details in [NumPy
    Reference](https://numpy.org/doc/stable/reference/routines.array-creation.html)
    (NumPy Developers (2023))

## Array functions

-   Arrays have many built-in elements for calculating useful quantites.
-   Examples:

In [18]:
array = np.linspace(0, np.pi, 3)
print ("array =", array)
print ("min =",array.min())
print ("max =",array.max())
print ("sum =", array.sum())
print ("standard deviation =", array.std())

array = [0.         1.57079633 3.14159265]
min = 0.0
max = 3.141592653589793
sum = 4.71238898038469
standard deviation = 1.282549830161864

## Array arithmetic

-   Arrays allow batch operations without writing `for` loops. This is
    called **vectorisation**.
-   Examples:

In [19]:
array = np.array([[1, 2, 3], [4, 5, 6]])
print ("array + 2 =\n",array + 2)
print ("1/array =\n",1 / array)

array + 2 =
 [[3 4 5]
 [6 7 8]]
1/array =
 [[1.         0.5        0.33333333]
 [0.25       0.2        0.16666667]]

-   Arithmetic between equal size arrays apply the operation
    element-wise:

In [20]:
print ("array - array =\n",array - array)
print ("array * array =\n",array * array)

array - array =
 [[0 0 0]
 [0 0 0]]
array * array =
 [[ 1  4  9]
 [16 25 36]]

## Array comparisons

-   Comparisons between arrays of the same size yield Boolean arrays:

In [21]:
array1 = np.array([[1, 2, 3], [4, 5, 6]])
array2 = np.array([[3, 2, 1], [6, 5, 4]])
print("array1 > array2?\n", array1 > array2)
print("array1 <= array2?\n", array1 <= array2)
print("array1 == array2?\n", array1 == array2)

array1 > array2?
 [[False False  True]
 [False False  True]]
array1 <= array2?
 [[ True  True False]
 [ True  True False]]
array1 == array2?
 [[False  True False]
 [False  True False]]

## Indexing and slicing in one dimension

-   One-dimensional arrays behave similarly to Python lists when it
    comes to indexing and slicing.
-   Slices provide *views* and are not *copied*.
-   Example:

In [22]:
array = np.arange(5)
print("array =\n", array)
array_slice = array[2:4]
print("array_slice =\n", array_slice)
array_slice[:] = 12
print("array after setting slice equal to 12 =\n", array)

array =
 [0 1 2 3 4]
array_slice =
 [2 3]
array after setting slice equal to 12 =
 [ 0  1 12 12  4]

## Indexing and slicing in two dimensions

-   It is helpful to think of first dimension as *rows* and second
    dimension as *columns*:

In [23]:
array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print ("first row =", array[0])

first row = [1 2 3]

-   Individual elements can be accessed in different ways:

In [24]:
print("first row, third item is ", array[0][2])
print("first row, third item is ", array[0, 2])

first row, third item is  3
first row, third item is  3

-   Slicing in two dimensions works intuitively:

In [25]:
print("first two rows =\n", array[:2])
print("top right corner =\n", array[:2, 1:])
print("right column =\n", array[:, -1])

first two rows =
 [[1 2 3]
 [4 5 6]]
top right corner =
 [[2 3]
 [5 6]]
right column =
 [3 6 9]

## Fancy indexing

-   Use Boolean arrays to select elements:

In [26]:
array = np.arange(6)
print("array =",array)
boolean_array = np.array([False, True, False, True, False, True])
print("boolean_array =",boolean_array)
print("selected items =", array[boolean_array])

print("even items =", array[array % 2 == 0])
print("odd items =", array[~(array % 2 == 0)])

array = [0 1 2 3 4 5]
boolean_array = [False  True False  True False  True]
selected items = [1 3 5]
even items = [0 2 4]
odd items = [1 3 5]

-   Use arrays of integers to select items:

In [27]:
print("first two items in reverse order =", array[[1,0]])

first two items in reverse order = [1 0]

## Flattening and reshaping arrays

-   Elements of a multi-dimensional array can be arranged linearly:

In [28]:
array1 = np.array([[1, 2], [3, 4], [5, 6]])
print("array1 =\n", array1)
print("flattened array1, row-wise =", array1.flatten())
print("flattened array1, column-wise =", array1.flatten(order = 'F'))

array1 =
 [[1 2]
 [3 4]
 [5 6]]
flattened array1, row-wise = [1 2 3 4 5 6]
flattened array1, column-wise = [1 3 5 2 4 6]

-   An array can be reshaped into any array of compatible size:

In [29]:
array2 = np.arange(6)
print("array2 =", array2)
print("2x3 array, row-wise =\n", array2.reshape(2,3))
print("2x3 array, column-wise =\n", array2.reshape(2,3, order='F'))

array2 = [0 1 2 3 4 5]
2x3 array, row-wise =
 [[0 1 2]
 [3 4 5]]
2x3 array, column-wise =
 [[0 2 4]
 [1 3 5]]

## Further properties

-   NumPy arrays have many other useful properties, which will be
    covered later in the module.

-   The NumPy User Guide (NumPy Developers (2022)) explains the
    important NumPy features, starting from the absolute basics.

-   The NumPy API Reference (NumPy Developers (2023)) provides full
    details on all functions, modules and objects.

# Universal functions

-   A **universal function** performs fast element-wise operations on
    data in ndarrays.
-   Example of unary universal function:

In [30]:
array = np.arange(6)
print("array =\n", array)
print("Square root of array =\n", np.sqrt(array))

array =
 [0 1 2 3 4 5]
Square root of array =
 [0.         1.         1.41421356 1.73205081 2.         2.23606798]

-   Examples of binary universal functions:

In [31]:
fixed_array = np.full(6, 3)
print("fixed_array =\n", fixed_array)
print("Minimum of two arrays =\n", np.minimum (array, fixed_array))
print("Maximum of two arrays =\n", np.maximum (array, fixed_array))

fixed_array =
 [3 3 3 3 3 3]
Minimum of two arrays =
 [0 1 2 3 3 3]
Maximum of two arrays =
 [3 3 3 3 4 5]

## Some universal functions

| Function | Description |
|--------------------------|----------------------------------------------|
| `abs` | Absolute value. |
| `sqrt` | Square root. |
| `square` | Raise to the power 2. |
| `exp` | Exponent ($e^x$). |
| `log`, `log10`, `log2` | Natural logarithm (base $e$), log base 10 and log base 2. |
| `cos`, `cosh`, `sin`, `sinh`, `tan`, `tanh` | Regular and hyperbolic trigonometric functions. |
| `arccos`, `arccosh`, `arcsin`, `arcsinh`, `arctan`, `arctanh` | Inverse trigonometric functions. |

-   There are many others; see [NumPy
    Reference](https://numpy.org/doc/stable/reference/ufuncs.html)
    (NumPy Developers (2023))

## Universal functions and scalars

-   Universal functions also work on scalars, however may not be very
    efficient.

In [32]:
%timeit np.sqrt(2.5)

678 ns ± 20.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

In [33]:
import math
%timeit math.sqrt(2.5)

32.8 ns ± 0.866 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

-   The `math` module contains many commonly used mathematical
    functions. The Python Software Foundation (2024) provides details
    with examples.

## Creating universal functions

-   It is possible to turn any Python function into a universal
    function.
-   Syntax:

``` python
<universal function> = np.frompyfunc(<function>, <number of inputs>, <number of outputs>)
```

-   Example:

In [34]:
my_func = lambda x : max(x - 2.5, 0)

my_func_univ = np.frompyfunc(my_func, 1, 1)

array = np.arange(6)
print("array =", array)
print("applying universal function =", my_func_univ(array))
print("applying function to individual elements =", [my_func(item) for item in array])

array = [0 1 2 3 4 5]
applying universal function = [0 0 0 0.5 1.5 2.5]
applying function to individual elements = [0, 0, 0, np.float64(0.5), np.float64(1.5), np.float64(2.5)]

## Vectorized functions

-   Sometimes we want to apply a Python function to a NumPy array just
    the once.
-   Syntax:

``` python
<function that accepts arrays> = np.vectorize(<function>, otypes=[np.float64])
```

-   Example:

In [35]:
def my_func (x, K):
    return max(x - K, 0.0)

K = 2.5

array = np.arange(6, dtype=np.float64)
print("array =", array)
print("applying vectorized function =", np.vectorize(my_func)(array,K))
print("applying function to individual elements =", [my_func(item,K) for item in array])

array = [0. 1. 2. 3. 4. 5.]
applying vectorized function = [0.  0.  0.  0.5 1.5 2.5]
applying function to individual elements = [0.0, 0.0, 0.0, np.float64(0.5), np.float64(1.5), np.float64(2.5)]

-   `np.vectorize` preserves types differently from `np.frompyfunc`,
    hence the `otypes=[np.float64]` is needed for floating point output.

# Further reading

-   The slides are based on Sections 4.1 and 4.3 of McKinney (2022).
-   Hilpisch (2019) covers similar material in Chapter 4.
-   Lynch (2018) gives many examples with mathematical applications.

## References

Hilpisch, Yves. 2019. *Python for Finance: Mastering Data-Driven
Finance*. 2nd ed. O’Reilly.

Lynch, Stephen. 2018. “Python for A-Level Mathematics and Beyond.”
<https://drstephenlynch.github.io/webpages/Python_for_A_Level_Mathematics_and_Beyond.html>.

McKinney, Wes. 2022. *Python for Data Analysis: Data Wrangling with
Pandas, NumPy & Jupyter*. 3rd edition. O’Reilly.
<https://wesmckinney.com/book/>.

NumPy Developers. 2022. “NumPy User Guide.”
<https://numpy.org/doc/stable/user/index.html>.

———. 2023. “NumPy Reference.”
<https://numpy.org/doc/stable/reference/index.html>.

The Python Software Foundation. 2024. “<span class="nocase">math</span>:
Mathematical Functions.”
<https://docs.python.org/3/library/decimal.html>.