# NumPy and SciPy for Monte Carlo simulation

Computational Finance with Python

[Alet Roux](https://www.york.ac.uk/maths/staff/alet-roux/) ([Department
of Mathematics](https://maths.york.ac.uk), University of York)

Click on the following to open this file in Google Colab:

<figure>
<a
href="https://colab.research.google.com/github/aletroux/comp-finance-python/blob/main/demonstrations/D03_NumPy_SciPy_Monte_Carlo_slides.ipynb"><img
src="https://colab.research.google.com/assets/colab-badge.svg"
alt="Open In Colab" /></a>
<figcaption>Open In Colab</figcaption>
</figure>

# Random numbers and random variables

-   NumPy has very powerful capacity for generating samples of random
    numbers in its `random` module (NumPy Developers (2022c)).

In [1]:
import numpy as np

-   SciPy has many features for working with random variables and their
    distributions in its `stats` module (The SciPy Community (2024b)).
-   It is possible to import `scipy.stats` directly, as below:

In [2]:
import scipy.stats as sstats

-   However, when only using one or two distributions it is common to
    just import the relevant distribution, for example:

In [3]:
from scipy.stats import norm

## Generating univariate samples

### Random number generator

-   NumPy random number generation relies on a random number generator.
-   Initialize a generator as follows:

``` python
<variable name> = np.random.default_rng(seed = <positive integer>)
```

-   The optional parameter `seed` is used to initialize the random
    number generator. Providing a seed means that the random numbers
    generated will be the same in different runs, which can be useful
    for repeating results and debugging code.

-   Example:

In [4]:
rng = np.random.default_rng (seed = 2314234234)
rng.random(size = (2,2))

array([[0.42584778, 0.12466016],
       [0.09799234, 0.44945449]])

------------------------------------------------------------------------

### Random sample generation

-   Many distributions are supported. These can be used as properties of
    the random number generator (e.g. `rng.standard_normal`), or called
    directly (e.g. `np.random.standard_normal`).
-   General format of arguments:

``` python
<random number generator>.<name of distribution>(<parameters of distribution>, <size specification>)
```

-   The size specification determines the number of samples created, and
    the shape of the NumPy array that is created. The following pairs
    are equivalent:

In [5]:
print(rng.random(size = 4))
print(rng.random(4))

[0.0546029  0.56548454 0.90199861 0.11261566]
[0.30568593 0.05365493 0.86361708 0.4924869 ]

In [6]:
print(rng.random(size = (2,3)))
print(rng.random((2,3)))

[[0.75398128 0.48221345 0.85442236]
 [0.04747989 0.49641986 0.05229798]]
[[0.24510733 0.22275409 0.95129101]
 [0.40892199 0.83551723 0.36251253]]

------------------------------------------------------------------------

### Distributions supported by NumPy

-   Consult the documentation (NumPy Developers (2022b)) for an
    extensive list. Examples:

| Distribution       | Usage                           |
|--------------------|---------------------------------|
| $U(0,1)$           | `random`                        |
| $U(a,b)$           | `uniform(`$a$`,`$b$`)`          |
| $N(0,1)$           | `standard_normal()`             |
| $N(\mu,\sigma^2)$  | `normal(`$\mu$`,`$\sigma$`)`    |
| $LN(\mu,\sigma^2)$ | `lognormal(`$\mu$`,`$\sigma$`)` |
| $\exp(\lambda)$    | `exponential(`$1/\lambda$`)`    |

-   Other functionality includes shuffling, permutations, etc.

## Random variables and their distributions

-   Usage if we know the parameters of the distribution (i.e. it is
    *frozen*):

``` python
from scipy import <distribution>
<variable name> = <distribution>(<parameters>)
...
<variable name>.<method of distribution>
```

-   Usage if the parameters of the distribution are unknown /
    changeable:

``` python
from scipy import <distribution>
<distribution>.<method of distribution>(<method arguments>, <distribution parameters>)
```

------------------------------------------------------------------------

### Distributions supported by SciPy

-   Consult the documentation (The SciPy Community (2024b)) for an
    extensive list. Examples:

| Distribution       | Usage                                 |
|--------------------|---------------------------------------|
| $U(a,b)$           | `uniform(loc=`$a$`,scale=`$b-a$`)`    |
| $N(0,1)$           | `norm()`                              |
| $N(\mu,\sigma^2)$  | `norm(loc=`$\mu$`,scale=`$\sigma$`)`  |
| $LN(\mu,\sigma^2)$ | `lognorm(`$\sigma$`,scale=`$e^\mu$`)` |
| $\exp(\lambda)$    | `expon(scale=`$1/\lambda$`)`          |

------------------------------------------------------------------------

### Distribution methods

| Method                | Description                                                                     |
|---------------------|---------------------------------------------------|
| `pdf` or `pmf`        | Probability density function (continuous) or mass function (discrete)           |
| `cdf`                 | Cumulative distribution function                                                |
| `stats(moments='mv')` | Statistics: mean (`m`), variance (`v`), skew (`s`) and/or kurtosis (`k`)        |
| `ppf`                 | Percent point function (percentiles)                                            |
| `median` and `mean`   | Median and mean                                                                 |
| `var` and `std`       | Variance and standard deviation                                                 |
| `interval`            | Confidence interval with equal areas around the median                          |
| `expect`              | Expected value of a function (of one argument) with respect to the distribution |

-   For more, see documentation (The SciPy Community (2024b))

## Example: lognormal distribution

In [7]:
import math

# prepare for plotting
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize = (12,7))

# model parameters
mu = 1
sigma = 1

# generate sample and produce histogram
n = 1000
sample = rng.lognormal(mu, sigma, n)
ax.hist(sample, bins='auto', density=True, align='mid', rwidth=0.8, label="Relative histogram")

# generate and plot probability density function
from scipy.stats import lognorm
rv = lognorm(sigma, scale = math.exp(mu))
x = np.linspace(min(sample), max(sample), 10000)
ax.plot(x, rv.pdf(x), color="red", label="Density function")

# plot grid and legend
ax.xaxis.grid(True)
ax.yaxis.grid(True)
ax.legend()

------------------------------------------------------------------------

------------------------------------------------------------------------

In [10]:
mu = 1
sigma = 1
rv = lognorm(sigma, scale = math.exp(mu))

mean, var, skew, kurt = rv.stats(moments='mvsk')
print("Mean:", mean)
print("Variance:", var)
print("Skew:", skew)
print("Kurtosis:", kurt)

Mean: 4.4816890703380645
Variance: 34.51261310995656
Skew: 6.184877138632554
Kurtosis: 110.9363921763115

In [11]:
print ("Median:", rv.median())

Median: 2.718281828459045

-   Cumulative distribution function:

In [12]:
print ("Evaluating cdf at mean:", rv.cdf (mean))

Evaluating cdf at mean: 0.6914624612740131

# Matrices

-   Matrices (two-dimensional arrays) play an important role in
    numerical computations.
-   Much of the detail has already been covered.

## Matrix transposition

-   Transpose:

In [13]:
X = np.array([[1, 2, 3], [4, 5, 6]])
print("X =\n", X)
print("X transposed =\n", X.T)

X =
 [[1 2 3]
 [4 5 6]]
X transposed =
 [[1 4]
 [2 5]
 [3 6]]

-   It is cometimes easier to work with row than column vectors: simply
    transpose vectors as and when needed.

## Matrix multiplication

-   Matrices can be multiplied by using `@` or `dot`:

In [14]:
X = np.array([[1, 2, 3], [4, 5, 6]])
Y = np.array([[1, 2], [3, 4], [5, 6]])
print("X @ Y =\n", X @ Y)
print("X.dot(Y) =\n", X.dot(Y))
print("np.dot(X,Y) =\n", np.dot(X,Y))

X @ Y =
 [[22 28]
 [49 64]]
X.dot(Y) =
 [[22 28]
 [49 64]]
np.dot(X,Y) =
 [[22 28]
 [49 64]]

-   This shouldn’t be confused with `*` which is element-by-element
    multiplication:

In [15]:
X * X

array([[ 1,  4,  9],
       [16, 25, 36]])

## Diagonal matrices

-   Diagonal matrices are useful in many areas. The function
    `numpy.diag` can be used to create diagonal matrices.
-   Usage:

``` python
np.diag(<array with diagonal entries>, <diagonal index>)
```

-   Example: create a $4\times 4$ matrix with 1 on the main diagonal,
    0.5 on the diagonal above it and -0.5 on the diagonal below it.

In [16]:
d = np.ones(3)
A = np.eye(4) + 0.5*np.diag(d,1) - 0.5*np.diag(d,-1)
print("A =\n", A)

A =
 [[ 1.   0.5  0.   0. ]
 [-0.5  1.   0.5  0. ]
 [ 0.  -0.5  1.   0.5]
 [ 0.   0.  -0.5  1. ]]

-   Applying `numpy.diag` to a two-dimensional array gives the diagonal:

In [17]:
np.diag(A)

array([1., 1., 1., 1.])

------------------------------------------------------------------------

### Example: conversion between covariance and correlation matrix

-   The relationship between the correlation $\rho_{12}$ and covariance
    $\sigma_{12}$ of two random variables with variance $\sigma_1^2$ and
    $\sigma_2^2$ respectively, is
    $$ \rho_{12} = \frac{\sigma_{12}}{\sigma_1\sigma_2}. $$ Conversion
    between correlation and covariance matrices can be done in NumPy by
    multiplying with diagonal matrices.

-   Let $\Sigma = \begin{bmatrix} 5 & 3 \\ 3 & 12 \end{bmatrix}$ be a
    covariance matrix.

------------------------------------------------------------------------

#### Numerical example

-   Variance and standard deviation:

In [18]:
Sigma = np.array([[ 5, 3], [3, 12]])
var = np.diag(Sigma)
print("Variances:", var)

std = np.sqrt(var)
print("Standard deviations:", std)

Variances: [ 5 12]
Standard deviations: [2.23606798 3.46410162]

-   Convert covariance to correlation:

In [19]:
mult = np.diag(1/std)
Corr = mult @ Sigma @ mult
print ("Correlation matrix:\n", Corr)

Correlation matrix:
 [[1.         0.38729833]
 [0.38729833 1.        ]]

-   Convert correlation to covariance:

In [20]:
mult = np.diag(std)
M = mult @ Corr @ mult
print ("Covariance matrix:\n", M)

Covariance matrix:
 [[ 5.  3.]
 [ 3. 12.]]

## Cumulative sums

-   Useful for cumulative addition over rows and columns of matrices.
-   Usage:

``` python
np.cumsum(<array or matrix>, axis = <axis along which to add>)
```

-   Cumulative sum along rows:

In [21]:
B = np.array([[1,2,3], [4,5,6]])
print("B = \n",B)
np.cumsum(B,axis=0)

B = 
 [[1 2 3]
 [4 5 6]]

array([[1, 2, 3],
       [5, 7, 9]])

-   Cumulative sum along columns:

In [22]:
np.cumsum(B,axis=1)

array([[ 1,  3,  6],
       [ 4,  9, 15]])

-   `np.cumprod` can be used similarly for cumulative products.

# Linear algebra

-   NumPy has an extensive library for matrix and linear algebra
    operations (NumPy Developers (2024)).
-   SciPy also has a library with similar (but not identical)
    functionality (The SciPy Community (2024a)).
-   These slides cover the basics of solving linear equations and the
    Cholesky decomposition (with an application), which is needed for
    our work, but this is only a very small part of the available
    functionality. Consult the documentation for further information.
-   Include the Linear Algebra module of NumPy in your code as follows:

In [23]:
import numpy.linalg as npla

## Solving linear equations

-   Use the determinant of a matrix to verify that it is non-singular
    (can be inverted):

In [24]:
A = np.array([[1, 2, 3], [1, 1, 1], [3, -2, -1]])
print(npla.det(A))

-6.0000000000000036

-   Invert a matrix:

In [25]:
print("Inverse of A:\n",npla.inv(A))

Inverse of A:
 [[-0.16666667  0.66666667  0.16666667]
 [-0.66666667  1.66666667 -0.33333333]
 [ 0.83333333 -1.33333333  0.16666667]]

-   Solve a system $Ax=b$ of linear equations:

In [26]:
b = np.array([[2, 11, 18]])

print("Solution x using inverse of A =", (npla.inv(A) @ b.T).T)

x = npla.solve(A, b.T)
print("Solution x without using inverse of A =",x.T)

Solution x using inverse of A = [[ 10.  11. -10.]]
Solution x without using inverse of A = [[ 10.  11. -10.]]

## Cholesky decomposition

-   Factorizes a symmetric positive definite matrix $A$ (such as a
    covariance matrix) into the product of a lower triangular matrix $L$
    and its transpose, such that $A=LL^T$.
-   Example:

In [27]:
A = np.array([[15, 12, 7], [12, 20, 10], [7, 10, 15]])
L = npla.cholesky(A)
print("Cholesky factor L =\n",L)
print("A = \n", L @ L.T)

Cholesky factor L =
 [[3.87298335 0.         0.        ]
 [3.09838668 3.2249031  0.        ]
 [1.80739223 1.36438208 3.14194126]]
A = 
 [[15. 12.  7.]
 [12. 20. 10.]
 [ 7. 10. 15.]]

------------------------------------------------------------------------

### Method of least squares

-   Method to solve problems of the form
    $$\text{minimize } \lVert Ax - b \rVert \text{ for }x\in\mathbb{R}^n,$$
    where $A$ is an $n\times m$ matrix with $n>m$. It is known that the
    solution $x^\ast$ is the solution of the *Gaussian normal equation*
    $$A^TA x = A^T b.$$
-   The matrix $A^TA$ is symmetric and positive definite, and thus has a
    Cholesky decomposition $LL^T = A^TA$. The Gaussian normal equation
    can be solved efficiently as follows:
    1.  Solve the system $L z = A^T b$.
    2.  Solve the system $L^T x = z$.

------------------------------------------------------------------------

### Example: Least squares

In [28]:
A = np.array([[1, 2], [3, 4], [5, 6]])
b = np.array([-1, 2, -3])

L = npla.cholesky(A.T @ A)
z = npla.solve(L, A.T @ b.T)
x = npla.solve(L.T, z)

print ("L =\n", L)
print ("z =", z.T)
print ("x =", x)
print ("A^TAx - A^Tb =", np.vectorize(round)(A.T @ A @ x - A.T @ b.T,10))

L =
 [[5.91607978 0.        ]
 [7.43735744 0.82807867]]
z = [-1.69030851  0.69006556]
x = [-1.33333333  0.83333333]
A^TAx - A^Tb = [0. 0.]

# Multivariate normal random samples

-   The Cholesky decomposition can be used to generate multivariate
    normal random variables with specified mean and variance.
-   NumPy also provides direct support for generating multivariate
    normal random variables (NumPy Developers (2022a)). Usage:

``` python
<random number generator>.multivariate_normal(<mean vector>, <covariance matrix>, <number of samples>)
```

-   Example: Simulate a 2-dimensional normal random variable with mean
    $\mu = (4, 5)$ and covariance matrix $\Sigma = \begin{bmatrix}
    1.1 & -0.5 \\
    -0.5 & 0.9
    \end{bmatrix}$.

## Using the Cholesky decomposition

In [29]:
# prepare for plotting
fig, ax = plt.subplots(figsize = (12,7))

# model parameters
mu = np.array([4, 5])
Sigma = np.array([[1.1, -0.5], [-0.5, 0.9]])

# Cholesky decomposition of Sigma
L = npla.cholesky(Sigma)

# generate sample and convert to N(mu,Sigma) distribution
# each column of this array is a 2-d sample
n = 1000
sample = L @ rng.standard_normal((2, n))
for k in range(2):
    sample[k] += mu[k]
ax.scatter(sample[0], sample[1])
                             
# plot grid and legend
ax.xaxis.grid(True)
ax.yaxis.grid(True)

------------------------------------------------------------------------

## Using NumPy

In [31]:
# prepare for plotting
fig, ax = plt.subplots(figsize = (12,7))

# model parameters
mu = np.array([4, 5])
Sigma = np.array([[1.1, -0.5], [-0.5, 0.9]])

# generate N(mu,Sigma) sample
# each row of this array is a 2-d sample
n = 1000
sample = rng.multivariate_normal(mu, Sigma, n)
ax.scatter(sample[:,0], sample[:,1])
                             
# plot grid and legend
ax.xaxis.grid(True)
ax.yaxis.grid(True)

------------------------------------------------------------------------

# Further reading

-   McKinney (2022) covers a number of linear algebra features of NumPy
    in Section 4.6. This is combined with random number generation in
    Section 4.7 to simulate Brownian motion.
-   Smith (2022) covers many of the SciPy features in Section 4.4, which
    also covers sparse matrices.

## References

McKinney, Wes. 2022. *Python for Data Analysis: Data Wrangling with
Pandas, NumPy & Jupyter*. 3rd edition. O’Reilly.
<https://wesmckinney.com/book/>.

NumPy Developers. 2022a. “Numpy.random.generator.multivariate_normal.”
<https://numpy.org/doc/stable/reference/random/generated/numpy.random.Generator.multivariate_normal.html>.

———. 2022b. “Random Generator.”
<https://numpy.org/doc/stable/reference/random/generator.html>.

———. 2022c. “Random Sampling (Numpy.random).”
<https://numpy.org/doc/stable/reference/random/index.html>.

———. 2024. “Linear Algebra (Numpy.linalg).”
<https://numpy.org/doc/stable/reference/routines.linalg.html#module-numpy.linalg>.

Smith, Einar. 2022. *Introduction to the Tools of Scientific Computing*.
Second edition. Texts in Computational Science and Engineering 25. Cham,
Switzerland: Springer.
<https://yorsearch.york.ac.uk/permalink/f/7htm32/TN_cdi_askewsholts_vlebooks_9783031169724>.

The SciPy Community. 2024a. “Linear Algebra (Scipy.linalg).”
<https://docs.scipy.org/doc/scipy/reference/linalg.html>.

———. 2024b. “Statistical Functions (Scipy.stats).”
<https://docs.scipy.org/doc/scipy/reference/stats.html>.