# NumPy Examples

Import package and set random seed for reproducibility

In [1]:
import numpy as np
np.random.seed(0)

Create random arrays and check important attributes

In [2]:
x1 = np.random.choice(12, size=12)
x2 = np.random.choice(12, size=(4, 3))
x3 = np.random.choice(12, size=(3, 2, 2))
print("x1: dtype={0}, itemsize={1}, ndim={2}, shape={3}".format(x1.dtype, x1.itemsize, x1.ndim, x1.shape))
print("x2: dtype={0}, itemsize={1}, ndim={2}, shape={3}".format(x2.dtype, x2.itemsize, x2.ndim, x2.shape))
print("x3: dtype={0}, itemsize={1}, ndim={2}, shape={3}".format(x3.dtype, x3.itemsize, x3.ndim, x3.shape))

x1: dtype=int32, itemsize=4, ndim=1, shape=(12,)
x2: dtype=int32, itemsize=4, ndim=2, shape=(4, 3)
x3: dtype=int32, itemsize=4, ndim=3, shape=(3, 2, 2)


## Indexing using ranges returns views 

In [4]:
x = x2.copy()             # assignment is also by reference; make a copy to avoid modifying x2
print("original x:")
print(x)
y = x[:, 1]  # y is a view of x, so change y, x will also change 
print("y is a view of x:")
print(y)
y[-1] = -99
print("current x:")
print(x)

original x:
[[ 6  8  8]
 [10  1  6]
 [ 7  7  8]
 [ 1  5  9]]
y is a view of x:
[8 1 7 5]
current x:
[[  6   8   8]
 [ 10   1   6]
 [  7   7   8]
 [  1 -99   9]]


## Reshaping returns views

In [5]:
x = x2.copy()
print("original x:")
print(x)
y = x.reshape((6, 2))
print("y is reshaped x:")
y[-1, -1] = -99
print(y)
print("current x:")
print(x)

original x:
[[ 6  8  8]
 [10  1  6]
 [ 7  7  8]
 [ 1  5  9]]
y is reshaped x:
[[  6   8]
 [  8  10]
 [  1   6]
 [  7   7]
 [  8   1]
 [  5 -99]]
current x:
[[  6   8   8]
 [ 10   1   6]
 [  7   7   8]
 [  1   5 -99]]


Trasposition is a form of reshaping (reversal of axes).
1D arrays have no orientation, i.e. they are neither row nor column vectors

In [6]:
print(x1)  # x is neither row or column vector 
print(x1.T)

[ 5  0  3 11  3  7  9  3  5  2  4  7]
[ 5  0  3 11  3  7  9  3  5  2  4  7]


To convert a 1D array to a row/column vector (1xn or nx1 array) use reshaping of np.newaxis

In [7]:
print("row vectors")
print(x1.reshape((1, x1.size)))     # row vector
print(x1[np.newaxis, :])            # row vector
print("column vectors")
print(x1.reshape((x1.size, 1)))     # column vector
print(x1[:, np.newaxis])            # column vector

row vectors
[[ 5  0  3 11  3  7  9  3  5  2  4  7]]
[[ 5  0  3 11  3  7  9  3  5  2  4  7]]
column vectors
[[ 5]
 [ 0]
 [ 3]
 [11]
 [ 3]
 [ 7]
 [ 9]
 [ 3]
 [ 5]
 [ 2]
 [ 4]
 [ 7]]
[[ 5]
 [ 0]
 [ 3]
 [11]
 [ 3]
 [ 7]
 [ 9]
 [ 3]
 [ 5]
 [ 2]
 [ 4]
 [ 7]]


## Aggregation
When the aggregation axis is set, the shape of the output is the original shape with the aggregation axis removed

In [8]:
print("original matrix")
print(x2)
print("matrix sum (axis=None)")
print(x2.sum())
print("matrix sum (axis=0)")
print(x2.sum(axis=0))  #  (4, 3) --> (0, 3)
print("matrix sum (axis=1)")
print(x2.sum(axis=1)) # (4, 3) ---> (4, 0)

original matrix
[[ 6  8  8]
 [10  1  6]
 [ 7  7  8]
 [ 1  5  9]]
matrix sum (axis=None)
76
matrix sum (axis=0)
[24 21 31]
matrix sum (axis=1)
[22 17 22 15]


## Broadcasting
The process of stretching the shape of an array to make it match the shape of another. 
This allows operators ('+', '-', '\*', '/') and ufuncs to work elementwise.

### Broadcasting rules
1. If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.
2. If the shape of the two arrays does not match in a given dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.
3. If in any dimension the sizes disagree and neither is equal to 1, an Exception is raised.


In [9]:
# centering an array
xorig = np.random.random((8, 3))   # shape (8, 3)
xmean = xorig.mean(axis=0)         # column mean shape (3,)
xcent = xorig - xmean              # rule 1: pad xmean to shape (1, 3); rule 2: stretch xmean to shape (8, 3)
print("xorig: shape={0}".format(x.shape))
print("xmean: shape={0}".format(xmean.shape))
print("xcent: shape={0}".format(xcent.shape))
print("sanity check: column sums of xcent")
print(xcent.sum(axis=0))

xorig: shape=(4, 3)
xmean: shape=(3,)
xcent: shape=(8, 3)
sanity check: column sums of xcent
[1.11022302e-16 3.33066907e-16 1.11022302e-16]


## Vector-Matrix operations
A vector is a 1D array, or a (nx1) or (1xn) 2D array  
A matrix is an mxn 2D array  
A tensor of rank r is an rD array with shape (n_0, n_1, ...n_{r-1})

### Inner products

In [10]:
x = np.random.random(3)                  # a vector without orientation
y = x.reshape((1, 3))                    # a row vector
z = np.random.random(3).reshape((3, 1))  # a column vector

# inner product of two vectors
print("using np.inner")
print(np.inner(x, z.T))
print(np.inner(z.T, x))
print(np.inner(z.T, y))
print(np.inner(y.T, z)) # surprise
print("using np.dot")
print(np.dot(x, z))
print(np.dot(z.T, x))
print(np.dot(y, z))  
print(np.dot(z.T, y.T))


using np.inner
[0.28837911]
[0.28837911]
[[0.28837911]]
[[0.0765184  0.11995946 0.09227411]
 [0.04689188 0.07351336 0.05654727]
 [0.11472469 0.17985623 0.13834735]]
using np.dot
[0.28837911]
[0.28837911]
[[0.28837911]]
[[0.28837911]]


A headache-free way to do it is using the "@" operator or np.matmult (python>=3.5)

In [11]:
print("using @")
print(x @ z)
print(z.T @ x)
print(y @ z)
print(z.T @ y.T) 

using @
[0.28837911]
[0.28837911]
[[0.28837911]]
[[0.28837911]]


### Vector-matrix multiplication

In [12]:
x = np.random.random(3)                  # a vector without orientation
z = x[:, np.newaxis]                     # a colum vector
A = np.random.random((3, 3))
print("A@x:")
print(A @ x)
print("A@z:")
print(A @ z)
print("xA@x:")
print(x @ A @ x)
print("zA@z:")
print(z.T @ A @ z)

A@x:
[0.27898715 0.51903678 0.20493014]
A@z:
[[0.27898715]
 [0.51903678]
 [0.20493014]]
xA@x:
0.3715137538073113
zA@z:
[[0.37151375]]


### How the @ operator (numpy.matmult) works

A @ B

First it checks/broadcasts the two arrays as follows:
* If A is 1D, it is promoted to a matrix by *prepending* a 1 to its dimensions (row vector). After matrix multiplication the prepended 1 is removed.
* If B is 1D, it is promoted to a matrix by *appending* a 1 to its dimensions (column vector). After matrix multiplication the appended 1 is removed.
* If either A or B is ND, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.

Then it does a sum-product between the two arrays over the last dimension of A and the one before last dimension of B

Corrollaries:
1. Multiplication with a scalar is not allowed. Use the * operator instead
2. If A and B are 1D arrays, it computes the dot-product of two vectors A^T dot B
3. If A is 1D and B is 2D, it computes the vector-matrix product A^T dot B
4. If A is 2D and B is 1D, it computes matrix-vector product A dot B
5. If A and B are 2D matrices, the standard matrix product is computed

## An example from finance

We have an equity portfolio of $N$ assets. 
For each asset $i$, the portfolio contains a position $x_i$ in million dollars.  
Positive (negative) position means the portfolio is long (short).   
The return volatilities $\sigma_i$ and the return correlation matrix $\rho$ are given (only the upper triangular part).  
Compute the volatility of the portfolio returns.

In [70]:
x = np.array([1.5, -0.8, 6.5, 4.7, -1.4])
sigma = np.array([0.2, 0.18, 0.12, 0.25, 0.21]) 
rho = np.array([[1.0, 0.4, 0.6, 0.7, 0.4], 
                [np.NaN, 1.0, 0.5, 0.7, 0.5],
                [np.NaN, np.NaN, 1.0, 0.4, 0.5],
                [np.NaN, np.NaN, np.NaN, 1.0, 0.6],
                [np.NaN, np.NaN, np.NaN, np.NaN, 1.0]])

First compute the symmetric correlation matrix

In [71]:
cormat = (np.triu(rho) + np.triu(rho).T)
np.fill_diagonal(cormat, 1.0)
print('Correlation matrix:')
print(cormat)

Correlation matrix:
[[1.  0.4 0.6 0.7 0.4]
 [0.4 1.  0.5 0.7 0.5]
 [0.6 0.5 1.  0.4 0.5]
 [0.7 0.7 0.4 1.  0.6]
 [0.4 0.5 0.5 0.6 1. ]]


Then compute the $NxN$ covariance matrix among the assets.  
$C_{i,j} =  \rho_{i,j} \sigma_i \sigma_j$

In [72]:
covmat = sigma[:, np.newaxis] * cormat * sigma[np.newaxis, :]
print('Covariance matrix:')
print(covmat)

Covariance matrix:
[[0.04   0.0144 0.0144 0.035  0.0168]
 [0.0144 0.0324 0.0108 0.0315 0.0189]
 [0.0144 0.0108 0.0144 0.012  0.0126]
 [0.035  0.0315 0.012  0.0625 0.0315]
 [0.0168 0.0189 0.0126 0.0315 0.0441]]


The portfolio variance is the quadratic form $x^T \cdot C \cdot x$ divided by the gross notional $A = \sum_i |x_i|$

In [74]:
A = np.sum(np.abs(x))
pvar = x @ covmat @ x
pvar /= A**2
sigma_p = np.sqrt(pvar)
print("Portfolio volatility: {0:4.2f}%".format(100* sigma_p))

Portfolio volatility: 10.90%
