# Foundations of Computational Economics #14

by Fedor Iskhakov, ANU

<img src="_static/img/dag3logo.png" style="width:256px;">

## Vectors and matrixes (Numpy)

<img src="_static/img/lecture.png" style="width:64px;">

<img src="_static/img/youtube.png" style="width:65px;">

[https://youtu.be/QLp3PEziRJE](https://youtu.be/QLp3PEziRJE)

Description: NumPy arrays data types and differences to native Python, operations on the arrays, solving linear systems of equations.

### Scientific stack in Python

Collection of modules (libraries) used in scientific numerical computations:

- **``NumPy``** is widely-used scientific computing package for implements fast array processing — vectorization  
- **``SciPy``** is a collection of functions that perform common scientific operations (optimization, root finding, interpolation, numerical integration, etc.)  
- **``Pandas``** is data manipulation package with special data types and methods  
- **``Numba``** is just in time (JIT) compiler for a subset of Python and NumPy functions  
- **``Matplotlib``** serves for making figures and plots  

### NumPy library

<img src="_static/img/numpy_logo.png" style="width:512px;">

- **Vectorization in Python**  
- **NumPy** is a widely-used scientific computing package for brings fast array processing to Python  
- Reference guide: [https://docs.scipy.org/doc/numpy/reference/](https://docs.scipy.org/doc/numpy/reference/)  
- Runs fast compiled code written in C & Fortran under the hood  

#### Importing modules in Python

- `import LIBRARY as ref`, then call library functions as `ref.function`  
- `from LIBRARIY import function` or `from LIBRARIY import *`, then call library functions directly  
- keeping conventional reference is a good idea for making your code understood by others!  

In [None]:
# import libraries
import numpy as np

#### Power of NumPy

In [None]:
N = 1000000
data_list = list(range(N)) # Native Python list
t1 = %timeit -n10 -r10 -o mean1 = sum(data_list)/N

import numpy as np
data_array = np.array(range(N)) #NumPy array
t2 = %timeit -n10 -r10 -o mean2 = data_array.mean()

print('NumPy is on avarage %2.3f faster' % (t1.average/t2.average))

SI orders of magnitude https://en.wikipedia.org/wiki/Micro-

#### Scientific computing: more advanced treatment of numbers

Inherited from low lever C implementation

- int8, uint8 (signed and unsigned)  
- int16, uint16  
- int32, uint32  
- float16  
- float32  
- float64  


Full list of types:
[https://numpy.org/doc/stable/user/basics.types.html](https://numpy.org/doc/stable/user/basics.types.html)

#### Array initialization with type

In [None]:
import sys
x = np.array([-1,0,1.4],dtype='bool')
y = np.array([-1,0,1.4],dtype='int16')
z = np.array([-1,0,1.4],dtype='float64')
print('x %s, takes %d bytes' % (type(x[0]),sys.getsizeof(x)))
print('y %s, takes %d bytes' % (type(y[0]),sys.getsizeof(y)))
print('z %s, takes %d bytes' % (type(z[0]),sys.getsizeof(z)))

NumPy array hold data of **the same type**

#### Integer overflow

In [None]:
x = np.array([0,0,0],dtype='uint8')
x[0] = 255
x[1] = x[0] + 1
x[2] = x[1] + 1
print(x)

#### Infinity and not-a-number

In [None]:
# Inf and NaN
# np.seterr(all=None, divide='ignore', over=None, under=None, invalid='ignore')
x = np.array([-1,0,1,10],dtype='float64')
print( x / 0 )
# y = 10 / 0 # core Python

In [None]:
# Comparing nans and infs
a = (np.inf == np.inf)
b = (np.nan == np.nan)  # Can not compare nan to nan!
c = np.isnan(np.nan)
print (a, b, c)

#### What is inside array?

First, import module `obj_explore.py`: this is not trivial because jupyter notebooks require imported modules to be on the known `PATH`

In [None]:
# add path to the modules directory
import sys
sys.path.insert(1, './_static/include/')
# import the obj_explore() function
from obj_explore import *

In [None]:
a = np.array([1,2,3,4,5],dtype='float64')
# a = np.arange(5,dtype='uint8') + 1
print([type(a_element) for a_element in a])
a

In [None]:
obj_explore(a,'public')
# help(a)

#### Memory footprint

In [None]:
import sys
def memory_usage(var,grow,steps=10):
    """Returns data on memory usage when var is grown using supplied function for given number of steps"""
    d={"x":[],"y":[],"v":[]} # dictionary for x, y data, and values
    for i in range(steps):
        var=grow(var) # next value
        d["v"].append(var)
        d["x"].append(i+1)
        d["y"].append(sys.getsizeof(var))
    return d

In [None]:
x = [0,] # Python list
grow = lambda x: [0,]*len(x)*2
d1 = memory_usage(x,grow,steps=10)
x = np.array([0])
grow = lambda x: np.array([0,]*x.size*2,dtype='int8')
d2 = memory_usage(x,grow,steps=10)

import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(d1["x"],d1["y"],label='Python')
plt.plot(d2["x"],d2["y"],label='Numpy')
plt.axis([min(d1["x"]),max(d1["x"]),0,max(d1["y"])+1])
plt.ylabel('size in memory, bytes')
plt.xlabel('steps of variable update')
plt.legend()
plt.show()

#### Creating arrays

- From lists  
- Using functions for standard cases  

In [None]:
a = np.array([1,3,5.0,7])
print(a)

In [None]:
a = np.empty(25,'int8')  # not initialized !
b = np.zeros(5)          # initialized with zeros
c = np.ones(5)
d = np.arange(10)
e = np.linspace(2, 3, 11) # fill between 2 and 3 with 10 points
print(a,b,c,d,e,sep='\n\n')

Note that uninitialized array may have garbage (random state of the memory)

#### Matrices and multidimensional arrays

In [None]:
a = np.eye(5) # identity matrix
b = np.ones((2,3))
print(a)
print(b)

In [None]:
b=b+2
c = np.asmatrix(b) # matrix !
print(b)
print(type(b))
print(c)
print(type(c))

In [None]:
print(b)
print(b*b) # element by element
print(c*c) # matrix multiplication

In [None]:
# c*c
print(c*c.transpose())

#### Indexing into arrays

Several types of indexes:

- scalar index x[0] (getitem)  
- slicing like strings x[1::2]  
- numerical indexing  
- masks  

In [None]:
z = np.linspace(0, 2, 15)
z = np.reshape(z,[5,3])
print(z)

In [None]:
print(z,end='\n\n')
print( z[1]      )   # scalar index: returns row array
print( z[1,0]    )   # scalar index: returns number
print( z[-1:]    )   # slicing: returns ?
print( z[1:3,1]  )   # slicing + scalar index
print( z[1:3,1:] )   # slicing
print( z[:,1]    )   # slicing to get the column

In [None]:
# Assigning elements of an array
z[1,0] = -1
z[2] = [4,5,7]  # assign whole row from a list
z[:,0] = np.array([4.2,5.2,6.2,7.2,8.2]) # assign column from nparray
z[:2,1]=9.3
z[3][1]=-2 # note double bracket indexing
print(z)

In [None]:
z = np.linspace(0, 2, 12)
z = np.reshape(z,[4,3])
print(z,end='\n\n')

print( z[[0,2,2],[0,1,0]]    )   # numerical (element by element) indexing
print( z[z>1.0]              )   # boolean indexing (masking)
mask = np.logical_and(z>1.0,z<1.75)
print(mask,end='\n\n')
print( z[mask]               )   # boolean indexing (masking)

#### Operation broadcasting

<img src="_static/img/broadcasting.png" style="width:800px;">

#### Operation broadcasting

In [None]:
a = np.array([1,2,3])
b = np.array([4,5,6])
print(a + 10)
print(a + b)

In [None]:
x = np.arange(3) + 5
y = np.ones((3,3))
print(x,y,x+y,sep='\n\n')

In [None]:
x = np.arange(5).reshape((5,1))  # or
x = np.arange(5)[:,np.newaxis]
print(x,end='\n\n')
print(x.transpose(),end='\n\n')
print(x + x.transpose())

In [None]:
x = np.arange(12).reshape((3,4))
y = np.arange(4)
print(x,y,x+y,sep='\n\n')

##### Broadcasting works with:

- addition ($ + $)  
- subtraction ($ - $)  
- multiplication ($ * $)  
- division ($ / $)  
- integer division ($ // $)  
- power ($ ** $)  
- all *universal functions*  

#### ufunc

Functions provided by NumPy which support vectorization and broadcasting

- Act on array element-wise  
- Efficient implementation with low level code  


[https://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs](https://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs)

In [None]:
import math
N = 10000
data_list = list(range(N)) # Native Python list
t1 = %timeit -n10 -r10 -o sin1 = [math.sin(x) for x in data_list]

import numpy as np
data_array = np.array(range(N)) #NumPy array
t2 = %timeit -n10 -r10 -o sin2 = np.sin(data_array)

print('NumPy is on avarage %2.3f faster' % (t1.average/t2.average))

#### Reduction operations

Functions that return the array of reduced size: **sum**, **min**,
**max** , **mean**, **all**, **any**

[https://numpy.org/doc/stable/reference/routines.math.html](https://numpy.org/doc/stable/reference/routines.math.html)

In [None]:
x = np.arange(12).reshape(4,3)
print(x)
print(np.sum(x))
print(np.sum(x, axis=1))
print(np.maximum.reduce(x,axis=1,keepdims=True))

In [None]:
x = np.arange(24).reshape(2,4,3)
print(x)
y = np.min(x,axis=2)
# y = np.mean(x,axis=(1,2))
print(y.shape)
print(y)

#### References and mutability

NumPy tries not to copy data in the arrays when not necessary

- principle: whether it is possible to maintain simple pointer arithmetic  
- slices are generally not copied  
- numerical indexing/mask generally copied  
- **.flags** to check  
- **.copy** to make a true copy  

In [None]:
x = np.arange(12).reshape(4,3) # 2-dim array
print(x)
y = x[:,1:2]
print(y)
print(y.flags)
y[0] = 999
print(x)

In [None]:
y = x[[0,1],[0,2]]
print(y)
print(y.flags)
y[0]=-100
print(x)

[https://numpy.org/doc/stable/reference/generated/numpy.ndarray.flags.html](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.flags.html)

### Linear algebra with NumPy

Submodule numpy.linalg

- matrix decompositions  
- eigenvalues  
- determinant, rank  
- matrix inverse  
- linear systems of equations  


[https://numpy.org/doc/stable/reference/routines.linalg.html](https://numpy.org/doc/stable/reference/routines.linalg.html)

In [None]:
import numpy.linalg as linalg
obj_explore(linalg,'public methods')

#### Matrix operations

$$
A=
\begin{pmatrix}
1 & 2 & 0 & 5 \\
4 & -2 & 1 & 1 \\
0 & 0 & -2 & 7 \\
3 & 1 & 4 & 0 \\
\end{pmatrix}
$$

In [None]:
A = np.array([[1,2,0,5],[4,-2,1,1],[0,0,-2,7],[3,1,4,0]])
print(A)

In [None]:
print(A,end='\n\n')
# print( A.transpose() )
# print( np.linalg.matrix_rank(A) )
# print( np.linalg.inv(A) )
# print( np.linalg.det(A) )
B = A[:2,:]
# print(B,end='\n\n')
# print( B * B ) # element by element
# print( B @ B ) # matrix multiplication
# print( B @ B.T )

#### Linear systems of equations

$$
\begin{eqnarray*}
1x_1+2x_2+5x_4&=&5\\
4x_1-2x_2+x_3+x_4&=&5\\
-2x_3+7x_4&=&0\\
3x_1+x_2+4x_3&=&-3\\
\end{eqnarray*}
$$

In matrix notation $ Ax=b $ where

$$
A=
\begin{pmatrix}
1 & 2 & 0 & 5 \\
4 & -2 & 1 & 1 \\
0 & 0 & -2 & 7 \\
3 & 1 & 4 & 0 \\
\end{pmatrix},\;\;
b=\begin{pmatrix}
5\\
5\\
0\\
-3
\end{pmatrix},\;\;
x=\begin{pmatrix}
x_1\\
x_2\\
x_3\\
x_4
\end{pmatrix}
$$

In [None]:
b = np.array([5,5,0,-3])
x=np.linalg.solve(A, b)
print('Solution is %r' % x)
print('Check: max(Ax-b) = %1.5e' % np.max(A@x-b))

#### Overdetermined linear systems of equations

$$
\begin{eqnarray*}
1x_1+2x_2&=&5\\
4x_1-2x_2+x_3&=&5\\
-2x_3&=&0\\
3x_1+x_2+4x_3&=&-3\\
\end{eqnarray*}
$$

In [None]:
A = np.array([[1,2,0],[4,-2,1],[0,0,-2],[3,1,4]])
x=np.linalg.lstsq(A, b, rcond=None)

A = np.array([[1,2,0],[4,-2,1],[0,0,-2],[3,1,4]])
x,*_=np.linalg.lstsq(A, b, rcond=None) # ignore all outputs except the first
print(A)
print('Solution is %r' % x)
print('Check: max(Ax-b) = %1.5e' % np.max(A@x-b))
# help(np.linalg.lstsq)

#### Market equilibrium in a linear model

- Prices $ p $, quantities $ q $  
- $ n $ goods in the economy  
- Linear demand $ D(p) = A p + d $, where $ A $ is n by n, and
  $ p $ is n by 1  
- Linear supply $ S(p) = B p + s $, where $ B $ is n by n, and
  $ p $ is n by 1  
- Market clearing prices: $ D(p)=S(p) $  

In [None]:
def random_matrix(n,positive=True,maxeigen=10):
    '''Generates square random positive/negative semi-definite matrix'''
    e=np.random.uniform(0,maxeigen,n) # random eigenvalues
    r=np.random.uniform(0,1,n*n).reshape(n,n) # rotation
    e = e if positive else -e
    A = np.diag(e)  # diagonal matrix with
    return r @ A @ r.T  # positive/negative semi-definite
n = 3  # number of products
A = random_matrix(n,positive=False)  # demand
d = np.array([100,]*n)
B = random_matrix(n)  # supply
s = np.zeros(n)
p = np.linalg.solve(A-B, s-d)  # solve for quilibrium
q = A @ p + d # equilibrium quantities
print('Demand is given by Ap+d:\nA=%r\nd=%r' % (A,d))
print('Supply is given by Bp+s:\nB=%r\ns=%r' % (B,s))
print('Equilibrium prices are     p = {}'.format(p))
print('Equilibrium quantities are q = {}'.format(q))

### Further learning resources

- Reference manual for Numpy
  [https://numpy.org/doc/stable/reference/](https://numpy.org/doc/stable/reference/)  
- 📖 Kevin Sheppard “Introduction to Python for Econometrics, Statistics
  and Data Analysis.” *Chapters: 4, 6, 7, 8*  
- SciPy 2017 Tutorial on NumPy (2.5h)
  [https://www.youtube.com/watch?v=lKcwuPnSHIQ&ab_channel=Enthought](https://www.youtube.com/watch?v=lKcwuPnSHIQ&ab_channel=Enthought)  
- Essence of linear algebra playlist by 3Blue1Brown
  [https://www.youtube.com/watch?v=fNk_zzaMoSs&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab](https://www.youtube.com/watch?v=fNk_zzaMoSs&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab)  