# Week 4 Practice

## Numerical Computing with 'NumPy'

### Arrays of Data

list object with numbers.

In [1]:
v = [0.5, 0.75, 1.0, 1.5, 2.0]
v

[0.5, 0.75, 1.0, 1.5, 2.0]

list object with list objects resulting in a matrix of numbers.

In [2]:
m = [v, v, v]
m

[[0.5, 0.75, 1.0, 1.5, 2.0],
 [0.5, 0.75, 1.0, 1.5, 2.0],
 [0.5, 0.75, 1.0, 1.5, 2.0]]

In [3]:
m[1]

[0.5, 0.75, 1.0, 1.5, 2.0]

In [4]:
m[1][0]

0.5

Now change the value of the first element of the v object and see what happens to the m object:

In [5]:
v[0] = 'Python'
m

[['Python', 0.75, 1.0, 1.5, 2.0],
 ['Python', 0.75, 1.0, 1.5, 2.0],
 ['Python', 0.75, 1.0, 1.5, 2.0]]

### The Python array Class

In [6]:
import array
v = [0.5, 0.75, 1.0, 1.5, 2.0]
a = array.array('f', v)
a

array('f', [0.5, 0.75, 1.0, 1.5, 2.0])

In [7]:
a.append(0.5)
a

array('f', [0.5, 0.75, 1.0, 1.5, 2.0, 0.5])

In [8]:
a.extend([5.0, 6.75])
a

array('f', [0.5, 0.75, 1.0, 1.5, 2.0, 0.5, 5.0, 6.75])

In [9]:
2 * a

array('f', [0.5, 0.75, 1.0, 1.5, 2.0, 0.5, 5.0, 6.75, 0.5, 0.75, 1.0, 1.5, 2.0, 0.5, 5.0, 6.75])

Trying to append an object of a different data type than the one specified raises a TypeError:

In [10]:
a.append('string')

TypeError: must be real number, not str

In [None]:
a.tolist()

### Regular NumPy Arrays

#### The Basics

In [11]:
import numpy as np
a = np.array([0, 0.5, 1.0, 1.5, 2.0])
a

array([0. , 0.5, 1. , 1.5, 2. ])

In [12]:
type(a)

numpy.ndarray

In [13]:
b = np.array(['a', 'b', 'c'])
b

array(['a', 'b', 'c'], dtype='<U1')

In [14]:
type(b)

numpy.ndarray

In [15]:
c = np.arange(2, 20, 2)
c

array([ 2,  4,  6,  8, 10, 12, 14, 16, 18])

In [16]:
d = np.arange(8, dtype=np.float)
d

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  d = np.arange(8, dtype=np.float)


array([0., 1., 2., 3., 4., 5., 6., 7.])

In [17]:
d[5:]

array([5., 6., 7.])

In [18]:
d[:2]

array([0., 1.])

A major feature of the ndarray class is the multitude of built-in methods. For instance:

In [19]:
d.sum()

28.0

In [20]:
d.std()

2.29128784747792

In [21]:
d.cumsum()

array([ 0.,  1.,  3.,  6., 10., 15., 21., 28.])

Another major feature is the (vectorized) mathematical operations defined on ndarray objects:

In [22]:
l = [0., 0.5, 1.5, 3., 5.]
2 * l

[0.0, 0.5, 1.5, 3.0, 5.0, 0.0, 0.5, 1.5, 3.0, 5.0]

In [23]:
2 * d

array([ 0.,  2.,  4.,  6.,  8., 10., 12., 14.])

In [24]:
d ** 2

array([ 0.,  1.,  4.,  9., 16., 25., 36., 49.])

In [25]:
2 ** d

array([  1.,   2.,   4.,   8.,  16.,  32.,  64., 128.])

In [26]:
d ** d

array([1.00000e+00, 1.00000e+00, 4.00000e+00, 2.70000e+01, 2.56000e+02,
       3.12500e+03, 4.66560e+04, 8.23543e+05])

Universal functions are another important feature of the NumPy package. They are “universal” in the sense that they in general operate on ndarray objects as well as on basic Python data types. However, when applying universal functions to, say, a Python float object, one needs to be aware of the reduced performance compared to the same functionality found in the math module:

In [27]:
np.exp(d)

array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03])

In [28]:
np.sqrt(d)

array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ,
       2.23606798, 2.44948974, 2.64575131])

In [29]:
np.sqrt(2.5)

1.5811388300841898

In [30]:
import math
math.sqrt(2.5)

1.5811388300841898

In [31]:
math.sqrt(d)

TypeError: only size-1 arrays can be converted to Python scalars

In [32]:
e = np.array([d, d * 2])
e

array([[ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.],
       [ 0.,  2.,  4.,  6.,  8., 10., 12., 14.]])

In [33]:
e[0]

array([0., 1., 2., 3., 4., 5., 6., 7.])

In [34]:
e[0,2]

2.0

In [35]:
e[:, 1]

array([1., 2.])

In [36]:
e.sum()   # Calculates the sum of all values.

84.0

In [37]:
e.sum(axis=0)    # Calculates the sum along the first axis; i.e., column-wise.

array([ 0.,  3.,  6.,  9., 12., 15., 18., 21.])

In [38]:
e.sum(axis=1)    # Calculates the sum along the second axis; i.e., row-wise.

array([28., 56.])

In [39]:
f = np.zeros((2, 3), dtype = 'i')
f

array([[0, 0, 0],
       [0, 0, 0]], dtype=int32)

In [40]:
g = np.ones((2,3),dtype = 'i')
g

array([[1, 1, 1],
       [1, 1, 1]], dtype=int32)

In [41]:
h = np.linspace(5, 15, 12)
h
# Creates a one-dimensional ndarray object with evenly spaced intervals between numbers; parameters used are start, end, and num (number of elements).

array([ 5.        ,  5.90909091,  6.81818182,  7.72727273,  8.63636364,
        9.54545455, 10.45454545, 11.36363636, 12.27272727, 13.18181818,
       14.09090909, 15.        ])

In [42]:
h.size

12

In [43]:
h.ndim    # The number of dimensions.

1

In [44]:
h.dtype   # The dtype of the elements.

dtype('float64')

#### Reshaping and Resizing

In [45]:
c.shape

(9,)

In [46]:
np.shape(c)

(9,)

In [47]:
cr = c.reshape((3, 3))
cr

array([[ 2,  4,  6],
       [ 8, 10, 12],
       [14, 16, 18]])

In [48]:
cr.T

array([[ 2,  8, 14],
       [ 4, 10, 16],
       [ 6, 12, 18]])

In [49]:
cr.transpose()

array([[ 2,  8, 14],
       [ 4, 10, 16],
       [ 6, 12, 18]])

In [50]:
np.resize(c, (3, 1))

array([[2],
       [4],
       [6]])

In [51]:
np.resize(c, (1, 5))

array([[ 2,  4,  6,  8, 10]])

In [52]:
np.resize(c, (2, 4))

array([[ 2,  4,  6,  8],
       [10, 12, 14, 16]])

In [53]:
crr = np.resize(c, (4, 2))
crr

array([[ 2,  4],
       [ 6,  8],
       [10, 12],
       [14, 16]])

In [54]:
np.hstack((crr, 2 * crr))

array([[ 2,  4,  4,  8],
       [ 6,  8, 12, 16],
       [10, 12, 20, 24],
       [14, 16, 28, 32]])

In [55]:
np.vstack((crr, 0.5 * crr))

array([[ 2.,  4.],
       [ 6.,  8.],
       [10., 12.],
       [14., 16.],
       [ 1.,  2.],
       [ 3.,  4.],
       [ 5.,  6.],
       [ 7.,  8.]])

In [56]:
crr.flatten()

array([ 2,  4,  6,  8, 10, 12, 14, 16])

#### Boolean Arrays

In [57]:
crr > 5

array([[False, False],
       [ True,  True],
       [ True,  True],
       [ True,  True]])

In [58]:
(crr == 8).astype(int)

array([[0, 0],
       [0, 1],
       [0, 0],
       [0, 0]])

In [59]:
(crr > 4) & (crr <= 12)

array([[False, False],
       [ True,  True],
       [ True,  True],
       [False, False]])

In [60]:
crr[crr > 8]

array([10, 12, 14, 16])

In [61]:
crr[(crr > 4) & (crr <= 12)]

array([ 6,  8, 10, 12])

In [62]:
crr[(crr < 4) | (crr >= 12)]

array([ 2, 12, 14, 16])

In [63]:
np.where(crr % 2 == 0, 'even', 'odd')

array([['even', 'even'],
       ['even', 'even'],
       ['even', 'even'],
       ['even', 'even']], dtype='<U4')

### Vectorization of Code

#### Basic Vectorization

As demonstrated in the previous section, simple mathematical operations — such as calculating the sum of all elements — can be implemented on ndarray objects directly (via methods or universal functions).

In [64]:
r = np.arange(12).reshape((4, 3))
s = np.arange(12).reshape((4, 3)) * 0.5

In [65]:
r

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

In [66]:
s

array([[0. , 0.5, 1. ],
       [1.5, 2. , 2.5],
       [3. , 3.5, 4. ],
       [4.5, 5. , 5.5]])

In [67]:
r + s

array([[ 0. ,  1.5,  3. ],
       [ 4.5,  6. ,  7.5],
       [ 9. , 10.5, 12. ],
       [13.5, 15. , 16.5]])

NumPy also supports what is called broadcasting. This allows you to combine objects of different shape within a single operation.

In [68]:
r + 3

array([[ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [69]:
2 * r

array([[ 0,  2,  4],
       [ 6,  8, 10],
       [12, 14, 16],
       [18, 20, 22]])

In [70]:
2 * r + 3

array([[ 3,  5,  7],
       [ 9, 11, 13],
       [15, 17, 19],
       [21, 23, 25]])

In [71]:
r.shape

(4, 3)

In [72]:
def f(x):
    return 3 * x + 5

In [73]:
f(0.5)

6.5

In [74]:
f(r)

array([[ 5,  8, 11],
       [14, 17, 20],
       [23, 26, 29],
       [32, 35, 38]])

#### Memory Layout

To illustrate the potential importance of the memory layout of arrays in science and finance, consider the following construction of multidimensional ndarray objects:

In [75]:
x = np.random.standard_normal((1000000, 5))
y = 2 * x + 3
C = np.array((x, y), order='C')    # This creates a two-dimensional ndarray object with C order (row-major).
F = np.array((x, y), order='F')    # This creates a two-dimensional ndarray object with F order (column-major).
C[:2].round(2)

array([[[-0.49,  0.39,  1.46,  0.28,  1.5 ],
        [ 0.06,  0.41,  1.07,  0.14, -1.47],
        [-2.74, -0.21, -1.09, -0.46, -0.31],
        ...,
        [ 0.48, -1.46,  0.15,  0.65, -2.1 ],
        [-0.88, -0.74,  0.65, -0.64, -0.61],
        [ 0.5 , -0.16, -1.04,  0.58,  0.86]],

       [[ 2.02,  3.79,  5.93,  3.56,  6.01],
        [ 3.12,  3.82,  5.14,  3.28,  0.07],
        [-2.48,  2.58,  0.83,  2.08,  2.38],
        ...,
        [ 3.96,  0.07,  3.29,  4.29, -1.19],
        [ 1.25,  1.53,  4.3 ,  1.71,  1.77],
        [ 4.  ,  2.68,  0.93,  4.17,  4.73]]])

Let’s look at some fundamental examples and use cases for both types of ndarray objects and consider the speed with which they are executed given the different memory layouts:

In [76]:
%timeit C.sum()
# Time execution of a Python statement or expression. C order means that operating row-rise on the array will be slightly quicker.


12 ms ± 2.27 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [77]:
%timeit F.sum()
# F order means that operating column-rise on the array will be slightly quicker.


12.9 ms ± 1.14 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
