# Lecture 07 - Numerical Computing (NumPy)

## Overview

`NumPy` (Numerical Python) is a fundamental library for scientific computing in Python. It provides support for arrays, matrices, and many mathematical functions that operate on these data structures. 

In this notebook, you'll learn about:
- Creating and manipulating `NumPy` arrays
- Basic mathematical operations with `NumPy`
- Statistical and calculations using `NumPy`

**Why Use NumPy?**

`NumPy` offers significant advantages when working with numerical data:
- **Performance:** `NumPy` is much faster than traditional Python lists for numerical operations.
- **Functionality:** It provides a wide range of mathematical functions and operations.
- **Convenience:** `NumPy` arrays are more convenient to work with, especially for large datasets.

## 1. Arrays of data with Python lists

Previous notebooks highlighted Python’s general data structures, especially lists, which are flexible but can be inefficient in terms of memory and performance. For scientific and financial applications that demand high-performance operations on specialized data structures, arrays are crucial. Arrays organize elements of the same data type in rows and columns, typically representing numbers like real values. A one-dimensional array represents a vector, while multi-dimensional arrays form matrices and cubes. These structures are essential in fields like linear algebra. To efficiently handle arrays, the Python library NumPy, with its ndarray class, provides powerful and specialized functionality.

**Dimensionality**

In [12]:
# 1-dimension list
v = [0.5, 0.75, 1.0, 1.5, 2.0]

# 2-dimensions list = matrix
m = [v, v, v]

**Indexing**

In [17]:
m[1]

[0.5, 0.75, 1.0, 1.5, 2.0]

In [16]:
m[1][0]

0.5

**Deeper**: n-dimensions, deep copies

In [18]:
v1 = [0.5, 1.5]
v2 = [1, 2]
m = [v1, v2]
c = [m, m]
c

[[[0.5, 1.5], [1, 2]], [[0.5, 1.5], [1, 2]]]

In [20]:
v = [0.5, 0.75, 1.0, 1.5, 2.0]
m = [v, v, v]
v[0] = 'Python'
m

[['Python', 0.75, 1.0, 1.5, 2.0],
 ['Python', 0.75, 1.0, 1.5, 2.0],
 ['Python', 0.75, 1.0, 1.5, 2.0]]

In [21]:
from copy import deepcopy
v = [0.5, 0.75, 1.0, 1.5, 2.0]
m = 3 * [deepcopy(v), ]
m

[[0.5, 0.75, 1.0, 1.5, 2.0],
 [0.5, 0.75, 1.0, 1.5, 2.0],
 [0.5, 0.75, 1.0, 1.5, 2.0]]

In [22]:
v[0] = 'Python'
m

[[0.5, 0.75, 1.0, 1.5, 2.0],
 [0.5, 0.75, 1.0, 1.5, 2.0],
 [0.5, 0.75, 1.0, 1.5, 2.0]]

In [23]:
v

['Python', 0.75, 1.0, 1.5, 2.0]

## 2. Regular NumPy arrays

Using list objects to compose array structures is possible, but not very convenient since the list class is designed for broader, general purposes. A truly dedicated class for handling array-type structures is far more beneficial. numpy.ndarray is just such a class, built with the specific goal of handling n- dimensional arrays both conveniently and efficiently—i.e., in a highly performantmanner.

### 2.1 The basics

In [25]:
import numpy as np
a = np.array([0, 0.5, 1.0, 1.5, 2.0])
a

array([0. , 0.5, 1. , 1.5, 2. ])

In [26]:
type(a)

numpy.ndarray

In [27]:
a = np.array(['a', 'b', 'c'])
a

array(['a', 'b', 'c'], dtype='<U1')

**Range**

In [28]:
a = np.arange(2, 20, 2)
a

array([ 2,  4,  6,  8, 10, 12, 14, 16, 18])

In [29]:
a = np.arange(8, dtype=np.float)
a

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  a = np.arange(8, dtype=np.float)


array([0., 1., 2., 3., 4., 5., 6., 7.])

**Indexing**

In [30]:
a[5:]

array([5., 6., 7.])

In [31]:
a[:2]

array([0., 1.])

**Built-in methods**

In [32]:
a.sum()

28.0

In [33]:
a.std()

2.29128784747792

In [34]:
a.cumsum()

array([ 0.,  1.,  3.,  6., 10., 15., 21., 28.])

**(Vectorized) Math operations**

In [35]:
# with python list
l = [0., 0.5, 1.5, 3., 5.]
2 * l

[0.0, 0.5, 1.5, 3.0, 5.0, 0.0, 0.5, 1.5, 3.0, 5.0]

In [37]:
# with NumPy
2 * a

array([ 0.,  2.,  4.,  6.,  8., 10., 12., 14.])

In [38]:
a ** 2

array([ 0.,  1.,  4.,  9., 16., 25., 36., 49.])

In [39]:
a ** a

array([1.00000e+00, 1.00000e+00, 4.00000e+00, 2.70000e+01, 2.56000e+02,
       3.12500e+03, 4.66560e+04, 8.23543e+05])

**Universal functions**
“Universal” because they operate on ndarray objects as well as on basic Python data types.

In [41]:
np.exp(a)

array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03])

In [42]:
np.sqrt(a)

array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ,
       2.23606798, 2.44948974, 2.64575131])

In [43]:
np.sqrt(2.5)

1.5811388300841898

### 2.2 Multiple dimensions

In [44]:
b = np.array([a, a * 2])
b

array([[ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.],
       [ 0.,  2.,  4.,  6.,  8., 10., 12., 14.]])

In [45]:
b[0]

array([0., 1., 2., 3., 4., 5., 6., 7.])

In [46]:
b[0,2]

2.0

In [47]:
b[:,1]

array([1., 2.])

In [48]:
b.sum()

84.0

In [51]:
# column-wise operation
b.sum(axis = 0)

array([ 0.,  3.,  6.,  9., 12., 15., 18., 21.])

In [52]:
# row-wise operation
b.sum(axis = 1)

array([28., 56.])

**Initializing**

For initialization functions, one can provide the following parameters:
- `shape`: Either an `int`, a sequence of `int` objects, or a reference to another `ndarray`
- `dtype` (*optional*): A `dtype`—these are NumPy-specific data types for `ndarray` objects
- `order` (*optional*): The order in which to store elements in memory: C for C-like (i.e., row-wise) or F for Fortran-like (i.e., column-wise)

|dtype | Description|
| --- | ---- |
| ? | Boolean| 
| i | Signed integer |
| u | Unsigned integer |
| f | Floating point |
| c | Complex floating point |
| m | timedelta |
| M | datetime |
| O | Object |
| U | Unicode |
| V | Raw data (void)|

In [53]:
c = np.zeros((2, 3), dtype='i', order='C')
c

array([[0, 0, 0],
       [0, 0, 0]], dtype=int32)

In [54]:
c = np.ones((2, 3, 4), dtype='i', order='C')
c

array([[[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]],

       [[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]]], dtype=int32)

In [61]:
d = np.zeros_like(c, dtype='f', order='C')
d

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]], dtype=float32)

In [64]:
d = np.ones_like(c, dtype='f', order='C')
d

array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]], dtype=float32)

In [57]:
e = np.empty((2, 3, 2))
e

array([[[0., 0.],
        [0., 0.],
        [0., 0.]],

       [[0., 0.],
        [0., 0.],
        [0., 0.]]])

In [58]:
f = np.empty_like(c)
f

array([[[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0]],

       [[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0]]], dtype=int32)

In [59]:
np.eye(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

g = np.linspace(5, 15, 12)
g

**Metainformation**

In [65]:
g.size

12

In [66]:
g.itemsize

8

In [67]:
g.ndim

1

In [68]:
g.shape

(12,)

In [69]:
g.dtype

dtype('float64')

In [70]:
g.nbytes

96

**Reshaping and resizing**

While reshaping in general just provides another view on the same data, resizing in general creates a new (temporary) object. During a reshaping operation, the total number of elements in the ndarray object is unchanged. During a resizing operation, this number changes—it either decreases (“down-sizing”) or increases (“up-sizing”).

In [72]:
g = np.arange(15)
g

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

In [73]:
g.shape

(15,)

In [74]:
np.shape(g)

(15,)

In [75]:
g.reshape((3,5))

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [76]:
h = g.reshape((5,3))
h

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [77]:
h.T

array([[ 0,  3,  6,  9, 12],
       [ 1,  4,  7, 10, 13],
       [ 2,  5,  8, 11, 14]])

In [78]:
h.transpose()

array([[ 0,  3,  6,  9, 12],
       [ 1,  4,  7, 10, 13],
       [ 2,  5,  8, 11, 14]])

In [79]:
np.resize(g,(3,1))

array([[0],
       [1],
       [2]])

In [80]:
np.resize(g, (1, 5))

array([[0, 1, 2, 3, 4]])

In [81]:
np.resize(g, (2, 5))

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [84]:
n = np.resize(g, (5, 4))
n

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14,  0],
       [ 1,  2,  3,  4]])

**Stacking**

In [85]:
h

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [87]:
# horizontal stacking
np.hstack((h,2*h))

array([[ 0,  1,  2,  0,  2,  4],
       [ 3,  4,  5,  6,  8, 10],
       [ 6,  7,  8, 12, 14, 16],
       [ 9, 10, 11, 18, 20, 22],
       [12, 13, 14, 24, 26, 28]])

In [89]:
# vertical stacking
np.vstack((h, 0.5 * h))

array([[ 0. ,  1. ,  2. ],
       [ 3. ,  4. ,  5. ],
       [ 6. ,  7. ,  8. ],
       [ 9. , 10. , 11. ],
       [12. , 13. , 14. ],
       [ 0. ,  0.5,  1. ],
       [ 1.5,  2. ,  2.5],
       [ 3. ,  3.5,  4. ],
       [ 4.5,  5. ,  5.5],
       [ 6. ,  6.5,  7. ]])

**Flattening**

In [90]:
h

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [91]:
h.flatten()

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

In [92]:
for i in h.flat:
    print(i, end=',')

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,

### 2.3 Boolean arrays
Comparison and logical operations in general work on ndarray objects the same way, element-wise, as on standard Python data types. 

In [93]:
h

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [94]:
h > 8

array([[False, False, False],
       [False, False, False],
       [False, False, False],
       [ True,  True,  True],
       [ True,  True,  True]])

In [95]:
h <= 7

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True, False],
       [False, False, False],
       [False, False, False]])

In [96]:
h == 5

array([[False, False, False],
       [False, False,  True],
       [False, False, False],
       [False, False, False],
       [False, False, False]])

In [97]:
(h == 5).astype(int)

array([[0, 0, 0],
       [0, 0, 1],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

In [98]:
(h > 4 ) & (h <= 12)

array([[False, False, False],
       [False, False,  True],
       [ True,  True,  True],
       [ True,  True,  True],
       [ True, False, False]])

**Boolean indexing**

In [99]:
h[h > 8]

array([ 9, 10, 11, 12, 13, 14])

In [100]:
h[(h > 4) & (h <= 12)]

array([ 5,  6,  7,  8,  9, 10, 11, 12])

In [101]:
h[(h < 4) | (h >= 12)]

array([ 0,  1,  2,  3, 12, 13, 14])

In [102]:
# use the np.where() function
np.where(h > 7, 1, 0)

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 1],
       [1, 1, 1],
       [1, 1, 1]])

In [103]:
np.where(h % 2 == 0, 'even', 'odd')

array([['even', 'odd', 'even'],
       ['odd', 'even', 'odd'],
       ['even', 'odd', 'even'],
       ['odd', 'even', 'odd'],
       ['even', 'odd', 'even']], dtype='<U4')

In [104]:
np.where(h <= 7, h * 2, h / 2)

array([[ 0. ,  2. ,  4. ],
       [ 6. ,  8. , 10. ],
       [12. , 14. ,  4. ],
       [ 4.5,  5. ,  5.5],
       [ 6. ,  6.5,  7. ]])

## 3. Structured NumPy arrays

Structured `ndarray` objects allow to have a different `dtype` per column.

The construction is similar to the operation for initializing tables in a database (e.g., SQL): one has column names and column data types, with maybe some additional information (e.g., maximum number of characters per str object).

In [120]:
dt = np.dtype([('Name', 'S10'), ('Age', 'i'),
                             ('Height', 'f'), ('Children/Pets', 'i', 2)])
dt

dtype([('Name', 'S10'), ('Age', '<i4'), ('Height', '<f4'), ('Children/Pets', '<i4', (2,))])

In [121]:
# alternative
dt = np.dtype({'names': ['Name', 'Age', 'Height', 'Children/Pets'],
                       'formats':'O int float int,int'.split()})
dt

dtype([('Name', 'O'), ('Age', '<i8'), ('Height', '<f8'), ('Children/Pets', [('f0', '<i8'), ('f1', '<i8')])])

In [122]:
s = np.array([('Smith', 45, 1.83, (0, 1)),
                        ('Jones', 53, 1.72, (2, 2))], dtype=dt)
s

array([('Smith', 45, 1.83, (0, 1)), ('Jones', 53, 1.72, (2, 2))],
      dtype=[('Name', 'O'), ('Age', '<i8'), ('Height', '<f8'), ('Children/Pets', [('f0', '<i8'), ('f1', '<i8')])])

In [123]:
type(s)

numpy.ndarray

**Indexing**

The single columns can now be easily accessed by their names and the rows by their index values.

In [124]:
s['Name']

array(['Smith', 'Jones'], dtype=object)

In [125]:
s['Height'].mean()

1.775

In [113]:
s[0]

('Smith', 45, 1.83, (0, 1))

In [126]:
s[1]['Age']

53

## 4. Vectorization of code

Vectorization is a strategy to get more compact code that is possibly executed faster. The fundamental idea is to conduct an operation on or to apply a function to a com‐ plex object “at once” and not by looping over the single elements of the object. NumPy has vectorization built in deep down in its core.

In [127]:
np.random.seed(100)
r = np.arange(12).reshape((4, 3))
s = np.arange(12).reshape((4, 3)) * 0.5

In [128]:
r

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

In [129]:
s

array([[0. , 0.5, 1. ],
       [1.5, 2. , 2.5],
       [3. , 3.5, 4. ],
       [4.5, 5. , 5.5]])

In [130]:
r + s

array([[ 0. ,  1.5,  3. ],
       [ 4.5,  6. ,  7.5],
       [ 9. , 10.5, 12. ],
       [13.5, 15. , 16.5]])

In [131]:
r + 3

array([[ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [132]:
2 * r

array([[ 0,  2,  4],
       [ 6,  8, 10],
       [12, 14, 16],
       [18, 20, 22]])

In [133]:
2 * r + 3

array([[ 3,  5,  7],
       [ 9, 11, 13],
       [15, 17, 19],
       [21, 23, 25]])

**Different shapes**

In [134]:
r.shape

(4, 3)

In [142]:
s = np.arange(0, 12, 4)
s

array([0, 4, 8])

In [143]:
r + s

array([[ 0,  5, 10],
       [ 3,  8, 13],
       [ 6, 11, 16],
       [ 9, 14, 19]])

In [144]:
s = np.arange(0, 12, 3)

In [145]:
r + s

ValueError: operands could not be broadcast together with shapes (4,3) (4,) 

In [146]:
r.transpose() + s

array([[ 0,  6, 12, 18],
       [ 1,  7, 13, 19],
       [ 2,  8, 14, 20]])

**Applying functions**

In [151]:
def f(x):
    return 3 * x + 5

In [152]:
f(0.5)

6.5

In [153]:
f(r)

array([[ 5,  8, 11],
       [14, 17, 20],
       [23, 26, 29],
       [32, 35, 38]])

---

# Simple Banking examples

## 1. Creating and Manipulating NumPy Arrays

### 2.1 Creating Arrays

NumPy arrays can be created from Python lists or using built-in functions like arange, linspace, and zeros.


In [2]:
# Example: Creating NumPy arrays
balance_array = np.array([1000, 1500, 1200, 1300])
print("Balance Array:", balance_array)

# Creating an array with a range of values
range_array = np.arange(1, 11)
print("Range Array:", range_array)

# Creating an array of zeros
zero_array = np.zeros(5)
print("Zero Array:", zero_array)

Balance Array: [1000 1500 1200 1300]
Range Array: [ 1  2  3  4  5  6  7  8  9 10]
Zero Array: [0. 0. 0. 0. 0.]


### 2.2 Reshaping and Indexing Arrays

You can reshape arrays and access specific elements or slices of arrays.

In [3]:
# Example: Reshaping and indexing arrays
reshaped_array = np.arange(12).reshape(3, 4)
print("Reshaped Array:\n", reshaped_array)

# Indexing elements
print("Element at position [1, 2]:", reshaped_array[1, 2])

# Slicing arrays
print("First row:", reshaped_array[0, :])
print("First column:", reshaped_array[:, 0])

Reshaped Array:
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
Element at position [1, 2]: 6
First row: [0 1 2 3]
First column: [0 4 8]


## 3. Basic Mathematical Operations with NumPy

NumPy makes it easy to perform element-wise operations, as well as more complex mathematical operations.

### 3.1 Element-wise Operations

You can perform operations like addition, subtraction, multiplication, and division on arrays.

In [4]:
# Example: Element-wise operations
balance = np.array([1000, 1500, 1200, 1300])
interest_rate = 0.05
interest = balance * interest_rate
print("Interest earned on each balance:", interest)

Interest earned on each balance: [50. 75. 60. 65.]


### 3.2 Aggregation Functions

NumPy provides functions like sum, mean, std, and max to perform aggregation on arrays.

In [5]:
# Example: Aggregation functions
print("Total balance:", np.sum(balance))
print("Average balance:", np.mean(balance))
print("Maximum balance:", np.max(balance))
print("Standard deviation of balance:", np.std(balance))

Total balance: 5000
Average balance: 1250.0
Maximum balance: 1500
Standard deviation of balance: 180.27756377319946


## 4. Statistical and Financial Calculations Using NumPy

### 4.1 Statistical Analysis

You can use NumPy to perform statistical analysis on financial data, such as calculating the mean, median, and standard deviation of returns.

In [6]:
# Example: Calculating statistical measures
returns = np.array([0.02, 0.03, -0.01, 0.05, 0.04])
print("Mean return:", np.mean(returns))
print("Median return:", np.median(returns))
print("Standard deviation of returns:", np.std(returns))

Mean return: 0.026000000000000002
Median return: 0.03
Standard deviation of returns: 0.020591260281974003


### 4.2 Financial Calculations

NumPy can be used to perform financial calculations like compound interest, present value, and future value.

In [7]:
# Example: Compound interest calculation
principal = 1000
rate = 0.05
time = np.array([1, 2, 3, 4, 5])
future_value = principal * np.power(1 + rate, time)
print("Future value over 5 years:", future_value)

Future value over 5 years: [1050.        1102.5       1157.625     1215.50625   1276.2815625]
