<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Introduction" data-toc-modified-id="Introduction-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Introduction</a></span></li><li><span><a href="#What-is-numpy?" data-toc-modified-id="What-is-numpy?-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>What is <code>numpy</code>?</a></span></li><li><span><a href="#Regular-vs.-record" data-toc-modified-id="Regular-vs.-record-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Regular vs. record</a></span></li><li><span><a href="#Imports" data-toc-modified-id="Imports-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Imports</a></span></li><li><span><a href="#Python-Array-Class-not-np.array" data-toc-modified-id="Python-Array-Class-not-np.array-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Python Array Class not <code>np.array</code></a></span></li><li><span><a href="#Regular-Numpy-Arrays" data-toc-modified-id="Regular-Numpy-Arrays-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Regular Numpy Arrays</a></span><ul class="toc-item"><li><span><a href="#The-Basics" data-toc-modified-id="The-Basics-6.1"><span class="toc-item-num">6.1&nbsp;&nbsp;</span>The Basics</a></span></li><li><span><a href="#Multiple-Dimensions" data-toc-modified-id="Multiple-Dimensions-6.2"><span class="toc-item-num">6.2&nbsp;&nbsp;</span>Multiple Dimensions</a></span></li><li><span><a href="#Meta-Information" data-toc-modified-id="Meta-Information-6.3"><span class="toc-item-num">6.3&nbsp;&nbsp;</span>Meta-Information</a></span></li><li><span><a href="#Reshaping,-Resizing,-Stacking,-Flattening" data-toc-modified-id="Reshaping,-Resizing,-Stacking,-Flattening-6.4"><span class="toc-item-num">6.4&nbsp;&nbsp;</span>Reshaping, Resizing, Stacking, Flattening</a></span></li><li><span><a href="#Boolean-Arrays" data-toc-modified-id="Boolean-Arrays-6.5"><span class="toc-item-num">6.5&nbsp;&nbsp;</span>Boolean Arrays</a></span></li><li><span><a href="#Speed-Comparison" data-toc-modified-id="Speed-Comparison-6.6"><span class="toc-item-num">6.6&nbsp;&nbsp;</span>Speed Comparison</a></span></li></ul></li><li><span><a href="#References" data-toc-modified-id="References-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>References</a></span></li><li><span><a href="#Clean-up" data-toc-modified-id="Clean-up-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>Clean-up</a></span></li><li><span><a href="#Requirements" data-toc-modified-id="Requirements-9"><span class="toc-item-num">9&nbsp;&nbsp;</span>Requirements</a></span></li></ul></div>

# Introduction
<hr style = "border:2px solid black" ></hr>

<div class="alert alert-warning">
<font color=black>

**What?** Regular Numpy Arrays

</font>
</div>

# What is `numpy`?
<hr style = "border:2px solid black" ></hr>

<div class="alert alert-info">
<font color=black>

- `NumPy` provides a multidimensional array object to:
    - store homogeneous data arrays
    - store heterogeneous data arrays
    - supports vectorization of codes

</font>
</div>

# Regular vs. record
<hr style = "border:2px solid black" ></hr>

<div class="alert alert-info">
<font color=black>
    
- “Regular NumPy Arrays”: this is the core section about the regular NumPy ndarray class, the workhorse in almost all data-intensive Python use cases involving numerical data.
- “Structured NumPy Arrays”: this brief section introduces structured (or record) ndarray objects for the han‐ dling of tabular data with columns.


</font>
</div>

![image.png](attachment:image.png)

# Imports
<hr style = "border:2px solid black" ></hr>

In [1]:
import array
import numpy as np  
import random
import math
import sys

# Python Array Class not `np.array`
<hr style = "border:2px solid black" ></hr>

<div class="alert alert-info">
<font color=black>

- This module defines an object type which can compactly represent an array of basic values: characters, integers, floating point numbers. Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained. The type is specified at object creation time by using a type code, which is a single character.
- Composing array structures with list objects works, somewhat. But it is not really convenient, and the list class has not been built with this specific goal in mind. It has rather a much broader and more general scope. The array class is a bit more spe‐ cialized, providing some useful features for working with arrays of data. However, a truly specialized class could be really beneficial to handle array-type structures.
    
</font>
</div>

In [2]:
v = [0.5, 0.75, 1.0, 1.5, 2.0]

In [3]:
a = array.array('f', v)  
a

array('f', [0.5, 0.75, 1.0, 1.5, 2.0])

In [4]:
a.append(0.5)  
a

array('f', [0.5, 0.75, 1.0, 1.5, 2.0, 0.5])

In [5]:
a.extend([5.0, 6.75])  
a

array('f', [0.5, 0.75, 1.0, 1.5, 2.0, 0.5, 5.0, 6.75])

In [6]:
2 * a  

array('f', [0.5, 0.75, 1.0, 1.5, 2.0, 0.5, 5.0, 6.75, 0.5, 0.75, 1.0, 1.5, 2.0, 0.5, 5.0, 6.75])

In [7]:
# causes intentional error
# a.append('string')  

In [8]:
a.tolist()  

[0.5, 0.75, 1.0, 1.5, 2.0, 0.5, 5.0, 6.75]

In [9]:
f = open('array.apy', 'wb')  
a.tofile(f)  
f.close()  

In [10]:
with open('array.apy', 'wb') as f:  
    a.tofile(f)  

In [11]:
!ls -n arr*  

-rw-r--r--  1 501  20  32 26 Sep 18:05 array.apy


In [12]:
b = array.array('f')  

In [13]:
with open('array.apy', 'rb') as f:  
    b.fromfile(f, 5)  

In [14]:
b  

array('f', [0.5, 0.75, 1.0, 1.5, 2.0])

In [15]:
b = array.array('d')  

In [16]:
with open('array.apy', 'rb') as f:
    b.fromfile(f, 2)  

In [17]:
b  

array('d', [0.0004882813645963324, 0.12500002956949174])

# Regular Numpy Arrays
<hr style = "border:2px solid black" ></hr>

## The Basics

In [18]:
a = np.array([0, 0.5, 1.0, 1.5, 2.0])  
a

array([0. , 0.5, 1. , 1.5, 2. ])

In [19]:
type(a)  

numpy.ndarray

In [20]:
a = np.array(['a', 'b', 'c'])  
a

array(['a', 'b', 'c'], dtype='<U1')

In [21]:
a = np.arange(2, 20, 2)  
a

array([ 2,  4,  6,  8, 10, 12, 14, 16, 18])

In [22]:
a = np.arange(8, dtype=float)  
a

array([0., 1., 2., 3., 4., 5., 6., 7.])

In [23]:
a[5:]  

array([5., 6., 7.])

In [24]:
a[:2]  

array([0., 1.])

In [25]:
a.sum()  

28.0

In [26]:
a.std()  

2.29128784747792

In [27]:
a.cumsum()  

array([ 0.,  1.,  3.,  6., 10., 15., 21., 28.])

In [28]:
l = [0., 0.5, 1.5, 3., 5.]
2 * l  

[0.0, 0.5, 1.5, 3.0, 5.0, 0.0, 0.5, 1.5, 3.0, 5.0]

In [29]:
a

array([0., 1., 2., 3., 4., 5., 6., 7.])

In [30]:
2 * a  

array([ 0.,  2.,  4.,  6.,  8., 10., 12., 14.])

In [31]:
a ** 2  

array([ 0.,  1.,  4.,  9., 16., 25., 36., 49.])

In [32]:
2 ** a  

array([  1.,   2.,   4.,   8.,  16.,  32.,  64., 128.])

In [33]:
a ** a  

array([1.00000e+00, 1.00000e+00, 4.00000e+00, 2.70000e+01, 2.56000e+02,
       3.12500e+03, 4.66560e+04, 8.23543e+05])

In [34]:
np.exp(a)  

array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03])

In [35]:
np.sqrt(a)  

array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ,
       2.23606798, 2.44948974, 2.64575131])

In [36]:
np.sqrt(2.5)  

1.5811388300841898

In [38]:
math.sqrt(2.5)  

1.5811388300841898

In [39]:
# causes intentional error
# math.sqrt(a)  

In [40]:
%timeit np.sqrt(2.5)  

860 ns ± 3.09 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [41]:
%timeit math.sqrt(2.5)  

115 ns ± 1.13 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


## Multiple Dimensions

In [42]:
b = np.array([a, a * 2])  
b

array([[ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.],
       [ 0.,  2.,  4.,  6.,  8., 10., 12., 14.]])

In [43]:
b[0]  

array([0., 1., 2., 3., 4., 5., 6., 7.])

In [44]:
b[0, 2]  

2.0

In [45]:
b[:, 1]  

array([1., 2.])

In [46]:
b.sum()  

84.0

In [47]:
b.sum(axis=0)  

array([ 0.,  3.,  6.,  9., 12., 15., 18., 21.])

In [48]:
b.sum(axis=1)  

array([28., 56.])

In [49]:
c = np.zeros((2, 3), dtype='i', order='C')  
c

array([[0, 0, 0],
       [0, 0, 0]], dtype=int32)

In [50]:
c = np.ones((2, 3, 4), dtype='i', order='C')  
c

array([[[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]],

       [[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]]], dtype=int32)

In [51]:
d = np.zeros_like(c, dtype=np.float16, order='C')  
d

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]], dtype=float16)

In [52]:
d = np.ones_like(c, dtype=np.float16, order='C')  
d

array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]], dtype=float16)

In [53]:
e = np.empty((2, 3, 2))  
e

array([[[0.00000000e+000, 1.00937611e-320],
        [0.00000000e+000, 0.00000000e+000],
        [0.00000000e+000, 1.33664410e+160]],

       [[5.27870891e-091, 1.95103635e+160],
        [5.54403864e+169, 8.23876842e-067],
        [3.99910963e+252, 4.06198887e-317]]])

In [54]:
f = np.empty_like(c)  
f

array([[[         0,          0,       2043,          0],
        [         0,          0,          0,          0],
        [         0,          0, 1852990827, 1630432357]],

       [[1630757170,  758199397, 1650669669, 1630942253],
        [ 809053537, 1663918438, 1633956451,  842413616],
        [ 778330416, 1952543859,    8221557,          0]]], dtype=int32)

In [55]:
np.eye(5)  

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [56]:
g = np.linspace(5, 15, 12) 
g

array([ 5.        ,  5.90909091,  6.81818182,  7.72727273,  8.63636364,
        9.54545455, 10.45454545, 11.36363636, 12.27272727, 13.18181818,
       14.09090909, 15.        ])

## Meta-Information

In [57]:
g.size  

12

In [58]:
g.itemsize  

8

In [59]:
g.ndim  

1

In [60]:
g.shape  

(12,)

In [61]:
g.dtype  

dtype('float64')

In [62]:
g.nbytes  

96

## Reshaping, Resizing, Stacking, Flattening

In [63]:
g = np.arange(15)

In [64]:
g

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

In [65]:
g.shape  

(15,)

In [66]:
np.shape(g) 

(15,)

In [67]:
g.reshape((3, 5))  

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [68]:
h = g.reshape((5, 3))  
h

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [69]:
h.T  

array([[ 0,  3,  6,  9, 12],
       [ 1,  4,  7, 10, 13],
       [ 2,  5,  8, 11, 14]])

In [70]:
h.transpose()  

array([[ 0,  3,  6,  9, 12],
       [ 1,  4,  7, 10, 13],
       [ 2,  5,  8, 11, 14]])

In [71]:
g

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

In [72]:
np.resize(g, (3, 1))  

array([[0],
       [1],
       [2]])

In [73]:
np.resize(g, (1, 5))  

array([[0, 1, 2, 3, 4]])

In [74]:
np.resize(g, (2, 5))  

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [75]:
n = np.resize(g, (5, 4))  
n

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14,  0],
       [ 1,  2,  3,  4]])

In [76]:
h

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [77]:
np.hstack((h, 2 * h))  

array([[ 0,  1,  2,  0,  2,  4],
       [ 3,  4,  5,  6,  8, 10],
       [ 6,  7,  8, 12, 14, 16],
       [ 9, 10, 11, 18, 20, 22],
       [12, 13, 14, 24, 26, 28]])

In [78]:
np.vstack((h, 0.5 * h))  

array([[ 0. ,  1. ,  2. ],
       [ 3. ,  4. ,  5. ],
       [ 6. ,  7. ,  8. ],
       [ 9. , 10. , 11. ],
       [12. , 13. , 14. ],
       [ 0. ,  0.5,  1. ],
       [ 1.5,  2. ,  2.5],
       [ 3. ,  3.5,  4. ],
       [ 4.5,  5. ,  5.5],
       [ 6. ,  6.5,  7. ]])

In [79]:
h

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [80]:
h.flatten()  

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

In [81]:
h.flatten(order='C')  

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

In [82]:
h.flatten(order='F')  

array([ 0,  3,  6,  9, 12,  1,  4,  7, 10, 13,  2,  5,  8, 11, 14])

In [83]:
for i in h.flat:  
    print(i, end=',')

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,

In [84]:
for i in h.ravel(order='C'):  
    print(i, end=',')

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,

In [85]:
for i in h.ravel(order='F'):  
    print(i, end=',')

0,3,6,9,12,1,4,7,10,13,2,5,8,11,14,

## Boolean Arrays

In [86]:
h

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [87]:
h > 8  

array([[False, False, False],
       [False, False, False],
       [False, False, False],
       [ True,  True,  True],
       [ True,  True,  True]])

In [88]:
h <= 7  

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True, False],
       [False, False, False],
       [False, False, False]])

In [89]:
h == 5  

array([[False, False, False],
       [False, False,  True],
       [False, False, False],
       [False, False, False],
       [False, False, False]])

In [90]:
(h == 5).astype(int)  

array([[0, 0, 0],
       [0, 0, 1],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

In [91]:
(h > 4) & (h <= 12)  

array([[False, False, False],
       [False, False,  True],
       [ True,  True,  True],
       [ True,  True,  True],
       [ True, False, False]])

In [92]:
h[h > 8]  

array([ 9, 10, 11, 12, 13, 14])

In [93]:
h[(h > 4) & (h <= 12)]  

array([ 5,  6,  7,  8,  9, 10, 11, 12])

In [94]:
h[(h < 4) | (h >= 12)]  

array([ 0,  1,  2,  3, 12, 13, 14])

In [95]:
np.where(h > 7, 1, 0)  

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 1],
       [1, 1, 1],
       [1, 1, 1]])

In [96]:
np.where(h % 2 == 0, 'even', 'odd')  

array([['even', 'odd', 'even'],
       ['odd', 'even', 'odd'],
       ['even', 'odd', 'even'],
       ['odd', 'even', 'odd'],
       ['even', 'odd', 'even']], dtype='<U4')

In [97]:
np.where(h <= 7, h * 2, h / 2)  

array([[ 0. ,  2. ,  4. ],
       [ 6. ,  8. , 10. ],
       [12. , 14. ,  4. ],
       [ 4.5,  5. ,  5.5],
       [ 6. ,  6.5,  7. ]])

## Speed Comparison

In [98]:
I = 5000

In [99]:
%time mat = [[random.gauss(0, 1) for j in range(I)] \
             for i in range(I)]  

CPU times: user 19.1 s, sys: 333 ms, total: 19.4 s
Wall time: 19.4 s


In [100]:
mat[0][:5]  

[0.4937391210474275,
 -1.2763473243378034,
 1.4703660628036392,
 -1.3438688707561324,
 -0.7861543402977659]

In [101]:
%time sum([sum(l) for l in mat])  

CPU times: user 137 ms, sys: 2.08 ms, total: 139 ms
Wall time: 138 ms


-453.7365327496022

In [102]:
sum([sys.getsizeof(l) for l in mat])  

209400000

In [103]:
%time mat = np.random.standard_normal((I, I))  

CPU times: user 1.22 s, sys: 149 ms, total: 1.37 s
Wall time: 1.37 s


In [104]:
%time mat.sum()  

CPU times: user 35.1 ms, sys: 1.43 ms, total: 36.6 ms
Wall time: 34.8 ms


-962.0224046296323

In [105]:
mat.nbytes  

200000000

In [106]:
sys.getsizeof(mat)  

200000120

# References
<hr style = "border:2px solid black" ></hr>

<div class="alert alert-warning">
<font color=black>

- https://github.com/yhilpisch/py4fi2nd/blob/master/code/ch04/04_numpy.ipynb
- Hilpisch, Yves. Python for finance: mastering data-driven finance. O'Reilly Media, 2018.

</font>
</div>

# Clean-up
<hr style = "border:2px solid black" ></hr>

In [108]:
!rm array.apy

# Requirements
<hr style = "border:2px solid black" ></hr>

In [107]:
%load_ext watermark
%watermark -v -iv

Python implementation: CPython
Python version       : 3.10.4
IPython version      : 8.3.0

autopep8: 1.7.0
numpy   : 1.21.6
sys     : 3.10.4 (main, Mar 31 2022, 03:38:35) [Clang 12.0.0 ]
json    : 2.0.9

