<a href="https://colab.research.google.com/github/conquerv0/Pynaissance/blob/master/1.%20Basic%20Framework/Numpy_Guide.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Computation with NumPy**

Built-in array based object such its limited features rendered it less performance-oriented that the more specialized NumPy arrays. This module will illustrate some powerful features of NumPy that makes this class highly useful in quantitative finance. 

**1. Basic Operation**


In [1]:
import numpy as np
a = np.array([1, 2, 3, 4, 5, 6])
a
new_a = np.arange(2, 20, 2)
new_a

a[:2]

# The sum of all elements in the array.
a.sum()

# The standard deviation of the elements.
a.std()

# The cumulative sum of all elements (starting at index 0)
a.cumsum()

array([ 1,  3,  6, 10, 15, 21])

**2. NumPy Vectorized Operation**

Most importantly, **ndarray** objects define mathematical operations for vectorized objects, and have high performance for universal functions on the array.

In [3]:
np.exp(a)
np.sqrt(a)
np.sqrt(2.5)
# Although math.sqrt(2.5) effectively complete the same thing as above. It cannot be applied to a ndarray object directly

import math
# math.sqrt(a)

# We can apply magic command to time the universal function in the different packages to compare performance.
%timeit np.sqrt(8)
%timeit math.sqrt(8)

The slowest run took 26.67 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.19 µs per loop
The slowest run took 26.49 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 77.2 ns per loop


**3. Multi-Dimensional Operations**

In [4]:
v = np.array([a, a*2])

# Indexing the second row
v[1]

# Indexing the second column.
v[:, 1]

# Calculate the sum of all values
v.sum()

# Calculate the sum along the first axis
v.sum(axis=0)

# Calculate the sum along the second axis.
v.sum(axis=1)

array([21, 42])

To use ndarrays, we usualy setup the arrays, then populate it with data points later. 

In [5]:
# Populating ndarrays.

# Creates a ndarray prepopulated with zeroes
b = np.zeros((2, 3), dtype='i', order='C')
c = np.zeros_like(b, dtype='f16', order='C')
c

# Creates an ndarray object with anything (depends on the bits present in the memory)
d = np.empty((2,3,2))
e = np.empty_like(c)
e

# Creates a sqaure matrix as an ndarray object with the diagonal populated by ones
f = np.eye(5)
f

# Creates a one-dimensional ndarray object with evenly spaced intervals between numbers; 
# parameters: start, end, and num(of elements)
g = np.linspace(8, 16, 20)
g

array([ 8.        ,  8.42105263,  8.84210526,  9.26315789,  9.68421053,
       10.10526316, 10.52631579, 10.94736842, 11.36842105, 11.78947368,
       12.21052632, 12.63157895, 13.05263158, 13.47368421, 13.89473684,
       14.31578947, 14.73684211, 15.15789474, 15.57894737, 16.        ])

In [6]:
# Useful attributes

# The number of elements
g.size 

# The number of bytes used to represent one elements
g.itemsize

# The number of dimensions
g.ndim

# The shape of the ndarray object.
g.shape

(20,)

**4. Reshaping and Resizing**

Despite its immutable nature, there are mean to reshape and resize such object. Reshaping usualy provides another view on the data, while resizing generally creates a temporary object to work with.

In [7]:
h = np.arange(16)
h.shape

np.shape(h)

h.reshape((2, 8))
k = h.reshape((8,2))
k

array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11],
       [12, 13],
       [14, 15]])

Note that the total number of elements in the ndarray object is unchanged. However, for a resizing operation, the number of elements can change as it either decreases or increases as a result of down-sizing and up-sizing.

In [8]:
# two dimension, down-sizing
np.resize(k, (1,5))

# two dimensions, up-sizing.
np.resize(k, (5,4))

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [ 0,  1,  2,  3]])

**Stacking** is another special operation that allows for horizontal or vertical combination or two ndarray objects. Note that dimensions of the combining ndarrays must be the same. 

In [9]:
h
# horizontal stacking
np.hstack((h, 2*h))

# vertical stacking
np.vstack((h, 0.5*h))

array([[ 0. ,  1. ,  2. ,  3. ,  4. ,  5. ,  6. ,  7. ,  8. ,  9. , 10. ,
        11. , 12. , 13. , 14. , 15. ],
       [ 0. ,  0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5,  5. ,
         5.5,  6. ,  6.5,  7. ,  7.5]])

**Flattening** can be used to flatten multidimensional ndarray objects into one-dimensional on, either through row-by-row(C order), or through column-by-column (F order)

In [10]:
h

# Default flatten order is C
h.flatten()
h.flatten(order='F')

# The flat attribute gives a flat iterator
for i in h.flat:
  print(i, end=',')

# Alternative flatten method, ravel()
for i in h.ravel(order='F'):
  print(i, end=',')

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,

**5. Boolean Arrays**

Evaluating conditions on ndarray objects by default yield ndarray object(dtype is bool). 

In [11]:
h
# Is value greater than...?
h > 8
# Is value smaller or equal to?
h <= 6

# Present true and false as integer value
(h==5).astype(int)

# Filterred arrays for values that satisfy the conditions
h[(h>6)&(h<=16)]
h[(h<4)|(h>=16)] 

array([0, 1, 2, 3])

**np.where** function can define actions that depends on conditions being met for the ndarray objects. Note that applying this will create a new ndarray object of the same shape as the original. 

In [12]:
# Set value to even or odd in this new array depends on 
# whether the value is even or odd.
np.where(h % 2 ==0, 'even', 'odd')

# Set the elements to doube the value if true, half if false.
np.where(h <= 8, h*2, h/2)

array([ 0. ,  2. ,  4. ,  6. ,  8. , 10. , 12. , 14. , 16. ,  4.5,  5. ,
        5.5,  6. ,  6.5,  7. ,  7.5])

**6. Vectorization**

Vevtorization is a algorithm strategy that attempts to get more compact code that execute faster. As a basic examples, operation can be element-wise added. **broadcasting** is also supported to combine object of the different shapes within a a single line of code operation.

In [16]:
np.random.seed(520)
r = np.arange(9).reshape((3,3))
s = np.arange(9).reshape((3,3))* 0.5

# element-wise addition
r+s

# scalar addition
r+6

# scalar multiplication
2*r

# linear transformation
2*r+6

# matrix multiplication
r@s

array([[ 7.5,  9. , 10.5],
       [21. , 27. , 33. ],
       [34.5, 45. , 55.5]])

An important property of ndarry is known as memory layout. It is an optional argument(parameter) that can be used to specifiy which elements of the array get stored next to each other. This minor difference can have large impacts on large arrays and performance-targetting algorithms.

Summing up over C-ordered ndarray object is faster both over rows and over columns as an absolute speed advantage. But generally, C-ordered ndarray object sum up over rows relatively faster than over columns. F-ordered ndarray sum up over columns faster compared to summing up over rows. 