<a href="https://colab.research.google.com/github/safry4/Research-Software-Engineering/blob/main/Advanced_numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import numpy as np

- By default, array operations are element-by-element
- Arrays must match in all dimensions in order to be compatible
- If we multiply arrays with non-matching shapes we get an error

Except, that if one array has any Dimension 1, then the data is REPEATED to match the other.

In [2]:
x = np.array([1, 2]).reshape(1, 2)
x

array([[1, 2]])

In [3]:
y = np.array([3, 4, 5]).reshape(3, 1)
y

array([[3],
       [4],
       [5]])

In [4]:
result = x + y
result.shape

(3, 2)

In [5]:
result

array([[4, 5],
       [5, 6],
       [6, 7]])

Broadcasting is very powerful, and numpy allows indexing with np.newaxis to temporarily create new one-long dimensions on the fly.

Dot Products using broadcasting

In [6]:
a = np.arange(9).reshape(3, 3)
a

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [7]:
b = np.arange(3, 12).reshape(3, 3)
b

array([[ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

In [8]:
a[:, :, np.newaxis].shape

(3, 3, 1)

In [9]:
b[np.newaxis, :, :].shape

(1, 3, 3)

In [10]:
a[:, :, np.newaxis] * b[np.newaxis, :, :]

array([[[ 0,  0,  0],
        [ 6,  7,  8],
        [18, 20, 22]],

       [[ 9, 12, 15],
        [24, 28, 32],
        [45, 50, 55]],

       [[18, 24, 30],
        [42, 49, 56],
        [72, 80, 88]]])

In [12]:
(a[:, :, np.newaxis] * b[np.newaxis, :, :]).sum(1)

array([[ 24,  27,  30],
       [ 78,  90, 102],
       [132, 153, 174]])

In [11]:
#the same
a @ b

array([[ 24,  27,  30],
       [ 78,  90, 102],
       [132, 153, 174]])

Record arrays

In [13]:
x = np.arange(50).reshape([10, 5])
record_x = x.view(
    dtype={"names": ["col1", "col2", "another", "more", "last"], "formats": [int] * 5}
)
record_x

array([[( 0,  1,  2,  3,  4)],
       [( 5,  6,  7,  8,  9)],
       [(10, 11, 12, 13, 14)],
       [(15, 16, 17, 18, 19)],
       [(20, 21, 22, 23, 24)],
       [(25, 26, 27, 28, 29)],
       [(30, 31, 32, 33, 34)],
       [(35, 36, 37, 38, 39)],
       [(40, 41, 42, 43, 44)],
       [(45, 46, 47, 48, 49)]],
      dtype=[('col1', '<i8'), ('col2', '<i8'), ('another', '<i8'), ('more', '<i8'), ('last', '<i8')])

In [14]:
record_x["col1"]

array([[ 0],
       [ 5],
       [10],
       [15],
       [20],
       [25],
       [30],
       [35],
       [40],
       [45]])

Logical arrays, masking, and selection

In [None]:
#Numpy memory

In [15]:
x = np.arange(5)
y = x[:]

In [16]:
y[2] = 0
x

array([0, 1, 0, 3, 4])

We must use np.copy to force separate memory. Otherwise NumPy tries its hardest to make slices be views on data.