In [2]:
import numpy as np

#### View vs Copy

`view` creates shallow copy while `copy` creates deep copy

> Access the array differently by just changing certain metadata like `stride` and `dtype` without changing the data buffer. This creates a new way of looking at the data and these new arrays are called views. The data buffer remains the same, so any changes made to a view reflects in the original copy.

In [2]:
og_arr = np.array([5, 9, 15, 21])

In [3]:
cp9 = og_arr.copy()
cp9[0] = 6
print(cp9)
print(og_arr)

[ 6  9 15 21]
[ 5  9 15 21]


In [4]:
cp0 = og_arr.view()
cp0[0] = 6
print(cp0)
print(og_arr)

[ 6  9 15 21]
[ 6  9 15 21]


So `og_arr` changes as well.

In [10]:
arr_4d = np.random.rand(2, 3, 4, 3)
arr_4d

array([[[[0.67092056, 0.15784597, 0.48386653],
         [0.97369318, 0.01764104, 0.96542136],
         [0.57248924, 0.58069273, 0.5702775 ],
         [0.19733103, 0.82451263, 0.88177775]],

        [[0.25188003, 0.61163378, 0.88181282],
         [0.66347581, 0.77692248, 0.8859976 ],
         [0.11204033, 0.75380498, 0.32292482],
         [0.10561936, 0.22941674, 0.79360653]],

        [[0.29671113, 0.3188477 , 0.66119149],
         [0.8032989 , 0.46201995, 0.23830548],
         [0.05419379, 0.8908145 , 0.80178266],
         [0.57748712, 0.65020347, 0.96352654]]],


       [[[0.70034815, 0.5108154 , 0.46969253],
         [0.21982049, 0.00326995, 0.87876271],
         [0.53291755, 0.3340263 , 0.14396213],
         [0.29663552, 0.08380586, 0.04986982]],

        [[0.03948369, 0.72862269, 0.24061073],
         [0.0609643 , 0.93493742, 0.30343186],
         [0.11036118, 0.92624684, 0.55723774],
         [0.01755226, 0.92319468, 0.39958595]],

        [[0.00576494, 0.97577704, 0.07238125],
 

#### Slicing

NumPy slicing creates a `view` instead of a `copy` as in the case of built-in Python sequences such as string, tuple and list.

In [25]:
arr_4d[0, 2, 1] = [6, 6, 6]

In [26]:
print(arr_4d[0][2])
print("\n=====================================\n")
print(arr_4d[0, 2])

[[0.29671113 0.3188477  0.66119149]
 [6.         6.         6.        ]
 [0.05419379 0.8908145  0.80178266]
 [0.57748712 0.65020347 0.96352654]]


[[0.29671113 0.3188477  0.66119149]
 [6.         6.         6.        ]
 [0.05419379 0.8908145  0.80178266]
 [0.57748712 0.65020347 0.96352654]]


Note that `arr_4d[0, 2] == arr_4d[0][2]` though the second case is more inefficient as a new temporary array is created after the first index that is subsequently indexed by 2.

#### Expanding dims with [`np.newaxis`](https://numpy.org/doc/stable/reference/constants.html#numpy.newaxis)

In [27]:
arr_5d_1 = arr_4d[:, :, :, :, np.newaxis]
arr_5d_1

array([[[[[6.70920562e-01],
          [1.57845973e-01],
          [4.83866530e-01]],

         [[6.00000000e+00],
          [6.00000000e+00],
          [6.00000000e+00]],

         [[5.72489243e-01],
          [5.80692733e-01],
          [5.70277504e-01]],

         [[1.97331027e-01],
          [8.24512630e-01],
          [8.81777750e-01]]],


        [[[2.51880027e-01],
          [6.11633785e-01],
          [8.81812818e-01]],

         [[6.63475810e-01],
          [7.76922481e-01],
          [8.85997605e-01]],

         [[1.12040332e-01],
          [7.53804976e-01],
          [3.22924817e-01]],

         [[1.05619365e-01],
          [2.29416744e-01],
          [7.93606534e-01]]],


        [[[2.96711128e-01],
          [3.18847699e-01],
          [6.61191490e-01]],

         [[6.00000000e+00],
          [6.00000000e+00],
          [6.00000000e+00]],

         [[5.41937886e-02],
          [8.90814505e-01],
          [8.01782660e-01]],

         [[5.77487116e-01],
          [6.50203472e

In [28]:
arr_5d_2 = arr_4d[:, :, :, np.newaxis, :]
arr_5d_2

array([[[[[6.70920562e-01, 1.57845973e-01, 4.83866530e-01]],

         [[6.00000000e+00, 6.00000000e+00, 6.00000000e+00]],

         [[5.72489243e-01, 5.80692733e-01, 5.70277504e-01]],

         [[1.97331027e-01, 8.24512630e-01, 8.81777750e-01]]],


        [[[2.51880027e-01, 6.11633785e-01, 8.81812818e-01]],

         [[6.63475810e-01, 7.76922481e-01, 8.85997605e-01]],

         [[1.12040332e-01, 7.53804976e-01, 3.22924817e-01]],

         [[1.05619365e-01, 2.29416744e-01, 7.93606534e-01]]],


        [[[2.96711128e-01, 3.18847699e-01, 6.61191490e-01]],

         [[6.00000000e+00, 6.00000000e+00, 6.00000000e+00]],

         [[5.41937886e-02, 8.90814505e-01, 8.01782660e-01]],

         [[5.77487116e-01, 6.50203472e-01, 9.63526538e-01]]]],



       [[[[7.00348151e-01, 5.10815401e-01, 4.69692526e-01]],

         [[2.19820493e-01, 3.26994514e-03, 8.78762711e-01]],

         [[5.32917553e-01, 3.34026299e-01, 1.43962129e-01]],

         [[2.96635516e-01, 8.38058618e-02, 4.98698178e-02]]],


#### Using [`Ellipsis`](https://docs.python.org/dev/library/constants.html#Ellipsis)

In [29]:
arr_5d_2[:, :, :, :, 1]

array([[[[1.57845973e-01],
         [6.00000000e+00],
         [5.80692733e-01],
         [8.24512630e-01]],

        [[6.11633785e-01],
         [7.76922481e-01],
         [7.53804976e-01],
         [2.29416744e-01]],

        [[3.18847699e-01],
         [6.00000000e+00],
         [8.90814505e-01],
         [6.50203472e-01]]],


       [[[5.10815401e-01],
         [3.26994514e-03],
         [3.34026299e-01],
         [8.38058618e-02]],

        [[7.28622694e-01],
         [9.34937419e-01],
         [9.26246837e-01],
         [9.23194684e-01]],

        [[9.75777038e-01],
         [4.73614918e-01],
         [8.45803407e-01],
         [2.15316700e-01]]]])

In [22]:
arr_5d_2[..., 1]

array([[[[0.15784597],
         [0.01764104],
         [0.58069273],
         [0.82451263]],

        [[0.61163378],
         [0.77692248],
         [0.75380498],
         [0.22941674]],

        [[0.3188477 ],
         [0.46201995],
         [0.8908145 ],
         [0.65020347]]],


       [[[0.5108154 ],
         [0.00326995],
         [0.3340263 ],
         [0.08380586]],

        [[0.72862269],
         [0.93493742],
         [0.92624684],
         [0.92319468]],

        [[0.97577704],
         [0.47361492],
         [0.84580341],
         [0.2153167 ]]]])

### Advanced indexing

 - when the selection object, obj, is a non-tuple sequence object


 - an ndarray (of data type integer or bool)


 - a tuple with at least one sequence object or ndarray (of data type integer or bool)
 
 
 - two types of advanced indexing: integer and Boolean.
 
Advanced indexing always returns a copy of the data (contrast with basic slicing that returns a view).

#### Integer indexing

Each integer array represents a number of indices into that dimension.

In [30]:
x = np.arange(10, 1, -1)
x

array([10,  9,  8,  7,  6,  5,  4,  3,  2])

In [33]:
x[(1, 2, -1)]

IndexError: too many indices for array: array is 1-dimensional, but 3 were indexed

In [34]:
x[(1, 2, -1),]

array([9, 8, 2])

In [35]:
x[np.array([1, 2, -1])]

array([9, 8, 2])

In [40]:
y = np.arange(35).reshape(5, 7)
y

array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27],
       [28, 29, 30, 31, 32, 33, 34]])

In [37]:
y[np.array([0, 2, 4])]

array([[ 0,  1,  2,  3,  4,  5,  6],
       [14, 15, 16, 17, 18, 19, 20],
       [28, 29, 30, 31, 32, 33, 34]])

In [38]:
y[np.array([0, 2, 4]), np.array([0, 1, 2])]

array([ 0, 15, 30])

In [43]:
y[np.array([0, 2, 4]), np.array([0, 1])]

IndexError: shape mismatch: indexing arrays could not be broadcast together with shapes (3,) (2,) 

#### Boolean indexing

In [57]:
z = np.arange(35)
np.random.shuffle(z)
z = z.reshape(5, 7)
z

array([[26, 11, 30, 34, 24,  0,  1],
       [ 7, 22, 13, 28,  6, 19,  3],
       [18, 23,  5, 14, 17, 25,  8],
       [16, 31, 33,  9, 15, 10, 12],
       [32, 27, 21, 20,  4, 29,  2]])

In [58]:
z > 20

array([[ True, False,  True,  True,  True, False, False],
       [False,  True, False,  True, False, False, False],
       [False,  True, False, False, False,  True, False],
       [False,  True,  True, False, False, False, False],
       [ True,  True,  True, False, False,  True, False]])

In [63]:
# All rows with sum greater than 120
rowsum = z.sum(axis=1)
z[rowsum > 120, :]

array([[26, 11, 30, 34, 24,  0,  1],
       [16, 31, 33,  9, 15, 10, 12],
       [32, 27, 21, 20,  4, 29,  2]])

In [64]:
x = np.arange(30).reshape(2, 3, 5)
x

array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24],
        [25, 26, 27, 28, 29]]])

In [69]:
b = np.array([[True, True, False]])
x[b]

IndexError: boolean index did not match indexed array along dimension 0; dimension is 2 but corresponding boolean dimension is 1

In [76]:
b = np.array([[True, True, False], [False, True, True]])
x[b]

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29]])

In [77]:
b

array([[ True,  True, False],
       [False,  True,  True]])

In [78]:
b.shape, x.shape

((2, 3), (2, 3, 5))

In [73]:
b = np.array([[True, True, False], [False, True, True], [False, True, True]])
x[b]

IndexError: boolean index did not match indexed array along dimension 0; dimension is 2 but corresponding boolean dimension is 3

In [74]:
b = np.array([[True, True], [False, True]])
x[b]

IndexError: boolean index did not match indexed array along dimension 1; dimension is 3 but corresponding boolean dimension is 2

For subsequent dims not specified in bool array, `:` will be taken

In [90]:
# Selecting/filtering from axis 0

b = np.array([True, False])
x[b] # Equivalent to x[b, :, :]

array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]]])

In [92]:
# Selecting/filtering from axis 1

b = np.array([[ True, True, False ],
              [ False, True, True ]])
x[b, :] # Equivalent to x[b, :]

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29]])

In [88]:
# Selecting/filtering from axis 2

b = np.array([[[ True,  False,  False,  False,  False ],
               [ False,  False,  True,  True,  True ],
               [ True,  True,  False,  False,  True ]],

              [[ True,  True,  False,  False, True ],
               [False,  False,  True,  True,  True ],
               [True,  False,  True,  True,  False ]]])
x[b]

array([ 0,  7,  8,  9, 10, 11, 14, 15, 16, 19, 22, 23, 24, 25, 27, 28])

### Broadcasting:

> Arrays with different shapes during arithmetic operations, subject to certain constraints - the smaller array is *broadcast* across the larger array so that they have compatible shapes. Broadcasting provides a means of vectorizing array operations so that looping occurs in C instead of Python.

When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing (i.e. rightmost) dimensions and works its way left. Two dimensions are compatible when

 - they are equal, or

 - one of them is 1
 
 - Arrays do not need to have the same number of dimensions.
 

**Example:**

```
A      (4d array):  8 x 1 x 6 x 1
B      (3d array):      7 x 1 x 5
Result (4d array):  8 x 7 x 6 x 5
```


 - Lining up the sizes of the trailing axes of the arrays to see if that they are compatible:

 - When either of the dimensions compared is one, the other is used. In other words, dimensions with size 1 are stretched or *copied* to match the other.
 
Can have applications in image processing.

In [6]:
x = np.array([[1], [2], [3]])
y = np.array([4, 5, 6])
b = np.broadcast(x, y)
b

<numpy.broadcast at 0x232770cb420>