# Advanced NumPy

## `numpy` internals

Under the hood, python is not python, rather C. Numpy array just stores a pointer, towards data (char *data).
1) numpy stores data flatly, and one dimensional.
2) it has an indexing sheme on top of that.
3) It also knows how to step through the memory to get values.

In [1]:
import numpy as np
np.random.seed(2374)

In [11]:
arr = np.random.randint(10, size=(8,8), dtype= np.int64)

In [12]:
arr

array([[3, 5, 1, 7, 6, 4, 0, 9],
       [6, 8, 0, 1, 2, 6, 1, 7],
       [9, 8, 9, 3, 9, 9, 4, 1],
       [2, 0, 6, 0, 1, 7, 2, 7],
       [3, 4, 7, 0, 9, 0, 6, 6],
       [0, 9, 7, 0, 3, 8, 7, 8],
       [5, 6, 8, 8, 4, 4, 1, 6],
       [8, 2, 8, 8, 8, 2, 6, 3]], dtype=int64)

Information about array elements:

In [13]:
arr.itemsize, arr.dtype

(8, dtype('int64'))

How to step through array memory? Using `strides` property:

In [15]:
arr.strides #in int64, each element is 8 bytes. So, to get from arr[0,0] to arr[1,0], we need to skip through all elements in the first row.
                #8 elements in the first row, all size 8, 8*8 = 64.
            #To move one over in axis =1, we just need to move 8 bytes (arr[0,0] to arr[0,1])

(64, 8)

I. e. `arr[0, 1]` is 8 bytes away from `arr[0, 0]` (one step along axis `1`), while `arr[1, 0]` is 64 bytes away from `arr[0, 0]` (one step along axis `0`).

In [86]:
arr.strides[0] == arr.shape[1] * arr.itemsize #elements in the row, times bytes per element.

True

But what about views?

In [20]:
arr_view = arr[::2, 1:] #we take a slice of our original array, which is just a view

In [21]:
arr

array([[3, 5, 1, 7, 6, 4, 0, 9],
       [6, 8, 0, 1, 2, 6, 1, 7],
       [9, 8, 9, 3, 9, 9, 4, 1],
       [2, 0, 6, 0, 1, 7, 2, 7],
       [3, 4, 7, 0, 9, 0, 6, 6],
       [0, 9, 7, 0, 3, 8, 7, 8],
       [5, 6, 8, 8, 4, 4, 1, 6],
       [8, 2, 8, 8, 8, 2, 6, 3]], dtype=int64)

In [22]:
arr_view

array([[5, 1, 7, 6, 4, 0, 9],
       [8, 9, 3, 9, 9, 4, 1],
       [4, 7, 0, 9, 0, 6, 6],
       [6, 8, 8, 4, 4, 1, 6]], dtype=int64)

Information about underlying array structure:

In [90]:
arr.flags

  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

In [91]:
arr_view.flags
#c_contiguos means that each next element is stored after the previous one. In arr_view, this is false, because there are gaps we created b/w rows.
#f_contiguos...
#OWNDATA tells us if this is the original source. This is False for the arr_view. 


  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

Views always have base array:

In [23]:
arr_view.base

array([[3, 5, 1, 7, 6, 4, 0, 9],
       [6, 8, 0, 1, 2, 6, 1, 7],
       [9, 8, 9, 3, 9, 9, 4, 1],
       [2, 0, 6, 0, 1, 7, 2, 7],
       [3, 4, 7, 0, 9, 0, 6, 6],
       [0, 9, 7, 0, 3, 8, 7, 8],
       [5, 6, 8, 8, 4, 4, 1, 6],
       [8, 2, 8, 8, 8, 2, 6, 3]], dtype=int64)

In [24]:
arr.base #No base, because this array is the base

In [25]:
arr_view.base is arr

True

`strides` are provided with respect to the **underlying data** (which is the same between original array `arr` and view array `arr_view`!):

In [26]:
arr_view.strides
#We see that the strides are now different, and doubles, as the view skips every other row.

(128, 8)

In [27]:
arr_view.shape

(4, 7)

Since view is not contiguous, and we are skipping rows, this relation is not True anymore:

In [28]:
arr_view.strides[0] == arr_view.shape[1] * arr_view.itemsize

False

Also, view starts not from byte 0 of the data, but steps 8 bytes inside the data:

In [29]:
np.byte_bounds(arr_view)[0] - np.byte_bounds(arr)[0] #byte bounds provides the address of the array (start and end)
#This cell calculates the difference in addresses between the first element of the arr_view, and the first element of the arr. 
# We get the result 8, which we can look to see is true, as in the original array, they are just one position away from each other (and each position is 8 bytes)

8

In [30]:
np.byte_bounds(arr_view) #provides us with thestarting and ending address of the array

(2131996699096, 2131996699536)

In [31]:
np.byte_bounds(arr_view)[1] - np.byte_bounds(arr)[1]

-64

The takeaway here is that numpy arrays are one dimensional, and they "stride" through that single, flat junk of memory. This gets represented to us as a multi-dimensional array, but indeed it is one dimensional.

In [37]:
arr

array([[3, 5, 1, 7, 6, 4, 0, 9],
       [6, 8, 0, 1, 2, 6, 1, 7],
       [9, 8, 9, 3, 9, 9, 4, 1],
       [2, 0, 6, 0, 1, 7, 2, 7],
       [3, 4, 7, 0, 9, 0, 6, 6],
       [0, 9, 7, 0, 3, 8, 7, 8],
       [5, 6, 8, 8, 4, 4, 1, 6],
       [8, 2, 8, 8, 8, 2, 6, 3]], dtype=int64)

In [38]:
arr_view

array([[5, 1, 7, 6, 4, 0, 9],
       [8, 9, 3, 9, 9, 4, 1],
       [4, 7, 0, 9, 0, 6, 6],
       [6, 8, 8, 4, 4, 1, 6]], dtype=int64)

In [39]:
arr_view.strides

(128, 8)

If the array in relity is one dimensional, then Transpose is just a different indeing scheme on top of it, because we just need to jump differently through the memory- but nothing else actually changes.

In [34]:
arr.T

array([[3, 6, 9, 2, 3, 0, 5, 8],
       [5, 8, 8, 0, 4, 9, 6, 2],
       [1, 0, 9, 6, 7, 7, 8, 8],
       [7, 1, 3, 0, 0, 0, 8, 8],
       [6, 2, 9, 1, 9, 3, 4, 8],
       [4, 6, 9, 7, 0, 8, 4, 2],
       [0, 1, 4, 2, 6, 7, 1, 6],
       [9, 7, 1, 7, 6, 8, 6, 3]], dtype=int64)

In [40]:
arr.T.strides #we see that the strides changes for a transposed array, and are exactly swapped. Since numpy is a one dimenisional structure under the hood, 
                # the position of the data itself doesn't change with a transpose, just the way in which we move around the data.

(8, 64)

Transpose reports similar strides, is it a view?

In [41]:
arr_view.T.strides #The answer is, yes.

(8, 128)

In [42]:
arr_view.T[::2, 1:].base is arr

True

### Creating views manually

In [108]:
arr

array([[5, 7, 6, 8, 2, 9, 2, 0],
       [3, 1, 5, 8, 7, 5, 3, 1],
       [8, 6, 9, 3, 3, 5, 1, 2],
       [0, 7, 5, 7, 1, 2, 8, 1],
       [4, 3, 9, 5, 2, 4, 5, 2],
       [8, 4, 8, 7, 8, 9, 2, 2],
       [9, 8, 7, 4, 7, 3, 8, 2],
       [0, 3, 6, 6, 4, 0, 9, 4]], dtype=int64)

Take memory, associated with `arr` (equivelent to a pointer):

In [109]:
arr.data

<memory at 0x000001A6F4CFAD40>

Create a new array, pointing to the same memory:

In [44]:
arr.shape

(8, 8)

In [46]:
np.ndarray(buffer=arr.data, shape=arr.shape, dtype=arr.dtype) #buffer is a pointer to the memory. 
#Here, we are copying all of the original attributes of the array.

array([[3, 5, 1, 7, 6, 4, 0, 9],
       [6, 8, 0, 1, 2, 6, 1, 7],
       [9, 8, 9, 3, 9, 9, 4, 1],
       [2, 0, 6, 0, 1, 7, 2, 7],
       [3, 4, 7, 0, 9, 0, 6, 6],
       [0, 9, 7, 0, 3, 8, 7, 8],
       [5, 6, 8, 8, 4, 4, 1, 6],
       [8, 2, 8, 8, 8, 2, 6, 3]], dtype=int64)

Create a new *view* pointing to the same memory:

In [48]:
arr[::2]

array([[3, 5, 1, 7, 6, 4, 0, 9],
       [9, 8, 9, 3, 9, 9, 4, 1],
       [3, 4, 7, 0, 9, 0, 6, 6],
       [5, 6, 8, 8, 4, 4, 1, 6]], dtype=int64)

In [47]:
np.ndarray(buffer=arr.data, shape=(4, 8), dtype=arr.dtype, strides=(128, 8)) #Here, we are making a view on the memory with a different approach

array([[3, 5, 1, 7, 6, 4, 0, 9],
       [9, 8, 9, 3, 9, 9, 4, 1],
       [3, 4, 7, 0, 9, 0, 6, 6],
       [5, 6, 8, 8, 4, 4, 1, 6]], dtype=int64)

In [112]:
np.ndarray(buffer=arr.data, shape=(4, 8), dtype=arr.dtype, strides=(128, 8)).base is arr

True

## Cache effects

How does all of this transform to things we actually need to know? Data access is much more costly than data calculations themselves. We want data to be as close to us as possible, which is why we have cache- so we don't need to dig deep to access the data each time from memory.



In [49]:
#If we want to sum all elements of a large array:
large_arr = np.random.randint(100, size=(1000000,))

In [50]:
#another array which is 4 times larger
STEP = 4
larger_arr = np.random.randint(100, size=(1000000*STEP,))

In [51]:
larger_arr.shape, large_arr.shape

((4000000,), (1000000,))

In [52]:
%timeit -n 100 -r 3 large_arr.sum()

835 µs ± 112 µs per loop (mean ± std. dev. of 3 runs, 100 loops each)


In [53]:
%timeit -n 100 -r 3 larger_arr[::STEP].sum() #this one takes each 4th element in the larger array, and the number of calculations is exatly the same.

1.94 ms ± 201 µs per loop (mean ± std. dev. of 3 runs, 100 loops each)


We see that the second one is slower. The take away is, that even though we performed the same amount of arithmetic operations, since we had to put in more effort to step through the data in the second one, it took much longer.

Whenever working with small data, this isnt so relevent. However, when working with big data, proprocessing is essential

In [54]:
del large_arr, larger_arr

In [55]:
large_arr = np.random.randint(100, size=(5, 10000000))

In [57]:
large_arr.nbytes // (1024*1024)

190

In [146]:
large_arr

array([[74, 71, 64, ..., 83, 70, 54],
       [97, 81,  4, ..., 55, 11, 65],
       [34, 38, 55, ..., 48,  4, 88],
       [24, 43, 92, ..., 96, 49, 86],
       [81, 88, 37, ..., 83,  4, 98]])

In [147]:
large_arr.T

array([[74, 97, 34, 24, 81],
       [71, 81, 38, 43, 88],
       [64,  4, 55, 92, 37],
       ...,
       [83, 55, 48, 96, 83],
       [70, 11,  4, 49,  4],
       [54, 65, 88, 86, 98]])

In [148]:
large_arr.T.flags #the transpose is F-contiguos, and not C-contiguos like before

  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

In [149]:
large_arr.flags

  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

In [125]:
%timeit -n 50 -r 3 large_arr.sum(axis=1).sum(axis=0)

59 ms ± 6.09 ms per loop (mean ± std. dev. of 3 runs, 50 loops each)


In [126]:
%timeit -n 50 -r 3 large_arr.sum(axis=0).sum() #this axis has more jumps (?)

123 ms ± 11.5 ms per loop (mean ± std. dev. of 3 runs, 50 loops each)


In [127]:
%timeit -n 50 -r 3 large_arr.T.sum(axis=0) #this one is more or less the same as the first one, as it is going over the same row.

51.9 ms ± 9.64 ms per loop (mean ± std. dev. of 3 runs, 50 loops each)


In [150]:
large_arr.T.base is large_arr

True

## Memory allocations in computations

How long does it take to create a copy?

In [129]:
%timeit -n 20 -r 3 large_arr.copy() #we can see that this can take a little bit of time.

129 ms ± 12.9 ms per loop (mean ± std. dev. of 3 runs, 20 loops each)


Operations create new arrays as well:

In [130]:
%timeit -n 20 -r 3 large_arr + 1 #Since operations create new arrays, in place operations can be helpful

179 ms ± 2.12 ms per loop (mean ± std. dev. of 3 runs, 20 loops each)


`np.add` and `+` do more or less the same:

In [131]:
%timeit -n 20 -r 3 np.add(large_arr, 1)

166 ms ± 4.28 ms per loop (mean ± std. dev. of 3 runs, 20 loops each)


But in-place operations are faster (no allocations):

In [132]:
%timeit -n 100 -r 3 np.add(large_arr, 1, out=large_arr)

70 ms ± 7.07 ms per loop (mean ± std. dev. of 3 runs, 100 loops each)


In [133]:
large_arr

array([[321, 377, 345, ..., 306, 349, 393],
       [339, 363, 371, ..., 344, 373, 391],
       [390, 337, 397, ..., 307, 342, 380],
       [371, 371, 338, ..., 337, 307, 350],
       [337, 318, 392, ..., 306, 388, 366]])

In [134]:
np.add(large_arr, 1, out=large_arr)

array([[322, 378, 346, ..., 307, 350, 394],
       [340, 364, 372, ..., 345, 374, 392],
       [391, 338, 398, ..., 308, 343, 381],
       [372, 372, 339, ..., 338, 308, 351],
       [338, 319, 393, ..., 307, 389, 367]])

Important: Indexing always creates a view, but booleana nd fancy always create a copy, as there is no set stride pattern. Making a copy always takes up more data, as we require more allocation and steps.

# Broadcasting

How can we operate on arrays of different shapes? Should we reshape them first to a common shape?

In [58]:
arr_2d = np.random.randint(10, size=(10, 3))
arr_1d_1 = np.random.randint(10, size=(3, ))
arr_1d_2 = np.random.randint(10, size=(10, ))

In [59]:
arr_2d

array([[1, 9, 6],
       [4, 0, 0],
       [1, 9, 9],
       [7, 4, 9],
       [3, 6, 5],
       [7, 2, 6],
       [3, 6, 8],
       [7, 5, 0],
       [8, 7, 7],
       [9, 0, 3]])

In [60]:
arr_1d_1

array([6, 6, 8])

In [61]:
arr_1d_2

array([2, 5, 6, 3, 8, 7, 3, 4, 9, 4])

In [62]:
arr_2d, arr_1d_1

(array([[1, 9, 6],
        [4, 0, 0],
        [1, 9, 9],
        [7, 4, 9],
        [3, 6, 5],
        [7, 2, 6],
        [3, 6, 8],
        [7, 5, 0],
        [8, 7, 7],
        [9, 0, 3]]),
 array([6, 6, 8]))

Can we add the two?

In [63]:
arr_2d + arr_1d_1

array([[ 7, 15, 14],
       [10,  6,  8],
       [ 7, 15, 17],
       [13, 10, 17],
       [ 9, 12, 13],
       [13,  8, 14],
       [ 9, 12, 16],
       [13, 11,  8],
       [14, 13, 15],
       [15,  6, 11]])

Seemingly we can.
But what was really added to `arr_2d`?

In [64]:
(arr_2d + arr_1d_1) - arr_2d

array([[6, 6, 8],
       [6, 6, 8],
       [6, 6, 8],
       [6, 6, 8],
       [6, 6, 8],
       [6, 6, 8],
       [6, 6, 8],
       [6, 6, 8],
       [6, 6, 8],
       [6, 6, 8]])

What we see here, is that through broadcasting, numpy essentialy turned our arr_1d_1 away into an array with the same dimensions as arr_2d (which we see in the above output). sing thatm it then does the arithmetic

Can we do the same with `arr_1d_2`?

In [65]:
arr_2d + arr_1d_2

ValueError: operands could not be broadcast together with shapes (10,3) (10,) 

We can not. We need to change `arr_1d_2` shape first:

In [66]:
arr_2d + arr_1d_2.reshape((10,1)) #now, it adds each column.

array([[ 3, 11,  8],
       [ 9,  5,  5],
       [ 7, 15, 15],
       [10,  7, 12],
       [11, 14, 13],
       [14,  9, 13],
       [ 6,  9, 11],
       [11,  9,  4],
       [17, 16, 16],
       [13,  4,  7]])

Alternatively, we can do:

In [67]:
np.expand_dims(arr_1d_2, axis=1)

array([[2],
       [5],
       [6],
       [3],
       [8],
       [7],
       [3],
       [4],
       [9],
       [4]])

In [68]:
arr_2d + np.expand_dims(arr_1d_2, axis=1)

array([[ 3, 11,  8],
       [ 9,  5,  5],
       [ 7, 15, 15],
       [10,  7, 12],
       [11, 14, 13],
       [14,  9, 13],
       [ 6,  9, 11],
       [11,  9,  4],
       [17, 16, 16],
       [13,  4,  7]])

It seems `arr_1d_2` was "replicated" in the same way as `arr_1d_1` but along different axis:

In [163]:
(arr_2d + np.expand_dims(arr_1d_2, axis=1)) - arr_2d

array([[1, 1, 1],
       [7, 7, 7],
       [4, 4, 4],
       [7, 7, 7],
       [4, 4, 4],
       [7, 7, 7],
       [7, 7, 7],
       [5, 5, 5],
       [4, 4, 4],
       [4, 4, 4]])

To reveal the pattern, let's try a `3D` array:

In [165]:
arr_3d = np.random.randint(10, size=(7, 10, 3))

In [167]:
arr_1d_1.shape, arr_3d.shape

((3,), (7, 10, 3))

In [None]:
arr_3d

In [168]:
arr_3d + arr_1d_1 # we can add these two together

array([[[ 3,  7,  8],
        [10,  7, 13],
        [ 2,  7,  9],
        [ 5,  8, 11],
        [ 4, 10, 14],
        [ 4, 15, 10],
        [ 3, 15, 12],
        [ 7, 16, 14],
        [ 8, 10,  5],
        [ 6, 11,  5]],

       [[ 4,  7, 13],
        [ 9, 14, 14],
        [11, 10,  6],
        [ 5, 15,  8],
        [ 3,  9, 14],
        [ 4,  7,  9],
        [ 2,  8, 14],
        [ 8, 12, 12],
        [11, 16, 10],
        [10, 15,  9]],

       [[ 3,  8,  7],
        [ 4, 16, 13],
        [ 6, 14,  9],
        [ 7, 14, 12],
        [ 5, 11,  5],
        [ 4, 11, 13],
        [11, 13, 11],
        [ 4,  7, 14],
        [ 8, 10, 14],
        [11, 10, 14]],

       [[ 2,  7,  5],
        [ 6,  8, 11],
        [ 2, 13,  7],
        [11, 16,  6],
        [ 5, 16, 12],
        [ 8, 16, 12],
        [ 2, 16, 14],
        [ 4, 15, 11],
        [ 6, 11, 13],
        [ 3, 12,  5]],

       [[ 4, 13,  5],
        [ 5,  8, 11],
        [11, 15, 12],
        [11, 13, 12],
        [10,  9,  9],
  

In [169]:
(arr_3d + arr_1d_1) - arr_3d #This is validated, by seeing arr_1d broadcasted over each with subtraction

array([[[2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5]],

       [[2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5]],

       [[2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5]],

       [[2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5]],

       [[2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5],
        [2, 7, 5]],

       [[2, 7, 5],
        [2, 7, 5],
  

In [170]:
arr_3d.shape, arr_1d_1.shape

((7, 10, 3), (3,))

Can we do the same with `arr_1d_2`?

In [171]:
arr_3d + arr_1d_2

ValueError: operands could not be broadcast together with shapes (7,10,3) (10,) 

In [172]:
arr_3d.shape, arr_1d_2.shape, np.expand_dims(arr_1d_2, axis=1).shape

((7, 10, 3), (10,), (10, 1))

In [173]:
(arr_3d + np.expand_dims(arr_1d_2, axis=1)) - arr_3d

array([[[1, 1, 1],
        [7, 7, 7],
        [4, 4, 4],
        [7, 7, 7],
        [4, 4, 4],
        [7, 7, 7],
        [7, 7, 7],
        [5, 5, 5],
        [4, 4, 4],
        [4, 4, 4]],

       [[1, 1, 1],
        [7, 7, 7],
        [4, 4, 4],
        [7, 7, 7],
        [4, 4, 4],
        [7, 7, 7],
        [7, 7, 7],
        [5, 5, 5],
        [4, 4, 4],
        [4, 4, 4]],

       [[1, 1, 1],
        [7, 7, 7],
        [4, 4, 4],
        [7, 7, 7],
        [4, 4, 4],
        [7, 7, 7],
        [7, 7, 7],
        [5, 5, 5],
        [4, 4, 4],
        [4, 4, 4]],

       [[1, 1, 1],
        [7, 7, 7],
        [4, 4, 4],
        [7, 7, 7],
        [4, 4, 4],
        [7, 7, 7],
        [7, 7, 7],
        [5, 5, 5],
        [4, 4, 4],
        [4, 4, 4]],

       [[1, 1, 1],
        [7, 7, 7],
        [4, 4, 4],
        [7, 7, 7],
        [4, 4, 4],
        [7, 7, 7],
        [7, 7, 7],
        [5, 5, 5],
        [4, 4, 4],
        [4, 4, 4]],

       [[1, 1, 1],
        [7, 7, 7],
  

Broadcasting rules:
    
- All input arrays with `ndim` smaller than the input array of largest `ndim`, have 1’s **prepended** to their shapes.
- The size in each dimension of the output shape is the **maximum** of all the input sizes in that dimension.
- An input can be used in the calculation if its size in a particular dimension either **matches** the output size in that dimension, or **is exactly 1**.
- If an input has a dimension of size 1 in its shape, the first data entry in that dimension will be used for all calculations along that dimension. In other words, the stepping machinery of a `ufunc` will simply not step along that dimension (stride will be 0 for that dimension).

### How broadcasting really works

What happens, when we add a unit dimension somewhere?

In [None]:
arr_1d_1[np.newaxis, :]

In [None]:
arr_1d_1[np.newaxis, :].strides

`strides[0]` is `0`, which means we can use dimension `0` of `arr_1d_1[np.newaxis, :]` in any (underlying, C) loop with any number of iterations. Let's emulate this in pure Python:

In [None]:
arr_1d_1_bc = arr_1d_1[np.newaxis, :]
arr_1d_1_bc

In [None]:
arr_2d

In [None]:
for i in range(arr_2d.shape[0]):

    print(f"Adding elements of row {i}")

    for j in range(arr_2d.shape[1]):
        arr_2d_address = arr_2d.strides[1] * j + arr_2d.strides[0] * i
        arr_1d_address = arr_1d_1_bc.strides[1] * j + arr_1d_1_bc.strides[0] * i

        print(f"\tarr_2d address: {arr_2d_address}")
        print(f"\tarr_1d_1_bc address: {arr_1d_address}")
    print("-" * 80)