# Numpy 3

## **Content**

- Matrix Multiplication
  - `np.dot`
  - `@` operator
  - `np.matmul`
- **Shallow vs Deep Copy**
  - `view()`
  - `copy()`
  - `deepcopy()`
- **Array Splitting**
  - `split()`
  - `hsplit()`
  - `vsplit()`
- **Array Stacking**
  - `hstack()`
  - `vstack()`
  - `concatenate()`


---

In [1]:
import numpy as np

## **Element-Wise Multiplication**

Element-wise multiplication in NumPy involves multiplying corresponding elements of two arrays with the same shape to produce a new array where each element is the product of the corresponding elements from the input arrays.

In [2]:
a = np.arange(1, 6)
a

array([1, 2, 3, 4, 5])

In [3]:
a * 5

array([ 5, 10, 15, 20, 25])

In [4]:
b = np.arange(6, 11)
b

array([ 6,  7,  8,  9, 10])

In [5]:
a * b

array([ 6, 14, 24, 36, 50])

Both arrays should have the same shape.

In [6]:
c = np.array([1, 2, 3])

In [7]:
a * c

ValueError: operands could not be broadcast together with shapes (5,) (3,) 

In [23]:
d = np.arange(12).reshape(3, 4)
e = np.arange(13, 25).reshape(3, 4)

In [24]:
print(d)
print(e)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[[13 14 15 16]
 [17 18 19 20]
 [21 22 23 24]]


In [25]:
d * e

array([[  0,  14,  30,  48],
       [ 68,  90, 114, 140],
       [168, 198, 230, 264]])

**Takeaway:**

- Array * Number -> WORKS
- Array * Array (same shape) -> WORKS
- Array * Array (different shape) -> DOES NOT WORK

---

## **Matrix Multiplication**

**Rule:** Number of columns of the first matrix should be equal to number of rows of the second matrix.

- (A,B) * (B,C) -> (A,C)
- (3,4) * (4,3) -> (3,3)

Visual Demo: https://www.geogebra.org/m/ETHXK756

In [26]:
a = np.arange(1,13).reshape((3,4))
c = np.arange(2,14).reshape((4,3))

In [27]:
a.shape, c.shape

((3, 4), (4, 3))

##### `a` is of shape (3,4) and `c` is of shape (4,3). The output will be of shape (3,3).

In [28]:
# Using np.dot
np.dot(a,c)

array([[ 80,  90, 100],
       [184, 210, 236],
       [288, 330, 372]])

In [29]:
# Using np.matmul
np.matmul(a,c)

array([[ 80,  90, 100],
       [184, 210, 236],
       [288, 330, 372]])

In [30]:
# Using @ operator
a@c

array([[ 80,  90, 100],
       [184, 210, 236],
       [288, 330, 372]])

---

In [31]:
a@5

ValueError: matmul: Input operand 1 does not have enough dimensions (has 0, gufunc core with signature (n?,k),(k,m?)->(n?,m?) requires 1)

In [32]:
np.matmul(a, 5)

ValueError: matmul: Input operand 1 does not have enough dimensions (has 0, gufunc core with signature (n?,k),(k,m?)->(n?,m?) requires 1)

In [33]:
np.dot(a, 5)

array([[ 5, 10, 15, 20],
       [25, 30, 35, 40],
       [45, 50, 55, 60]])

**Important:**

- `dot()` function supports the vector multiplication with a scalar value, which is not possible with `matmul()`.
- `Vector * Vector` will work for `matmul()` but `Vector * Scalar` won't.

---

## **Views vs Copies (Shallow vs Deep Copy)**

- Numpy **manages memory very efficiently**,
- which makes it really **useful while dealing with large datasets**.

**But how does it manage memory so efficiently?**

- Let's create some arrays to understand what's happens while using numpy.

When we copy a NumPy array, sometimes we want changes in one array to reflect in the other, sometimes not.

Whether memory is shared or not determines the behavior of your code — critical for debugging, memory optimization, and performance.

### Shallow Copy using `view()`

#### `.view()`

- Returns view of the original array.
- Any changes made in new array will be reflected in original array.

Documentation: https://numpy.org/doc/stable/reference/generated/numpy.ndarray.view.html



In [34]:
import numpy as np

a = np.array([10, 20, 30, 40])
b = a.view()  # Shallow copy

- b is a new object
- But it still shares the same data buffer as a

In [35]:
b[0] = 100
print("a:", a)  # a[0] changes to 100
print("b:", b)

a: [100  20  30  40]
b: [100  20  30  40]


 Changes in b reflect in a because both share the same memory.

To check if memory is shared:

In [36]:
np.shares_memory(a, b)  # Output: True

True

`view()` is useful when you want a new view of the same data (maybe a different shape), but not a full copy.

---

#### `.copy()`

- Returns a copy of the array.
- Changes made in new array are not reflected in the original array.

Documentation (`.copy()`): https://numpy.org/doc/stable/reference/generated/numpy.ndarray.copy.html#numpy.ndarray.copy

Documentation: (`np.copy()`): https://numpy.org/doc/stable/reference/generated/numpy.copy.html

In [37]:
a = np.array([10, 20, 30, 40])
b = a.copy()  #  copy

b is a completely new array with its own memory

In [38]:
b[0] = 999
print("a:", a)  # a is unchanged
print("b:", b)

a: [10 20 30 40]
b: [999  20  30  40]


Check memory sharing:

In [39]:
np.shares_memory(a, b)  # Output: False

False

`copy()` is safer when you want to modify a copy without affecting the original array.

---

### What are object arrays?

- Object arrays are basically array of any Python datatype.

Documentation: https://numpy.org/devdocs/reference/arrays.scalars.html#numpy.object

In [40]:
arr = np.array([1, 'm', [1,2,3]], dtype = 'object')
arr

array([1, 'm', list([1, 2, 3])], dtype=object)

There is an exception to `.copy()`:
- **`.copy()` behaves as shallow copy when using `dtype='object'` array**.
- It will not copy object elements within arrays.

#### But arrays are supposed to be homogeous data. How is it storing data of various types?

Remember that everything is object in Python.

Just like Python list,
- The data actually **stored** in object arrays are **references to Python objects**, not the objects themselves.

Hence, their elements need not be of the same Python type.

**As every element in array is an object, therefore the dtype=object.**

<img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/065/263/original/img.png?1708017404" width="700" height="100">

Let's make a copy of object array and check whether it returns a shallow copy or deep copy.

In [41]:
copy_arr = arr.copy()

In [42]:
copy_arr

array([1, 'm', list([1, 2, 3])], dtype=object)

Now, let's try to modify the list elements in `copy_arr`.

In [12]:
copy_arr[2][0] = 999

In [13]:
copy_arr

array([1, 'm', list([999, 2, 3])], dtype=object)

Let's see if it changed the original array as well.

In [14]:
arr

array([1, 'm', list([999, 2, 3])], dtype=object)

It did change the original array.

Hence, **`.copy()` will return shallow copy when copying elements of array in object array.**

Any change in the 2nd level elements of array will be reflected in original array as well.

So, how do we create deep copy then?

We can do so using `copy.deepcopy()` method.

---

#### `copy.deepcopy()`

- Returns the deep copy of an array.

Documentation: https://docs.python.org/3/library/copy.html#copy.deepcopy

In [15]:
import copy

In [16]:
arr = np.array([1, 'm', [1,2,3]], dtype = 'object')
arr

array([1, 'm', list([1, 2, 3])], dtype=object)

Let's make a copy using `deepcopy()`.

In [17]:
copy = copy.deepcopy(arr)
copy

array([1, 'm', list([1, 2, 3])], dtype=object)

Let's modify the array inside copy array.

In [18]:
copy[2][0] = 999
copy

array([1, 'm', list([999, 2, 3])], dtype=object)

In [19]:
arr

array([1, 'm', list([1, 2, 3])], dtype=object)

Notice that the changes in copy array didn't reflect back to original array.

`copy.deepcopy()` **returns deep copy of an array.**

---

## **Splitting**

In addition to reshaping and selecting subarrays, it is often necessary to split arrays into smaller arrays or merge arrays into bigger arrays.

#### `np.split()`

- Splits an array into multiple sub-arrays as views.

**It takes an argument `indices_or_sections`.**

- If `indices_or_sections` is an **integer, n**, the array will be **divided into n equal arrays along axis**.

- If such a split is not possible, an error is raised.

- If `indices_or_sections` is a **1-D array of sorted integers**, the entries indicate **where along axis the array is split**.

- If an index **exceeds the dimension of the array along axis**, an **empty sub-array is returned** correspondingly.

In [20]:
x = np.arange(9)
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8])

In [21]:
np.split(x, 3)

[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]

**IMPORTANT REQUISITE**

- Number of elements in the array should be divisible by number of sections.

In [22]:
b = np.arange(10)
np.split(b, 3)

ValueError: array split does not result in an equal division

In [43]:
b[0:-1]
np.split(b[0:-1], 3)

[array([999]), array([20]), array([30])]

In [48]:
b,np.split(b,(2,3))

(array([999,  20,  30,  40]), [array([999,  20]), array([30]), array([40])])

---

#### `np.hsplit()`

- Splits an array into multiple sub-arrays **horizontally (column-wise)**.

In [49]:
x = np.arange(16.0).reshape(4, 4)
x

array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.],
       [12., 13., 14., 15.]])

Think of it this way:

  - There are 2 axis to a 2D array
    1. **1st axis - Vertical axis**
    2. **2nd axis - Horizontal axis**

**Along which axis are we splitting the array?**

- The split we want happens across the **2nd axis (Horizontal axis)**
- That is why we use `hsplit()`

**So, try to think in terms of "whether the operation is happening along vertical axis or horizontal axis".**

- We are splitting the horizontal axis in this case.

In [52]:
np.hsplit(x, 2),np.hsplit(x,(2,3))

([array([[ 0.,  1.],
         [ 4.,  5.],
         [ 8.,  9.],
         [12., 13.]]),
  array([[ 2.,  3.],
         [ 6.,  7.],
         [10., 11.],
         [14., 15.]])],
 [array([[ 0.,  1.],
         [ 4.,  5.],
         [ 8.,  9.],
         [12., 13.]]),
  array([[ 2.],
         [ 6.],
         [10.],
         [14.]]),
  array([[ 3.],
         [ 7.],
         [11.],
         [15.]])])

In [53]:
np.hsplit(x, np.array([3, 6]))

[array([[ 0.,  1.,  2.],
        [ 4.,  5.,  6.],
        [ 8.,  9., 10.],
        [12., 13., 14.]]),
 array([[ 3.],
        [ 7.],
        [11.],
        [15.]]),
 array([], shape=(4, 0), dtype=float64)]

---

#### `np.vsplit()`

- Splits an array into multiple sub-arrays **vertically (row-wise)**.

In [54]:
x = np.arange(16.0).reshape(4, 4)
x

array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.],
       [12., 13., 14., 15.]])

**Now, along which axis are we splitting the array?**

- The split we want happens across the **1st axis (Vertical axis)**
- That is why we use `vsplit()`

**Again, always try to think in terms of "whether the operation is happening along vertical axis or horizontal axis".**

- We are splitting the vertical axis in this case.

In [55]:
np.vsplit(x, 2)

[array([[0., 1., 2., 3.],
        [4., 5., 6., 7.]]),
 array([[ 8.,  9., 10., 11.],
        [12., 13., 14., 15.]])]

In [56]:
np.vsplit(x, np.array([3]))

[array([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]]),
 array([[12., 13., 14., 15.]])]

<img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/054/735/original/hvsp1.png?1698041133" width="600" height="400">

---

## **Stacking**

In [57]:
a = np.arange(1, 5)
b = np.arange(2, 6)
c = np.arange(3, 7)

#### `np.vstack()`

- Stacks a list of arrays **vertically (along axis 0 or 1st axis)**.
- For **example**, **given a list of row vectors, appends the rows to form a matrix**.

In [60]:
np.vstack((a,b,c))

array([[1, 2, 3, 4],
       [2, 3, 4, 5],
       [3, 4, 5, 6]])

In [58]:
np.vstack([b, c, a])

array([[2, 3, 4, 5],
       [3, 4, 5, 6],
       [1, 2, 3, 4]])

In [62]:
a = np.arange(1, 5)
b = np.arange(2, 4)
c = np.arange(3, 10)

In [63]:
np.vstack([b, c, a])

ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 2 and the array at index 1 has size 7

---

#### `np.hstack`

- Stacks a list of arrays **horizontally (along axis 1 or 2nd axis)**.

In [64]:
a = np.arange(5).reshape(5, 1)
a

array([[0],
       [1],
       [2],
       [3],
       [4]])

In [65]:
b = np.arange(15).reshape(5, 3)
b

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [66]:
np.hstack([a, b])

array([[ 0,  0,  1,  2],
       [ 1,  3,  4,  5],
       [ 2,  6,  7,  8],
       [ 3,  9, 10, 11],
       [ 4, 12, 13, 14]])

---

#### `np.concatenate()`

- Can perform both vstack and hstack
- Creates a new array by appending arrays after each other, along a given axis.

Provides similar functionality, but it takes a **keyword argument `axis`** that specifies the **axis along which the arrays are to be concatenated**.

The input array to `concatenate()` needs to be of dimensions atleast equal to the dimensions of output array.

In [61]:
a = np.array([1,2,3])
a


array([1, 2, 3])

In [67]:
b = np.array([[1,2,3], [4,5,6]])
b

array([[1, 2, 3],
       [4, 5, 6]])

In [73]:
np.concatenate([a, b], axis=0)

ValueError: all the input array dimensions except for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 1 and the array at index 1 has size 3

**`concatenate()` can only work if both `a` and `b` have the same number of dimensions**

In [74]:
a = np.array([[1,2,3]])
b = np.array([[1,2,3], [4,5,6]])

In [75]:
np.concatenate([a, b], axis = 0) # axis=0 -> vstack

array([[1, 2, 3],
       [1, 2, 3],
       [4, 5, 6]])

In [76]:
a = np.arange(6).reshape(3, 2)
b = np.arange(9).reshape(3, 3)

In [77]:
np.concatenate([a, b], axis = 1) # axis=1 -> hstack

array([[0, 1, 0, 1, 2],
       [2, 3, 3, 4, 5],
       [4, 5, 6, 7, 8]])

In [78]:
a = np.array([[1,2], [3,4]])
b = np.array([[5,6,7,8]])

In [79]:
np.concatenate([a, b], axis = None)

# axis=None joins and converts to 1D

array([1, 2, 3, 4, 5, 6, 7, 8])

---

### Extra-reading material

- [Image Manipulation](https://colab.research.google.com/drive/1SkyA5iF7UTDR8VFhCWEy525XlcaJ8YdI#scrollTo=VBD8uhb9M63e)