# Numpy 4

---

## **Content**

- **Shallow vs Deep Copy**
  - `view()`
  - `copy()`
  - `deepcopy()`
- **Array Splitting**
  - `split()`
  - `hsplit()`
  - `vsplit()`
- **Array Stacking**
  - `hstack()`
  - `vstack()`
  - `concatenate()`
- **Post Read** - Image Manipulation

---

## **Views vs Copies (Shallow vs Deep Copy)**

- Numpy **manages memory very efficiently**,
- which makes it really **useful while dealing with large datasets**.

**But how does it manage memory so efficiently?**

- Let's create some arrays to understand what's happens while using numpy.

In [None]:
import numpy as np

In [None]:
# We'll create  a np array

a = np.arange(4)
a

array([0, 1, 2, 3])

In [None]:
# Reshape array `a` and store in `b`

b = a.reshape(2, 2)
b

array([[0, 1],
       [2, 3]])

Now we will make some changes to our original array `a`.

In [None]:
a[0] = 100
a

array([100,   1,   2,   3])

**What will be values if we print array `b`?**

In [None]:

b

array([[100,   1],
       [  2,   3]])

- Array **`b` got automatically updated**

**This is an example of numpy using `Shallow Copy` of data.**

\
What happens here?

- Numpy **re-uses data** as much as possible **instead of duplicating** it.
- This helps numpy to be efficient.

When we created `b=a.reshape(2,2)`

- Numpy **did NOT make a copy of `a` to store in `b`**, as we can clearly see.
- It is **using the same data as in `a`**.
- It **just looks different (reshaped)** in `b`.
- That is why, **any changes in `a` automatically gets reflected in `b`**.

---

**Now, let's see an example where Numpy will create a `Deep Copy` of data.**

In [None]:
a = np.arange(4)
a

array([0, 1, 2, 3])

In [None]:
# Create `c`

c = a + 2
c

array([2, 3, 4, 5])

In [None]:
# We make changes in `a`

a[0] = 100
a

array([100,   1,   2,   3])

In [None]:
c

array([2, 3, 4, 5])

In [None]:
np.shares_memory(a, c) # Deep Copy

False

As we can see, `c` did not get affected on changing `a`.

- Because it is an operation.
- A more **permanent change in data**.
- So, Numpy **had to create a separate copy for `c`** - i.e., **deep copy of array `a` for array `c`**.

#### Conclusion:

- Numpy is able to **use same data** for **simpler operations** like **reshape** $\rightarrow$ **Shallow Copy**.
- It creates a **copy of data** where operations make **more permanent changes** to data $\rightarrow$ **Deep Copy**.


---

**Is there a way to check whether two arrays are sharing memory or not?**

- Yes, `np.shares_memory()` function

In [None]:
a= np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
b = a[::2]
b

array([0, 2, 4, 6, 8])

In [None]:
np.shares_memory(a,b)

True

**Notice that Slicing creates shallow copies.**

In [None]:
a[0] = 1000

In [None]:
b

array([1000,    2,    4,    6,    8])

---

In [None]:
a = np.arange(6)
a

array([0, 1, 2, 3, 4, 5])

In [None]:
b = a[a % 1 == 0]
b

array([0, 1, 2, 3, 4, 5])

In [None]:
b[0] = 10

In [None]:
a[0]

0

In [None]:
np.shares_memory(a,b)

False

---

**Note:**
- Shallow Copy $\rightarrow$ Reshaping, Slicing...
- Deep Copy $\rightarrow$ Arithmetic Operations, Masking...

In [None]:
a = np.arange(10)

In [None]:
a_shallow_copy = a.view()
# Creates a shallow copy of `a`

In [None]:
np.shares_memory(a_shallow_copy, a)

True

In [None]:
a_deep_copy = a.copy()
# Creates a deep copy of `a`

In [None]:
np.shares_memory(a_deep_copy, a)


False

---

#### `.view()`

- Returns view of the original array.
- Any changes made in new array will be reflected in original array.

Documentation: https://numpy.org/doc/stable/reference/generated/numpy.ndarray.view.html



In [None]:
arr = np.arange(10)
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
view_arr = arr.view()
view_arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Let's modify the content of `view_arr` and check whether it modified the original array as well.

In [None]:
view_arr[4] = 420
view_arr

array([  0,   1,   2,   3, 420,   5,   6,   7,   8,   9])

In [None]:
arr

array([  0,   1,   2,   3, 420,   5,   6,   7,   8,   9])

In [None]:
np.shares_memory(arr, view_arr)

True

Notice that changes in view array are reflected in original array.

---

#### `.copy()`

- Returns a copy of the array.
- Changes made in new array are not reflected in the original array.

Documentation (`.copy()`): https://numpy.org/doc/stable/reference/generated/numpy.ndarray.copy.html#numpy.ndarray.copy

Documentation: (`np.copy()`): https://numpy.org/doc/stable/reference/generated/numpy.copy.html

In [None]:
arr = np.arange(10)
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
copy_arr = arr.copy()
copy_arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Let's modify the content of `copy_arr` and check whether it modified the original array as well.

In [None]:
copy_arr[3] = 45
copy_arr

array([ 0,  1,  2, 45,  4,  5,  6,  7,  8,  9])

In [None]:
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
np.shares_memory(arr, copy)

NameError: name 'copy' is not defined

Notice that the content of original array were not modified as we changed our copy array.


---

### What are object arrays?

- Object arrays are basically array of any Python datatype.

Documentation: https://numpy.org/devdocs/reference/arrays.scalars.html#numpy.object

In [None]:
arr = np.array([1, 'm', [1,2,3]], dtype = 'object')
arr

array([1, 'm', list([1, 2, 3])], dtype=object)

There is an exception to `.copy()`:
- **`.copy()` behaves as shallow copy when using `dtype='object'` array**.
- It will not copy object elements within arrays.

#### But arrays are supposed to be homogeous data. How is it storing data of various types?

Remember that everything is object in Python.

Just like Python list,
- The data actually **stored** in object arrays are **references to Python objects**, not the objects themselves.

Hence, their elements need not be of the same Python type.

**As every element in array is an object, therefore the dtype=object.**

<img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/065/263/original/img.png?1708017404" width="700" height="100">

Let's make a copy of object array and check whether it returns a shallow copy or deep copy.

In [None]:
copy_arr = arr.copy()

In [None]:
copy_arr

array([1, 'm', list([1, 2, 3])], dtype=object)

Now, let's try to modify the list elements in `copy_arr`.

In [None]:
copy_arr[2][0] = 999

In [None]:
copy_arr

array([1, 'm', list([999, 2, 3])], dtype=object)

Let's see if it changed the original array as well.

In [None]:
arr

array([1, 'm', list([999, 2, 3])], dtype=object)

It did change the original array.

Hence, **`.copy()` will return shallow copy when copying elements of array in object array.**

Any change in the 2nd level elements of array will be reflected in original array as well.

So, how do we create deep copy then?

We can do so using `copy.deepcopy()` method.

#### `copy.deepcopy()`

- Returns the deep copy of an array.

Documentation: https://docs.python.org/3/library/copy.html#copy.deepcopy

In [None]:
import copy

In [None]:
arr = np.array([1, 'm', [1,2,3]], dtype = 'object')
arr

array([1, 'm', list([1, 2, 3])], dtype=object)

Let's make a copy using `deepcopy()`.

In [None]:
copy = copy.deepcopy(arr)
copy

array([1, 'm', list([1, 2, 3])], dtype=object)

Let's modify the array inside copy array.

In [None]:
copy[2][0] = 999
copy

array([1, 'm', list([999, 2, 3])], dtype=object)

In [None]:
arr

array([1, 'm', list([1, 2, 3])], dtype=object)

Notice that the changes in copy array didn't reflect back to original array.

`copy.deepcopy()` **returns deep copy of an array.**

---

## **Splitting**

In addition to reshaping and selecting subarrays, it is often necessary to split arrays into smaller arrays or merge arrays into bigger arrays.

#### `np.split()`

- Splits an array into multiple sub-arrays as views.

**It takes an argument `indices_or_sections`.**

- If `indices_or_sections` is an **integer, n**, the array will be **divided into n equal arrays along axis**.

- If such a split is not possible, an error is raised.

- If `indices_or_sections` is a **1-D array of sorted integers**, the entries indicate **where along axis the array is split**.

- If an index **exceeds the dimension of the array along axis**, an **empty sub-array is returned** correspondingly.

In [None]:
x = np.arange(9)
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8])

In [None]:
np.split(x, 3)

[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]

**IMPORTANT REQUISITE**

- Number of elements in the array should be divisible by number of sections.

In [None]:
b = np.arange(10)
np.split(b, 3)

ValueError: array split does not result in an equal division

In [None]:
b[0:-1]
np.split(b[0:-1], 3)

In [None]:
# Splitting on the basis of exact indices

c = np.arange(16)
np.split(c, [3, 5, 6])

[array([0, 1, 2]),
 array([3, 4]),
 array([5]),
 array([ 6,  7,  8,  9, 10, 11, 12, 13, 14, 15])]

---

#### `np.hsplit()`

- Splits an array into multiple sub-arrays **horizontally (column-wise)**.

In [None]:
x = np.arange(16.0).reshape(4, 4)
x

array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.],
       [12., 13., 14., 15.]])

Think of it this way:

  - There are 2 axis to a 2D array
    1. **1st axis - Vertical axis**
    2. **2nd axis - Horizontal axis**

**Along which axis are we splitting the array?**

- The split we want happens across the **2nd axis (Horizontal axis)**
- That is why we use `hsplit()`

**So, try to think in terms of "whether the operation is happening along vertical axis or horizontal axis".**

- We are splitting the horizontal axis in this case.

In [None]:
np.hsplit(x, 2)

[array([[ 0.,  1.],
        [ 4.,  5.],
        [ 8.,  9.],
        [12., 13.]]),
 array([[ 2.,  3.],
        [ 6.,  7.],
        [10., 11.],
        [14., 15.]])]

In [None]:
np.hsplit(x, np.array([3, 6]))

[array([[ 0.,  1.,  2.],
        [ 4.,  5.,  6.],
        [ 8.,  9., 10.],
        [12., 13., 14.]]),
 array([[ 3.],
        [ 7.],
        [11.],
        [15.]]),
 array([], shape=(4, 0), dtype=float64)]

---

#### `np.vsplit()`

- Splits an array into multiple sub-arrays **vertically (row-wise)**.

In [None]:
x = np.arange(16.0).reshape(4, 4)
x

array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.],
       [12., 13., 14., 15.]])

**Now, along which axis are we splitting the array?**

- The split we want happens across the **1st axis (Vertical axis)**
- That is why we use `vsplit()`

**Again, always try to think in terms of "whether the operation is happening along vertical axis or horizontal axis".**

- We are splitting the vertical axis in this case.

In [None]:
np.vsplit(x, 2)

[array([[0., 1., 2., 3.],
        [4., 5., 6., 7.]]),
 array([[ 8.,  9., 10., 11.],
        [12., 13., 14., 15.]])]

In [None]:
np.vsplit(x, np.array([3]))

[array([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]]),
 array([[12., 13., 14., 15.]])]

<img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/054/735/original/hvsp1.png?1698041133" width="600" height="400">

---

## **Stacking**

In [None]:
a = np.arange(1, 5)
b = np.arange(2, 6)
c = np.arange(3, 7)

#### `np.vstack()`

- Stacks a list of arrays **vertically (along axis 0 or 1st axis)**.
- For **example**, **given a list of row vectors, appends the rows to form a matrix**.

In [None]:
np.vstack([b, c, a])

array([[2, 3, 4, 5],
       [3, 4, 5, 6],
       [1, 2, 3, 4]])

In [None]:
a = np.arange(1, 5)
b = np.arange(2, 4)
c = np.arange(3, 10)

In [None]:
np.vstack([b, c, a])

ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 2 and the array at index 1 has size 7

---

#### `np.hstack`

- Stacks a list of arrays **horizontally (along axis 1 or 2nd axis)**.

In [None]:
a = np.arange(5).reshape(5, 1)
a

array([[0],
       [1],
       [2],
       [3],
       [4]])

In [None]:
b = np.arange(15).reshape(5, 3)
b

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [None]:
np.hstack([a, b])

array([[ 0,  0,  1,  2],
       [ 1,  3,  4,  5],
       [ 2,  6,  7,  8],
       [ 3,  9, 10, 11],
       [ 4, 12, 13, 14]])

---

#### `np.concatenate()`

- Can perform both vstack and hstack
- Creates a new array by appending arrays after each other, along a given axis.

Provides similar functionality, but it takes a **keyword argument `axis`** that specifies the **axis along which the arrays are to be concatenated**.

The input array to `concatenate()` needs to be of dimensions atleast equal to the dimensions of output array.

In [None]:
a = np.array([1,2,3])
a


array([1, 2, 3])

In [None]:
b = np.array([[1,2,3], [4,5,6]])
b

array([[1, 2, 3],
       [4, 5, 6]])

In [None]:
np.concatenate([a, b], axis=0)

ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 1 has 2 dimension(s)

**`concatenate()` can only work if both `a` and `b` have the same number of dimensions**

In [None]:
a = np.array([[1,2,3]])
b = np.array([[1,2,3], [4,5,6]])

In [None]:
np.concatenate([a, b], axis = 0) # axis=0 -> vstack

array([[1, 2, 3],
       [1, 2, 3],
       [4, 5, 6]])

In [None]:
a = np.arange(6).reshape(3, 2)
b = np.arange(9).reshape(3, 3)

In [None]:
np.concatenate([a, b], axis = 1) # axis=1 -> hstack

array([[0, 1, 0, 1, 2],
       [2, 3, 3, 4, 5],
       [4, 5, 6, 7, 8]])

In [None]:
a = np.array([[1,2], [3,4]])
b = np.array([[5,6,7,8]])

In [None]:
np.concatenate([a, b], axis = None)

# axis=None joins and converts to 1D

array([1, 2, 3, 4, 5, 6, 7, 8])

---

**Question:** What will be the output of this?

```
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
np.concatenate((a, b), axis=0)
```

In [None]:
a = np.array([[1, 2], [3, 4]])
a

array([[1, 2],
       [3, 4]])

In [None]:
b = np.array([[5, 6]])
b

array([[5, 6]])

In [None]:
np.concatenate((a, b), axis=0)

array([[1, 2],
       [3, 4],
       [5, 6]])

**How did it work?**

- Dimensions of `a` is $2\times2$

**What is the dimensions of `b` ?**

- 1-D array ?? - **NO**
- Look carefully!!
- **`b` is a 2-D array of dimensions $1\times2$**

**`axis = 0` $\rightarrow$ It's a vertical axis**

- So, **changes will happen along vertical axis**
- So, **`b` gets concatenated below `a`**

---

**Question:** What will be the result of this concatenation operation?

```
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
np.concatenate((a, b.T), axis=1)
```

In [None]:
a = np.array([[1, 2], [3, 4]])
a

array([[1, 2],
       [3, 4]])

In [None]:
b = np.array([[5, 6]])
b

array([[5, 6]])

In [None]:
np.concatenate((a, b.T), axis=1)

array([[1, 2, 5],
       [3, 4, 6]])

**What happened here?**

- **Dimensions of `a`** is again $2\times2$
- **Dimensions of `b`** is again $1\times2$
- So, **Dimensions of `b.T`** will be $2\times1$

**`axis = 1`** $\rightarrow$ It's a horizontal axis

- So, **changes will happen along horizontal axis**
- So, **`b.T` gets concatenated horizontally to `a`**

---

### Extra-reading material

- [Image Manipulation](https://colab.research.google.com/drive/1SkyA5iF7UTDR8VFhCWEy525XlcaJ8YdI#scrollTo=VBD8uhb9M63e)