# Copying

QUOTE: Views are an important NumPy concept! NumPy functions, as well as operations like indexing and slicing, will return views whenever possible. This saves memory and is faster (no copy of the data has to be made). However it’s important to be aware of this - modifying data in a view also modifies the original array!

In [5]:
import numpy as np

## Not Copy

Directly assign a numpy array to a new variable doesn't make any copy. The two variables still refer to the same memory address.

In [41]:
a = np.arange(10).reshape(5, 2)
b = a
print(b is a)
print(id(a))
print(id(b))

True
1297068598544
1297068598544


## View or Shallow Copy

We can use `view`, `reshape`, `slicing` (`e.g., [:10]`), etc. to create a new array object with the same values from the one we copied. But they still share the same values.

In [78]:
a = np.arange(10).reshape(5, 2)
b = a.view()
c = a.reshape(-1)
d = a[:3, :1]

In [79]:
a, b, c, d

(array([[0, 1],
        [2, 3],
        [4, 5],
        [6, 7],
        [8, 9]]),
 array([[0, 1],
        [2, 3],
        [4, 5],
        [6, 7],
        [8, 9]]),
 array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),
 array([[0],
        [2],
        [4]]))

In [81]:
b.shape

(5, 2)

In [82]:
b[2:, 1:] = b[2:, 1:] * 100
c[2] = 200
d[0, 0] = 100
a, b, c, d

(array([[100,   1],
        [200,   3],
        [  4, 500],
        [  6, 700],
        [  8, 900]]),
 array([[100,   1],
        [200,   3],
        [  4, 500],
        [  6, 700],
        [  8, 900]]),
 array([100,   1, 200,   3,   4, 500,   6, 700,   8, 900]),
 array([[100],
        [200],
        [  4]]))

## Deep Copy

The `copy` method will make a copy with new data and a new memory address from the copied one. 

In [83]:
a = np.arange(10)
b = a.copy()
b is a

False

In [85]:
b[0] = 100
print(b)
print(a)

[100   1   2   3   4   5   6   7   8   9]
[0 1 2 3 4 5 6 7 8 9]


QUOTE: Sometimes `copy` should be called after slicing if the original array is not required anymore. 

For example, suppose `a` is a huge intermediate result and the final result `b` only contains a small fraction of `a`, a `deep copy` should be made when constructing `b` with `slicing`:

In [89]:
a = np.arange(10000)
b = a[:100].copy()
del a

# If b = a[:100] is used instead, 
# a is referenced by b and will persist in memory even if del a is executed.

# Reference

- https://numpy.org/doc/stable/user/quickstart.html#copies-and-views
- https://numpy.org/doc/stable/user/absolute_beginners.html#how-to-create-an-array-from-existing-data