# Views and Copies

In our previous reading, we talked about how we could not just look at subsets of vectors, but also store those subsets in a new variable. For example, we could make a short vector with 4 entries, then store the middle two entries in a new vector `my_subset`:

In [5]:
import numpy as np
my_vector = np.array([1, 2, 3, 4])
my_vector

array([1, 2, 3, 4])

In [6]:
my_subset = my_vector[1:3]
my_subset

array([2, 3])

Now, at the time, we drew a picture that looked like this:

![subset to new var](../week_2/img/vector_subsetting1.png)
![subset to new var](../week_2/img/vector_subsetting2.png)
![subset to new var](../week_2/img/vector_subsetting3.png)

And that was *close* to the truth about what was going on, but it wasn't quite the *full* truth. 

The reality is that when we create a subset in numpy and assign it to a new variable, what is *actually* happening is not that the variable is being assigned a copy of the value of the subset, but rather the variable is being assigned a *reference* to the subset, something that looks more like this:

[new image with referring arrow]

When numpy creates a reference to a subset of an existing array, that reference is called a *view*, because it's not a copy of the data in the original array, but an easy way to referring back to the original array -- it provides a *view* onto a subset of the original array. 

Why is this distinction important? It's important because it means that both variables -- `my_vector` and `my_subset` are actually both referencing the same data, and so changes make through one variable will propagate to the other. 

For example, if I change the first entry in `my_subset`, it will of course change what I see with `my_subset`:

In [7]:
my_subset[0] = -99
my_subset

array([-99,   3])

But since the first entry in `my_subset` is just a reference to the second entry in `my_vector`, the change I made to `my_subset` will also propagate to `my_vector`:

In [8]:
my_vector

array([  1, -99,   3,   4])

### Why? Why Would Numpy Do This?

The short answer is, as with most things in numpy, speed. Creating a new copy of the data contained in the subset of a vector takes time, and so creating views as numpy's default behavior makes numpy faster.

## When do you get a view, and when do you get a copy?

OK, now the *really* annoying thing: when do I get a view, and when do I get a copy?

Generally speaking: 

- **you get a view if you do a plain, basic slice of an array,** and 
- **the view remains a view if you edit it by modifying it using basic indexing (i.e. you use `[]` on the left side of the assignment operator).** 

Outside of those two behaviors, you will usually get a copy. 

So, for example, this slice will get you a view:

In [None]:
my_array = np.array([1, 2, 3])
my_slice = my_array[1:3]
my_slice[0] = -1
my_array

array([ 1, -1,  3])

But if you use "fancy indexing" (where you pass a list when making your slice), you will NOT get a view:

In [None]:
my_array = np.array([1, 2, 3])
my_slice = my_array[[1,2]]
my_slice[0] = -1
my_array

array([1, 2, 3])

Similarly, if you edit using basic indexing (like we did above), those edits will propogate from the slice back to the originally array (or the other way around). 

But if you modify a slice without using basic indexing, you get a copy, so changes won't propagate:

In [None]:
my_array = np.array([1, 2, 3])
my_slice = my_array[1:3]
my_slice = my_slice * 2
my_slice

array([4, 6])

In [None]:
my_array

array([1, 2, 3])

(If you want to do a full-array manipulation and preserve your view, always use square brackets on the left side of the assignment operator (`=`):

In [None]:
my_array = np.array([1, 2, 3])
my_slice = my_array[1:3]
my_slice[:] = my_slice * 2
my_slice

array([4, 6])

In [None]:
my_array

array([1, 4, 6])

## Making a Copy

Of course, this type of propagating behavior is not always desirable, and so if one wishes to pull a subset of a vector (or array) that is a full copy and not a view, one can just use the `.copy()` method:

In [15]:
my_vector = np.array([1, 2, 3, 4])
my_subset = my_vector[1:3].copy()
my_subset

array([2, 3])

In [16]:
my_subset[0] = -99
my_subset

array([-99,   3])

In [17]:
my_vector

array([1, 2, 3, 4])