# Q: What's the difference between a copy and a view?

# Answer:
* We have some object x. 
* y = some expression of x. 

    *y is a view of x if changing y changes x.* Otherwise *y is a copy of x*. 

# An object can also be part-view: 
* y = some expression of x. 
* Add something (e.g. a new column) to y. 
* Original columns of y are views of x. 
* New column is not. 

# What happened. 
* In the version that I used to develop the notes, `x[column][row]` was always a copy.
* Unknown to me, the version on https://jupyterhub.cs.tufts.edu allowed most expressions `x[column][row]` to be views.
* In my version, only `x.loc[row, column]` resulted in a view. The debian release didn't keep up to date. 

* The takeaway: later versions of the software allow more expressions to be views. 
* `x['column label']['row label'] = 1`  # change the value to 1. This is a view expression. 
* `y = x['column label']['row label']`  # y is a view. 
* `y = 1`                               # so this works. 
* For example, say I have columns 'a', 'b', 'c' and rows 'd', 'e', 'f', 'g'
* `y = x.loc['d':'e', 'b':'c']`  # a view of x
* `z = y.loc['d', 'b']` # a view of y is a view of x
* `y.loc['d', 'b'] = 1.0` # setting a subset of y, a view to 1.0, and that is setting x. 

* You can make an explicit copy. 

In [12]:
import pandas as pd
x = pd.DataFrame(data={'a': [1,2,3,4], 'b': [5,6,7,8], 'c': [9,10,11,12]}, index=['d','e','f', 'g'])
x

Unnamed: 0,a,b,c
d,1,5,9
e,2,6,10
f,3,7,11
g,4,8,12


In [13]:
c = x.copy()
c

Unnamed: 0,a,b,c
d,1,5,9
e,2,6,10
f,3,7,11
g,4,8,12


In [14]:
c['a']['f'] = 100
c

Unnamed: 0,a,b,c
d,1,5,9
e,2,6,10
f,100,7,11
g,4,8,12


In [15]:
x

Unnamed: 0,a,b,c
d,1,5,9
e,2,6,10
f,3,7,11
g,4,8,12


# When to use a copy or a view
1. Copies are used when you want to make multiple versions of a thing. 
2. Views make it simpler to edit a single thing. The basic idea of a view: eliminate 'for' loops in editing data. 
3. In the version of anaconda3 you're using, most things are views, and copies must be generated explicitly. 

In [17]:
x.loc['e':'f', 'a':'b'] = 200
x

Unnamed: 0,a,b,c
d,1,5,9
e,200,200,10
f,200,200,11
g,4,8,12


# The key issue
The only reason that the above works is that the LHS is a view. 
If it were not a view, the original would not change. 

In [23]:
x.a == 200

d    False
e     True
f     True
g    False
Name: a, dtype: bool

In [19]:
v = x[x.a == 200]  # conditions based upon row value do not result in views. 
v

Unnamed: 0,a,b,c
e,200,200,10
f,200,200,11


In [24]:
v['a']['e'] = 1000
v

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  exec(code_obj, self.user_global_ns, self.user_ns)


Unnamed: 0,a,b,c
e,1000,200,10
f,200,200,11


In [25]:
x

Unnamed: 0,a,b,c
d,1,5,9
e,200,200,10
f,200,200,11
g,4,8,12


# right now, in the current version 
(subject to change when the implementors are clever enough:)
* Expressions of row/column labels result in views. 
* Expressions involving column *values* result in copies. 
(I would not be at all surprised if -- in the next version -- these become views instead.) 

In [27]:
x.loc[x.a == 200, 'a'] = 1000
x

Unnamed: 0,a,b,c
d,1,5,9
e,1000,200,10
f,1000,200,11
g,4,8,12


In [35]:
y = x.loc[x.a == 1000, 'a':'b']
type(y)


pandas.core.frame.DataFrame

In [37]:
y.loc[:,:] = 2000
y

Unnamed: 0,a,b
e,2000,2000
f,2000,2000


In [38]:
x

Unnamed: 0,a,b,c
d,1,5,9
e,1000,200,10
f,1000,200,11
g,4,8,12


In [40]:
s = x['a']
type(s)

pandas.core.series.Series

In [43]:
s.loc['e']

1000

# A series as accessed above is a view. 
* If you get a series by `s = x['column']`, then changing `s` changes `x`

In [44]:
s['e']=5000
x

Unnamed: 0,a,b,c
d,1,5,9
e,5000,200,10
f,1000,200,11
g,4,8,12


# the real lesson
* Be aware that most times you create a derived object that is a subset of the original, you get a view. 
* Changing the view changes the original. 
* To be sure you have a copy,type `y = x.copy()`
* You'll want to do that when trying different versions of an experiment. 