## View vs copy

In [1]:
x = [1, 2, 3]

In [2]:
y = x 

In [3]:
y[0] = 42

In [5]:
y

[42, 2, 3]

In [6]:
x

[42, 2, 3]

En python, la création d'une nouvelle variable à partir d'un objet n'entrainne pas forcément la création d'un nouvel objet, mais va simplement créer un pointeur sur l'objet originel en mémoire.

In [7]:
id(x)

139706435138416

In [8]:
id(y)

139706435138416

In [9]:
id(x) == id(y)

True

### Comment copier un objet ?

In [10]:
%reset -f 

In [11]:
x = [1, 2, 3]

In [12]:
y = x.copy()

In [13]:
y[0] = 42

In [14]:
x

[1, 2, 3]

In [15]:
y

[42, 2, 3]

### Exemple qui nous concerne

In [16]:
import copy
import pandas as pd
import numpy as np

In [17]:
df = pd.DataFrame({"A": np.random.randint(0,100, 6)})

In [18]:
df

Unnamed: 0,A
0,15
1,84
2,56
3,46
4,27
5,98


In [19]:
df2 = df.loc[0:3]

In [20]:
df2

Unnamed: 0,A
0,15
1,84
2,56
3,46


In [21]:
df2.loc[1] = 666

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  iloc._setitem_with_indexer(indexer, value, self.name)


In [22]:
df2

Unnamed: 0,A
0,15
1,666
2,56
3,46


In [23]:
df

Unnamed: 0,A
0,15
1,666
2,56
3,46
4,27
5,98


In [24]:
id(df)

139705451186576

In [25]:
id(df2)

139705447850384

⚠️ Même si `df1` et `df2` ne sont pas le même objet, `df2` est une sous-partie de `df1` et toute modification de `df2` **risque** de modifier `df1` 

In [26]:
%reset -f

In [27]:
import copy
import pandas as pd
import numpy as np

In [28]:
df = pd.DataFrame({"A": np.random.randint(0,100,6)})

In [30]:
df2 = df.loc[0:3].copy()

In [31]:
df2.loc[1] = 666

In [32]:
df2

Unnamed: 0,A
0,58
1,666
2,67
3,14


In [33]:
df

Unnamed: 0,A
0,58
1,63
2,67
3,14
4,94
5,56


### Si votre objet est complexe/profond et que vous souhaitez créer une copie de tous ses éléments et sous éléments

In [None]:
import copy

In [None]:
df2 = copy.deepcopy(df.loc[0:3])

Est-ce que pandas va réaliser une vue ou une copie ?   
Est-ce qu'une vue va modifier la DataFrame originale ?
Dur à savoir...

In [None]:
%reset -f

In [37]:
import copy
import pandas as pd
import numpy as np

In [38]:
df = pd.DataFrame({"A": np.random.randint(0,100,6)})
df2 = df.loc[0:3]
df2.loc[1] = 666
df

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  iloc._setitem_with_indexer(indexer, value, self.name)


Unnamed: 0,A
0,3
1,666
2,71
3,31
4,31
5,51


In [36]:
df = pd.DataFrame({"A": np.random.randint(0,100,6)})
df2 = df.loc[0:3]
df2.loc[1] = "coucou"
df

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_column(loc, value, pi)


Unnamed: 0,A
0,68
1,60
2,34
3,66
4,93
5,8
