## View vs copy

In [1]:
x = [1, 2, 3]

In [2]:
y = x

In [3]:
y[0] = 42

In [4]:
x

[42, 2, 3]

En python, la création d'une nouvelle variable à partir d'un objet n'entrainne pas forcément la création d'un nouvel objet, mais va simplement créer un pointeur sur l'objet originel en mémoire.

In [5]:
id(x)

139790586716064

In [6]:
id(y)

139790586716064

In [7]:
id(x) == id(y)

True

### Comment copier un objet ?

In [8]:
%reset -f 

In [10]:
x = [1, 2, 3]

In [11]:
y = x.copy()

In [12]:
y[0] = 42

In [13]:
x

[1, 2, 3]

### Exemple qui nous concerne

In [14]:
import copy
import pandas as pd
import numpy as np

In [15]:
df = pd.DataFrame({"A": np.random.randint(0,100, 6)})

In [16]:
df

Unnamed: 0,A
0,46
1,53
2,32
3,38
4,37
5,49


In [17]:
df2 = df.loc[0:3]

In [18]:
df2.loc[1] = 666

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  iloc._setitem_with_indexer(indexer, value, self.name)


In [19]:
df2

Unnamed: 0,A
0,46
1,666
2,32
3,38


In [20]:
df

Unnamed: 0,A
0,46
1,666
2,32
3,38
4,37
5,49


In [21]:
id(df)

139789311868752

In [22]:
id(df2)

139790516857424

⚠️ Même si `df1` et `df2` ne sont pas le même objet, `df2` est une sous-partie de `df1` et toute modification de `df2` **risque** de modifier `df1` 

In [23]:
%reset -f

In [24]:
import copy
import pandas as pd
import numpy as np

In [25]:
df = pd.DataFrame({"A": np.random.randint(0,100,6)})

In [26]:
df2 = df.loc[0:3].copy()

In [27]:
df2.loc[1] = 666

In [28]:
df2

Unnamed: 0,A
0,38
1,666
2,28
3,87


In [29]:
df

Unnamed: 0,A
0,38
1,86
2,28
3,87
4,22
5,99


### Si votre objet est complexe/profond et que vous souhaitez créer une copie de tous ses éléments et sous éléments

In [30]:
import copy

In [31]:
df2 = copy.deepcopy(df.loc[0:3])

Est-ce que pandas va réaliser une vue ou une copie ?   
Est-ce qu'une vue va modifier la DataFrame originale ?
Dur à savoir...

In [32]:
%reset -f

In [33]:
import copy
import pandas as pd
import numpy as np

In [34]:
df = pd.DataFrame({"A": np.random.randint(0,100,6)})
df2 = df.loc[0:3]
df2.loc[1] = 666
df

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  iloc._setitem_with_indexer(indexer, value, self.name)


Unnamed: 0,A
0,26
1,666
2,95
3,7
4,42
5,23


In [35]:
df = pd.DataFrame({"A": np.random.randint(0,100,6)})
df2 = df.loc[0:3]
df2.loc[1] = "coucou"
df

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_column(loc, value, pi)


Unnamed: 0,A
0,25
1,53
2,92
3,4
4,61
5,2
