In [None]:
import numpy as np
import pandas as pd

# Pandas Views vs Copy. Understanding the `SettingWithCopyWarning` message.

## The Warning Explained

The warning is there to help users understand that they need to explicitely understand what they are doing because Pandas does not!

Generally, if Pandas can't detect if you are performing a `set` operation on a copy or a view, it throws the warning.

After the warning, the `set` may or may not actually work as expected because the `set` occured on a copy or a view when the other was intended.

Theoretically, if you always know what you are doing, you could turn this warning off.

You could also follow different patterns to avoid the warning and provide more clarity in your code.

This [article explains](https://www.dataquest.io/blog/settingwithcopywarning/) it very well.

You can get another perspective from [this article](https://www.practicaldatascience.org/html/views_and_copies_in_pandas.html) and in [this video](https://www.youtube.com/watch?v=4R4WsDJ-KVc).

### Some Rules to Consider

Here's the rules ( [see stackoverflow issue](https://stackoverflow.com/questions/23296282/what-rules-does-pandas-use-to-generate-a-view-vs-a-copy) ):

- All operations generate a copy
- If inplace=True is provided, it will modify in-place; only some operations support this
- An indexer that sets, e.g. .loc/.iloc/.iat/.at will set inplace.
- An indexer that gets on a single-dtyped object is almost always a view (depending on the memory layout it may not be that's why this is not reliable). This is mainly for efficiency. (the example from above is for .query; this will always return a copy as its evaluated by numexpr)
- An indexer that gets on a multiple-dtyped object is always a copy.

### View and Copy aren't consistent, it depends on the memory layout...

An indexer-get operation on a multi-dtyped object will always return a copy. However, mainly for efficiency, an indexer get operation on a single-dtyped object almost always returns a view; the caveat here being that this depends on the memory layout of the object and is not guaranteed.

So generally speaking, it can be challenging to know as a developer what is going on so it's safer to not execute a set on an indexed object?

### Some Other Good Advice

From [this article](https://www.dataquest.io/blog/settingwithcopywarning/):

> The trick is to learn to identify chained indexing and avoid it at all costs. If you want to change the original, use a single assignment operation. If you want a copy, make sure you force pandas to do just that. This will save time and make your code water-tight.
> Also note that even though the SettingWithCopyWarning will only occur when you are setting, it’s best to avoid chained indexing for gets too. Chained operations are slower and will cause problems if you decide to add assignment operations later on.

The complexity underlying the SettingWithCopyWarning is one of the few rough edges in the pandas library. Its roots are very deeply embedded in the library and should not be ignored. In Jeff Reback’s own words there “are no cases that I am aware [of] that you should actually ignore this warning. … If you do certain types of indexing it will never work, others it will work. You are really playing with fire.”

Fortunately, addressing the warning only requires you to identify chained assignment and fix it. If there’s just one thing to take away from all this, it’s that.

## Which Slices Return Views?

### Columnfrom DataFrame -> Series

In [None]:
dict_table = {
    "col1": [0,1,2,3,4], 
    "col2": ["a", "b", "c", "d", "e"],
    "col3": [True, False, True, True, False]
}
df_1 = pd.DataFrame(dict_table)

In [None]:
df_1

In [None]:
col2 = df_one["col2"]

In [None]:
# Is this a view or a copy?
col2[0] = "z"

In [None]:
col2[0]

In [None]:
df_1["col2"][0]

So as you can see, slicine out a column in this manor provides a view so modifying the series also modifies the same cell back in the view. If you as the developer understood that you received a view, then this would all work as expected. Pandas wants to make sure you know what you are doing and warn you.

## The Better Way to Do This

Because you can't dependbly know if what was returned was a copy or a view, it's best to avoid assigment all together because you won't know if it will or wont' affect the root data frame.

Therefore, if your intention is to update the copy, make a copy and be explicit.

If you want to change the original, do so using loc.

### Making a Change to A Copy

In [None]:
dict_table = {
    "col1": [0,1,2,3,4], 
    "col2": ["a", "b", "c", "d", "e"],
    "col3": [True, False, True, True, False]
}
df_2 = pd.DataFrame(dict_table)

In [None]:
col2 = df_2["col2"].copy()

In [None]:
col2[0] = "z"

In [None]:
col2

In [None]:
df_2

### Making a Change to the Original

In [None]:
dict_table = {
    "col1": [0,1,2,3,4], 
    "col2": ["a", "b", "c", "d", "e"],
    "col3": [True, False, True, True, False]
}
df_3 = pd.DataFrame(dict_table)

In [None]:
df_3.loc[0, "col2"] = "z"

In [None]:
df_3