In [None]:
# Last amended: 3rd July, 2020
# My folder: OneDrive/python/
# Ref: https://stackoverflow.com/a/61240274   ==> Check wheteher a particular dataframe is a view or a copy
# Ref: https://stackoverflow.com/a/39102097
# Ref: https://stackoverflow.com/a/53954986

## How to set values in pandas DataFrame--Rule
In a pandas dataframe, values in a cell or cells can be altered by way of filtering the dataframe and assigning values to desired cells. There is one recommended way of assigning values. The recommended way is to filter dataframe by <b><i>use of <u>.loc</u> AND avoiding chained operations. </i></b><br><br>
If you do not follow this advice, you may still achieve the results or you may not get the results--either event is possible. You may still get the results, if filtering creates a view of original DataFrame. Values will then be set within the view and will, therefore, get reflected in the original dataframe. It is also possible that the filtering operation creates a copy of original DataFrame and changes get made to this copy and not to the original dataframe. Use of .loc AND avoinding chained operations, makes it sure that any setting of values will always be to the original dataframe. <br><br>
But when does filtering create a view and when does filtering create a copy? Short answer--Nobody knows. Pandas documentation says this:<br>

  >  Outside of simple cases, it’s very hard to predict whether it will return a view or a copy (it depends on the memory layout of the array, about which pandas makes no guarantees)

<br>So what are the implications of this statement: It means the following:<br><br>
Let us say in a DataFrame a 'Gender' column has values as "Male" and "Female". And you want to map them to 0 or 1. By use of filtering operations, here are four possible ways one can think of:<br>

`df.Gender[df["Gender"] == "Male"]      = 0      # Not correct. It is chained as also does not use .loc
df[df["Gender"] == "Male"]["Gender"]    = 0      # Same comments as above
df.loc[df["Gender" == "Male"]["Gender"] = 0      # Does use '.loc' but also has chaining. Not correct
df.loc[df["Gender" == "Male", "Gender"] = 0      # Correct way to set values`

<br>Example below illustrates all the above cases.<br>



In [52]:
# 1.0 Call libraries
import pandas as pd
import numpy as np

In [59]:
# 1.1 Show outputs from multiple commands in a cell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

In [61]:
# 2.0 Create a datafarme and two copies
df = pd.DataFrame(
                  np.random.choice(10,(6,3)),
                  columns = ["A", "B", "C"]
                 )
# 2.1 Add one categorical column
df["Gender"] = ["Male", "Female"] * 3

# 2.2 Our data and two deep copies of dataframe
df
df1 = df.copy()
df2 = df.copy()

Unnamed: 0,A,B,C,Gender
0,6,9,8,Male
1,6,0,9,Female
2,2,8,7,Male
3,0,4,8,Female
4,7,3,1,Male
5,2,7,1,Female


In [55]:
# 3.0 A chained value setting operation.
#     You get SettingWithCopyWarning 
df1.Gender[df1["Gender"] == "Male"] =0
df1

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


Unnamed: 0,A,B,C,Gender
0,8,8,0,0
1,6,2,8,Female
2,0,6,4,0
3,8,1,6,Female
4,0,7,4,0
5,0,1,6,Female


In [65]:
# 3.1 Another chained and value setting operation
#     You get SettingWithCopyWarning
#     Note that values do get set to a copy
#     and not to original data
df2[df2["Gender"] == "Male"]["Gender"]
df2[df2["Gender"] == "Male"]["Gender"] = 0 
df2

0    Male
2    Male
4    Male
Name: Gender, dtype: object

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


Unnamed: 0,A,B,C,Gender
0,6,9,8,Male
1,6,0,9,Female
2,2,8,7,Male
3,0,4,8,Female
4,7,3,1,Male
5,2,7,1,Female


In [66]:
# 3.2 Another chained value setting operation though uses .loc
#     You get SettingWithCopyWarning
#     Note that values do get set to a copy
#     and not to original data
df2.loc[df2["Gender"] == "Male"]["Gender"]
df2.loc[df2["Gender"] == "Male"]["Gender"] = 0 
df2   # Same as original

0    Male
2    Male
4    Male
Name: Gender, dtype: object

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


Unnamed: 0,A,B,C,Gender
0,6,9,8,Male
1,6,0,9,Female
2,2,8,7,Male
3,0,4,8,Female
4,7,3,1,Male
5,2,7,1,Female


In [58]:
# 3.3 Correct way to set values, as per pandas advice
#     No warnings appear
df2.loc[df2["Gender"] == "Male", "Gender"] = 0 
df2

Unnamed: 0,A,B,C,Gender
0,8,8,0,0
1,6,2,8,Female
2,0,6,4,0
3,8,1,6,Female
4,0,7,4,0
5,0,1,6,Female


In [None]:
############## Done ##############