# How to properly filter data frame without warnings ?
## Many times doing this operation, we use to add another column or modify "new" data frame, which is not exactly "new". It causes in warning 'SettingWithCopyWarning'

In [1]:
import pandas as pd

##  simple data frame to have fun with

In [2]:
df = pd.DataFrame({'a':[1,1,2,2,3,3,4,4,5,5], 
                   'b': [10,10,10,10,50,-50,0,0,0,0],
                  'c':['a','a','a','b','b','b','c','c','d','d']})

In [3]:
df

Unnamed: 0,a,b,c
0,1,10,a
1,1,10,a
2,2,10,a
3,2,10,b
4,3,50,b
5,3,-50,b
6,4,0,c
7,4,0,c
8,5,0,d
9,5,0,d


## First try...

In [4]:
df_naive = df[df.a > 3]

In [5]:
df_naive['d'] = 'text'

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [6]:
df_naive

Unnamed: 0,a,b,c,d
6,4,0,c,text
7,4,0,c,text
8,5,0,d,text
9,5,0,d,text


## Maybe we should use .loc ...?

In [7]:
df_loc = df.loc[df.a > 3]

In [8]:
df_loc['d'] = 'text'

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


## .loc in creating new column ..?

In [9]:
df_double_loc = df.loc[df.a > 3]

In [10]:
df_double_loc.loc[:, 'd'] = 'text'

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[key] = _infer_fill_value(value)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[item] = s


## Let's have a look what's inside our indexer..

In [11]:
df.a > 3

0    False
1    False
2    False
3    False
4    False
5    False
6     True
7     True
8     True
9     True
Name: a, dtype: bool

## So, it is not explicit index, but a series with bool values

## Proper way how to do that:

In [12]:
df_proper = df.loc[df[df.a > 3].index]

In [13]:
df_proper['d'] = 'text'

In [14]:
df_proper

Unnamed: 0,a,b,c,d
6,4,0,c,text
7,4,0,c,text
8,5,0,d,text
9,5,0,d,text


## Or, in more pretty way:

In [15]:
chosen_indexes = df[df.a > 3].index
df_proper_pretty = df.loc[chosen_indexes]

In [16]:
df_proper_pretty['d'] = 'text'

In [17]:
df_proper_pretty

Unnamed: 0,a,b,c,d
6,4,0,c,text
7,4,0,c,text
8,5,0,d,text
9,5,0,d,text
