In [1]:
from __future__ import division # using Python 2.7
import numpy as np
import pandas as pd 

## Read Data into Pandas DataFrame

In [2]:
f='/Users/KimiZ/GRBs2/analysis/LAT/bn080916009/BXA/GBMwLAT/grbm/grbm_-01-_L_.txt'
colnames = ['prob', 'cstat', 'alpha', 'beta', 'enaught', 'norm']

# Read a table of fixed-width formatted lines into DataFrame
data = pd.read_fwf(f, names=colnames)
data.head()

Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,774.965035,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,774.847617,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,774.783027,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,774.752918,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,774.567371,-1.174649,-2.390003,880.59139,-1.775047


For an easy way to retireve the original dataframe in the case we make undesirable changes to it, we make a copy.

In [3]:
data.describe()

Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
count,10296.0,10296.0,10296.0,10296.0,10296.0,10296.0
mean,9.71251e-05,395.004846,-1.060994,-2.235769,765.699156,-1.776226
std,0.000129185,114.797866,0.083314,0.086807,545.741127,0.047967
min,1.4648040000000001e-99,311.341958,-1.332401,-2.740877,133.833608,-1.925408
25%,3.8809249999999996e-26,316.824998,-1.099635,-2.250655,501.571732,-1.799328
50%,1.980437e-07,337.816511,-1.044211,-2.207984,567.740581,-1.766239
75%,0.0002322696,430.402958,-1.016682,-2.190644,763.720345,-1.750132
max,0.0004849745,774.965035,-0.557594,-2.022935,4542.674944,-1.466031


```

```
### Lets replace all cstat values above the mean value of 395 to `nan` using `numpy.nan`

In [4]:
data[data.cstat > data.cstat.mean()].cstat = np.nan

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self[name] = value


**We get the following Warning:**

```
/Users/KimiZ/anaconda/lib/python2.7/site-packages/pandas/core/generic.py:2773: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self[name] = value
  
```

**In a minute we will discuss what this Warning means. But for now, lets check out our DataFrame**

In [5]:
data.head()

Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,774.965035,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,774.847617,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,774.783027,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,774.752918,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,774.567371,-1.174649,-2.390003,880.59139,-1.775047


**NOTHING CHANGED!!**

In [6]:
# HERE IS ANOTHER EXAMPLE OF THE SAME:
data[data.cstat > data.cstat.mean()]['cstat'] = np.nan

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [7]:
data.head()

Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,774.965035,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,774.847617,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,774.783027,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,774.752918,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,774.567371,-1.174649,-2.390003,880.59139,-1.775047


**AGAIN, NOTHING CHANGED!!**

```


```
## What is happening?

Pandas is uncertain whether you are trying to assign values to a view or a copy of the dataframe, so it is warning you. 


<div class="alert alert-block alert-danger">
/Users/KimiZ/anaconda/lib/python2.7/site-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: <br></br> 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
<br></br><br></br>
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
</div>

## What exactly does this mean?

You could be setting values to a temporary object (which in our case, we are!) that will be thrown out immediatly afterward. That is exctly what `SettingWithCopy` is warning us about!


## How to understand what is happening.

Say we have a dataframe, `dfmi`:

In [8]:
dfmi = pd.DataFrame([list('abcd'),
                      list('efgh'),
                      list('ijkl'),
                      list('mnop')],
                     columns=pd.MultiIndex.from_product([['one', 'two'],
                                                         ['first', 'second']]))

dfmi

Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,a,b,c,d
1,e,f,g,h
2,i,j,k,l
3,m,n,o,p


If we want to assign new values to columns 'one' and 'two', **the proper way** to do this would be to use:
```python
dfmi.loc[:, ('one', 'second')] = value
```
Which under the hood is the same as:
```python
dfmi.loc.__setitem__((slice(None), ('one', 'second')), value)
```
This code first calls the `__setitem__` function for the dataframe, which means it plans on changing the original dataframe. 

**The improper way** to do this, which is what we did, is:
```python
dfmi['one']['second'] = value
```

Which under the hood is the same as:
```python
dfmi.__getitem__('one').__setitem__('second', value)
```

Notice the `__getitem__` is used first, to retrieve the object. This is where the problems crop up. 

Outside of simple cases, it’s very hard to predict whether this `__getitem__` will return a view or a copy (it depends on the memory layout of the array, about which pandas makes no guarantees), and therefore whether the `__setitem__` will modify `dfmi` or a temporary object that gets thrown out immediately afterward. That’s what `SettingWithCopy` is warning you about!



## Why am I mentioning this?
To save you from many headaches that I, and many others, have suffered from ignoring this warning. Why did I ignore the warning? Because the output was what I desired, *at the time*. 
When you write scripts to automate code, you won't see these warnings. If you continue to use this poor pandas practice, it will **certainly cause major problems** for you at some point!




__For more information on this topic, see:__
- Documentation on indexing and selection: [Returning a view versus a copy](http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#indexing-view-versus-copy)
- Stack Overflow: [What is the point of views in pandas if it is undefined whether an indexing operation returns a view or a copy?](http://stackoverflow.com/questions/34884536/what-is-the-point-of-views-in-pandas-if-it-is-undefined-whether-an-indexing-oper)

# Examples, with the format that is used under the hood:

In [9]:
DFMI = dfmi.copy()  # make a copy for easy fix.

In [10]:
dfmi

Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,a,b,c,d
1,e,f,g,h
2,i,j,k,l
3,m,n,o,p


## The correct way:

In [11]:
dfmi.loc[:, ('one', 'second')] = 'z'
dfmi

Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,a,z,c,d
1,e,z,g,h
2,i,z,k,l
3,m,z,o,p


In [12]:
dfmi.loc.__setitem__((slice(None), ('one', 'second')), 'm')
dfmi

Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,a,m,c,d
1,e,m,g,h
2,i,m,k,l
3,m,m,o,p


In [13]:
dfmi.loc.__setitem__((slice(None), ('one', 'second')), 3.14159) # floats work as well!
dfmi

Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,a,3.14159,c,d
1,e,3.14159,g,h
2,i,3.14159,k,l
3,m,3.14159,o,p


In [14]:
dfmi.loc.__setitem__((slice(0), ('one', 'second')), 'b') 
dfmi

Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,a,b,c,d
1,e,3.14159,g,h
2,i,3.14159,k,l
3,m,3.14159,o,p


In [15]:
dfmi.loc.__setitem__((slice(3), ('one', 'second')), 'joy') 
dfmi

Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,a,joy,c,d
1,e,joy,g,h
2,i,joy,k,l
3,m,joy,o,p


In [16]:
dfmi.loc.__setitem__((3, ('one', 'second')), 'dog') 
dfmi

Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,a,joy,c,d
1,e,joy,g,h
2,i,joy,k,l
3,m,dog,o,p


In [17]:
dfmi.loc.__setitem__((2, ('one', 'second')), 'cat') 
dfmi

Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,a,joy,c,d
1,e,joy,g,h
2,i,cat,k,l
3,m,dog,o,p


In [18]:
dfmi.loc.__setitem__((1, ('one', 'second')), np.nan) 
dfmi

Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,a,joy,c,d
1,e,,g,h
2,i,cat,k,l
3,m,dog,o,p


In [19]:
dfmi.loc.__setitem__((0, ('one', 'second')), 3.14159) 
dfmi

Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,a,3.14159,c,d
1,e,,g,h
2,i,cat,k,l
3,m,dog,o,p


```


```

## The wrong way:

In [20]:
dfmi = DFMI.copy()

In [21]:
dfmi

Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,a,b,c,d
1,e,f,g,h
2,i,j,k,l
3,m,n,o,p


In [22]:
dfmi.__getitem__('one').__setitem__('first', 'car')  # works
dfmi

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,car,b,c,d
1,car,f,g,h
2,car,j,k,l
3,car,n,o,p


**So, it actually worked! But now watch!**

In [23]:
dfmi.__getitem__('one').__setitem__('first', 3.14159) # does not work
dfmi

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,car,b,c,d
1,car,f,g,h
2,car,j,k,l
3,car,n,o,p


**This way did not work! It seems that if it's the same data type, it'll change the original object. Otherwise, it writes to a temporary object and then throws it away.**

In [24]:
dfmi.__getitem__('one').__setitem__('first', 3.14159)
dfmi

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,car,b,c,d
1,car,f,g,h
2,car,j,k,l
3,car,n,o,p


In [25]:
print(dfmi.__getitem__('one'))
print('--'*20)
print(dfmi.__getitem__('one').__getitem__('first'))
print('--'*20)
print(dfmi.__getitem__('one').__getitem__('first').__getitem__(0))

  first second
0   car      b
1   car      f
2   car      j
3   car      n
----------------------------------------
0    car
1    car
2    car
3    car
Name: first, dtype: object
----------------------------------------
car


In [26]:
# same dtypes work for the full column and single items.

dfmi.__getitem__('one').__getitem__('first').__setitem__(0, 'kim')  # works
dfmi

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  exec(code_obj, self.user_global_ns, self.user_ns)


Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,kim,b,c,d
1,car,f,g,h
2,car,j,k,l
3,car,n,o,p


In [27]:
# different dtypes work when you select only a SINGLE item. 

dfmi.__getitem__('one').__getitem__('first').__setitem__(1, 3.14159) # works
dfmi

Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,kim,b,c,d
1,3.14159,f,g,h
2,car,j,k,l
3,car,n,o,p


**If you select the EXACT item in the pandas dataframe, like we did above, this method will replace the value with whatever you pass it. However, it won't work for an entire column of values.**

In [28]:
dfmi = DFMI.copy()
dfmi

Unnamed: 0_level_0,one,one,two,two
Unnamed: 0_level_1,first,second,first,second
0,a,b,c,d
1,e,f,g,h
2,i,j,k,l
3,m,n,o,p


In [29]:
type(dfmi.one) # dataframe itself

pandas.core.frame.DataFrame

In [30]:
dfmi.one

Unnamed: 0,first,second
0,a,b
1,e,f
2,i,j
3,m,n


In [31]:
type(dfmi.one.first) # instance method

instancemethod

In [32]:
dfmi.one.first

<bound method DataFrame.first of   first second
0     a      b
1     e      f
2     i      j
3     m      n>

In [33]:
type(dfmi.one['first']) # series, so a column

pandas.core.series.Series

In [34]:
dfmi.one['first']

0    a
1    e
2    i
3    m
Name: first, dtype: object

In [35]:
type(dfmi['one']['first']) # series, so a column

pandas.core.series.Series

In [36]:
dfmi['one']['first']  # same as dfmi.one['first']

0    a
1    e
2    i
3    m
Name: first, dtype: object

## Back to my dataframe

In [37]:
DAT = data.copy()

In [38]:
data.head()

Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,774.965035,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,774.847617,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,774.783027,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,774.752918,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,774.567371,-1.174649,-2.390003,880.59139,-1.775047


In [39]:
data.__getitem__('cstat').head()

0    774.965035
1    774.847617
2    774.783027
3    774.752918
4    774.567371
Name: cstat, dtype: float64

In [40]:
data.__getitem__('cstat').__setitem__(slice(0,3), 'nan')  # works
data.head()

Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,774.753,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,774.567,-1.174649,-2.390003,880.59139,-1.775047


In [41]:
data.__getitem__('cstat').__setitem__(0, 'k') # works
data.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,k,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,774.753,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,774.567,-1.174649,-2.390003,880.59139,-1.775047


In [42]:
data.__getitem__('cstat').__setitem__(slice(None), np.nan) # works
data.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,,-1.174649,-2.390003,880.59139,-1.775047


In [43]:
data.__getitem__('cstat').__setitem__([0,2,4], 3.14195) # works
data.head(10)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,3.14195,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,3.14195,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,3.14195,-1.174649,-2.390003,880.59139,-1.775047
5,1.994388e-99,,-1.136761,-2.095912,365.431802,-1.734409
6,2.1687080000000002e-99,,-1.146912,-2.245102,1460.999367,-1.89897
7,2.5126040000000004e-99,,-1.240119,-2.485094,1459.503585,-1.838669
8,2.581055e-99,,-1.177917,-2.627564,2259.052181,-1.892149
9,2.6644350000000003e-99,,-1.312911,-2.46113,3234.960797,-1.922764


In [44]:
data.__setitem__('cstat', 99) # works
data.head()

Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,99,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,99,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,99,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,99,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,99,-1.174649,-2.390003,880.59139,-1.775047


In [45]:
data.__getitem__('cstat').__setitem__(0, np.nan) 
data.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self._setitem_with_indexer(indexer, value)


Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,99.0,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,99.0,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,99.0,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,99.0,-1.174649,-2.390003,880.59139,-1.775047


In [46]:
data.__getitem__('cstat').__setitem__(slice(2,4), 'kim') # works
data.head()

Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,99,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,kim,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,kim,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,99,-1.174649,-2.390003,880.59139,-1.775047


In [47]:
data.__getitem__('cstat').__setitem__(slice(None), 'kim') # works
data.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,kim,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,kim,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,kim,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,kim,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,kim,-1.174649,-2.390003,880.59139,-1.775047


In [48]:
data = DAT.copy()

#data.cstat = np.nan
data.head()

Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,774.965035,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,774.847617,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,774.783027,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,774.752918,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,774.567371,-1.174649,-2.390003,880.59139,-1.775047


In [49]:
data[data.cstat > 400.0]['cstat'] = np.nan
data.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,774.965035,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,774.847617,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,774.783027,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,774.752918,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,774.567371,-1.174649,-2.390003,880.59139,-1.775047


In [50]:
data[data.cstat > 400.0]['cstat'].head()

0    774.965035
1    774.847617
2    774.783027
3    774.752918
4    774.567371
Name: cstat, dtype: float64

In [51]:
type(data[data.cstat > 400.0]['cstat'])

pandas.core.series.Series

In [52]:
data[data.cstat > 400.0]['cstat'].shape

(3067,)

In [53]:
data.shape

(10296, 6)

The following works when a list of numbers is passed to `__setitem__`. 
```python
data.__getitem__('cstat').__setitem__([2,5,7], np.nan) 
```
Which means, we can use the indices of a conditional.

In [54]:
data = DAT.copy()

# condition
indices = data[data.cstat > 400.0]['cstat'].index

data.__getitem__('cstat').__setitem__(indices, np.nan) # works
data.head(10)

Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
0,1.4648040000000001e-99,,-1.094847,-2.334929,607.719256,-1.72491
1,1.5514360000000002e-99,,-1.251852,-2.282142,1739.260013,-1.814916
2,1.6003560000000001e-99,,-1.222781,-2.69232,2722.137812,-1.883588
3,1.622601e-99,,-1.328367,-2.237835,1735.107763,-1.88535
4,1.7781150000000002e-99,,-1.174649,-2.390003,880.59139,-1.775047
5,1.994388e-99,,-1.136761,-2.095912,365.431802,-1.734409
6,2.1687080000000002e-99,,-1.146912,-2.245102,1460.999367,-1.89897
7,2.5126040000000004e-99,,-1.240119,-2.485094,1459.503585,-1.838669
8,2.581055e-99,,-1.177917,-2.627564,2259.052181,-1.892149
9,2.6644350000000003e-99,,-1.312911,-2.46113,3234.960797,-1.922764


In [55]:
data.tail(10) # worked

Unnamed: 0,prob,cstat,alpha,beta,enaught,norm
10286,0.000226,312.867029,-1.04514,-2.195407,549.164918,-1.764438
10287,0.000261,312.584521,-1.030965,-2.205793,537.323704,-1.761381
10288,0.000211,313.00715,-1.04636,-2.207474,559.954113,-1.761374
10289,0.000314,312.212658,-1.016767,-2.195301,493.162455,-1.748218
10290,0.000227,312.863434,-1.006208,-2.188166,475.251252,-1.742081
10291,0.000332,312.097174,-1.022911,-2.199779,520.533772,-1.752404
10292,0.00025,312.66636,-1.020065,-2.190011,488.601911,-1.74757
10293,0.000231,312.827938,-1.022652,-2.199277,504.114026,-1.748053
10294,0.000302,312.288025,-1.044113,-2.20178,565.135973,-1.764887
10295,0.000316,312.198532,-1.031711,-2.194697,529.200351,-1.754961
