In [1]:
from pandas import Series, DataFrame
import pandas as pd
import numpy as np

# Essential Functionality


In this section, I’ll walk you through the fundamental mechanics of interacting with
the data contained in a Series or DataFrame. Upcoming chapters will delve more deeply
into data analysis and manipulation topics using pandas. This book is not intended to
serve as exhaustive documentation for the pandas library; I instead focus on the most
important features, leaving the less common (that is, more esoteric) things for you to
explore on your own.

## Reindexing

A critical method on pandas objects is reindex, which means to create a new object
with the data conformed to a new index. Consider a simple example from above:

In [2]:
obj = Series([4.5, 7.2, -5.3, 3.6], index=['d', 'b', 'a', 'c'])

Calling reindex on this Series rearranges the data according to the new index, introducing
missing values if any index values were not already present:

In [3]:
obj2 = obj.reindex(['a', 'b', 'c', 'd', 'e'])

In [4]:
obj2

a   -5.3
b    7.2
c    3.6
d    4.5
e    NaN
dtype: float64

In [5]:
obj.reindex(['a', 'b', 'c', 'd', 'e'], fill_value=0)

a   -5.3
b    7.2
c    3.6
d    4.5
e    0.0
dtype: float64

For ordered data like time series, it may be desirable to do some interpolation or filling
of values when reindexing. The method option allows us to do this, using a method such
as ffill which forward fills the values:

In [6]:
obj3 = Series(['blue', 'purple', 'yellow'], index=[0, 2, 4])

In [7]:
obj3.reindex(range(6), method='ffill')

0      blue
1      blue
2    purple
3    purple
4    yellow
5    yellow
dtype: object

Table 5-4 lists available method options. At this time, interpolation more sophisticated
than forward- and backfilling would need to be applied after the fact.

Table 5-4. reindex method (interpolation) options

Argument Description

ffill or pad Fill (or carry) values forward

bfill or backfill Fill (or carry) values backward

With DataFrame, reindex can alter either the (row) index, columns, or both. When
passed just a sequence, the rows are reindexed in the result:

In [8]:
frame = DataFrame(np.arange(9).reshape((3,3)), index=['a', 'c', 'd'], columns=['Ohio', 'Texas', 'California'])

In [9]:
frame

Unnamed: 0,Ohio,Texas,California
a,0,1,2
c,3,4,5
d,6,7,8


In [10]:
frame2 = frame.reindex(['a', 'b', 'c', 'd'])

In [11]:
frame2

Unnamed: 0,Ohio,Texas,California
a,0.0,1.0,2.0
b,,,
c,3.0,4.0,5.0
d,6.0,7.0,8.0


The columns can be reindexed using the columns keyword:

In [12]:
states = ['Texas', 'Utah', 'California']

In [13]:
frame.reindex(columns=states)

Unnamed: 0,Texas,Utah,California
a,1,,2
c,4,,5
d,7,,8


Both can be reindexed in one shot, though interpolation will only apply row-wise (axis
0):

In [14]:
frame.reindex(index=['a', 'b', 'c', 'd'], method='ffill', columns=states)

Unnamed: 0,Texas,Utah,California
a,1,,2
b,1,,2
c,4,,5
d,7,,8


As you’ll see soon, reindexing can be done more succinctly by label-indexing with ix:

In [15]:
frame.ix[['a', 'b', 'c', 'd'], states]

Unnamed: 0,Texas,Utah,California
a,1.0,,2.0
b,,,
c,4.0,,5.0
d,7.0,,8.0


Table 5-5. reindex function arguments

Argument Description

index New sequence to use as index. Can be Index instance or any other sequence-like Python data structure. An
Index will be used exactly as is without any copying

method Interpolation (fill) method, see Table 5-4 for options.

fill_value Substitute value to use when introducing missing data by reindexing

limit When forward- or backfilling, maximum size gap to fill

level Match simple Index on level of MultiIndex, otherwise select subset of

copy Do not copy underlying data if new index is equivalent to old index. True by default (i.e. always copy data).

## Dropping entries from an axis

Dropping one or more entries from an axis is easy if you have an index array or list
without those entries. As that can require a bit of munging and set logic, the drop
method will return a new object with the indicated value or values deleted from an axis:

In [16]:
obj = Series(np.arange(5.), index=['a', 'b', 'c', 'd', 'e'])

In [17]:
new_obj = obj.drop('c')

In [18]:
new_obj

a    0.0
b    1.0
d    3.0
e    4.0
dtype: float64

In [19]:
obj.drop(['d', 'c'])

a    0.0
b    1.0
e    4.0
dtype: float64

With DataFrame, index values can be deleted from either axis:

In [20]:
data = DataFrame(np.arange(16).reshape((4,4))
                 , index=['Ohio', 'Colorado', 'Utah', 'New York']
                 , columns=['one', 'two', 'three', 'four']
                )

In [21]:
data

Unnamed: 0,one,two,three,four
Ohio,0,1,2,3
Colorado,4,5,6,7
Utah,8,9,10,11
New York,12,13,14,15


In [22]:
data.drop(['Colorado', 'Ohio'])

Unnamed: 0,one,two,three,four
Utah,8,9,10,11
New York,12,13,14,15


In [23]:
data.drop('two', axis=1)

Unnamed: 0,one,three,four
Ohio,0,2,3
Colorado,4,6,7
Utah,8,10,11
New York,12,14,15


In [24]:
data.drop(['two', 'four'], axis=1)

Unnamed: 0,one,three
Ohio,0,2
Colorado,4,6
Utah,8,10
New York,12,14


## Indexing, selection, and filtering

Series indexing (obj[...]) works analogously to NumPy array indexing, except you can
use the Series’s index values instead of only integers. Here are some examples this:

In [25]:
obj = Series(np.arange(4.), index=['a', 'b', 'c', 'd'])

In [26]:
obj

a    0.0
b    1.0
c    2.0
d    3.0
dtype: float64

In [27]:
obj['b']

1.0

In [28]:
obj[1]

1.0

In [29]:
obj[2:4]

c    2.0
d    3.0
dtype: float64

In [30]:
obj[['b', 'a', 'd']]

b    1.0
a    0.0
d    3.0
dtype: float64

In [31]:
obj[[1, 3]]

b    1.0
d    3.0
dtype: float64

In [32]:
obj[obj < 2]

a    0.0
b    1.0
dtype: float64

Slicing with labels behaves differently than normal Python slicing in that the endpoint
is inclusive:

In [33]:
obj['b':'c']

b    1.0
c    2.0
dtype: float64

Setting using these methods works just as you would expect:

In [34]:
obj['b':'c'] = 5

In [35]:
obj

a    0.0
b    5.0
c    5.0
d    3.0
dtype: float64

As you’ve seen above, indexing into a DataFrame is for retrieving one or more columns
either with a single value or sequence:

In [36]:
data = DataFrame(np.arange(16).reshape((4, 4))
                 ,index=['Ohio', 'Colorado', 'Utah', 'New York']
                 ,columns=['one', 'two', 'three', 'four'])

In [37]:
data

Unnamed: 0,one,two,three,four
Ohio,0,1,2,3
Colorado,4,5,6,7
Utah,8,9,10,11
New York,12,13,14,15


In [38]:
data['two']

Ohio         1
Colorado     5
Utah         9
New York    13
Name: two, dtype: int64

In [39]:
data[['three', 'one']]

Unnamed: 0,three,one
Ohio,2,0
Colorado,6,4
Utah,10,8
New York,14,12


Indexing like this has a few special cases. First selecting rows by slicing or a boolean
array:

In [40]:
data[:2]

Unnamed: 0,one,two,three,four
Ohio,0,1,2,3
Colorado,4,5,6,7


In [41]:
data[data['three'] > 5]

Unnamed: 0,one,two,three,four
Colorado,4,5,6,7
Utah,8,9,10,11
New York,12,13,14,15


This might seem inconsistent to some readers, but this syntax arose out of practicality
and nothing more. Another use case is in indexing with a boolean DataFrame, such as
one produced by a scalar comparison:

In [42]:
data < 5

Unnamed: 0,one,two,three,four
Ohio,True,True,True,True
Colorado,True,False,False,False
Utah,False,False,False,False
New York,False,False,False,False


In [43]:
data[data < 5] = 0

In [44]:
data

Unnamed: 0,one,two,three,four
Ohio,0,0,0,0
Colorado,0,5,6,7
Utah,8,9,10,11
New York,12,13,14,15


This is intended to make DataFrame syntactically more like an ndarray in this case.

For DataFrame label-indexing on the rows, I introduce the special indexing field ix. It
enables you to select a subset of the rows and columns from a DataFrame with NumPylike
notation plus axis labels. As I mentioned earlier, this is also a less verbose way to
do reindexing:

In [45]:
data.ix['Colorado', ['two', 'three']]

two      5
three    6
Name: Colorado, dtype: int64

In [46]:
data.ix[['Colorado', 'Utah'], [3, 0, 1]]

Unnamed: 0,four,one,two
Colorado,7,0,5
Utah,11,8,9


In [47]:
data.ix[2]

one       8
two       9
three    10
four     11
Name: Utah, dtype: int64

In [48]:
data.ix[:'Utah', 'two']

Ohio        0
Colorado    5
Utah        9
Name: two, dtype: int64

In [49]:
data.ix[data.three > 5, :3]

Unnamed: 0,one,two,three
Colorado,0,5,6
Utah,8,9,10
New York,12,13,14


So there are many ways to select and rearrange the data contained in a pandas object.
For DataFrame, there is a short summary of many of them in Table 5-6. You have a
number of additional options when working with hierarchical indexes as you’ll later
see.

NOTE:
When designing pandas, I felt that having to type frame[:, col] to select
a column was too verbose (and error-prone), since column selection is
one of the most common operations. Thus I made the design trade-off
to push all of the rich label-indexing into ix.

http://pandas.pydata.org/pandas-docs/stable/indexing.html

Table 5-6. Indexing options with DataFrame

Type Notes

obj[val] Select single column or sequence of columns from the DataFrame. Special case conveniences:
boolean array (filter rows), slice (slice rows), or boolean DataFrame (set
values based on some criterion).

obj.ix[val] Selects single row of subset of rows from the DataFrame.

obj.ix[:, val] Selects single column of subset of columns.

obj.ix[val1, val2] Select both rows and columns.

reindex method Conform one or more axes to new indexes.

xs method Select single row or column as a Series by label.

icol, irow methods Select single column or row, respectively, as a Series by integer location.

get_value, set_value methods Select single value by row and column label.

## Arithmetic and data alignment

One of the most important pandas features is the behavior of arithmetic between objects
with different indexes. When adding together objects, if any index pairs are not
the same, the respective index in the result will be the union of the index pairs. Let’s
look at a simple example:

In [50]:
s1 = Series([7.3, -2.5, 3.4, 1.5], index=['a', 'c', 'd', 'e'])

In [51]:
s2 = Series([-2.1, 3.6, -1.5, 4, 3.1], index=['a', 'c', 'e', 'f', 'g'])

In [52]:
s1 

a    7.3
c   -2.5
d    3.4
e    1.5
dtype: float64

In [53]:
s2

a   -2.1
c    3.6
e   -1.5
f    4.0
g    3.1
dtype: float64

Adding these together yields:

In [54]:
s1 + s2

a    5.2
c    1.1
d    NaN
e    0.0
f    NaN
g    NaN
dtype: float64

The internal data alignment introduces NA values in the indices that don’t overlap.
Missing values propagate in arithmetic computations.
In the case of DataFrame, alignment is performed on both the rows and the columns:

In [55]:
df1 = DataFrame(np.arange(9.).reshape((3, 3)), columns=list('bcd'), index=['Ohio', 'Texas', 'Colorado'])

In [56]:
df2 = DataFrame(np.arange(12.).reshape((4, 3)), columns=list('bde'), index=['Utah', 'Ohio', 'Texas', 'Oregon'])

In [57]:
df1

Unnamed: 0,b,c,d
Ohio,0.0,1.0,2.0
Texas,3.0,4.0,5.0
Colorado,6.0,7.0,8.0


In [58]:
df2

Unnamed: 0,b,d,e
Utah,0.0,1.0,2.0
Ohio,3.0,4.0,5.0
Texas,6.0,7.0,8.0
Oregon,9.0,10.0,11.0


Adding these together returns a DataFrame whose index and columns are the unions
of the ones in each DataFrame:

In [59]:
df1 + df2

Unnamed: 0,b,c,d,e
Colorado,,,,
Ohio,3.0,,6.0,
Oregon,,,,
Texas,9.0,,12.0,
Utah,,,,


### Arithmetic methods with fill values

In arithmetic operations between differently-indexed objects, you might want to fill
with a special value, like 0, when an axis label is found in one object but not the other:

In [60]:
df1 = DataFrame(np.arange(12.).reshape((3, 4)), columns=list('abcd'))

In [61]:
df2 = DataFrame(np.arange(20.).reshape((4, 5)), columns=list('abcde'))

In [62]:
df1

Unnamed: 0,a,b,c,d
0,0.0,1.0,2.0,3.0
1,4.0,5.0,6.0,7.0
2,8.0,9.0,10.0,11.0


In [63]:
df2

Unnamed: 0,a,b,c,d,e
0,0.0,1.0,2.0,3.0,4.0
1,5.0,6.0,7.0,8.0,9.0
2,10.0,11.0,12.0,13.0,14.0
3,15.0,16.0,17.0,18.0,19.0


Adding these together results in NA values in the locations that don’t overlap:

In [64]:
df1 + df2

Unnamed: 0,a,b,c,d,e
0,0.0,2.0,4.0,6.0,
1,9.0,11.0,13.0,15.0,
2,18.0,20.0,22.0,24.0,
3,,,,,


Using the add method on df1, I pass df2 and an argument to fill_value:

In [65]:
df1.add(df2, fill_value=0)

Unnamed: 0,a,b,c,d,e
0,0.0,2.0,4.0,6.0,4.0
1,9.0,11.0,13.0,15.0,9.0
2,18.0,20.0,22.0,24.0,14.0
3,15.0,16.0,17.0,18.0,19.0


Relatedly, when reindexing a Series or DataFrame, you can also specify a different fill
value:

In [66]:
df1.reindex(columns=df2.columns, fill_value=0)

Unnamed: 0,a,b,c,d,e
0,0.0,1.0,2.0,3.0,0
1,4.0,5.0,6.0,7.0,0
2,8.0,9.0,10.0,11.0,0


Table 5-7. Flexible arithmetic methods

Method Description

add Method for addition (+)

sub Method for subtraction (-)

div Method for division (/)

mul Method for multiplication (*)

### Operations between DataFrame and Series

As with NumPy arrays, arithmetic between DataFrame and Series is well-defined. First,
as a motivating example, consider the difference between a 2D array and one of its rows:

In [67]:
arr = np.arange(12.).reshape((3, 4))

In [69]:
arr

array([[  0.,   1.,   2.,   3.],
       [  4.,   5.,   6.,   7.],
       [  8.,   9.,  10.,  11.]])

In [70]:
arr[0]

array([ 0.,  1.,  2.,  3.])

In [71]:
arr - arr[0]

array([[ 0.,  0.,  0.,  0.],
       [ 4.,  4.,  4.,  4.],
       [ 8.,  8.,  8.,  8.]])

This is referred to as broadcasting and is explained in more detail in Chapter 12. Operations
between a DataFrame and a Series are similar:

In [73]:
frame = DataFrame(np.arange(12.).reshape((4, 3)), columns=list('bde'), index=['Utah', 'Ohio', 'Texas', 'Oregon'])

In [74]:
series = frame.ix[0]

In [75]:
frame

Unnamed: 0,b,d,e
Utah,0.0,1.0,2.0
Ohio,3.0,4.0,5.0
Texas,6.0,7.0,8.0
Oregon,9.0,10.0,11.0


In [76]:
series

b    0.0
d    1.0
e    2.0
Name: Utah, dtype: float64

By default, arithmetic between DataFrame and Series matches the index of the Series
on the DataFrame's columns, broadcasting down the rows:

In [77]:
frame - series

Unnamed: 0,b,d,e
Utah,0.0,0.0,0.0
Ohio,3.0,3.0,3.0
Texas,6.0,6.0,6.0
Oregon,9.0,9.0,9.0


If an index value is not found in either the DataFrame’s columns or the Series’s index,
the objects will be reindexed to form the union:

In [78]:
series2 = Series(range(3), index=['b', 'e', 'f'])

In [79]:
frame + series2

Unnamed: 0,b,d,e,f
Utah,0.0,,3.0,
Ohio,3.0,,6.0,
Texas,6.0,,9.0,
Oregon,9.0,,12.0,


If you want to instead broadcast over the columns, matching on the rows, you have to
use one of the arithmetic methods. For example:

In [80]:
series3 = frame['d']

In [81]:
frame

Unnamed: 0,b,d,e
Utah,0.0,1.0,2.0
Ohio,3.0,4.0,5.0
Texas,6.0,7.0,8.0
Oregon,9.0,10.0,11.0


In [82]:
series3

Utah       1.0
Ohio       4.0
Texas      7.0
Oregon    10.0
Name: d, dtype: float64

In [83]:
frame.sub(series3, axis=0)

Unnamed: 0,b,d,e
Utah,-1.0,0.0,1.0
Ohio,-1.0,0.0,1.0
Texas,-1.0,0.0,1.0
Oregon,-1.0,0.0,1.0


The axis number that you pass is the axis to match on. In this case we mean to match
on the DataFrame’s row index and broadcast across.

## Function application and mapping

NumPy ufuncs (element-wise array methods) work fine with pandas objects:

In [93]:
frame = DataFrame(np.random.randn(4, 3), columns=list('bde'), index=['Utah', 'Ohio', 'Texas', 'Oregon'])

In [94]:
frame

Unnamed: 0,b,d,e
Utah,-2.612556,-1.177334,1.055565
Ohio,-0.019528,-0.886689,1.20314
Texas,-0.869449,0.045419,0.38903
Oregon,-2.118287,1.007656,-0.847625


In [95]:
np.abs(frame)

Unnamed: 0,b,d,e
Utah,2.612556,1.177334,1.055565
Ohio,0.019528,0.886689,1.20314
Texas,0.869449,0.045419,0.38903
Oregon,2.118287,1.007656,0.847625


Another frequent operation is applying a function on 1D arrays to each column or row.
DataFrame’s apply method does exactly this:

In [98]:
f = lambda x: x.max() - x.min()

In [99]:
frame.apply(f)

b    2.593028
d    2.184991
e    2.050765
dtype: float64

In [100]:
frame.apply(f, axis=1)

Utah      3.668121
Ohio      2.089830
Texas     1.258478
Oregon    3.125944
dtype: float64

Many of the most common array statistics (like sum and mean) are DataFrame methods,
so using apply is not necessary.

The function passed to apply need not return a scalar value, it can also return a Series
with multiple values:

In [103]:
def f(x):
    return Series([x.min(), x.max()], index=['min', 'max'])

In [104]:
frame.apply(f)

Unnamed: 0,b,d,e
min,-2.612556,-1.177334,-0.847625
max,-0.019528,1.007656,1.20314


Element-wise Python functions can be used, too. Suppose you wanted to compute a
formatted string from each floating point value in frame. You can do this with applymap:

In [105]:
format = lambda x: '%.2f' % x

In [106]:
frame.applymap(format)

Unnamed: 0,b,d,e
Utah,-2.61,-1.18,1.06
Ohio,-0.02,-0.89,1.2
Texas,-0.87,0.05,0.39
Oregon,-2.12,1.01,-0.85


The reason for the name applymap is that Series has a map method for applying an element-
wise function:

In [107]:
frame['e'].map(format)

Utah       1.06
Ohio       1.20
Texas      0.39
Oregon    -0.85
Name: e, dtype: object

## Sorting and ranking

Sorting a data set by some criterion is another important built-in operation. To sort
lexicographically by row or column index, use the sort_index method, which returns
a new, sorted object:

In [108]:
obj = Series(range(4), index=['d', 'a', 'b', 'c'])

In [109]:
obj.sort_index()

a    1
b    2
c    3
d    0
dtype: int64

With a DataFrame, you can sort by index on either axis:

In [111]:
frame = DataFrame(np.arange(8).reshape((2,4)), index=['three', 'one'], columns=['d', 'a', 'b', 'c'])

In [113]:

frame

Unnamed: 0,d,a,b,c
three,0,1,2,3
one,4,5,6,7


In [114]:
frame.sort_index()

Unnamed: 0,d,a,b,c
one,4,5,6,7
three,0,1,2,3


In [115]:
frame.sort_index(axis=1)

Unnamed: 0,a,b,c,d
three,1,2,3,0
one,5,6,7,4


The data is sorted in ascending order by default, but can be sorted in descending order,
too:

In [116]:
frame.sort_index(axis=1, ascending=False)

Unnamed: 0,d,c,b,a
three,0,3,2,1
one,4,7,6,5


To sort a Series by its values, use its order method:

In [117]:
obj = Series([4, 7, -3, 2])

In [120]:
obj.sort_values()

2   -3
3    2
0    4
1    7
dtype: int64

Any missing values are sorted to the end of the Series by default:

In [121]:
obj = Series([4, np.nan, 7, np.nan, -3, 2])

In [122]:
obj.sort_values()

4   -3.0
5    2.0
0    4.0
2    7.0
1    NaN
3    NaN
dtype: float64

On DataFrame, you may want to sort by the values in one or more columns. To do so,
pass one or more column names to the by option:

In [124]:
frame = DataFrame({'b': [4, 7, -3, 2], 'a': [0, 1, 0, 1]})

In [125]:
frame

Unnamed: 0,a,b
0,0,4
1,1,7
2,0,-3
3,1,2


In [127]:
frame.sort_values(by='b')

Unnamed: 0,a,b
2,0,-3
3,1,2
0,0,4
1,1,7


To sort by multiple columns, pass a list of names:

In [129]:
frame.sort_values(by=['a', 'b'])

Unnamed: 0,a,b
2,0,-3
0,0,4
3,1,2
1,1,7


Ranking is closely related to sorting, assigning ranks from one through the number of
valid data points in an array. It is similar to the indirect sort indices produced by
numpy.argsort, except that ties are broken according to a rule. The rank methods for
Series and DataFrame are the place to look; by default rank breaks ties by assigning
each group the mean rank:

In [131]:
obj = Series([7, -5, 7, 4, 2, 0, 4])

In [133]:
obj

0    7
1   -5
2    7
3    4
4    2
5    0
6    4
dtype: int64

In [134]:
obj.rank()

0    6.5
1    1.0
2    6.5
3    4.5
4    3.0
5    2.0
6    4.5
dtype: float64

Ranks can also be assigned according to the order they’re observed in the data:

In [135]:
obj.rank(method='first')

0    6.0
1    1.0
2    7.0
3    4.0
4    3.0
5    2.0
6    5.0
dtype: float64

Naturally, you can rank in descending order, too:

In [136]:
obj.rank(ascending=False, method='max')

0    2.0
1    7.0
2    2.0
3    4.0
4    5.0
5    6.0
6    4.0
dtype: float64

See Table 5-8 for a list of tie-breaking methods available. DataFrame can compute ranks
over the rows or the columns:

In [137]:
frame = DataFrame({'b': [4.3, 7, -3, 2], 'a': [0, 1, 0, 1], 'c': [-2, 5, 8, -2.5]})

In [138]:
frame

Unnamed: 0,a,b,c
0,0,4.3,-2.0
1,1,7.0,5.0
2,0,-3.0,8.0
3,1,2.0,-2.5


In [139]:
frame.rank(axis=1)

Unnamed: 0,a,b,c
0,2.0,3.0,1.0
1,1.0,3.0,2.0
2,2.0,1.0,3.0
3,2.0,3.0,1.0


Table 5-8. Tie-breaking methods with rank

Method Description

'average' Default: assign the average rank to each entry in the equal group.

'min' Use the minimum rank for the whole group.

'max' Use the maximum rank for the whole group.

'first' Assign ranks in the order the values appear in the data.

### Axis indexes with duplicate values


Up until now all of the examples I’ve showed you have had unique axis labels (index
values). While many pandas functions (like reindex) require that the labels be unique,
it’s not mandatory. Let’s consider a small Series with duplicate indices:

In [140]:
obj = Series(range(5), index=['a', 'a', 'b', 'b', 'c'])

In [141]:
obj

a    0
a    1
b    2
b    3
c    4
dtype: int64

The index’s is_unique property can tell you whether its values are unique or not:

In [142]:
obj.index.is_unique

False

Data selection is one of the main things that behaves differently with duplicates. Indexing
a value with multiple entries returns a Series while single entries return a scalar
value:

In [143]:
obj['a']

a    0
a    1
dtype: int64

In [144]:
obj['c']

4

The same logic extends to indexing rows in a DataFrame:

In [145]:
df = DataFrame(np.random.randn(4,3), index=['a', 'a', 'b', 'b'])

In [146]:
df

Unnamed: 0,0,1,2
a,-1.46947,-1.211022,0.654323
a,0.467773,-0.406585,-0.333752
b,-0.222183,0.616704,-1.200082
b,-1.722163,1.292316,-1.014751


In [147]:
df.ix['b']

Unnamed: 0,0,1,2
b,-0.222183,0.616704,-1.200082
b,-1.722163,1.292316,-1.014751
