## Remap entries of a Series

In [2]:
import pandas as pd
import numpy as np

Remap according to a dictionary

In [4]:
s = pd.Series(['cat', 'dog', 1, 2])
d = {"cat": 0, "dog": 1, 1: 2, 2: 3}

s.map(d)

0    0
1    1
2    2
3    3
dtype: int64

Remap according to a function

In [5]:
f = lambda x: str(x) + "_new"
s.map(f)

0    cat_new
1    dog_new
2      1_new
3      2_new
dtype: object

## Replace according to cases

Two cases

In [6]:
s = pd.Series(np.arange(10))
np.where(s % 2 == 0, 0, 1)

array([0, 1, 0, 1, 0, 1, 0, 1, 0, 1])

In [8]:
np.select([s < 2, s < 4, s < 8], [0, 1, 2])

array([0, 0, 1, 1, 2, 2, 2, 2, 0, 0])

## Indexing with Callables

In [12]:
df = pd.DataFrame(np.random.randn(10, 3), columns=["a", "b", "c"])

This allows us to not have to create intermediate indexing variables, and helps us chain consecutively. For example, we can index using the results of the groupby mean.

In [19]:
# Indexing with a callable
(df.groupby("a")
    .mean()
    .reset_index()
    .loc[lambda f: f["a"] > 0, :]
)

Unnamed: 0,a,b,c
3,0.045347,-0.460512,0.405638
4,0.056177,-0.504621,0.948509
5,0.429672,-0.58279,-0.124533
6,0.550559,0.975535,-1.22243
7,0.755109,-0.76885,0.79349
8,0.887669,-0.582231,-0.707347
9,0.894173,-0.97516,-0.698036


In [21]:
# Indexing without a callable
tmp = (df.groupby("a")
    .mean()
    .reset_index()
)
tmp.loc[tmp["a"] > 0, :]

Unnamed: 0,a,b,c
3,0.045347,-0.460512,0.405638
4,0.056177,-0.504621,0.948509
5,0.429672,-0.58279,-0.124533
6,0.550559,0.975535,-1.22243
7,0.755109,-0.76885,0.79349
8,0.887669,-0.582231,-0.707347
9,0.894173,-0.97516,-0.698036


## Multiindex indexing

Using `pd.IndexSlice` to select all the entries from a level of a `MultiIndex`

reindex

Selecting all entries under the ith label in level 0 of a multiindex

## Functions to help method chaining

In [24]:
df = pd.DataFrame({'temp_c': [17.0, 25.0]},
                  index=['Portland', 'Berkeley'])
df

Unnamed: 0,temp_c
Portland,17.0
Berkeley,25.0


### `DataFrame.assign`

In [25]:
df.assign(temp_f=lambda x: x['temp_c'] * 9 / 5 + 32,
          temp_k=lambda x: (x['temp_f'] +  459.67) * 5 / 9)

Unnamed: 0,temp_c,temp_f,temp_k
Portland,17.0,62.6,290.15
Berkeley,25.0,77.0,298.15


### `DataFrame.pipe`

In [27]:
def f(x, T):
    return T * x

df.pipe(f, T=3)

Unnamed: 0,temp_c
Portland,51.0
Berkeley,75.0


### `DataFrame.rename`

In [31]:
df.rename(columns={"temp_c": "new_name"})

Unnamed: 0,new_name
Portland,17.0
Berkeley,25.0


In [30]:
df.rename(lambda s: s.upper(), axis="columns")

Unnamed: 0,TEMP_C
Portland,17.0
Berkeley,25.0


### `DataFrame.query`

In [32]:
T = 20
df.query("temp_c > @T")

Unnamed: 0,temp_c
Berkeley,25.0


In [33]:
T = 20
df.query("temp_c > @T and temp_c < @T")

Unnamed: 0,temp_c


## 