---
title: "Advanced Data Manipulation (apply, map, applymap)"
author: "Mohammed Adil Siraju"
date: "2025-09-21"
categories: [pandas, dataframe, transformation]
description: "Overview of `apply`, `map`, and `applymap` for advanced DataFrame/Series transformations."
---

This notebook demonstrates advanced element-wise and row/column-wise transformations in pandas using `apply`, `map`, and `applymap`.


## Introduction

Pandas provides flexible methods to transform data:
- `Series.map(func)`: elementwise mapping for a Series.
- `DataFrame.apply(func, axis=...)`: apply a function to each column or row (as Series).
- `DataFrame.applymap(func)`: elementwise operation across the entire DataFrame.

We'll illustrate each with short examples and best-practice notes.

In [1]:
# Import libraries and create sample DataFrame
import pandas as pd

data = {
    'A': [1, 2, 3, 4, 5],
    'B': [10, 20, 30, 40, 50],
}

df = pd.DataFrame(data)
df

Unnamed: 0,A,B
0,1,10
1,2,20
2,3,30
3,4,40
4,5,50


## Using `apply`

`DataFrame.apply` calls a function on each column (by default) or each row when `axis=1`. The function receives a Series and should return a single value or a Series (for aggregation or transformation).

Use `apply` when your operation needs to work on an entire row/column at once (e.g., compute a statistic or combine multiple columns).

In [2]:
# Example: multiply each column (Series) by 2 using apply
# Note: apply receives a Series (column) by default, so multiplying the Series scales all values in that column
df_apply = df.apply(lambda col: col * 2)
df_apply

Unnamed: 0,A,B
0,2,20
1,4,40
2,6,60
3,8,80
4,10,100


## Using `map` (Series)

`Series.map` is an elementwise operation on a Series. Use it for simple scalar transformations or to map values via a dict/Series/function. It is not available on `DataFrame` directly (use `applymap` for elementwise on DataFrame).

In [3]:
# Series example using map
series_data = pd.Series([1, 2, 3, 4, 5])
mapped_data = series_data.map(lambda x: x ** 2)
mapped_data

0     1
1     4
2     9
3    16
4    25
dtype: int64

In [4]:
# original series
series_data

0    1
1    2
2    3
3    4
4    5
dtype: int64

## Using `applymap` (elementwise on DataFrame)

`DataFrame.applymap` applies a function to each element of the DataFrame. This is the correct choice for elementwise numeric transforms across all cells. For column/row-wise operations, prefer `apply`.

In [5]:
# show the DataFrame
df

Unnamed: 0,A,B
0,1,10
1,2,20
2,3,30
3,4,40
4,5,50


In [6]:
# elementwise cube using applymap
df_applymap = df.applymap(lambda x: x ** 3)
df_applymap

  df_applymap = df.applymap(lambda x: x ** 3)


Unnamed: 0,A,B
0,1,1000
1,8,8000
2,27,27000
3,64,64000
4,125,125000


## Creating new columns with `apply` (row-wise)

When you need to compute a value using multiple columns, use `apply` with `axis=1`. For better performance, prefer vectorized operations when possible (see Best Practices below).

In [7]:
# create column 'C' as product of A and B using apply row-wise
df['C'] = df.apply(lambda row: row['A'] * row['B'], axis=1)
df

Unnamed: 0,A,B,C
0,1,10,10
1,2,20,40
2,3,30,90
3,4,40,160
4,5,50,250


## Best Practices

- Prefer pandas vectorized operations (e.g., `df['A'] * df['B']`) over `apply` when possible — they are faster and clearer.
- Use `map` for Series-to-Series elementwise mappings or label replacements.
- Use `applymap` only when you need a uniform elementwise transform across the entire DataFrame.
- When using `apply` with `axis=1`, consider `np.where`, `pd.Series.where`, or vectorized arithmetic to improve performance.
- Keep functions simple and avoid expensive Python-level loops inside `apply`/`map` for large DataFrames.

## Summary & Further Reading

This notebook covered the differences between `map`, `apply`, and `applymap` and showed practical examples. For more, see the pandas documentation: https://pandas.pydata.org/pandas-docs/stable/reference/index.html

Further exercises: try rewriting the `df['C']` calculation using a fully vectorized expression and compare timing with `%timeit`.