Skip to content

PERF: Series.apply is slower on single element dict compared with multi elements dict #56942

@Alexia-I

Description

@Alexia-I

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this issue exists on the latest version of pandas.

  • I have confirmed this issue exists on the main branch of pandas.

Reproducible Example

Hello, I noticed that applying the apply function to a Series with a single element dictionary takes more time than its counterpart with a multi-element dictionary. I'm curious if this is due to something I did wrong.

import pandas as pd
import timeit
import random
import string
import time

# Create dictionary inputs
single_pair_dict = {'a': 42}
# Create a dictionary containing 10 key-value pairs
multi_pair_dict = {letter: random.randint(1, 100) for letter in string.ascii_lowercase[:10]}

# Self-defined apply function
def complex_operation(x):
    return x * x - x + 42

n = 10000
# Create Series and test the execution time of apply() method
times = time.time()
for i in range(n):
    pd.Series(single_pair_dict).apply(complex_operation)
time_now_single = time.time() - times

timem = time.time()
for i in range(n):
    pd.Series(multi_pair_dict).apply(complex_operation)
time_now_multi = time.time() - timem

# Print the results
print(f"Time for apply() on Series with a single key-value pair: {time_now_single} seconds")
print(f"Time for apply() on Series with multiple key-value pairs: {time_now_multi} seconds")
Time for apply() on Series with single key-value pair: 1.1293659210205078 seconds
Time for apply() on Series with multiple key-value pairs: 1.0656580924987793 seconds

Installed Versions

INSTALLED VERSIONS

commit : d4c8d82
python : 3.9.18.final.0
python-bits : 64
OS : Linux
OS-release : 5.15.0-88-generic
Version : #98~20.04.1-Ubuntu SMP Mon Oct 9 16:43:45 UTC 2023
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 2.2.0rc0
numpy : 1.26.3
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 68.2.2
pip : 23.3.1
Cython : 3.0.8
pytest : None
hypothesis : None
...
zstandard : None
tzdata : 2023.4
qtpy : None
pyqt5 : None

Prior Performance

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    ApplyApply, Aggregate, Transform, MapNeeds InfoClarification about behavior needed to assess issuePerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions