Potential regression induced by commit 924f246 #58285

rhshadrach · 2024-04-17T02:43:16Z

PR #57915 may have induced a performance regression. If it was a necessary behavior change, this may have been expected and everything is okay.

Please check the links below. If any ASVs are parameterized, the combinations of parameters that a regression has been detected for appear as subbullets.

Subsequent benchmarks may have skipped some commits. The link below lists the commits that are between the two benchmark runs where the regression was identified.

Commit Range

cc @NickCrews

NickCrews · 2024-04-17T04:11:02Z

Repr'ing like this shouldn't be in the hot path for any application, so a 30% slowdown that is only a few milliseconds seems like we shouldn't care at all

rhshadrach · 2024-04-17T21:20:30Z

No disagreement on the severity of the regression, but the nature of it surprises me. In the time_to_html_mixed benchmark, I think the only things that hit the lines changed in the linked PR are where the object in question is None, int, str, or dict, so the code paths taken are the same. I'm seeing testing isinstance(..., dict) vs isinstance(..., Mapping) take 20ns vs 80ns on my machine, so for a 4ms slowdown these lines would need to be hit about ~60k times.

NickCrews · 2024-04-18T19:09:53Z

hmm, yeah that is weird. I haven't looked at the benchmark in detail. Perhaps it is because the instance(..., dict) is able to short circuit with some optimization or c-only code, but the Mapping version actually requires going out to the python world and checking if the python object contains a .keys() method or something?

rhshadrach · 2024-04-18T20:55:04Z

Right - I'm not too surprised dict is highly optimized here - it's a data structure that is used all over CPython itself. Rather, the difference in isinstance(..., dict) and isinstance(..., Mapping) does not seem sufficient to explain the regression that showed up in the ASVs. I'll dig into this a little more this weekend.

rhshadrach added Output-Formatting __repr__ of pandas objects, to_string Performance Memory or execution speed performance Regression Functionality that used to work in a prior pandas version labels Apr 17, 2024

rhshadrach added this to the 3.0 milestone Apr 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential regression induced by commit 924f246 #58285

Potential regression induced by commit 924f246 #58285

rhshadrach commented Apr 17, 2024

NickCrews commented Apr 17, 2024

rhshadrach commented Apr 17, 2024

NickCrews commented Apr 18, 2024

rhshadrach commented Apr 18, 2024

Potential regression induced by commit 924f246 #58285

Potential regression induced by commit 924f246 #58285

Comments

rhshadrach commented Apr 17, 2024

NickCrews commented Apr 17, 2024

rhshadrach commented Apr 17, 2024

NickCrews commented Apr 18, 2024

rhshadrach commented Apr 18, 2024