[proposal] Allow "precompiled" perf-trampolines to largely mitigate the cost of enabling perf-trampolines #109587

czardoz · 2023-09-19T23:21:11Z

Feature or enhancement

Proposal:

The perf trampoline feature introduced in #96143 is incredibly useful to gain visibility into the cost of individual Python functions.

Currently, there are a few aspects of the trampoline that make it costly to enable on live server-side applications:

Each call to a new function results in a disk IO operation (to write to the /tmp/perf-<pid>.map file)
For a forked multiprocess model:
a. We disable and re-enable the trampoline on fork, which means the same work must be done in child processes.
b. We use more memory, because we do not take advantage of copy-on-write.

On a fairly large Python server (Instagram), we have observed a 1.5% overall CPU regression, mainly stemming from the repeated file writes.

In order to address these, and make it possible to have perf-trampoline running in an always-enabled fashion, we could allow extension modules to initialize trampolines eagerly, after the application is "warmed up". (The definition of warmed up is specific to the application).

Essentially, this would involve introducing a two C-API functions:

//  Creates a new trampoline by doing essentially what `py_trampoline_evaluator` currently does.
//  1. Call `compile_trampoline()`
//  2. Register it by calling `trampoline_api.write_state()`
//  3. Call `_PyCode_SetExtra`
int PyUnstable_PerfTrampoline_CompileCode(PyCodeObject *co);

// This flag will be used by _PyPerfTrampoline_AfterFork_Child to 
// decide whether to re-initialize trampolines in child processes. If enabled,
// we would copy the the parent process perf-map file, else we would use 
// the current behavior. 
static Py_ssize_t persist_after_fork = 0;

int PyUnstable_PerfTrampoline_PersistAfterFork (int enable) {
    persist_after_fork = enable;
}

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

gh-109587: Allow "precompiled" perf-trampolines to largely mitigate the cost of enabling perf-trampolines #109666

The text was updated successfully, but these errors were encountered:

carljm · 2023-09-20T15:08:01Z

cc @pablogsal @gpshead

…he cost of enabling perf-trampolines (#109666)

…gate the cost of enabling perf-trampolines (python#109666)

czardoz added the type-feature A feature request or enhancement label Sep 19, 2023

gsallam mentioned this issue Sep 21, 2023

gh-109587: Allow "precompiled" perf-trampolines to largely mitigate the cost of enabling perf-trampolines #109666

Merged

pablogsal pushed a commit that referenced this issue Oct 27, 2023

gh-109587: Allow "precompiled" perf-trampolines to largely mitigate t…

21f068d

…he cost of enabling perf-trampolines (#109666)

pablogsal closed this as completed Oct 27, 2023

aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024

pythongh-109587: Allow "precompiled" perf-trampolines to largely miti…

ce85899

…gate the cost of enabling perf-trampolines (python#109666)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[proposal] Allow "precompiled" perf-trampolines to largely mitigate the cost of enabling perf-trampolines #109587

[proposal] Allow "precompiled" perf-trampolines to largely mitigate the cost of enabling perf-trampolines #109587

czardoz commented Sep 19, 2023 •

edited by bedevere-app bot

carljm commented Sep 20, 2023

[proposal] Allow "precompiled" perf-trampolines to largely mitigate the cost of enabling perf-trampolines #109587

[proposal] Allow "precompiled" perf-trampolines to largely mitigate the cost of enabling perf-trampolines #109587

Comments

czardoz commented Sep 19, 2023 • edited by bedevere-app bot

Feature or enhancement

Proposal:

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Linked PRs

carljm commented Sep 20, 2023

czardoz commented Sep 19, 2023 •

edited by bedevere-app bot