Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiling tiny traces wastes lots of memory #116017

Open
brandtbucher opened this issue Feb 28, 2024 · 4 comments
Open

Compiling tiny traces wastes lots of memory #116017

brandtbucher opened this issue Feb 28, 2024 · 4 comments
Assignees
Labels
3.13 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@brandtbucher
Copy link
Member

brandtbucher commented Feb 28, 2024

The JIT has a minimum allocation granularity of 1 page, with code and data on separate pages. Reducing this minimum would mean having pages of memory shared between executors, which is much harder to manage (and isn't thread-safe).

So the minimum amount of memory owned by an executor is 2 pages. This equals 8KB of memory on most platforms (the notable exception being newer Macs which have 16KB pages, and thus a whopping 32KB minimum allocation per trace). And all of that memory is paged in, since there's at least some memory being used on every page.

We should tune the trace collection machinery to find a sensible minimum trace length for tier two. We might also consider compiling fewer traces.

A separate, but hairier, issue is what to do about the array of cold exit executors. We have 512 of them, and they're one instruction each, adding up to 4MB of mostly empty memory (16MB on macOS). For reference, if optimally packed (i.e. compiled as a single trace of length 512), they would take up about 200KB.

@mdboom has been working on measuring the memory overhead of tier two on our benchmark suite. Early results show that:

(Keep in mind that most of the benchmarks have a tiny memory footprint to begin with, so moderate increases can manifest as large percentages.)

CC: @gvanrossum, @markshannon

Linked PRs

@brandtbucher brandtbucher added performance Performance or resource usage interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.13 bugs and security fixes labels Feb 28, 2024
@brandtbucher
Copy link
Member Author

We may also want to consider a scheme for garbage-collecting cold traces. We already have a linked list of executors and a way of invalidating/removing them, so it probably wouldn't be too big of a project.

@gvanrossum
Copy link
Member

Yeah, for a hot executor the memory waste feels less of an issue than for a cold executor. Without measuring what's hot, we could at least have separate min-trace-size limits if the executor has a loop in it and if it doesn't. Mark probably has more interesting things to say. Meeting agenda for tomorrow?

@brandtbucher
Copy link
Member Author

Action item for me here is to investigate a more efficient allocation scheme (with multiple executors per page).

@savannahostrowski
Copy link
Contributor

I'll take a look at this ahead of 3.14!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
Projects
None yet
Development

No branches or pull requests

3 participants