-
Notifications
You must be signed in to change notification settings - Fork 52
Closed
Labels
MetaStuff about the project (incl. management)Stuff about the project (incl. management)
Description
The Shannon Plan
(Cf. The Seldon Plan from Asimov's Foundation trilogy. :-)
The key goal here is to speed up the bytecode interpreter through speculative specialization (#26). Optimistically we are aiming for speed-up of a factor 2 for Python 3.11 (to be released Oct 2022).
The key successive steps towards this goal are:
- Generating code for the interpreter main loop and switch (Generating code for the ceval loop/switch #5)
- Quickening interpreter (Quickening interpreter #27)
- Adaptive, specializing interpreter (Adaptive, specializing interpreter #28)
It will be hard to parallelize the steps towards the Shannon Plan because each step builds on the previous one. However, there are many other ideas that we can try concurrently with the Shannon plan. When successful, those ideas will produce additional, largely independent speed-ups. Below I am listing the key ones.
Other Parallel Workstreams
- Benchmarking and profiling (Benchmarking #10, Profiling #12, Incorporate azure-sdk / azure-cli into pyperformance suite #18, (maybe) Add Tooling and Automation for Benchmarking/Profiling #21, [profiling] Add profiling instrumentation centered on the eval loop #25, [profiling] Review the existing DXPAIRS instrumentation #29). This can easily take up an engineer full-time: write and update tools that fit our purpose, manage the hardware, watch for regressions and address them.
- Work on interpreter startup time (Improve startup time. #32). This is logically completely separate, but addresses the needs of certain classes of users (e.g. the Azure CLI would benefit).
- Design and implement "lazy import". This is frequently requested and would also benefit startup time.
- Move the frame stack out of the heap (Move the data and control (block) stacks on the thread and remove frame objects. #31). This can improve cache locality and speed up calls.
- Avoid adding to the C stack for Python-to-Python calls. This was contemplated for PEP 651. While that PEP was rejected, we could revive the idea if we can prove it has a positive performance benefit.
- Tagged integers or tagged numbers (Pointer tagging aka tagged numbers #7). This is quite a complex idea, with consequences for ABI and API compatibility, but worth exploring since the potential is quite high.
- Hidden classes (Hidden classes #6). Another idea with serious ABI/API consequences that's worth exploring. Attributes would end up at fixed offsets from the instance pointer. This ties in with the Shannon plan (since the fixed offsets allow more specializations) but doesn't depend on it (the fixed offsets would work like
__slots__
) and neither is it assumed by the Shannon plan's performance goals. - GC improvements. The existing GC is already highly tuned so there's no low-hanging fruit, but the current GC implementation is 1-2 decades old and can conceivably be improved, with some effort. (One possible idea would be to run GC in a separate thread, Move GC to a separate thread #58.)
- Move the
__dict__
used for instance attributes to a fixed attribute relative to the start of the object. This would speed up accessing attributes by removing 2 indirect accesses. For ABI compatibility the offset might have to be negative.
bratao, antonagestam, lpereira, deskfan and jarrodmillman
Metadata
Metadata
Assignees
Labels
MetaStuff about the project (incl. management)Stuff about the project (incl. management)