-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creation of superblocks #558
Comments
We need a way to create the above superblock. Creation of the superblock involves abstract interpretation over the (hopefully specialized) bytecodes.
The initial values are:
The algorithm is something like: while (on_target > on_target_threshold) {
inst = read_instruction(code, offset)
switch(inst.opcode) {
case NOP:
/* no micro-ops to add */
/* code object is unchanged */
offset += 1
on_target = on_target * 1.0;
break;
/* etc, etc... */
}
} This would incredibly tedious and error-prone to write by hand. So we need to extend the interpreter generator. With sufficient metadata we can generate the above loop. We will need to transform some micro-ops. Non-local jumps like calls, yields, etc should be explicit in the bytecode. We may need to understand things like Local jumps (jumps and branches) need explicit handling. Jumps become no-ops (as all instructions need to update Guards remain as guards (the behavior differs, but that is dealt with by the tier2 interpreter/compiler). |
An incremental way of doing this using the code generator might be to add a default case to the switch that bails out of the loop when an unsupported opcode is encountered, and then trying to add stuff to bytecodes.c and/or to the generator to allow more and more opcodes to be supported. Some metadata can be recovered by scanning the C code for an instruction (@iritkatriel is starting to do this in python/cpython#105482 already); presumably there's also metadata that will require us to mark certain instructions explicitly. |
- Tweak uops debugging output - Fix the bug from gh-106290 - Rename `SET_IP` to `SAVE_IP` (per faster-cpython/ideas#558) - Add a `SAVE_IP` uop at the start of the trace (ditto) - Allow `unbound_local_error`; this gives us uops for `LOAD_FAST_CHECK`, `LOAD_CLOSURE`, and `DELETE_FAST` - Longer traces - Support `STORE_FAST_LOAD_FAST`, `STORE_FAST_STORE_FAST` - Add deps on pycore_uops.h to Makefile(.pre.in)
A superblock will be a linear sequence of "micro ops", that may have multiple side exits, but will only have one entry point.
The start of the superblock will be determined by "hotspots" in the code. The tricky bit is creating the rest of the superblock.
For non-branching code, extending the superblock is easy enough.
For branches we either need the base interpreter to record which way the branch goes, or to make a estimate based on static analysis and the value being tested, if it is available.
For non-local jumps, like calls and return, we need to rely on the information recorded by the specializing adaptive interpreter.
We should end the superblock when either the estimated likelihood of execution staying in the superblock drops below a threshold, say 40%, or when we cannot estimate that likelihood.
An example:
The bytecode for the above snippet is:
and the bytecode for
typing.cast
is:Starting at the
LOAD_GLOBAL 0 (typing)
the resulting superblock might look something like:
All the
SAVE_IP
instructions exist to make sure that theframe->prev_instr
field gets updated correctly.Don't worry, optimization should remove almost all of them.
The text was updated successfully, but these errors were encountered: