Skip to content

Conversation

@JeffersGlass
Copy link
Contributor

@JeffersGlass JeffersGlass commented Feb 8, 2024

Adds the ability to track pairs, triples, and longer sequences of UOps to the pystats statistics.

Currently, this only counts these statistics for the non-JIT tier 2 interpreter. I intend to include tracking in the JIT in a follow-up PR.

To try it out, run:

./configure --enable-pystats
./make
mkdir /tmp/py_stats
PYTHON_UOPS=1 ./python -X pystats -c $'x = 0\nfor i in range(1000):\n x+=i'

The output file at /tmp/py_stats/???.txt will contain a new section with UOp Sequence counts:

UOp sequence count[BUILD_TUPLE,_CHECK_VALIDITY]: 2
UOp sequence count[LIST_APPEND,_CHECK_VALIDITY]: 13
UOp sequence count[LOAD_DEREF,_SET_IP]: 2
UOp sequence count[LOAD_FAST,LOAD_FAST]: 7
...

Tools/scripts/summarize_stats.py has also been expanded to encompass these new stats, with a new section:

Counts for top 100 UOp Sequences of Length 2
Sequence Count Self Cumulative
_CHECK_VALIDITY,_SET_IP 3,932 19.0% 19.0%
LOAD_NAME,_CHECK_VALIDITY 1,966 9.5% 28.6%
_SET_IP,LOAD_NAME 1,966 9.5% 38.1%
_ITER_CHECK_RANGE,_SET_IP 984 4.8% 42.9%
_GUARD_NOT_EXHAUSTED_RANGE,_ITER_CHECK_RANGE 984 4.8% 47.6%
STORE_NAME,_SET_IP 983 4.8% 52.4%
STORE_NAME,_CHECK_VALIDITY 983 4.8% 57.1%
_SET_IP,STORE_NAME 983 4.8% 61.9%
_SET_IP,_BINARY_OP_ADD_INT 983 4.8% 66.7%
_SET_IP,_ITER_NEXT_RANGE 983 4.8% 71.4%
_SET_IP,_JUMP_TO_TOP 983 4.8% 76.2%
_GUARD_BOTH_INT,_CHECK_VALIDITY 983 4.8% 81.0%
_BINARY_OP_ADD_INT,_GUARD_BOTH_INT 983 4.8% 85.7%
_ITER_NEXT_RANGE,_GUARD_NOT_EXHAUSTED_RANGE 983 4.8% 90.5%
_JUMP_TO_TOP,_CHECK_VALIDITY 983 4.8% 95.2%
_CHECK_VALIDITY,STORE_NAME 983 4.8% 100.0%

Finally, a new environment variable, PYTHONSTATS_UOPDEPTH can be set to any integer to record sequences of length longer than two. For example:

PYTHONSTATS_UOPDEPTH=4 PYTHON_UOPS=1 ./python -X pystats -c 'x = 0; for i in range(1000): x+=i

Yields:

Counts for top 100 UOp Sequences of Length 2
Sequence Count Self Cumulative
_CHECK_VALIDITY,_SET_IP 3,932 19.0% 19.0%
LOAD_NAME,_CHECK_VALIDITY 1,966 9.5% 28.6%
_SET_IP,LOAD_NAME 1,966 9.5% 38.1%
_ITER_CHECK_RANGE,_SET_IP 984 4.8% 42.9%
_GUARD_NOT_EXHAUSTED_RANGE,_ITER_CHECK_RANGE 984 4.8% 47.6%
STORE_NAME,_SET_IP 983 4.8% 52.4%
STORE_NAME,_CHECK_VALIDITY 983 4.8% 57.1%
_SET_IP,STORE_NAME 983 4.8% 61.9%
_SET_IP,_BINARY_OP_ADD_INT 983 4.8% 66.7%
_SET_IP,_ITER_NEXT_RANGE 983 4.8% 71.4%
_SET_IP,_JUMP_TO_TOP 983 4.8% 76.2%
_GUARD_BOTH_INT,_CHECK_VALIDITY 983 4.8% 81.0%
_BINARY_OP_ADD_INT,_GUARD_BOTH_INT 983 4.8% 85.7%
_ITER_NEXT_RANGE,_GUARD_NOT_EXHAUSTED_RANGE 983 4.8% 90.5%
_JUMP_TO_TOP,_CHECK_VALIDITY 983 4.8% 95.2%
_CHECK_VALIDITY,STORE_NAME 983 4.8% 100.0%
Counts for top 100 UOp Sequences of Length 3
Sequence Count Self Cumulative
LOAD_NAME,_CHECK_VALIDITY,_SET_IP 1,966 9.5% 9.5%
_SET_IP,LOAD_NAME,_CHECK_VALIDITY 1,966 9.5% 19.0%
_CHECK_VALIDITY,_SET_IP,LOAD_NAME 1,966 9.5% 28.6%
_GUARD_NOT_EXHAUSTED_RANGE,_ITER_CHECK_RANGE,_SET_IP 984 4.8% 33.3%
STORE_NAME,_SET_IP,_BINARY_OP_ADD_INT 983 4.8% 38.1%
STORE_NAME,_CHECK_VALIDITY,_SET_IP 983 4.8% 42.9%
_SET_IP,STORE_NAME,_CHECK_VALIDITY 983 4.8% 47.6%
_SET_IP,_BINARY_OP_ADD_INT,_GUARD_BOTH_INT 983 4.8% 52.4%
_SET_IP,_ITER_NEXT_RANGE,_GUARD_NOT_EXHAUSTED_RANGE 983 4.8% 57.1%
_SET_IP,_JUMP_TO_TOP,_CHECK_VALIDITY 983 4.8% 61.9%
_GUARD_BOTH_INT,_CHECK_VALIDITY,_SET_IP 983 4.8% 66.7%
_BINARY_OP_ADD_INT,_GUARD_BOTH_INT,_CHECK_VALIDITY 983 4.8% 71.4%
_ITER_CHECK_RANGE,_SET_IP,_JUMP_TO_TOP 983 4.8% 76.2%
_ITER_NEXT_RANGE,_GUARD_NOT_EXHAUSTED_RANGE,_ITER_CHECK_RANGE 983 4.8% 80.9%
_JUMP_TO_TOP,_CHECK_VALIDITY,STORE_NAME 983 4.8% 85.7%
_CHECK_VALIDITY,STORE_NAME,_SET_IP 983 4.8% 90.5%
_CHECK_VALIDITY,_SET_IP,STORE_NAME 983 4.8% 95.2%
_CHECK_VALIDITY,_SET_IP,_ITER_NEXT_RANGE 983 4.8% 100.0%
Counts for top 100 UOp Sequences of Length 4
Sequence Count Self Cumulative
_SET_IP,LOAD_NAME,_CHECK_VALIDITY,_SET_IP 1,966 9.5% 9.5%
_CHECK_VALIDITY,_SET_IP,LOAD_NAME,_CHECK_VALIDITY 1,966 9.5% 19.0%
LOAD_NAME,_CHECK_VALIDITY,_SET_IP,LOAD_NAME 983 4.8% 23.8%
LOAD_NAME,_CHECK_VALIDITY,_SET_IP,STORE_NAME 983 4.8% 28.6%
STORE_NAME,_SET_IP,_BINARY_OP_ADD_INT,_GUARD_BOTH_INT 983 4.8% 33.3%
STORE_NAME,_CHECK_VALIDITY,_SET_IP,_ITER_NEXT_RANGE 983 4.8% 38.1%
_SET_IP,STORE_NAME,_CHECK_VALIDITY,_SET_IP 983 4.8% 42.9%
_SET_IP,_BINARY_OP_ADD_INT,_GUARD_BOTH_INT,_CHECK_VALIDITY 983 4.8% 47.6%
_SET_IP,_ITER_NEXT_RANGE,_GUARD_NOT_EXHAUSTED_RANGE,_ITER_CHECK_RANGE 983 4.8% 52.4%
_SET_IP,_JUMP_TO_TOP,_CHECK_VALIDITY,STORE_NAME 983 4.8% 57.1%
_GUARD_BOTH_INT,_CHECK_VALIDITY,_SET_IP,LOAD_NAME 983 4.8% 61.9%
_BINARY_OP_ADD_INT,_GUARD_BOTH_INT,_CHECK_VALIDITY,_SET_IP 983 4.8% 66.7%
_ITER_CHECK_RANGE,_SET_IP,_JUMP_TO_TOP,_CHECK_VALIDITY 983 4.8% 71.4%
_GUARD_NOT_EXHAUSTED_RANGE,_ITER_CHECK_RANGE,_SET_IP,_JUMP_TO_TOP 983 4.8% 76.2%
_ITER_NEXT_RANGE,_GUARD_NOT_EXHAUSTED_RANGE,_ITER_CHECK_RANGE,_SET_IP 983 4.8% 80.9%
_JUMP_TO_TOP,_CHECK_VALIDITY,STORE_NAME,_SET_IP 983 4.8% 85.7%
_CHECK_VALIDITY,STORE_NAME,_SET_IP,_BINARY_OP_ADD_INT 983 4.8% 90.5%
_CHECK_VALIDITY,_SET_IP,STORE_NAME,_CHECK_VALIDITY 983 4.8% 95.2%
_CHECK_VALIDITY,_SET_IP,_ITER_NEXT_RANGE,_GUARD_NOT_EXHAUSTED_RANGE 983 4.8% 100.0%

EDIT: I see this has automatically requested a review from a huge number of people. This is my first time submitting a PR to CPython - please do forgive me if I've goofed this process up somehow.


📚 Documentation preview 📚: https://cpython-previews--115181.org.readthedocs.build/

Make optimization_stats.opcode an array of pointers to UOpStats objects,
instead of a member array.
Add a *next_stats[512] field to the UOpStats struct
Add _init_pystats function to initialize these new member structs
Call _init_pystats inside _PyConfig_Write
Add print_uop_sequence function to specialize.c to output stored
chains of uops
Sets the maximum depth of UOP sequences to track. Defaults to 2.
@JeffersGlass
Copy link
Contributor Author

Thank you both! @erlend-aasland your fix recipe worked a treat - I will be more careful about merging from main/pushing in the future.

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor issues, but looks good in general.

@bedevere-app
Copy link

bedevere-app bot commented Feb 15, 2024

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@JeffersGlass
Copy link
Contributor Author

Thank you for the corrections. I have made the requested changes; please review again.

@JeffersGlass JeffersGlass changed the title gh-115178: Add Counts of UOp Sequences to pystats gh-115178: Add Counts of UOp Pairs to pystats Feb 19, 2024
@Fidget-Spinner
Copy link
Member

@markshannon I'm going to merge this soon.

@JeffersGlass could you please fix the merge conflicts? Thanks!

@JeffersGlass
Copy link
Contributor Author

JeffersGlass commented Mar 2, 2024

Thanks @Fidget-Spinner - I've caught up with main and tests are passing.

@JeffersGlass
Copy link
Contributor Author

Actually @Fidget-Spinner before you merge, I’ve seen something I missed in this morning’s merge from main, let me take care of that later today.

@Fidget-Spinner
Copy link
Member

Ok just ping me again when you feel its ready, I will take a look then!

@JeffersGlass
Copy link
Contributor Author

@Fidget-Spinner I've made the updates I wanted to, to fix detection of the different kinds of opcode pairs and give them good labels in the output. I've also just rebased on top of main; I think this is ready for review.

@markshannon markshannon self-requested a review April 16, 2024 13:25
Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @JeffersGlass for doing this.
This is going to be very useful for selecting micro-op pairs for the JIT.

@markshannon markshannon merged commit acf69e0 into python:main Apr 16, 2024
diegorusso pushed a commit to diegorusso/cpython that referenced this pull request Apr 17, 2024
JeffersGlass added a commit to JeffersGlass/ideas that referenced this pull request May 21, 2024
This follows up [115181](python/cpython#115181), and adds a note about the presence of Uop pair counts in pystats.
gvanrossum pushed a commit to faster-cpython/ideas that referenced this pull request May 21, 2024
This follows up [115181](python/cpython#115181),
and adds a note about the presence of Uop pair counts in pystats.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants