gh-119692: Add Total UOp 'cost' to PyStats Output #119693

JeffersGlass · 2024-05-28T21:57:46Z

This PR adds an additional table to the output from summarize_stats.py. Namely, a table of (# of times a uop was exectued) * (length of that UOp in machine code), sorted by this value. This makes it clear how much time* is being spent in each UOp, as opposed to just which ones are most frequently executed.

*Machine instruction count is a rough proxy for time, but a really easy one to calculate.

The new table looks like this:

Total Machine Instruction Counts per UOp

Name	Product	Self	Cumulative	Count	Length (Machine Instructions)
_COLD_EXIT	741,336	14.4%	14.4%	1,173	632
_TIER2_RESUME_CHECK	436,914	8.5%	22.8%	2,511	174
_STORE_FAST_0	389,712	7.6%	30.4%	2,118	184
_BINARY_OP_ADD_INT	327,339	6.3%	36.7%	983	333
_START_EXECUTOR	231,442	4.5%	41.2%	1,193	194
_LOAD_FAST_0	225,144	4.4%	45.6%	1,416	159
...	...	...	...	...	...

Closes #119692. Tagging @brandtbucher as the requester of this feature, and @mdboom for pystats visibility.

Issue: Add "Total Machine Code Cost" of UOps to PyStats #119692

mdboom

Thanks. This LGTM, but let's let @brandtbucher confirm the stats themselves make sense for what he needs.

brandtbucher

Thanks for doing this. I think it will be really useful!

A few notes:

I don't love searching for and parsing jit_stencils.h like this... it's pretty fragile (the JIT is changing quite a bit right now, and out-of-tree builds are a thing unfortunately). A more robust solution would be to dump the code size as part of the stats themselves in the interpreter, since we have access to the stencil_groups array in the C code. Then we could just parse it out of the stats dump normally.
No need to repeat the "Self" and "Cumulative" values in the new table.
I'd rename "length" to "size" (minor, but we use "length" to mean other things in the stats summary).
Remove all mention of "machine instructions", since what we're really measuring is size (in bytes).
I'd replace "Product" with something like "Total Size" or "Total Cost" in the table, and move it after the "Count" and "Size" columns (which makes it a bit clearer to me that it is derived from them).

JeffersGlass · 2024-05-31T18:19:39Z

Thanks for the feedback @brandtbucher! I've reworked things so that the code size (and data size) of each stencil are dumped as part of the stats, and the summarize script picks them up from there. I've also renamed and re-ordered the table fields to clarify what is being shown.

I've left the Self and Cumulative columns for now - they're the percentage of all the bytes that were jitted by the current UOp (and the running total of the same), if that makes sense, so they're not just repeating the values from an earlier table. But I'm happy to remove or re-label them if that doesn't actually seem useful.

The new table with some sample data looks like:

Total Bytes Executed per JIT'ed UOp

Name	Count	Stencil Size (Bytes)	Total Size	Self (Total Size)	Cumulative (Total Size)
_COLD_EXIT	23,808	447	10,642,176	30.9%	30.9%
_STORE_NAME	23,800	259	6,164,200	17.9%	48.7%
_START_EXECUTOR	23,808	170	4,047,360	11.7%	60.5%
_EXIT_TRACE	23,808	151	3,595,008	10.4%	70.9%
_ITER_NEXT_RANGE	23,800	86	2,046,800	5.9%	76.8%
_ITER_CHECK_RANGE	23,808	82	1,952,256	5.7%	82.5%
_CHECK_VALIDITY	23,800	76	1,808,800	5.2%	87.7%
_GUARD_NOT_EXHAUSTED_RANGE	23,808	72	1,714,176	5.0%	92.7%
_TIER2_RESUME_CHECK	23,808	66	1,571,328	4.6%	97.2%
_SET_IP	23,800	40	952,000	2.8%	100.0%

** Updated - see below **
Right now, there's a bit of a kludge in load_raw_data so that the stencil lengths don't get summed. The lines containing the info about the code stencils look like uops[_MATCH_KEYS].code_size : 74, and this snippet makes sure they're just recorded, not summed.

# Data about JIT stencils isn't cumulative
if "code_size" in key or "data_size" in key:
    stats[key.strip()] = int(value)
else:
    stats[key.strip()] += int(value)

~~I can see breaking this data into a new prefix (uops[_MATCH_KEYS].data.XXX maybe, and looking for data in the key?) and reworking how its loaded, if that seems cleaner?~~

JeffersGlass · 2024-05-31T18:49:36Z

I made a small format change - keys that have metadata in them should simply be set across the input files (instead of summed). So the dumped stencil-length data looks like:

uops[_CONVERT_VALUE].metadata.code_size : 227
uops[_CONVERT_VALUE].metadata.data_size : 280
uops[_COPY].metadata.code_size : 137
uops[_COPY].metadata.data_size : 216
uops[_COPY_FREE_VARS].metadata.code_size : 396
uops[_COPY_FREE_VARS].metadata.data_size : 480
...

I think ideally, there would be some checking that these values are consistent across all the stats files, but currently it'll just use the last value it finds. A bit of a kludge still, but since I would guess it's rare to have stats files hanging around from multiple builds with different jit stencils, perhaps this is fine for now?

JeffersGlass added 5 commits May 28, 2024 17:43

Add total uop 'cost' to pystats output

c12f9ba

Rename jit_data -> jit_stencils_data

e8dafa8

Add comment, clarify variable names

abbde18

Rename typing -> JitStencilLengthData

7db8fb6

Merge branch 'main' into uop-instruction-count-cost

f36ce34

bedevere-app bot added the awaiting review label May 28, 2024

bedevere-app bot mentioned this pull request May 28, 2024

Add "Total Machine Code Cost" of UOps to PyStats #119692

Open

mdboom approved these changes May 29, 2024

View reviewed changes

bedevere-app bot added awaiting core review and removed awaiting review labels May 29, 2024

brandtbucher reviewed May 30, 2024

View reviewed changes

JeffersGlass added 6 commits May 31, 2024 12:32

Working code_size export to stats

21baab0

Use optimization stats instead of global stats

603810b

Update summarize_stats to use new data

cf5ca23

Fix error when running Tier 2 w/o JIT

71a0c07

Apply black

f35e68a

Stencil lengths are not cummulative

734257b

Tag metadata with dotted key

d3612cf

brandtbucher added the topic-JIT label May 31, 2024

JeffersGlass mentioned this pull request Jun 21, 2024

SuperInstructions Implementation JeffersGlass/cpython#1

Open

19 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gh-119692: Add Total UOp 'cost' to PyStats Output #119693

gh-119692: Add Total UOp 'cost' to PyStats Output #119693

Uh oh!

JeffersGlass commented May 28, 2024 •

edited by bedevere-app bot

Loading

Uh oh!

mdboom left a comment

Uh oh!

brandtbucher left a comment •

edited

Loading

Uh oh!

JeffersGlass commented May 31, 2024 •

edited

Loading

Uh oh!

JeffersGlass commented May 31, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

gh-119692: Add Total UOp 'cost' to PyStats Output #119693

Are you sure you want to change the base?

gh-119692: Add Total UOp 'cost' to PyStats Output #119693

Uh oh!

Conversation

JeffersGlass commented May 28, 2024 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Total Machine Instruction Counts per UOp

Uh oh!

mdboom left a comment

Choose a reason for hiding this comment

Uh oh!

brandtbucher left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JeffersGlass commented May 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Total Bytes Executed per JIT'ed UOp

Uh oh!

JeffersGlass commented May 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JeffersGlass commented May 28, 2024 •

edited by bedevere-app bot

Loading

brandtbucher left a comment •

edited

Loading

JeffersGlass commented May 31, 2024 •

edited

Loading

JeffersGlass commented May 31, 2024 •

edited

Loading