Skip to content

Conversation

k0kubun
Copy link
Member

@k0kubun k0kubun commented Jul 13, 2023

This PR changes ratio_in_yjit to be available on any YJIT-enabled build. To minimize overhead when --yjit or --yjit-stats is not used, we increment a global variable without branching or locking. The counter could be inaccurate on Ractors.

Performance

interpreter

before: ruby 3.3.0dev (2023-07-13T18:48:14Z master 62ecf78b87) [x86_64-linux]
after: ruby 3.3.0dev (2023-07-13T20:22:04Z yjit-reg-counter 6c13fcc476) [x86_64-linux]

----------  -----------  ----------  ----------  ----------  -------------  ------------
bench       before (ms)  stddev (%)  after (ms)  stddev (%)  after 1st itr  before/after
railsbench  1912.2       0.6         1913.4      0.6         1.00           1.00
----------  -----------  ----------  ----------  ----------  -------------  ------------

--yjit

before: ruby 3.3.0dev (2023-07-13T18:48:14Z master 62ecf78b87) +YJIT [x86_64-linux]
after: ruby 3.3.0dev (2023-07-13T20:22:04Z yjit-reg-counter 6c13fcc476) +YJIT [x86_64-linux]

----------  -----------  ----------  ----------  ----------  -------------  ------------
bench       before (ms)  stddev (%)  after (ms)  stddev (%)  after 1st itr  before/after
railsbench  1237.2       1.1         1239.3      1.1         1.00           1.00
----------  -----------  ----------  ----------  ----------  -------------  ------------

--yjit-stats

before: ruby 3.3.0dev (2023-07-13T18:48:14Z master 62ecf78b87) +YJIT [x86_64-linux]
after: ruby 3.3.0dev (2023-07-13T20:22:04Z yjit-reg-counter 6c13fcc476) +YJIT [x86_64-linux]

----------  -----------  ----------  ----------  ----------  -------------  ------------
bench       before (ms)  stddev (%)  after (ms)  stddev (%)  after 1st itr  before/after
railsbench  1326.9       1.0         1317.0      1.0         1.00           1.01
----------  -----------  ----------  ----------  ----------  -------------  ------------

This PR doesn't seem to have a significant performance impact on railsbench.

@k0kubun k0kubun force-pushed the yjit-reg-counter branch 4 times, most recently from 72b69da to 9771422 Compare July 13, 2023 03:38
@k0kubun k0kubun marked this pull request as ready for review July 13, 2023 03:54
@matzbot matzbot requested a review from a team July 13, 2023 03:54
@maximecb
Copy link
Contributor

So the good news is it looks like you managed to make the overhead negligible, but that comes at the expense of some accuracy. Are we sure that the loss in accuracy will always be small?

Looking at the PR. I think maybe things could be simplified a bit? It looks like there are a few functions exported on both sides. Ideally, maybe we should try to do as much of the vm_insn_count accounting on the C side?

I'm also wondering if you tried to make it so that the C interpreter loop just does an atomic increment on a vm_insn_count global variable? I assume you've tried this and it was slower?

@k0kubun
Copy link
Member Author

k0kubun commented Jul 13, 2023

I'm also wondering if you tried to make it so that the C interpreter loop just does an atomic increment on a vm_insn_count global variable? I assume you've tried this and it was slower?

I assumed it wouldn't be faster than incrementing a local variable, so I didn't try it. I tried it now, and here's the result.

Using lock addq does slow down the interpreter on railsbench by 2%. However, using addq without locking does not slowdown the interpreter on railsbench. Given that it could be inaccurate only when you run multiple Ractors, I think it's a better type of inaccuracy for us? I'll update the PR like that after lunch.

@maximecb
Copy link
Contributor

However, using addq without locking does not slowdown the interpreter on railsbench. Given that it could be inaccurate only when you run multiple Ractors, I think it's a better type of inaccuracy for us? I'll update the PR like that after lunch.

Sounds like it could be a pretty good tradeoff 👍 🙂

@k0kubun k0kubun force-pushed the yjit-reg-counter branch from 9771422 to 6148884 Compare July 13, 2023 20:22
@k0kubun
Copy link
Member Author

k0kubun commented Jul 13, 2023

I simplified the implementation 6cf33c8 and updated the benchmark results on the PR description. The current version seems much nicer 🙂

@k0kubun k0kubun force-pushed the yjit-reg-counter branch from 8d1220e to 6cf33c8 Compare July 13, 2023 20:47
Copy link
Contributor

@maximecb maximecb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! It's awesome that we can now have all the stats including percentage in YJIT on all builds! 😎

@maximecb maximecb merged commit d814722 into ruby:master Jul 13, 2023
@k0kubun k0kubun deleted the yjit-reg-counter branch July 13, 2023 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants