Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YJIT: Optimize local variables when EP == BP #10487

Merged
merged 1 commit into from Apr 17, 2024
Merged

Conversation

k0kubun
Copy link
Member

@k0kubun k0kubun commented Apr 8, 2024

This PR optimizes local variable access in getlocal and setlocal from REG_CFP->ep[offset] (2 instructions) to REG_SP[offset] (1 instruction) when EP == BP. When EP escapes, the block gets invalidated.

Because local variable access is now based off of SP, this will allow us to handle local variables using Opnd::Stack and allocate registers for them in a future PR.

Generated code

Before

  # Insn: 0000 getlocal_WC_0 (stack_size: 0)
  0x59e90a49a058: mov rax, qword ptr [r13 + 0x20]
  # reg_temps: 00000000 -> 00000001
  0x59e90a49a05c: mov rsi, qword ptr [rax - 0x18]

After

  # Insn: 0000 getlocal_WC_0 (stack_size: 0)
  # reg_temps: 00000000 -> 00000001
  0x6271819a1058: mov rsi, qword ptr [rbx - 0x20]

Benchmark

This seems to improve the performance of the following benchmarks.

before: ruby 3.4.0dev (2024-04-17T02:45:52Z master 0b630d6441) +YJIT [x86_64-linux]
after: ruby 3.4.0dev (2024-04-17T05:46:52Z yjit-ep-bp 3b316da2e3) +YJIT [x86_64-linux]

--------------  -----------  ----------  ----------  ----------  -------------  ------------
bench           before (ms)  stddev (%)  after (ms)  stddev (%)  after 1st itr  before/after
erubi-rails     1219.7       0.1         1160.5      0.1         1.04           1.05
liquid-compile  74.1         0.5         73.6        0.8         1.00           1.01
liquid-render   94.8         0.3         93.0        0.6         1.01           1.02
psych-load      1785.5       0.1         1743.8      0.1         1.02           1.02
protoboeuf      59.0         0.6         58.2        0.8         1.01           1.01
--------------  -----------  ----------  ----------  ----------  -------------  ------------

This comment has been minimized.

@k0kubun k0kubun force-pushed the yjit-ep-bp branch 4 times, most recently from 913e817 to fd54744 Compare April 17, 2024 05:40
@k0kubun k0kubun marked this pull request as ready for review April 17, 2024 06:46
@matzbot matzbot requested a review from a team April 17, 2024 06:47
Copy link
Contributor

@maximecb maximecb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @k0kubun ! This looks good. Nice work.

For completeness, could you measure memory usage on lobsters? Just asking because of the extra invariants we need to care care of, might be a useful data point.

@k0kubun
Copy link
Member Author

k0kubun commented Apr 17, 2024

For completeness, could you measure memory usage on lobsters?

Sure.

With 25 itrs on lobsters, inline_code_size is reduced by 0.7%, and yjit_alloc_size is increased by 2.4%.

Before

inline_code_size:          7,612,421
outlined_code_size:        6,322,015
code_region_size:         15,269,888
code_region_overhead:      1,335,452 ( 8.7%)
yjit_alloc_size:          28,784,713

After

inline_code_size:          7,559,850
outlined_code_size:        6,229,344
code_region_size:         15,462,400
code_region_overhead:      1,673,206 (10.8%)
yjit_alloc_size:          29,475,775

Copy link
Contributor

@maximecb maximecb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to think about how to optimize memory usage for storing invariants in the future, but this PR seems like a clear step in the right direction 👍

@maximecb maximecb merged commit 4cc58ea into ruby:master Apr 17, 2024
101 checks passed
@k0kubun k0kubun deleted the yjit-ep-bp branch April 18, 2024 00:52
@nobu
Copy link
Member

nobu commented Apr 19, 2024

This seems causing recent hang ups at make (check, --enable-yjit=dev, --yjit-call-threshold=1 --yjit-verify-ctx --yjit-code-gc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants