Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YJIT: Reduce paddings if --yjit-exec-mem-size <= 128 on arm64 #7671

Merged
merged 2 commits into from Apr 11, 2023

Conversation

k0kubun
Copy link
Member

@k0kubun k0kubun commented Apr 7, 2023

Despite our past effort #6790, YJIT's arm64 code still seems to be bloated a lot by nop instructions. For invalidation, we still waste a lot of spaces for non-side-exit jumps.

b instruction could make a jump to +/-128MiB relative from the PC. Using that property, if the virtual code region size is <= 128MiB, we should always be able to encode an unconditional jump to JIT code with b.

We've already learned that the default 64MiB is more than enough for SFR, and we're even running Shopify Core with 64MiB. This PR minimizes the number of nop instructions when --yjit-exec-mem-size is <= 128MiB, assuming most people would use the default 64MiB and almost never need more than 128MiB.

Example

Before

  # Insn: branchunless
  0x1060f008c: tst x1, #-5
  0x1060f0090: b.eq #0x1060f2158
  0x1060f0094: nop
  0x1060f0098: nop
  0x1060f009c: nop
  0x1060f00a0: nop
  0x1060f00a4: nop
  0x1060f00a8: b #0x1060f218c
  0x1060f00ac: nop
  0x1060f00b0: nop
  0x1060f00b4: nop
  0x1060f00b8: nop

After

  # Insn: branchunless
  0x1067e006c: tst x1, #-5
  0x1067e0070: b.eq #0x1067e2148
  0x1067e0074: nop
  0x1067e0078: b #0x1067e217c

Unconditional jumps will always fit in a single b instruction, so no nop instructions are generated. For unconditional jumps, b.cond + b (2 instructions) will be the worst case, so up to 1 nop instruction can be generated.

Code size

On railsbench,

Before

inline_code_size:          4,318,804
outlined_code_size:        3,064,516
code_region_size:          7,667,712

After

inline_code_size:          3,628,992
outlined_code_size:        2,679,464
code_region_size:          6,979,584

inline_code_size is reduced by 16%.

Benchmark

before: ruby 3.3.0dev (2023-04-06T17:35:25Z master bffadcd6d4) +YJIT [arm64-darwin22]
after: ruby 3.3.0dev (2023-04-07T01:04:22Z yjit-32bit-code 99a4cf74a6) +YJIT [arm64-darwin22]

----------  -----------  ----------  ----------  ----------  ------------  -------------
bench       before (ms)  stddev (%)  after (ms)  stddev (%)  before/after  after 1st itr
railsbench  647.7        1.2         642.0       1.9         1.01          1.02
30k_ifelse  381.7        0.6         357.2       0.3         1.07          1.02
----------  -----------  ----------  ----------  ----------  ------------  -------------

@k0kubun k0kubun marked this pull request as ready for review April 7, 2023 01:32
@matzbot matzbot requested a review from a team April 7, 2023 01:32
Copy link
Member

@XrXr XrXr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Very elegant solve. Consider making jmp_ptr_bytes a method on CodeBlock so that it can use VirtualMem to make the decision on ARM and can be more friendly to tests, though.

Also I suppose in the future we might allocate a smaller address space than what the option specifies for alignment or other reasons, but I don't think that'd cause issues for this change.

@k0kubun k0kubun force-pushed the yjit-32bit-code branch 2 times, most recently from 8b5b6ea to 1784a3b Compare April 10, 2023 17:09
Copy link
Contributor

@maximecb maximecb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive code size reduction! :)

@maximecb maximecb merged commit 7297374 into ruby:master Apr 11, 2023
97 checks passed
@maximecb maximecb deleted the yjit-32bit-code branch April 11, 2023 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants