Reserve 2 bits for expressing object layout by tenderlove · Pull Request #17139 · ruby/ruby

tenderlove · 2026-05-28T22:47:47Z

We would like to make instance variable reads in the JIT compiler faster (as well as simplify the JIT implementation). Currently, in order to read an instance variable, we have to:

Test for heap object
Load object flags to a 64 bit register
Mask the object header
Bit test against the masked header
JNE
Load field

We would like to:

Test for heap object
Load object shape to a 32 bit register
Bit test against the shape
JNE
Load field

The way we fetch instance variables is not consistent across objects. In order to realize our goal, we need to encode object layout inside the shape. If we encode object layout inside the shape, then the shape itself will guarantee that the access pattern generated by the JIT compiler is correct.

We should encode the following load patterns into the shape tag bits. This way we can share shapes on transitions, but be able to differentiate the access patterns for the JIT compiler. In other words, two objects can have an @a -> @b -> @c transition and share the same shape, but the tag bits can differentiate the access pattern so that the JIT compiler can be confident that the machine code is correct.

Here are the patterns:

Embedded/Extended T_OBJECT Instance Variables

Objects with direct references to instance variables or via malloc buffer

Objects with fields_objects fields

These are Data and TypedData objects. They have an associated axillary imemo/fields object that stores the instance variables. The access pattern is object[2] + 2. The fields object is the 3rd field, and the instance variables start at +2 inside the fields object. The fields object itself is a Ruby object, so it contains the usual header bits + class headers.

Non Boxable Classes / Modules

This is similar to Objects with fields_objects, but the fields object is stored at a different offset. We’re differentiating this from boxable classes and modules because those are harder to support.

Other

"Other" pattern is for objects that are rare, or have difficult-to-implement access patterns. This includes:

Boxable classes and modules
Structs (for now)
Objects that use the geniv table

Proposed shape bit layout:

  Current shape_id_t is 32 bits:
  31        28 27 26 25 24 23 22        19 18                         0
  +-----------+--+--+--+--+--+------------+----------------------------+
  | unused    |L1|L0|OI|FR|CX| heap index | shape tree offset          |
  +-----------+--+--+--+--+--+------------+----------------------------+
               |  |  |  |  |  |            |
               |  |  |  |  |  |            +-- bits 0-18: SHAPE_ID_OFFSET_MASK
               |  |  |  |  |  +--------------- bits 19-22: SHAPE_ID_HEAP_INDEX_MASK
               |  |  |  |  +------------------ bit 23: SHAPE_ID_FL_COMPLEX
               |  |  |  +--------------------- bit 24: SHAPE_ID_FL_FROZEN
               |  |  +------------------------ bit 25: SHAPE_ID_FL_HAS_OBJECT_ID
               +--+--------------------------- bits 26-27: SHAPE_ID_LAYOUT_MASK

The important part about these layout patterns is that they do not reflect the type of object, only how the object is laid out in memory. For example, we currently treat structs as "other", but we can refactor them to have the same layout as "Objects with fields_objects", and when we do that they should get a different bit in the shape header.

This commit only reserves the two bits, it doesn't use them in the JIT compiler yet.

XrXr · 2026-05-28T23:18:32Z

What's a non-boxable class? I thought every class can end up inside a box

tenderlove · 2026-05-29T00:03:21Z

What's a non-boxable class? I thought every class can end up inside a box

Ya, every class can end up inside a box, but only "box able" classes can be "seen" differently between boxes. For example the String class can have different ivars depending on the box that it's accessed from, but a user class that's created and passed between boxes will only ever have one set of IVs.

> RUBY_BOX=1 ./miniruby -e'p [Ruby::Box.new.eval("String.instance_variable_set(:@a, 1)"), String.instance_variable_get(:@a)]'
./miniruby: warning: Ruby::Box is experimental, and the behavior may change in the future!
See https://docs.ruby-lang.org/en/master/Ruby/Box.html for known issues, etc.
[1, nil]

The flag is here

The "boxable" flag depends on the boolean passed here, and BOX_MASTER_P(rb_current_box()) is only true during Ruby's boot process. User classes will all be defined as "not boxable". It's kind of a weird name for a flag but I don't know a better name off the top of my head.

Anyway if you inspect the current box after boot, you'll see it's a user box:

RUBY_BOX=1 ./miniruby -e'p Ruby::Box.current'
./miniruby: warning: Ruby::Box is experimental, and the behavior may change in the future!
See https://docs.ruby-lang.org/en/master/Ruby/Box.html for known issues, etc.
#<Ruby::Box:3,user,main>

So BOX_MASTER_P is false, and user classes will be "not boxable".

tenderlove · 2026-05-29T00:08:39Z

If we make the other case fall back to just a function call, then I think we could also eliminate the "main box" guards in the IV read path:

> RUBY_BOX=1 ./miniruby -e'p RubyVM::Shape.of(String).layout'
./miniruby: warning: Ruby::Box is experimental, and the behavior may change in the future!
See https://docs.ruby-lang.org/en/master/Ruby/Box.html for known issues, etc.
:other
> ./miniruby -e'p RubyVM::Shape.of(String).layout'
:rclass

tekknolagi · 2026-05-29T01:04:47Z

They are patchpoints and therefore don't cost much, if anything

byroot · 2026-05-29T05:36:45Z

I'll have a lot of nitpicks, but this is very much where I was going with the recent shape refactors and the paused work on #16843, so 👍 on the idea itself.

byroot

Lots of little nitpicks, but overall big 👍 on this, it's essentially what I was trying to do, and I'm happy to base myself of these changes.

byroot · 2026-05-29T05:37:24Z

    struct RArray fake_ary = {0};
    fake_ary.basic.flags = T_ARRAY;
    VALUE ary = (VALUE)&fake_ary;
+    RBASIC_SET_SHAPE_ID(ary, ROOT_SHAPE_ID | SHAPE_ID_LAYOUT_OTHER);


You would probably save yourself lots of trouble if SHAPE_ID_LAYOUT_OTHER was the default (aka 0).

Maybe. We could change it. TBH I think we should be explicitly setting shapes on objects at allocation time. This fake array worked because ROOT_SHAPE_ID happens to be 0. IOW I think this allocation point should have been setting a shape in the first place and I'm glad we caught it.

TBH I think we should be explicitly setting shapes on objects at allocation time.

Agreed. I started moving in that direction a while ago, e.g. introduction of rb_newobj_of_with_shape etc.

byroot · 2026-05-29T05:42:34Z

 }

+static inline shape_id_t
+rb_shape_id_with_robject_layout(shape_id_t shape_id)


Suggested change

rb_shape_id_with_robject_layout(shape_id_t shape_id)

rb_shape_transition_robject_layout(shape_id_t shape_id)

Not the end of the world, but the naming convention for function that transition to another shape (even if just flipping bits) is rb_shape_transition_xxx

It's not used as a transition though. The semantics are the same but the intent is a little different

byroot · 2026-05-29T05:45:30Z


+    // Means IVs are found at an offset from the object's addr, or in a
+    // malloc allocated side table
+    SHAPE_ID_LAYOUT_ROBJECT = 0,


So I know it's much harder (since I tried it: #16843), but I think ideally you do want two distinct layouts for ROBJECT. One for embedded, one for "extended".

Can be a followup though.

I think ideally you do want two distinct layouts for ROBJECT. One for embedded, one for "extended".

Why? Since the size class is embedded in the shape, the JIT compiler can know whether the object is embedded or extended based on the shape alone. Not against it, just wondering the reasons.

Another thing we were discussing is that when an object goes extended maybe we use an imemo/fields object and then the object becomes "rdata" layout.

Since the size class is embedded in the shape

Right, I didn't properly explain.

So first right now this is a bit annoying because it means that when removing an ivar, you have to know if you are dealing with an "extended RObject" in which case you may have to re-embed it as to not confuse the JITs.

That is why I do want to properly materialize IS_ROBJECT_EMBED as a bit in the shape_id (#17145) and not solely rely on the heap_id + shape->depth to tell us if we're embedded or not.

The more secondary reason I would like this is that I would want all types to have the heap_id in their shape (right now only ROBJECT does) so that you could more efficiently query the object slot size. Right now rb_gc_obj_slot_size() is used a bit across array.c/string.c but it's quite slow for what is it.

But to be clear I'm not asking you do implement any of this, I'm just sharing my own goals so that we hopefully don't step on each others toes too much (that being said most of this work has been paused for a couple weeks now, and unclear why I'll have the time+energy to resume it, so priority to you).

byroot · 2026-05-29T05:54:50Z

+    SHAPE_ID_LAYOUT_RCLASS = RBIMPL_SHAPE_ID_FL(3),
+
+    // Means this object is an RData or RTypedData and IVs are found in the
+    // fields_obj found on the RData/RTypedData struct
+    SHAPE_ID_LAYOUT_RDATA = RBIMPL_SHAPE_ID_FL(4),


So my plan was to move RClass.fields_obj at the same offset than RTypedData.fields_obj and RObject.as.extended (which right now is a raw buffer, but that I would like to make an IMEMO/fields #17145).

If you do this, SHAPE_ID_LAYOUT_RCLASS and SHAPE_ID_LAYOUT_RDATA can become the same (unless you need to discriminate classes for other reason? Ractors?).

This way you don't really care what the type is, it's really just at which offset you find the fields object.

So my plan was to move RClass.fields_obj at the same offset than RTypedData.fields_obj and RObject.as.extended (which right now is a raw buffer, but that I would like to make an IMEMO/fields #17145).

🤦 yes, I should have read all comments before responding.

So my plan was to move RClass.fields_obj

Were you only going to do this for non-boxable classes?

If you do this, SHAPE_ID_LAYOUT_RCLASS and SHAPE_ID_LAYOUT_RDATA can become the same (unless you need to discriminate classes for other reason? Ractors?).

This way you don't really care what the type is, it's really just at which offset you find the fields object.

I don't think there is any reason to discriminate. The only reason we have this layout type is to know that a class isn't boxable, and where its fields pointer is via the shape. If we can make it have the same layout as RDATA that would be great.

Yep, boxable classes can't have that "layout", but other than that, you can share the same flag for:

non-boxables classes

RTypedData

"extended" RObject

As they're all generate the exact same native code (search for IMEMO/fields reference at obj[2]).

byroot · 2026-05-29T06:01:32Z

        VALUE type = RB_BUILTIN_TYPE(obj);
        size_t slot_size = rb_gc_obj_slot_size(obj);
-        VALUE moved = rb_newobj(GET_EC(), 0, type, 0, wb_protected_types[type], slot_size);
+        VALUE moved = rb_newobj(GET_EC(), 0, type, RBASIC_SHAPE_ID(obj), wb_protected_types[type], slot_size);


Is it OK to copy the whole shape on an otherwise empty object? I'm thinking that if GC triggers before that object is initialized it might cause some bug.

You may want to only copy the "structural bits", so LAYOUT and HEAP parts.

byroot · 2026-05-29T06:09:57Z

    // In most case we can replicate the single `fields_obj` shape
    // but in namespaced case? Perhaps INVALID_SHAPE_ID?
-    RBASIC_SET_SHAPE_ID(obj, RBASIC_SHAPE_ID(new_fields_obj));
+    RBASIC_SET_SHAPE_ID(obj, rb_shape_id_layout(RBASIC_SHAPE_ID(obj)) | RBASIC_SHAPE_ID(new_fields_obj));


So on my experimental branch, I struggle quite a bit with keeping the "owner" and "field obj" shapes in sync, as it resulted in lot of brittle code like this.

What I was thinking was to introduce a split between the bits that represent the object type (slot size, location of fields, etc) and the parts that are shared with the fields obj like the offset frozen bit etc etc.

Note this isn't a blocker, if you PR is green I'm happy to refactor this in a followup, it's much easier to simplify working code.

@b

We would like to make instance variable reads in the JIT compiler faster (as well as simplify the JIT implementation). Currently, in order to read an instance variable, we have to: 1. Test for heap object 2. Load object to a 64 bit register 3. Mask the object header 4. Bit test against the masked header 5. JNE 6. Load field We would like to: 1. Test for heap object 2. Load object shape to a 32 bit register 3. Bit test against the shape 4. JNE 5. Load field The way we fetch instance variables is not consistent across objects. In order to realize our goal, we need to encode object layout inside the shape. If we encode object layout inside the shape, then the shape itself will guarantee that the access pattern generated by the JIT compiler is correct. We should encode the following load patterns into the shape tag bits. This way we can share shapes on transitions, but be able to differentiate the access patterns for the JIT compiler. In other words, two objects can have an `@a -> @b -> @c` transition and share the same shape, but the tag bits can differentiate the access pattern so that the JIT compiler can be confident that the machine code is correct. Here are the patterns: 1. Embedded/Extended T_OBJECT Instance Variables Objects with direct references to instance variables or via malloc buffer 2. Objects with fields_objects fields These are Data and TypedData objects. They have an associated axillary imemo/fields object that stores the instance variables. The access pattern is `object[2] + 2`. The fields object is the 3rd field, and the instance variables start at +2 inside the fields object. The fields object itself is a Ruby object, so it contains the usual header bits + class headers. 3. Non Boxable Classes / Modules This is similar to Objects with fields_objects, but the fields object is stored at a different offset. We’re differentiating this from boxable classes and modules because those are harder to support. 4. Other "Other" pattern is for objects that are rare, or have difficult-to-implement access patterns. This includes: * Boxable classes and modules * Structs (for now) * Objects that use the geniv table Proposed shape bit layout: ``` Current shape_id_t is 32 bits: 31 28 27 26 25 24 23 22 19 18 0 +-----------+--+--+--+--+--+------------+----------------------------+ | unused |L1|L0|OI|FR|CX| heap index | shape tree offset | +-----------+--+--+--+--+--+------------+----------------------------+ | | | | | | | | | | | | | +-- bits 0-18: SHAPE_ID_OFFSET_MASK | | | | | +--------------- bits 19-22: SHAPE_ID_HEAP_INDEX_MASK | | | | +------------------ bit 23: SHAPE_ID_FL_COMPLEX | | | +--------------------- bit 24: SHAPE_ID_FL_FROZEN | | +------------------------ bit 25: SHAPE_ID_FL_HAS_OBJECT_ID +--+--------------------------- bits 26-27: SHAPE_ID_LAYOUT_MASK ``` The important part about these layout patterns is that they do not reflect the _type_ of object, only how the object is laid out in memory. For example, we currently treat structs as "other", but we can refactor them to have the same layout as "Objects with fields_objects", and when we do that they should get a different bit in the shape header. This commit only reserves the two bits, it doesn't use them in the JIT compiler yet. Co-Authored-By: John Hawthorn <john@hawthorn.email> Co-Authored-By: Max Bernstein <tekknolagi@gmail.com>

Co-authored-by: Nobuyoshi Nakada <nobu.nakada@gmail.com>

Co-authored-by: Jean Boussier <jean.boussier@gmail.com>

This reverts commit 900711d.

XrXr

Before implementing this in the JITs, you should be able to use this in the to simplify some class of cached ivar reading in the interpreter right? That's a good way to verify the shapes infra is working properly.

tenderlove · 2026-05-29T19:59:21Z

you should be able to use this in the to simplify some class of cached ivar reading in the interpreter right?

I don't know that this change would simplify any of the existing code. These bits teach us where to find the IV buffer and the code for locating the IV buffer is not part of the current inline cache hit code.

If we refactored the IV caches to do CC fast paths like methods, then we could eliminate this branch but that seems like a much bigger change than this PR.

tenderlove · 2026-05-29T22:09:23Z

@XrXr I made a commit that dogfoods these bits. I don't think it's any simpler at the moment, so I'd like to follow up with a PR after.

This reverts commit 63d9f09.

This reverts commit ddb5055.

tenderlove requested a review from byroot May 28, 2026 22:47

tenderworks force-pushed the messing-around branch 2 times, most recently from ae3bfd0 to c2c8d8c Compare May 28, 2026 22:54

tekknolagi mentioned this pull request May 28, 2026

ZJIT: Clean up specialized T_DATA path #17140

Merged

tenderworks force-pushed the messing-around branch 2 times, most recently from 005c652 to d584473 Compare May 29, 2026 00:35

nobu reviewed May 29, 2026

View reviewed changes

Comment thread gc.c

byroot mentioned this pull request May 29, 2026

Robject: use IMEMO/fields as buffer when transitioning to extended #17145

Draft

byroot reviewed May 29, 2026

View reviewed changes

tekknolagi reviewed May 29, 2026

View reviewed changes

Comment thread shape.c Outdated

tenderlove and others added 8 commits May 29, 2026 10:05

Update gc.c

80e045f

Co-authored-by: Nobuyoshi Nakada <nobu.nakada@gmail.com>

Update shape.h

12b61f2

Co-authored-by: Jean Boussier <jean.boussier@gmail.com>

fix function name

6ef7f20

Update shape.c

aa0c41a

Co-authored-by: Jean Boussier <jean.boussier@gmail.com>

fix function name

95846f5

Revert "Update shape.c"

5f24217

This reverts commit 900711d.

add comment

7df5509

tenderworks force-pushed the messing-around branch from 3bb14ff to 7df5509 Compare May 29, 2026 17:05

XrXr reviewed May 29, 2026

View reviewed changes

tenderlove merged commit 63d9f09 into ruby:master May 29, 2026
119 checks passed

tenderworks deleted the messing-around branch May 29, 2026 22:26

tekknolagi mentioned this pull request May 29, 2026

ZJIT: Use reserved shape layout bits to remove IntAnd and flags checks in ivar lookup Shopify/ruby#989

Open

matzbot pushed a commit that referenced this pull request May 29, 2026

Revert "Reserve 2 bits for expressing object layout (#17139)"

ddb5055

This reverts commit 63d9f09.

tenderlove mentioned this pull request May 29, 2026

Reapply "Reserve 2 bits for expressing object layout (#17139)" #17158

Open

tenderlove added a commit to tenderlove/ruby that referenced this pull request May 30, 2026

Reapply "Reserve 2 bits for expressing object layout (ruby#17139)"

7a4d91c

This reverts commit ddb5055.

	rb_shape_id_with_robject_layout(shape_id_t shape_id)
	rb_shape_transition_robject_layout(shape_id_t shape_id)

Conversation

tenderlove commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

XrXr commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tenderlove commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tenderlove commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tekknolagi commented May 29, 2026

Uh oh!

Uh oh!

byroot commented May 29, 2026

Uh oh!

byroot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tenderlove May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

XrXr left a comment

Choose a reason for hiding this comment

Uh oh!

tenderlove commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tenderlove commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

tenderlove commented May 28, 2026 •

edited

Loading

XrXr commented May 28, 2026 •

edited

Loading

tenderlove commented May 29, 2026 •

edited

Loading

tenderlove commented May 29, 2026 •

edited

Loading

tenderlove May 29, 2026 •

edited

Loading

tenderlove commented May 29, 2026 •

edited

Loading