Closed
Conversation
3c68653 to
5322c41
Compare
Change slot sizes from {40,80,160,320,640} to {64,128,256,512,1024}.
BASE_SLOT_SIZE is now 64 (2^6) and all pool sizes are powers of 2
This enables bit-shift slot indexing instead of magic number division.
RFLOAT mostly
because BASE_SLOT_SIZE is now 32 bytes, it's no longer suitable for use in tests that use it to assume the size of most RVALUE objects, like strings
When RVALUE_OVERHEAD is large (debug builds with RACTOR_CHECK_MODE + GC_DEBUG), the smallest size pool's usable size can be less than sizeof(struct RBasic). The capacity calculation underflows: (8 - 16) / 8 → 0xFFFF (via size_t wraparound, truncated to uint16_t) Since shape_grow_capa iterates capacities from index 0, the garbage 65535 at capacities[0] poisons all ivar capacity growth, causing a buffer overflow in the RUBY_DEBUG assertion that fills unused capacity with Qundef.
When RVALUE_OVERHEAD > 0 (GC_DEBUG, RACTOR_CHECK_MODE), heap[0]'s usable space can equal sizeof(struct RBasic), leaving zero bytes for instance variables. The capacity was incorrectly set to 1, allowing the shape system to embed an IV that overflows into the overhead area. Change the fallback capacity to 0 and switch shape_grow_capa to count-based iteration so that a zero capacity is not confused with the array sentinel.
rb_obj_embedded_size(0) returned sizeof(struct RBasic), which is too small on builds with RVALUE_OVERHEAD (GC_DEBUG) where heap[0] has no usable space beyond RBasic. The as.heap variant needs at least one VALUE of space for the external IV pointer. Clamp the minimum fields_count to 1 so T_OBJECT allocations always request enough space for the as union.
This assumes that the typical allocation pattern is going to build way more smaller objects than large (>512 byte) objects for most cases.
I don't know if this is the correct Fix!
Add the new 32-byte heap 0 to the stat_heap example, shift all heap indices by 1, and update the keys example to show [0, 1, 2, 3, 4, 5].
The assertion used 4 * RVALUE_SIZE (160) which was correct for the old heaps[2] = 160B, but no longer matches any pool boundary. Use the literal heaps[3] size (256B) since that is the target pool for class objects on 64-bit.
BASE_SLOT_SIZE is now 32 on 64-bit (16 on 32-bit), matching the smallest slot in heap_sizes[]. RVALUE_SIZE stays at 64/32 as the minimum slot for a standard RVALUE. Update stale comment about pool count from 5 to 6.
BASE_SLOT_SIZE is now 32 bytes, so this exclusion (which was explicitly marked for removal when that happened) can be dropped.
5322c41 to
34acfad
Compare
Slot sizes are not necessarily powers of BASE_SLOT_SIZE (e.g. the upcoming 48-byte pool). Objects at odd multiples of non-power-of-two slot sizes are VALUE-aligned but not BASE_SLOT_SIZE-aligned. Use sizeof(VALUE) as the minimum alignment for quick-reject in is_pointer_to_heap and related debug assertions.
Replace the formula-based slot size computation (BASE_SLOT_SIZE << i) with an explicit lookup table that includes a 48-byte pool at index 1. Pool layout: 32, 48, 64, 128, 256, 512, 1024 bytes (7 pools). The 48-byte pool targets the dominant 40-byte RVALUE allocation, reducing internal fragmentation from 37.5% to 16.7%. Init slots are now table-driven: pool 1 (48B) gets the full budget as the busiest pool; pool 0 (32B) gets fewer since little fits there.
The RVALUE pool is now 48 bytes (capacity 4 embedded IVs), down from 64 bytes (capacity 6). Adjust the test to use 5 ivars to trigger external storage, then remove 2 to test re-embedding with 3.
heap_idx_for_size used ceil(log2(size)) to map allocation sizes to pool indices, which assumed every pool was a power-of-two multiple of BASE_SLOT_SIZE. With the 48-byte pool at index 1, objects needing 49-64 bytes were placed in 48-byte slots, causing heap buffer overflows (e.g. embedded TypedData with embed_size=56). Handle the first two pools with direct comparisons, then use the log2 formula with +1 offset for the remaining power-of-two pools. slot_index_for_offset used offset >> slot_size_log2 for bitmap index calculation. Replace with precomputed reciprocal multiplication (offset * reciprocal >> 48) which handles arbitrary slot sizes in two instructions without branching. heap_add_page aligned page start with & ~(slot_size - 1), a bitmask trick that only works for powers of two. Replace with modulo-based round-up on this cold path. Reorder heap_page fields to keep freelist, start, and slot_size_reciprocal in the first cache line for allocation and SLOT_INDEX hot paths. Fix RVALUE_SIZE computation and tests that assumed power-of-two pool progression.
a676ebd to
9cf3de6
Compare
Align MMTk's GC::INTERNAL_CONSTANTS key with the default GC. Update all test references from SIZE_POOL_COUNT to HEAP_COUNT.
9cf3de6 to
7cb2c66
Compare
In debug builds RVALUE_OVERHEAD is 16 bytes (file + line tracking), pushing the minimum RVALUE size to 56 bytes which exceeds the 48-byte pool. Use size_allocatable_p instead of checking against a specific pool index so the assertion holds regardless of overhead size.
The 48B pool has capacity for 4 embedded ivars (down from 6 in the old 64B RVALUE pool). Setting @d (the 4th ivar on main) now triggers a shape extension, so ZJIT falls back to SetIvar instead of specializing the store.
The 24-byte non-power-of-two pool on 32-bit triggers a heap corruption
in test_marshal on i686. Revert to the original {16, 32, 64, 128, 256,
512} layout on 32-bit until the root cause is identified. The 48-byte
pool on 64-bit is unaffected.
c28ad6d to
98223e5
Compare
MMTk did not account for RVALUE_OVERHEAD. rb_gc_impl_obj_slot_size returned the full slot size, but the GC interface contract requires the usable size (full minus overhead). RACTOR_BELONGING_ID writes at obj + obj_slot_size expecting to land in the overhead region — without the subtraction, the write corrupted adjacent objects. This is a pre-existing bug exposed by allocation pattern changes. Add RVALUE_OVERHEAD awareness: define the overhead struct matching the default GC, subtract it in obj_slot_size, account for it in allocation sizing and heap_sizes, and call rb_ractor_setup_belonging after allocation to initialize the ractor ID.
eightbitraptor
pushed a commit
that referenced
this pull request
Apr 1, 2026
Move compilation steps from the heaviest jobs to the lightest to reduce the critical path of the Compilations workflow. Before: jobs ranged from 13-41 min (compile#12 had 4 steps, compile#3 had 10 clang versions). After: jobs range from 7-9 steps each (excluding compile#1 which has the LTO build), bringing the estimated critical path from ~41 min to ~30 min. Moves: - clang 23, 22, 21 from #3 to ruby#12 and ruby#10 - GCC 8, 7 from #2 to ruby#12 - `OPT_THREADED_CODE=1`, `OPT_THREADED_CODE=2` from ruby#7 to ruby#10
eightbitraptor
pushed a commit
that referenced
this pull request
Apr 2, 2026
pm_parse_process initializes the index_lookup_table but nothing seems to
use it after it has been allocated. However, pm_compile_scope_node will
overwrite the index_lookup_table and cause it to leak memory. This can
be seen during bootup with the following memory leaks reported by ASAN:
#0 0x60dba31b7af3 in malloc
#1 0x60dba32e0718 in rb_gc_impl_malloc gc/default/default.c:8287:5
#2 0x60dba32c7aa7 in ruby_xmalloc_body gc.c:5373:12
#3 0x60dba32c4a54 in ruby_xmalloc gc.c:5355:34
ruby#4 0x60dba3260314 in pm_index_lookup_table_init_heap prism_compile.h:89:29
ruby#5 0x60dba3209388 in pm_parse_process prism_compile.c:11366:5
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This does
notcontain the 48 byte size pool, that's on another branch on top of this one.This is very WIP