fix(codegen): two silent miscompiles on poly receivers#388
Conversation
A setter call `obj.x = v` on a parameter whose static type is a
class union (one or more classes have an `attr_writer :x`, others
don't) silently no-ops: the dispatch loop in compile_call_expr's
SP_TAG_OBJ block recognises only explicit user methods and
attr_readers. Setters fall through with no cls_id arm, so
`if (.tag == SP_TAG_OBJ) {}` ends up empty and the store never
executes.
Add a third arm that recognises the attr_writer shape and emits a
direct `((sp_C *)recv.v.p)->iv_x = v` for the matching cls_id.
Also extend `poly_dispatch_return_type` to recognise setters so the
result temp's C type matches the ivar slot type (Ruby's `x = v`
returns v, not int) — without this, the new arm's `_t = v;`
mistypes when the slot is e.g. a string.
Repro lives in test/poly_attr_writer.rb. Surfaced while building
the tep Sinatra-clone; `res.body = out` was a no-op when `res`
widened to Request | Response (both having the ivar).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`box_value_to_poly` for `at == "nil"` returned the bare literal
`"sp_box_nil()"`, discarding the `val` argument. The poly dispatch
loop calls this with `val = "sp_<Subclass>_method(...)"` — the
actual method call. Discarding it means: when a subclass override's
static return type is nil (e.g. its body is `puts ...` or
`hash[k] = v` whose return is nil-typed), the cls_id arm emits a
bare `sp_box_nil()` and the override never executes.
Equivalent for any base class with an empty body: empty body's
inferred return is also nil, so even the base call is dropped.
Emit the val for its side effect, then yield nil:
return "((void)(" + val + "), sp_box_nil())"
The `(void)` cast suppresses the (benign) "expression result unused"
warning when val is e.g. a function returning int.
Repro lives in test/poly_nil_return_dispatch.rb. Surfaced building
the tep Sinatra-clone — Tep::Filter#before's empty base body made
every user-supplied filter silently no-op until we added a
placeholder return.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces regression tests for polymorphic dispatch, specifically addressing issues where setter calls on class unions were ignored and subclass overrides returning nil were skipped. The tests verify that instance variable assignments occur and that method bodies are executed. The review feedback suggests strengthening these tests by explicitly verifying the return values of the setter expressions and dispatched calls, and by adding test cases for shared instance variables with differing types to ensure proper polymorphic boxing.
|
|
||
| foo = Foo.new | ||
| bar = Bar.new | ||
| set_x(foo, 42) |
There was a problem hiding this comment.
The PR description mentions that poly_dispatch_return_type was extended to ensure the return value of the setter matches the assigned value's type. However, the current test only verifies the side effect (the write to the ivar) and not the return value of the assignment expression itself. It would be more robust to verify that set_x and set_body return the expected value (e.g., by wrapping the calls in puts).
| class Req | ||
| attr_accessor :body | ||
| def initialize; @body = ""; end | ||
| end | ||
|
|
||
| class Res | ||
| attr_accessor :body | ||
| def initialize; @body = ""; end | ||
| end |
There was a problem hiding this comment.
The current test for shared ivar names uses String for both Req#body and Res#body. To fully exercise the fix in poly_dispatch_return_type (which handles matching the C type to the ivar slot), it would be beneficial to add a case where the shared ivar has different types in different classes (e.g., String in one and Integer in another). This would verify that the polymorphic return value is correctly boxed when the types in the class union diverge.
|
|
||
| h = Holder.new | ||
| h.set(Sub.new) | ||
| h.call_hook("ok") |
There was a problem hiding this comment.
While the test correctly verifies that the subclass override is executed via its side effect (puts), it doesn't verify that the return value of the dispatched call is correctly handled as nil. Since the fix specifically addresses nil return values in polymorphic dispatch by ensuring they are properly boxed as sp_box_nil(), adding an assertion for the return value (e.g., puts h.call_hook("ok").nil?) would provide better coverage.
Follow-up to the previous attr_writer-on-poly-receiver fix, surfaced by review feedback (gemini-code-assist on PR matz#388). When the new arm fires for a setter call whose function parameter widened to `poly` -- because the same setter is called with two differently-typed args from divergent call sites -- the rhs needs to be unboxed to the slot's concrete C type for each cls_id arm. Without this, a slot of type `int` (`mrb_int iv_slot`) ended up with the poly's `sp_RbVal lv_v` assigned directly, which fails C compilation. Three rhs-fitting cases now: slot poly + arg concrete -> box the arg (existing path) slot concrete + arg poly -> unbox the arg via unbox_poly_to (new) else -> direct assign Test (`test/poly_attr_writer.rb`) gains a third subcase: `IBox#slot:int` and `SBox#slot:String` sharing the slot name, with `set_slot(o, v)` called for both. Pre-fix this didn't even compile; with the unbox path, each cls_id arm picks the right concrete shape. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… effect Follow-up per gemini-code-assist's review on PR matz#388: the poly_nil_return_dispatch test verified that the subclass override's side effect (a `puts`) ran, but didn't verify that the dispatched call's return value was correctly observable as nil. Add an explicit `result == nil` check after `h.call_hook(\"ok\")` to confirm the boxing path threads the nil result through to the caller, not just the side effect. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Addressed all three of @gemini-code-assist's points:
Re: Windows CI — |
Per matz#391 review: extend test/poly_attr_writer.rb with three subcases. 1. bool slot (the original tep symptom) 2. string slot with both classes agreeing 3. divergent slots: IBox{int} + SBox{string} sharing the `assign(o, v)` call site, so spinel widens both `o` and `v` to poly. Each subcase also captures and checks the assignment expression's return value to confirm Ruby `obj.x = v` semantics (yields v). Subcase 3 surfaced a hole in the original arm: when the rhs is poly (`sp_RbVal`) and the slot is concrete (mrb_int / const char *), the per-arm write needs to unbox the rhs into the slot's C type. Without that the C compiler errors with "assigning to const char * from sp_RbVal" -- because the divergent ivars demand different concrete unboxes. Mirrors the fix already in PR matz#388 for poly setter dispatch. The arm now matches three rhs/slot shapes: - slot poly, rhs concrete: box rhs. - slot concrete, rhs poly: unbox rhs into the slot's type. - both agree: pass through. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two codegen fixes for polymorphic receivers
Both bugs surfaced while building tep, a
spinel-AOT'd Sinatra clone. Each is a silent miscompile — code runs,
no error, wrong behaviour — and each has a tiny, deterministic repro.
Repros:
repro_01_poly_ivar_write.rb— see issue #N1repro_02_empty_base_method.rb— see issue #N2Bug 1 —
obj.x = vno-ops whenobj's static type is a class unionThe dispatch loop in
compile_call_expr'sif (recv.tag == SP_TAG_OBJ)block emits arms only for explicit user methods and
attr_readers.Setter calls (
mname == "x=") on aattr_writer-registered ivar fallthrough with no arm, so the surrounding
if (.tag == SP_TAG_OBJ) {}block is empty. The store never executes.
Fix: add a third arm to the loop that recognises the
attr_writershape and emits a direct((sp_C*)recv.v.p)->iv_x = vinside the matching
cls_idbranch. Also extendspoly_dispatch_return_typeto recognise setters so the result temp'sC type matches the ivar slot type (Ruby returns the rhs from
x=,not an int default).
Bug 2 — subclass override silently dropped when its return type is nil
box_value_to_polyforat == "nil"previously returned the bareliteral
"sp_box_nil()", discarding thevalargument. Thedispatch loop calls this with
val = "sp_<Subclass>_method(...)"—the actual method call is dropped, the
cls_idarm just producessp_box_nil(), and the body of the override never runs.This bites any base-class method with an empty body (or any subclass
override whose final expression is
puts .../Hash[]=/ similarnil-returning side effect).
Fix: emit the val for its side effect, then yield nil:
The
(void)cast suppresses the otherwise-benign "expression resultunused" warning under -Wall configurations.
Verification
make testpasses (425/425).passes (57/57), and several workarounds in tep can now be removed
(renaming
Request#body→Request#raw_bodyto avoid thepoly-write bug; placeholder
0returns in filter base methods tododge the nil-return drop).
What this doesn't cover
String#indexreturns-1instead ofnildivergence (filedas a separate issue, no PR — the choice between matching CRuby
semantics vs. documenting the divergence is a maintainer call).
intinthe absence of a concrete call site (deferred until I have a
cleaner repro).