Skip to content

ltm: per-element relative loop scores via subscripted IDs#472

Merged
bpowers merged 5 commits intomainfrom
ltm-per-element-rel-scores
Apr 25, 2026
Merged

ltm: per-element relative loop scores via subscripted IDs#472
bpowers merged 5 commits intomainfrom
ltm-per-element-rel-scores

Conversation

@bpowers
Copy link
Copy Markdown
Owner

@bpowers bpowers commented Apr 25, 2026

Summary

Adds per-element relative loop score access to the libsimlin FFI and pysimlin: callers can now pass r1[Boston], r1[2], or r1[Boston, 2] to address a specific slot of an arrayed (Apply-to-All) loop. Bare r1 on an arrayed loop now returns a signed argmax-abs aggregation across slots so single-line "this loop's importance" plots show the dominant element's contribution rather than silently reading slot 0.

Built on a separately-tracked engine bug fix (tech-debt #34): A2A loop_score variables were broadcasting slot 0 across every slot because compile_ltm_equation_fragment stubbed LTM-var-to-LTM-var dependencies as scalar, defeating per-element evaluation regardless of how the FFI consumer wired up subscripts. Without this fix, the per-element FFI would have shipped numerically identical values for every element.

Closes #463.

Commits

  1. engine: fix A2A loop_score scalar broadcast — root-cause fix in compile_ltm_equation_fragment so A2A loop_scores actually emit per-element values. Resolves tech-debt accidentally quadratic: module_deps #34 (filed during this work after the bite check on the layout aggregation test surfaced that all per-element rel scores were identical).
  2. ltm: per-element relative loop scores via subscripted IDs — engine helpers (LoopElementIndex + resolver, per-element streaming helpers, argmax-abs aggregator), libsimlin SimState snapshot + cache + dispatch, layout compute_metadata rewire, pysimlin element= kwarg.
  3. build: ignore .claude/scheduled_tasks.lock — small .gitignore housekeeping.
  4. ltm: fix per-element rel-score stride and helper OOB — addresses the codex-cli review on this PR (P1: layout was using the loop's own n_slots to derive stride and n_steps from the helper's output, but the helper writes at the partition's max-slot stride; for mixed partitions this produced misaligned importance_series of length step_count * stride). Plus a latent OOB-read in compute_rel_loop_scores_per_element itself surfaced during investigation: when n_slots[i] < max_slots for an arrayed loop, the inner read OOB-read into the next variable's data. Both fixes centred on per-loop n_slots being the canonical bound; new aggregate_per_element_argmax_abs helper extracts the layout aggregation as a unit-testable pure function. Filed tech-debt unit checking #35 to track an adjacent engine quirk (A2A loops get partition = None due to a variable-level vs element-level stock-name mismatch in partition_for_loop).

Test plan

  • cargo test --workspace — all suites green (3,161 engine unit + 33 layout integration + 47 LTM integration + 16 libsimlin analysis + ~200 libsimlin total + 340 pysimlin tests).
  • cargo clippy --workspace --all-targets — clean.
  • cargo fmt --check — clean.
  • cd src/pysimlin && uv run pytest tests/ — 340 passed, 11 skipped.
  • New tests at every layer: LoopElementIndex builder & resolver (10), per-element streaming helpers (5), argmax-abs aggregator (5), subscript parser (13), libsimlin FFI dispatch (5), layout argmax-abs contract + bite check, pysimlin element kwarg + element_count + end-to-end arrayed access (3), helper OOB regression with sentinel-value fixture (1), aggregation helper unit tests including scalar-in-mixed-partition stride recovery (5), and an integration regression for layout importance_series length (1).
  • Two pre-existing tests had assertions that incorrectly assumed every element of every loop has dynamics; relaxed to "at least one element has dynamics" with comments explaining why the prior assertions passed only by accident.

Notes for reviewers

  • The original engine bug (commit 1) is small but high-leverage: a one-line change in dep-stub creation that was masking an entire category of arrayed-LTM correctness for years.
  • Slot-layout convention is row-major (last-dim-fastest), matching what the engine's compiler/dimensions.rs already documents. The resolver's offset arithmetic is linear = i_0 * s_1 + i_1 for 2D and the equivalent fold for ND.
  • The argmax-abs aggregator preserves sign so polarity information isn't lost in the single-line view; ties between slots are broken by lowest index for determinism.
  • Commit 4 addresses codex review #r3142129379. The reported P1 is fixed via the new aggregate_per_element_argmax_abs helper that recovers stride from the helper's actual output length; the helper itself is also hardened against an OOB-read that the codex review caused us to notice.
  • Tech-debt entries accidentally quadratic: module_deps #34 (resolved) and unit checking #35 (filed) in docs/tech-debt.md document the engine-side discoveries from this work for future reference.

bpowers added 3 commits April 25, 2026 07:47
When `compile_ltm_equation_fragment` resolved a dependency that referred
to another LTM synthetic variable (e.g. a loop_score's product equation
referring to its constituent link_scores), the dep stub was hardcoded to
`size: 1, ast: None` and forced the compiler to emit slot-0 reads for
every output slot.  As a result, A2A loop_score variables collapsed every
slot to slot 0's value even though the underlying link_scores were
correctly per-element distinct.  Probed in `arrayed_population.stmx`
(3 regions, distinct birth_rates): `link_score(births->population)` slot
values were 0.0306 / 0.0202 / 0.0102, but `loop_score` reported 0.0306
in all three slots.

The fix mirrors the existing dim-aware stub path used for explicit model
A2A vars (`build_stub_variable`): look up the dep's dimensions in the
salsa-cached `model_ltm_variables` result and synthesize an
`Ast::ApplyToAll(canonical_dims, dummy_const)` with the right slot count.
Scalar deps continue to fall through to the original 1-slot stub, so
non-arrayed loops are unchanged.

Two pre-existing tests had assertions that "every element must have
non-zero values" and only passed because of the broadcast bug; relaxed
to "at least one element has dynamics".  In `arrayed_population.stmx`,
LA's birth_rate equals the uniform death_rate (both 0.01), so LA's
population is in exact equilibrium and its link_scores legitimately
SAFEDIV to zero -- the broadcast bug had been hiding that all along.

The new `test_a2a_loop_score_has_distinct_per_element_values` pins the
contract going forward (heterogeneous birth_rate per region, asserts
`loop_score` slots differ visibly).  Tracked as tech-debt item #34;
marked RESOLVED with the discovery context preserved.
The libsimlin scalar FFI for relative loop scores silently returned
slot 0 for arrayed (Apply-to-All) loops.  This shipped per-element
introspection: callers can now pass `r1[Boston]` (named element),
`r1[2]` (1-based integer for indexed dims), or `r1[Boston, 2]` (2D
mixed) to get a specific element's series; bare `r1` on an arrayed
loop returns a signed argmax-abs aggregation across slots so a
single-line "this loop's importance" view shows the dominant
element's contribution at each step rather than always slot 0.

Engine surface:
  - `ltm_post::LoopElementIndex` + `build_loop_element_index` map each
    loop_id to its dim metadata (canonical names, sizes, named-vs-
    indexed flag).  `resolve(&[&str])` does row-major slot offset
    arithmetic with internal canonicalize, accepts either named
    elements or 1-based integers, and returns informative
    `ResolveError` variants for FFI error formatting.
  - Per-element streaming helpers (`compute_partition_denominator_for_element`,
    `compute_rel_loop_score_for_element`) + signed argmax-abs aggregator
    (`compute_rel_loop_score_argmax_abs`).  Scalar loops broadcast
    slot 0 to keep mixed-shape partitions consistent with
    `compute_rel_loop_scores_per_element`.

Layout:
  - `compute_metadata` now computes `FeedbackLoop::importance_series`
    via per-element rel scores aggregated by signed argmax-abs.  For
    scalar loops this reduces to identity; for arrayed loops with
    distinct per-element dynamics it differs from the prior slot-0
    collapse (verified by `test_arrayed_loop_importance_matches_argmax_abs_aggregation`).

libsimlin:
  - `SimState` snapshots `loop_element_index` at `simlin_sim_new`
    time.  Cache key changes from `Option<usize>` (partition) to
    `(Option<usize>, usize)` (partition x element) so per-element
    lookups amortize.
  - `simlin_analyze_get_relative_loop_score` parses subscripted IDs
    (new `parse_subscripted_loop_id` helper), resolves to a slot
    via `LoopElementIndex.resolve`, and dispatches to either the
    scalar helper, the per-element helper, or the argmax-abs
    aggregator based on subscript presence and loop arity.  Errors
    map `ResolveError` variants to readable FFI messages
    ("loop 'r1' dimension 'region' has no element 'tokyo'", etc.).
  - New `simlin_analyze_get_loop_element_count` accessor lets
    callers detect arrayed loops without parsing trial subscripts.

pysimlin:
  - `Sim.get_relative_loop_score` accepts an optional `element` kwarg
    (string, int, or tuple); the helper formats `loop_id[e1, e2, ...]`
    before passing to the FFI.
  - New `Sim.get_loop_element_count(loop_id)` introspection.

Closes the FFI surface portion of issue #463.  The visible per-element
behaviour required tech-debt #34 (A2A loop_score per-element fix) to
land first; with both in place, the layout argmax-abs path differs
from the slot-0 collapse for arrayed fixtures (the bite check in
`test_arrayed_loop_importance_matches_argmax_abs_aggregation` now
passes deterministically rather than by accident).
The Claude Code harness writes a per-session lock file at
`.claude/scheduled_tasks.lock` to coordinate scheduled task firings.
It's runtime state, not project content, and shouldn't appear in
`git status` for anyone using the harness in this checkout.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bf8f8bfbb3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/simlin-engine/src/layout/mod.rs Outdated
Comment on lines +3923 to +3924
let n_steps = series.len() / n;
(0..n_steps)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Index aggregated loop series with partition stride

compute_rel_loop_scores_per_element returns each loop’s data as step_count * max_slots_in_partition, but this code derives n_steps from the loop’s own slot count (n) and then indexes with that same stride. In mixed partitions (which are explicitly supported, e.g. scalar + arrayed loops), n < max_slots, so importance_series is read with the wrong layout and can become longer than the simulation horizon with misaligned values. This corrupts loop importance output for those models; the read should use the partition stride (or results.step_count) rather than per-loop n.

Useful? React with 👍 / 👎.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 25, 2026

Code Review

[P2] Layout argmax-abs aggregation strides by loop's own n_slots instead of partition's max slots

File: src/simlin-engine/src/layout/mod.rs:3914-3942

compute_rel_loop_scores_per_element returns each loop's series with length step_count * max_slots, where max_slots is the partition's largest slot count (see ltm_post.rs:209 where it allocates vec![0.0; step_count * max_slots], and :234 where it writes series[step * max_slots + k]). The new layout aggregator computes n_steps = series.len() / n and indexes series[t * n + k] using n = n_slots_by_loop.get(&dl.id) — the loop's own slot count. These match only for uniform partitions where every member shares the same dimensions. In a mixed partition (e.g. a scalar loop coexisting with an arrayed loop, or two arrayed loops with different shapes — patterns the docstring on compute_rel_loop_scores_per_element explicitly anticipates), n_steps becomes step_count * max_slots / nstep_count, producing an importance_series of wrong length whose values index into mis-aligned positions. The fixture used by test_arrayed_loop_importance_matches_argmax_abs_aggregation is uniform 3-region A2A, and the test's own "expected" computation mirrors the same buggy stride, so neither the layout nor the test catches this. To fix, derive the stride from series.len() / results.step_count (or thread the partition's max_slots through) and iterate k in 0..n.min(stride).

Overall correctness verdict

Not correct. The PR is well-tested for the FFI surface and the core engine fix is sound, but the layout's importance_series aggregation is incorrect for mixed-slot-count partitions and the layout test does not cover that case.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 25, 2026

Review of PR #472: ltm: per-element relative loop scores via subscripted IDs

[P2] compute_rel_loop_scores_per_element reads past a loop's slots when partitions mix arrayed loops with different slot counts

Location: src/simlin-engine/src/ltm_post.rs lines 219-235

For a cycle partition where two A2A loops have differing slot counts (e.g. slot_counts = [2, 3]), group_max_slots becomes 3 and the inner for k in 0..max_slots iterates to k = 2. For the 2-slot loop, slot_counts[i] > 1 so elem = k = 2, and row[off + 2] reads past that loop's allocated slots into the next variable's slot space (or panics if off + 2 >= step_size). Both the denominator (line 225) and the numerator (line 232) suffer the same bug. The streaming variant compute_partition_denominator_for_element clamps via effective_slot(), and the docstring on effective_slot (Saturation matches the partition-wide max-slots stride convention used by [compute_rel_loop_scores_per_element]…) even claims this function uses the same convention — but the code does not. Trigger requires a single SCC containing arrayed loops with different dimension products (e.g. stocks dimensioned by Region[3] and Cohort[4] linked through cross-element feedback); the included tests all use single-loop or scalar+arrayed partitions, so it is latent.

[P2] compute_metadata argmax-abs aggregation strides by the wrong slot count for partitions with mixed arrayed slot counts

Location: src/simlin-engine/src/layout/mod.rs lines 3915-3942

per_element_rel_scores[loop_id] is laid out as step_count * partition_max_slots (allocated at ltm_post.rs:209), but the aggregator reads it with n = n_slots_by_loop.get(&dl.id) — the loop's own slot count. When the loop's n_slots differs from the partition's max_slots, series.len() / n produces a wrong n_steps and series[t * n + k] walks across step boundaries instead of within a single step's slot block. The resulting importance_series is garbage for any arrayed loop whose partition contains another arrayed loop with a larger slot count. Same trigger condition as the previous finding (mixed-arity partitions); current fixtures don't exercise it, but the layout consumer is what callers see in the diagram UI.

Overall correctness

The core engine fix (LTM-var-to-LTM-var dep stub now respects A2A dimensions in db_ltm.rs:806-856) is correct and well-covered by the new bite check, the FFI subscript parser/resolver is solid, and the FFI per-element streaming/cache path is safe (it uses effective_slot to clamp). The two issues above are real correctness bugs in the post-sim per-element pipeline that trigger only with mixed-arity arrayed partitions. Patch is largely correct but has the noted latent bugs.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 25, 2026

Codecov Report

❌ Patch coverage is 94.35696% with 43 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.08%. Comparing base (e848621) to head (c264ab8).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/libsimlin/src/analysis.rs 86.08% 16 Missing ⚠️
src/simlin-engine/tests/layout.rs 90.26% 11 Missing ⚠️
src/libsimlin/tests/analysis.rs 97.51% 6 Missing ⚠️
src/simlin-engine/tests/simulate_ltm.rs 93.93% 6 Missing ⚠️
src/libsimlin/src/simulation.rs 83.33% 2 Missing ⚠️
src/simlin-engine/src/db_ltm.rs 94.73% 1 Missing ⚠️
src/simlin-engine/src/ltm_post.rs 99.30% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #472      +/-   ##
==========================================
+ Coverage   81.90%   82.08%   +0.18%     
==========================================
  Files         187      187              
  Lines       53170    53884     +714     
==========================================
+ Hits        43550    44233     +683     
- Misses       9620     9651      +31     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Codex review on PR #472 flagged that `try_detect_ltm_loops_incremental`
in `layout/mod.rs` was deriving `n_steps` from a loop's own `n_slots`
and indexing the helper output with that same stride.  But
`compute_rel_loop_scores_per_element` writes each loop's series at the
partition's max-slot stride, not the loop's own n_slots.  For mixed
partitions (loops with different slot counts sharing a partition) this
silently produced importance_series of length `step_count * stride`
with misaligned values instead of length `step_count`.

While investigating, an adjacent OOB-read in the helper came to light:
when iterating `k in 0..max_slots` the inner read pulls
`row[off + k]` for arrayed members, OOB-reading past the loop's own
slot range when partitions mix arrayed loops of different
dimensionalities.

Two principled fixes, both centered on the loop's own n_slots being
the canonical bound:

(1) `compute_rel_loop_scores_per_element` now gates per-element reads
and writes by `k < n_slots[i]` for arrayed loops.  Scalar loops
continue broadcasting slot 0 (existing semantic).  Positions
`n_slots..max_slots` for arrayed loops with `n < max_slots` are
zero-filled rather than OOB-read; the partition denom only includes
contributions from loops that actually have an element at that index.
A new RED test (`per_element_helper_handles_arrayed_with_smaller_n`)
plants a sentinel `999.0` at the OOB target so the bug fails loudly
rather than passing on whatever uninitialised data the allocator
returns.

(2) New `aggregate_per_element_argmax_abs` helper in `ltm_post.rs`
takes the per-element rel-score map, `n_slots_by_loop`, and
`step_count` and produces one signed argmax-abs series per loop.
Stride is recovered from `series.len() / step_count` so the layout no
longer needs to track partition stride.  Argmax-abs iterates the
loop's own n_slots only (always <= stride by construction).
`try_detect_ltm_loops_incremental` is rewired to call this helper
instead of the inlined buggy aggregation.  Five RED tests cover
scalar-in-mixed-partition stride recovery, arrayed-in-mixed-stride
iterating own slots only, NaN/Inf filtering, empty-series handling,
and a pure-arrayed sanity check.

The integration regression test
`test_compute_metadata_importance_series_length_matches_step_count`
provides forward-looking layout-level coverage.  It currently does
NOT trigger the mixed-stride path due to a separately-tracked engine
quirk (new tech-debt #35: A2A loops get `partition = None` because
`Loop::stocks` holds variable-level names while `stock_partition`
keys on element-level names).  Once that quirk is fixed, the same
fixture should immediately start exercising mixed-stride partitions
and this test will keep the layout aggregation honest -- the unit
tests cover the algorithm directly with hand-crafted inputs that
deliberately simulate mixed-stride partitions until then.

Filed tech-debt #35 to track the partition-mismatch quirk.
@claude
Copy link
Copy Markdown

claude Bot commented Apr 25, 2026

[P3] FFI streaming helpers diverge from compute_rel_loop_scores_per_element after commit 4

Files: src/simlin-engine/src/ltm_post.rs:369-374 and src/simlin-engine/src/ltm_post.rs:385-397

Commit 4 changed compute_rel_loop_scores_per_element to skip (zero-contribute) arrayed loops at partition indices k >= n_slots rather than clamp, but the streaming helpers (effective_slot, compute_partition_denominator_for_element, compute_rel_loop_score_for_element, compute_rel_loop_score_argmax_abs) still clamp to n_slots - 1. The doc on effective_slot explicitly claims the saturation "matches the partition-wide max-slots stride convention used by compute_rel_loop_scores_per_element", and the doc on compute_partition_denominator_for_element says it lets the FFI amortize "without falling back to the non-streaming compute_rel_loop_scores_per_element" — both claims are now stale. As a result, for any mixed-stride partition (one arrayed loop with n_a slots sharing a partition with another arrayed loop with n_b < n_a slots), the FFI bare-arrayed simlin_analyze_get_relative_loop_score denom at k in n_b..n_a includes a clamped |row[off_b + n_b - 1]| contribution, whereas the layout's importance_series for the same loop excludes it — so the FFI and layout will silently produce different numbers for the same model. This is currently latent because tech-debt #35 means A2A loops never share a partition with anything (per_element_streaming_matches_full_sweep only covers same-shape partitions, so doesn't catch the divergence), but if/when #35 is fixed the streaming helpers will quietly disagree with the helper they document themselves as mirroring.


Overall correctness verdict

Correct (with caveat). The patch passes its tests and the engine-side fixes (commit 1 LTM-var dep stub, commit 4 zero-fill / stride-recovery) are solid. The latent divergence above only manifests when mixed-stride partitions exist, which the author has tracked as tech-debt #35; nothing currently observable breaks.

Codex review on the previous commit pointed out that
`compute_rel_loop_scores_per_element` was changed to *skip*
arrayed loops at partition indices `k >= n_slots`, but the
streaming helpers (`effective_slot`,
`compute_partition_denominator_for_element`,
`compute_rel_loop_score_for_element`,
`compute_rel_loop_score_argmax_abs`) still *clamped* to the
loop's last slot.  The two paths therefore disagreed for any
mixed-stride partition: the FFI bare-arrayed
`simlin_analyze_get_relative_loop_score` denom included a
clamped `|row[off + n - 1]|` contribution from a loop with
fewer slots than the partition max, while the layout's
`importance_series` (which goes through the full-sweep helper)
did not.  The two helpers' docstrings claimed they mirrored
each other; that claim was stale.

Fix: `effective_slot` now returns `Option<usize>`, returning
`None` for arrayed loops when `element_index >= n_slots`.
Callers thread the `None` through:
  - `compute_partition_denominator_for_element` skips
    `None`-returning members (no contribution to the partition
    sum).
  - `compute_rel_loop_score_for_element` returns an
    all-zeros series (the loop has no own element at this
    partition index, matching the full-sweep zero-fill).
  - `compute_rel_loop_score_argmax_abs` skips `None`
    iterations defensively (its k iteration is always within
    `0..n_slots` by construction so the branch is unreachable
    in practice, but keeping the call documents the invariant
    and guards future callers).

Updated docstrings on `effective_slot` and the streaming
denominator helper to describe the skip semantic and remove
the stale "matches partition-wide max-slots stride convention"
language.

TDD: two new RED tests pinned the divergence (planted a
sentinel value at the clamp target so the bug failed loudly
with 1004.0 = |A[2]| + sentinel and 999.0 = sentinel/denom);
both now pass.  A new parity test
`streaming_helpers_match_full_sweep_in_mixed_stride_partition`
extends the existing same-shape parity check to mixed-stride
so any future drift is caught structurally rather than waiting
for a code reviewer to notice.

This divergence is currently latent in production due to
tech-debt #35 (A2A loops never share a partition with anything
because of a separate variable-vs-element-level stock-name
mismatch in `partition_for_loop`), but lands the fix now so
that resolving #35 doesn't immediately introduce silent FFI/
layout disagreement.
@claude
Copy link
Copy Markdown

claude Bot commented Apr 25, 2026

Code Review

I reviewed the engine fix (LTM dep-stub in compile_ltm_equation_fragment), the new ltm_post.rs primitives (effective_slot, streaming partition-denominator/per-element/argmax-abs helpers, LoopElementIndex/build_loop_element_index, aggregate_per_element_argmax_abs), the libsimlin FFI subscript parser/resolver/dispatch and per-(partition, element_index) denominator cache, the layout compute_metadata rewire, and the pysimlin element= kwarg + get_loop_element_count.

I did not find any bugs introduced by this PR. Notable points I checked:

  • The effective_slot skip-vs-clamp distinction is consistent across both the streaming (compute_partition_denominator_for_element, compute_rel_loop_score_for_element) and full-sweep (compute_rel_loop_scores_per_element) paths, so mixed-stride partitions stay bit-aligned within the same iteration order. Sentinel-value tests (per_element_helper_handles_arrayed_with_smaller_n, streaming_partition_denominator_skips_arrayed_loops_past_own_slots, streaming_rel_score_returns_zeros_when_loop_has_no_own_element) explicitly cover the OOB-read regression.
  • The argmax-abs aggregator's NaN handling is correct: v.abs() > best_abs returns false for NaN, so a NaN never displaces a finite candidate, and aggregate_per_element_argmax_abs additionally filters non-finite picks to 0 to match the layout's existing convention.
  • FFI denominator cache keyed on (Option<usize>, usize) is invalidated in lockstep with results (cleared on run_to_end, reset, set_value_by_offset).
  • The subscript parser handles all the listed edge cases (r1[], r1[, r1], [r1], r1[a][b], r1[a]b, r1[a,], r1[,a]); tests cover all of these.
  • The engine fix in compile_ltm_equation_fragment correctly mirrors the build_stub_variable pattern for explicit A2A vars, looking up the dep's dimensions via salsa-cached model_ltm_variables and emitting an Ast::ApplyToAll stub when dimensions are non-empty.

Overall correctness verdict

Correct. The patch is free of blocking issues and existing tests are extended (not weakened) at every layer. The two pre-existing tests that were relaxed (test_arrayed_population_ltm_exhaustive, test_cross_element_ltm_exhaustive) are documented as having passed by accident pre-tech-debt-#34, and the new assertions reflect real fixture semantics.

@bpowers bpowers merged commit 371a7ed into main Apr 25, 2026
12 checks passed
@bpowers bpowers deleted the ltm-per-element-rel-scores branch April 25, 2026 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LTM scalar FFI returns element-0 only for arrayed A2A loops

1 participant