Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
547fe60
design: sentinel migration completion plan
hetoku May 18, 2026
31df6e8
plan: sentinel migration finish — Stages A-F task breakdown
hetoku May 18, 2026
10ff40c
test: sentinel-only baseline RFL — gates the migration
hetoku May 18, 2026
4566196
vec: add sentinel_is_null inline helper
hetoku May 18, 2026
1da4b93
vec: ray_vec_is_null reads sentinel, not bitmap
hetoku May 18, 2026
f8a2e9c
Revert "vec: ray_vec_is_null reads sentinel, not bitmap"
hetoku May 18, 2026
374e83a
audit: RAYFORCE_NULL_AUDIT instrumentation + revised migration strategy
hetoku May 18, 2026
1a71f0f
S1.1: cast_vec_copy_nulls walks source for sentinel fill
hetoku May 18, 2026
980656c
S1.2: par_set_null writes sentinel + bitmap (parallel-safe)
hetoku May 18, 2026
a34ddfa
S1.3: ray_vec_set_null_checked stamps sentinel + bitmap
hetoku May 18, 2026
d9e9624
S1.4: ray_vec_set_null_checked STR path zeroes the element
hetoku May 18, 2026
d90537c
audit: suppress divergence reports on BOOL/U8 (non-nullable)
hetoku May 18, 2026
235540e
audit: also suppress GUID (no defined NULL_GUID sentinel yet)
hetoku May 18, 2026
fdfdcff
S1.6: propagate_nulls fast path stamps sentinels
hetoku May 18, 2026
7d5e80e
S1.7: test_vec_null_inline restores payload before clearing null
hetoku May 18, 2026
83f5a75
S2a: flip ray_vec_is_null to sentinel for nullable types
hetoku May 18, 2026
77ae2af
S2b: flip RAY_ATOM_IS_NULL + NULL_GUID sentinel + test updates
hetoku May 18, 2026
69e21e6
S3.1: NULL_F32 = NaN; F32 joins sentinel-supported types
hetoku May 18, 2026
8564d06
docs: explain dual-encoding hold + BOOL/U8 deferred lockdown
hetoku May 18, 2026
24812b9
S3'.1: serde ser_null_bitmap derives bits from sentinel reads
hetoku May 18, 2026
d28f73e
S3'.2: expr.c null-handling switched to sentinel reads
hetoku May 18, 2026
3612144
S3'.3a: group.c reduce_range + cdpg + FIRST/LAST sentinel-aware
hetoku May 18, 2026
cce3641
S3'.3b: group.c (median/topk/pearson) + query.c sentinel-aware
hetoku May 18, 2026
99e6a0a
docs(vec): retarget bitmap-strip note to Stage 3'' converters
hetoku May 18, 2026
ff2aa08
S3''.1: morsel synthesizes null_bits from sentinels
hetoku May 18, 2026
8005b54
docs(vec): bitmap strip blocked on obsolete test removal approval
hetoku May 18, 2026
096d6d0
S3: strip bitmap writes for sentinel-supporting types
hetoku May 18, 2026
1d69475
S3.2: par_set_null strips bitmap write for sentinel types
hetoku May 18, 2026
657334a
docs(vec): note BOOL/U8 lockdown effort scope
hetoku May 18, 2026
d34fc31
S3.3: BOOL/U8 lockdown — set_null returns TYPE for non-nullable types
hetoku May 18, 2026
34e4681
S4.1: strip dead bitmap write paths now that the sentinel is sole truth
hetoku May 18, 2026
67028d0
S4.2: stop persisting / restoring ext_nullmap in col file format
hetoku May 18, 2026
65bb04e
S4.4a: strip ext_nullmap bitmap from CSV reader and splayed writer
hetoku May 18, 2026
907a5f1
S4.4: strip remaining ext_nullmap allocators (serde + linkop)
hetoku May 18, 2026
d674feb
S4.5: drop dead NULLMAP_EXT / ext_nullmap bitmap surface
hetoku May 18, 2026
d78ad5d
S4.6: drop ext_nullmap / str_ext_null names from the ray_t union arm
hetoku May 18, 2026
e32e18e
Scrub sentinel-migration narrative from comments, docs, and tests
hetoku May 18, 2026
07ef83b
fix(query): CDPG_BUF_INSERT macro local shadows caller's `v`
hetoku May 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/memory.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 6 additions & 6 deletions docs/superpowers/plans/2026-05-04-universal-dag-vm.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,9 @@

---

## Phase 1 — Boundary materialisation (Layer B)
## Pass 1 — Boundary materialisation (Layer B)

These tasks make it safe for producers to return lazy. After Phase 1 the codebase still produces no lazy values, so behaviour is unchanged — but the safety net is in place.
These tasks make it safe for producers to return lazy. After Pass 1 the codebase still produces no lazy values, so behaviour is unchanged — but the safety net is in place.

### Task 1: `ray_lazy_materialize` runs `ray_optimize`

Expand Down Expand Up @@ -180,11 +180,11 @@ computation."

- [ ] **Step 2: If any gaps found, add materialise prelude per the same pattern as Tasks 2–3**

Otherwise, no commit — Phase 1 is complete.
Otherwise, no commit — Pass 1 is complete.

---

## Phase 2 — Flip producers to return lazy (Layer A, partial)
## Pass 2 — Flip producers to return lazy (Layer A, partial)

Only the `AGG_VEC_VIA_DAG` macro flip in this phase. The single-op leaf cases in `ray_min_fn` / `ray_max_fn` (`agg.c:225, 254`) keep their `wrap+materialize` because they need `recast_i64_to_orig` post-processing that depends on a concrete result. That recast is a separate executor cleanup, deferred.

Expand Down Expand Up @@ -255,7 +255,7 @@ agg.c that were dormant code until now."

---

## Phase 3 — Lift four ops into the DAG (Layer C)
## Pass 3 — Lift four ops into the DAG (Layer C)

Each task is one op and is fully self-contained: opcode + builder + executor + dump entry + lazy-append type rule + `*_fn` refactor. Land in any order.

Expand Down Expand Up @@ -488,7 +488,7 @@ Same shape. `OP_REVERSE = 107`. Refactors `ray_reverse_fn` (`collection.c:1710`)

---

## Phase 4 — Idiom rewrite pass (Layer D)
## Pass 4 — Idiom rewrite pass (Layer D)

### Task 10: Skeleton — `idiom.h` + `idiom.c` with empty table, wired into `ray_optimize`

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ showed:
calls**. They work today only because no producer ever returns
lazy.

So the original "Phase 1 = lift four ops; Phase 2 = idiom rewriter"
So the original "Pass 1 = lift four ops; Pass 2 = idiom rewriter"
framing was incomplete. The honest framing is one principle with three
mechanical consequences. This revision restructures around that.

Expand Down
113 changes: 63 additions & 50 deletions include/rayforce.h
Original file line number Diff line number Diff line change
Expand Up @@ -113,22 +113,27 @@ typedef enum {
typedef union ray_t {
/* Allocated: object header */
struct {
/* Bytes 0-15: nullable bitmask / slice / ext nullmap / index */
/* Bytes 0-15: slice / sym_dict / str_pool / index / link arm.
* Null state is sentinel-encoded in the payload (see
* src/vec/vec.c); this 16-byte slot carries no bitmap bits.
* The `nullmap` name is retained as the raw-byte view used by
* atoms (nullmap[0]&1), envs (builtin name @ nullmap[2..15]),
* tables / dicts / lists / str-pools (zero-init), and the col
* on-disk header. */
union {
uint8_t nullmap[16];
struct { union ray_t* slice_parent; int64_t slice_offset; };
struct { union ray_t* ext_nullmap; union ray_t* sym_dict; };
struct { union ray_t* str_ext_null; union ray_t* str_pool; };
struct { union ray_t* slice_parent; int64_t slice_offset; };
struct { uint8_t _aux_sym_lo[8]; union ray_t* sym_dict; };
struct { uint8_t _aux_str_lo[8]; union ray_t* str_pool; };
/* RAY_ATTR_HAS_INDEX (vectors): ray_t* of type RAY_INDEX
* carrying both the accelerator payload and the saved nullmap
* bytes. _idx_pad is reserved (must be NULL). See ops/idxop.h. */
struct { union ray_t* index; union ray_t* _idx_pad; };
/* RAY_ATTR_HAS_LINK (vectors, RAY_I32/RAY_I64 only): bytes 8-15
* hold an int64 sym ID naming the target table. link_lo[8]
* aliases bytes 0-7 (inline nullmap bits OR ext_nullmap pointer
* OR HAS_INDEX index pointer, depending on the other arm in use).
* See ops/linkop.h. */
struct { uint8_t link_lo[8]; int64_t link_target; };
* carrying the accelerator payload and the saved nullmap
* bytes. _idx_pad is reserved (must be NULL). See
* ops/idxop.h. */
struct { union ray_t* index; union ray_t* _idx_pad; };
/* RAY_ATTR_HAS_LINK (vectors, RAY_I32/RAY_I64 only): bytes
* 8-15 hold an int64 sym ID naming the target table.
* link_lo[8] aliases bytes 0-7. See ops/linkop.h. */
struct { uint8_t link_lo[8]; int64_t link_target; };
};
/* Bytes 16-31: metadata + value */
uint8_t mmod; /* 0=heap, 1=file-mmap */
Expand Down Expand Up @@ -310,48 +315,56 @@ ray_t* ray_typed_null(int8_t type);
* directly (e.g. `x == NULL_I64`, `x != x` for NaN); there are no predicate
* macros or aliases. Temporal types (DATE/TIME/TIMESTAMP) reuse NULL_I32 or
* NULL_I64 based on their storage width. SYM null = sym ID 0; STR null =
* empty string (length 0); BOOL and U8 are non-nullable.
*
* Phase 1 added the constants and locked BOOL/U8 down as non-nullable.
* Phase 2 wired NULL_F64 into the CSV parser, ray_typed_null, and the
* I64→F64 UPDATE cast — null F64 slots now hold NaN alongside the
* nullmap bit.
* Phase 3a generalized this to integer / temporal types (I16, I32, I64,
* DATE, TIME, TIMESTAMP). Producer surface mirrors Phase 2 — CSV
* parser, ray_typed_null, cast_vec_copy_nulls, set_all_null,
* store_typed_elem (lang/internal.h), UPDATE atom broadcast (3 sites),
* UPDATE WHERE numeric-promo cast, group-by key scatter (serial +
* parallel + grpt TOP_N), pivot key scatter, linkop deref. The
* grouped-aggregation consumer (da_accum_row + scalar_accum_row) gained
* per-agg integer-null guards in the SUM/AVG/STDDEV/VAR/PROD/MIN/MAX/
* FIRST/LAST arms — sentinel-compare (`v != precomputed_sentinel`)
* rather than nullmap consultation for cache-line efficiency; the
* tradeoff (a user-stored INT_MIN in a HAS_NULLS column is dropped)
* is bounded by dual encoding keeping the bitmap as source of truth.
* Phase 3b closed the documented finalization gaps in the
* scalar and direct-array (DA) grouped accumulators: per-(group, agg)
* non-null counts (`nn_count[gid * n_aggs + a]`) drive AVG / VAR /
* STDDEV divisors and gate MIN / MAX / PROD / FIRST / LAST result
* emission — all-null groups now produce a typed null (NULL_F64 /
* NULL_I64 plus the nullmap bit) instead of leaking the accumulator
* seed (DBL_MAX / -DBL_MAX / 0 / product identity). FIRST/LAST also
* gained "skip null rows" semantics: a null prefix no longer advances
* acc->first_row[gid]. The multi-key radix HT (accum_from_entry,
* ~line 2155) still inherits the pre-existing nullable-agg gap noted
* at the sparse-path fallback (~line 5728).
* Through Phase 7 (full cutover) the bitmap bit `nullmap[0] & 1` is
* kept in sync with the sentinel value for atoms ("dual encoding"), so
* legacy bitmap-aware readers and new sentinel-aware readers agree.
* After Phase 7 the bitmap arm is reclaimed for inline stats and the
* bit becomes a pure optimization hint. */
* empty string (length 0); BOOL and U8 are non-nullable. */
#define NULL_I16 ((int16_t)INT16_MIN)
#define NULL_I32 ((int32_t)INT32_MIN)
#define NULL_I64 ((int64_t)INT64_MIN)
#define NULL_F32 ((float)__builtin_nanf(""))
#define NULL_F64 (__builtin_nan(""))

/* Null bitmap check for atoms — bit 0 of nullmap[0] marks typed nulls.
* Also matches RAY_NULL_OBJ (the untyped null singleton). */
#define RAY_ATOM_IS_NULL(x) (RAY_IS_NULL(x) || ((x)->type < 0 && ((x)->nullmap[0] & 1)))
/* Atom null check. RAY_NULL_OBJ is the untyped null singleton.
* Typed atoms with a defined NULL_* sentinel use payload-compare;
* types without a sentinel (BOOL/U8/F32) fall back to the
* nullmap[0]&1 bit written by ray_typed_null. */
static inline bool ray_atom_is_null_fn(const union ray_t* x) {
if (RAY_IS_NULL(x)) return true;
if (x->type >= 0) return false;
switch (-x->type) {
case RAY_F64: return x->f64 != x->f64;
case RAY_F32: {
/* F32 atoms reuse the f64 union slot — see ray_f32 / atom.c. */
float f = (float)x->f64;
return f != f;
}
case RAY_I64:
case RAY_TIMESTAMP: return x->i64 == NULL_I64;
case RAY_I32:
case RAY_DATE:
case RAY_TIME: return x->i32 == NULL_I32;
case RAY_I16: return x->i16 == NULL_I16;
case RAY_SYM: return x->i64 == 0;
case RAY_STR:
/* STR atom null = empty string. Atoms use SSO (slen + sdata)
* for len<=7 and a pool pointer (obj) for longer strings; the
* union overlap means a non-zero obj pointer has a low byte
* that ALSO reads as slen via the SSO arm. Only when slen==0
* AND obj==NULL is the atom genuinely the empty string (see
* is_sso in src/vec/str.c). */
return x->slen == 0 && x->obj == NULL;
case RAY_GUID: {
/* GUID null = 16 all-zero bytes in obj's U8 buffer.
* obj is always populated by ray_guid / ray_typed_null —
* a NULL obj indicates corruption; treat as null
* defensively. */
if (!x->obj) return true;
const uint8_t* b = (const uint8_t*)((char*)x->obj + sizeof(union ray_t));
for (int i = 0; i < 16; i++) if (b[i]) return false;
return true;
}
default: return (x->nullmap[0] & 1) != 0;
}
}
#define RAY_ATOM_IS_NULL(x) ray_atom_is_null_fn(x)

/* ===== Vector API ===== */

Expand Down
56 changes: 30 additions & 26 deletions src/core/morsel.c
Original file line number Diff line number Diff line change
Expand Up @@ -68,37 +68,41 @@ bool ray_morsel_next(ray_morsel_t* m) {
m->morsel_len = remaining < RAY_MORSEL_ELEMS ? remaining : RAY_MORSEL_ELEMS;
m->morsel_ptr = (uint8_t*)ray_data(m->vec) + (size_t)m->offset * m->elem_size;

/* Null bitmap: only if HAS_NULLS.
* M5: null_bits points to the byte containing bit (m->offset).
* Callers must account for (m->offset % 8) bit offset within the
* first byte of null_bits when testing individual null bits.
/* Null bitmap: synthesized per-morsel from sentinel reads.
* null_bits points to a buffer offset (0,1,...) — caller indexes
* starting at bit (m->offset & 7) just like the previous
* source-bitmap layout did. We mirror the (m->offset / 8) byte
* offset by computing into &null_bits_buf[m->offset / 8].
*
* HAS_INDEX path: when an accelerator index is attached, the parent's
* 16-byte nullmap union holds the index pointer instead of bitmap data
* (or ext_nullmap pointer). The original bytes are preserved inside
* ix->saved_nullmap. Route through that snapshot here so null-aware
* loops still see the correct bits. */
* Synthesizing on demand sidesteps the source bitmap entirely:
* sentinel-supporting types (F64 / F32 / integer & temporal /
* STR / GUID) have the source bitmap stripped, so reading it
* directly would give stale zeros. Cost is one O(morsel_len)
* sentinel scan per chunk; cheap given morsel_len <= 1024. */
m->null_bits = NULL;
if (m->vec->attrs & RAY_ATTR_HAS_NULLS) {
if (m->vec->attrs & RAY_ATTR_HAS_INDEX) {
ray_index_t* ix = ray_index_payload(m->vec->index);
if (ix->saved_attrs & RAY_ATTR_NULLMAP_EXT) {
ray_t* ext;
memcpy(&ext, &ix->saved_nullmap[0], sizeof(ext));
m->null_bits = (uint8_t*)ray_data(ext) + (m->offset / 8);
} else if (m->offset < 128) {
m->null_bits = ix->saved_nullmap + (m->offset / 8);
int64_t bit0 = m->offset & 7;
int64_t base_byte = m->offset / 8;
int64_t total_bits = bit0 + m->morsel_len;
int64_t nbytes = (total_bits + 7) / 8;
if ((size_t)nbytes > sizeof(m->null_bits_buf)) {
/* Defensive — RAY_MORSEL_ELEMS bounds morsel_len to 1024
* (=128 bytes), well within the 128-byte buffer. Bail to
* a NULL null_bits if a future MORSEL grows beyond. */
return true;
}
memset(m->null_bits_buf, 0, (size_t)nbytes);
for (int64_t k = 0; k < m->morsel_len; k++) {
if (ray_vec_is_null(m->vec, m->offset + k)) {
int64_t b = bit0 + k;
m->null_bits_buf[b >> 3] |= (uint8_t)(1u << (b & 7));
}
} else if (m->vec->attrs & RAY_ATTR_NULLMAP_EXT) {
/* External bitmap: point to correct byte offset */
ray_t* ext = m->vec->ext_nullmap;
m->null_bits = (uint8_t*)ray_data(ext) + (m->offset / 8);
} else if (m->offset < 128) {
/* Inline bitmap is 16 bytes = 128 bits; vectors with HAS_NULLS
* and >128 elements must use external nullmap (RAY_ATTR_NULLMAP_EXT).
* Returns null_bits=NULL for offset>=128 when using inline bitmap. */
m->null_bits = m->vec->nullmap + (m->offset / 8);
}
/* Mimic the prior contract: pointer addresses the byte that
* holds bit (m->offset). Callers index into it starting at
* bit (m->offset & 7). */
m->null_bits = m->null_bits_buf;
(void)base_byte;
}

return true;
Expand Down
Loading
Loading