Skip to content

feat: Column H — EntityTypeId on BindSpace (Phase 1 of 4)#272

Merged
AdaWorldAPI merged 1 commit intomainfrom
claude/bindspace-phase1-column-h
Apr 27, 2026
Merged

feat: Column H — EntityTypeId on BindSpace (Phase 1 of 4)#272
AdaWorldAPI merged 1 commit intomainfrom
claude/bindspace-phase1-column-h

Conversation

@AdaWorldAPI
Copy link
Copy Markdown
Owner

Summary

Phase 1 of the BindSpace Columns E/F/G/H integration plan (PR #271).

Adds Column H (EntityTypeId) — the Palantir Vertex "Object Type" equivalent.

Deliverables

  • D-H1: EntityTypeId = u16 type alias + entity_type_id(ontology, name) -> u16 function in contract::ontology. 1-based index into Ontology.schemas, 0 = untyped.
  • D-H2: entity_type: Box<[u16]> field on BindSpace SoA. +2 bytes/row.
  • D-H3: BindSpaceBuilder::push_typed() writes entity_type per row. Existing push() defaults to 0 for backward compat — no breaking change.
  • D-H4: 4 tests (default=0, set/get, builder push_typed, contract 1-based lookup).

What it changes

  • BindSpace::byte_footprint(): 71774 → 71776 per row (+2 bytes = 0.003%)
  • BindSpaceBuilder::push(): unchanged (passes entity_type=0)
  • New: BindSpaceBuilder::push_typed() with explicit entity_type

Brutal honest review

What's good:

  • Smallest possible change. 2 files, 95 lines, 0 breaking API changes. Pure additive.
  • entity_type_id() is a simple position lookup — no complexity hidden behind the API.
  • 1-based indexing means 0 is always "untyped" — no sentinel confusion, no off-by-one.
  • The push()push_typed() delegation keeps all existing callers working.

What's not great:

  • entity_type_id() does a linear scan of Ontology.schemas — O(N) per lookup. Fine for N < 100 schemas, but if someone has 1000+ entity types this becomes a problem. Should be a HashMap<&str, EntityTypeId> cache on Ontology. Not worth optimizing now (N is ~10 for SMB), but flagged.
  • The dispatch() step that writes entity_type into the SoA (D-H3 in the plan) is NOT wired yet — this PR adds the FIELD but not the dispatch-time write. That's Phase 2 territory because it requires knowing which OntologySpec the current triplet matches, which is the novel-pattern-detection logic in D-E3.
  • No clippy gate on the shader-driver crate (only contract is gated). The shader-driver has pre-existing clippy debt from the parallel agents.

Bottom line: Phase 1 is the foundation. Column H exists, has tests, doesn't break anything. The interesting work (dispatch-time type binding) is Phase 2.

Test plan

  • 9 bindspace tests pass (6 existing + 3 new Column H)
  • 261 contract tests pass (260 existing + 1 new entity_type_id)
  • cargo check workspace clean
  • Backward compat: push() still works without entity_type arg

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh


Generated by Claude Code

Foundry Vertex "Object Type" equivalent. Per-row entity type binding
in the BindSpace SoA, enabling type-filtered queries without schema
re-parsing.

D-H1: `EntityTypeId = u16` + `entity_type_id(ontology, name) -> u16`
  in contract::ontology. 1-based index into Ontology.schemas. 0 = untyped.

D-H2: `entity_type: Box<[u16]>` field on BindSpace SoA.
  +2 bytes/row (71774 → 71776 footprint for 1 row).

D-H3: `BindSpaceBuilder::push_typed()` writes entity_type per row.
  `push()` defaults to 0 (untyped) for backward compat.

D-H4: 4 tests (entity_type defaults to 0, set/get, builder push_typed,
  contract entity_type_id 1-based lookup). All pass.

Phase 1 complete per bindspace-columns-v1.md §5.
Unblocks: LF-22 ObjectView usage, LF-40 type-filtered search.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 117a76a5ec

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

pub fn entity_type_id(ontology: &Ontology, name: &str) -> EntityTypeId {
ontology.schemas.iter()
.position(|s| s.name == name)
.map(|idx| (idx + 1) as EntityTypeId)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prevent EntityTypeId wraparound on large ontologies

Casting (idx + 1) to u16 here will silently wrap once ontology.schemas.len() >= 65_536, which can turn a real schema into 0 (the reserved “untyped” sentinel) or collide with another type ID. In those large-ontology cases this corrupts row typing semantics in BindSpace and makes type-based filtering unreliable; this lookup should detect overflow and fail explicitly (or otherwise enforce the maximum schema count) instead of truncating.

Useful? React with 👍 / 👎.

@AdaWorldAPI AdaWorldAPI merged commit 58c692a into main Apr 27, 2026
1 of 5 checks passed
AdaWorldAPI pushed a commit that referenced this pull request May 7, 2026
Three artifacts in one commit:

1. Post-merge governance for #352 (lance-graph-ontology v5 + ogit-cascade
   v1 plans, merged 2026-05-07 as 8e2f088):
   - PR_ARC_INVENTORY.md prepended with full Added/Locked/Deferred/Docs
     entry; Confidence line updatable.
   - LATEST_STATE.md table prepended; "Last updated" refreshed to
     2026-05-07.

2. New plan: .claude/plans/palantir-parity-cascade-v2.md (262 lines).
   Integration capstone over 4 prior Foundry parity docs and v1 cascade.
   Pillar 0 carry-forward: Foundry parity IS SoA-as-canon parity. Column
   H (PR #272 SHIPPED) is already the Foundry Object Type bridge; v2
   makes the SoA carry the Foundry-equivalent shape. 15 deliverables,
   top-3 ship with this plan (V2-1 ledger, V2-2 triangle, V2-3 BusDto
   bridge). Business Logic ↔ Thinking-style ↔ OGIT triangle introduced
   as routing knowledge artifact.

3. New knowledge doc: .claude/knowledge/soa-dto-dependency-ledger.md
   (210 lines). Append-only entropy table of 22 DTOs across 4 tiers
   (sensor → engine → contract → callcenter). Three classifications:
   bare-metal (9), SoA-glue (7), bridge-projection (6, with 3 OPEN
   re-classifications). Internal vs external O(1) mapping diagrams.
   Codec cascade column status: all 8 cascade columns OPEN, current
   registry uses (bridge_id, public_name) tuples + ogit_uri hashing
   per 2026-05-07 audit. Probe queue with pass criteria for D-CASCADE-
   V1-1/7/11 + D-PARITY-V2-3/10. Maintenance protocol attached.

Findings driving the artifacts:
- StreamDto, ResonanceDto, BusDto all live in thinking-engine::dto.rs
  (Tier 0/1/2), upstream of contract.
- ResonanceDto IS the SoA (4096 ripple energies), not a glue layer.
- OntologyRegistry has NO codec cascade columns today; D-CASCADE-V1-7
  is the wiring deliverable.
- Foundry parity has 5+ prior docs; v2 integrates, does not duplicate.

Append-only governance honored on PR_ARC, LATEST_STATE, INTEGRATION_PLANS
(prepend only; no past entries edited). Layer-2 AGENT_LOG.md (gitignored)
will carry the entry post-push.

https://claude.ai/code/session_01WevBiZ3jzVocu8fBpTY8sq
AdaWorldAPI pushed a commit that referenced this pull request May 8, 2026
Third addendum, written after actually loading Grok's bundle and the
relevant source files into one mental space, the way the prompt asked
for from the start. The audit shape used file-by-file slice reads;
this addendum used full-file parallel reads with the bundle held
together. The output is a different topology, not just additional
findings.

Substantive observations the audit and Grok's pass both missed:

1. AwarenessPlane16K already exists, six channels not one. Shipped
   2026-05-06 in crates/lance-graph-contract/src/splat.rs. Support /
   Contradiction / Forecast / Counterfactual / Style / Source.
   Forecast and Counterfactual are scenario-only and explicitly cannot
   promote ontology facts. Grok's single-channel AwarenessColumn
   undershoots — the workspace already separates "I am believing" from
   "I am imagining" at the substrate level.

2. The deposition kernel is geometric: (center_a << 8) ^ center_b mod
   16384. TriadicProjection (S/P, P/O, S/O) selects which Pearl-2³
   lens produced the codebook pair. Pearl 2³ as parallel dimensions is
   already implemented at the splat level via projection-byte addressing,
   spreading evidence across multiple Pearl-aware addresses.

3. Schema-as-MUL-priors lives in DomainProfile per StepDomain. Medcare
   = 0.92 + Human + fail-closed + HIPAA 6-year retention; SMB = 0.75 +
   Llm + commerce-grade. Unit-tested invariants. The "ontology-aware
   MUL trust thresholds" TODO at bindspace.rs:191-198 is exactly the
   missing wire between EntityTypeId (Column H, shipped PR #272) and
   DomainProfile.

4. Investigation-as-substrate-traversal is concrete: anchor by
   entity_type → traverse CausalEdge64.forward() chains accumulating
   into cycle_fingerprint → deposit splats per channel during traversal
   → read CamSplatCertificate at stabilization → emit four-way
   SplatDecision (Proceed / RequireExactReplay / PrefetchOnly /
   ScenarioOnly). Forecast and Counterfactual channels are how the
   substrate runs hypothetical investigations without committing to
   facts — this IS the preemption framing Grok proposed, with sharper
   semantics.

5. The 3-byte polyglot tag is dialect:u8 (CognitiveEventRow membrane
   projection) + MetaWord.thinking:6 + MetaWord.nars_f:8 — already
   produced, scattered across two structs that compose at the membrane
   boundary. The work is composing them as one queryable predicate,
   not inventing a new tag.

6. CrystalFingerprint is a multi-carrier enum carrying Binary16K /
   Structured5x5 / Vsa10kI8 / Vsa10kF32 / Vsa16kF32 deliberately. The
   user's "Vsa10000 deprecated" correction is per-call-site routing,
   not deletion: each carrier has a purpose (Vsa10000 for Markov
   bundling, Vsa16kF32 for collapse-gate + cycle column, Binary16K for
   bit-deposition splat planes, CAM-PQ for compressed search). The
   TECH_DEBT entry needs reframing as N decisions, not one migration.

7. DEBUG-STRINGIFY-1 (entropy 5): 35 format!("{:?}", logical_plan)
   sites read DataFusion LogicalPlan Debug as a stable surface
   workspace-wide. Hot-path Cypher migration is sized 1-2x larger
   than Grok's Phase 2 plan accounts for, because the typed-visitor
   work to eradicate the 35 sites is comparable in size to the parser
   wiring itself.

8. The substrate shape is six operations on one SoA, each operation a
   different addressing of the same rows: cognition (cycle column +
   MetaColumn + CausalEdge64.forward()); memory (entity_type / temporal
   / cycle_fp_hi-lo); imagination (SplatChannel Forecast/Counterfactual);
   awareness (AwarenessPlane16K six-channel); external traffic
   (CognitiveEventRow scalar projection across BBB); audit
   (step_trajectory_hash). One SoA. Six operations. Each operation is
   a different way of *reading* the same rows; writes stay narrow via
   CollapseGate. This is not "BindSpace + accessories"; it's an SoA
   deliberately designed so every cognitive operation is an addressing
   mode rather than a layer.

The single most valuable observation: the workspace's own
ARCHITECTURE_ENTROPY_LEDGER already classifies 22 DTOs across 4 tiers
with concrete spaghetti clusters and per-row entropy scores. Grok's
pass and my audit independently re-derived smaller versions of this
ledger. The right framing for the next session is "complete the seams
between structures that already exist" — sized as the high-entropy
rows in the existing ledger — not "add the proposed structures." The
substrate is more complete than either external pass realized.

Also: removed the .grok/ checkout from the working tree (per user's
note that the prefix is distinctive enough that bleed risk is low; the
files live on origin/main, no need to mirror them on the feature
branch).

https://claude.ai/code/session_01WevBiZ3jzVocu8fBpTY8sq
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants