feat: Column H — EntityTypeId on BindSpace (Phase 1 of 4) by AdaWorldAPI · Pull Request #272 · AdaWorldAPI/lance-graph

AdaWorldAPI · 2026-04-27T03:41:13Z

Summary

Phase 1 of the BindSpace Columns E/F/G/H integration plan (PR #271).

Adds Column H (EntityTypeId) — the Palantir Vertex "Object Type" equivalent.

Deliverables

D-H1: EntityTypeId = u16 type alias + entity_type_id(ontology, name) -> u16 function in contract::ontology. 1-based index into Ontology.schemas, 0 = untyped.
D-H2: entity_type: Box<[u16]> field on BindSpace SoA. +2 bytes/row.
D-H3: BindSpaceBuilder::push_typed() writes entity_type per row. Existing push() defaults to 0 for backward compat — no breaking change.
D-H4: 4 tests (default=0, set/get, builder push_typed, contract 1-based lookup).

What it changes

BindSpace::byte_footprint(): 71774 → 71776 per row (+2 bytes = 0.003%)
BindSpaceBuilder::push(): unchanged (passes entity_type=0)
New: BindSpaceBuilder::push_typed() with explicit entity_type

Brutal honest review

What's good:

Smallest possible change. 2 files, 95 lines, 0 breaking API changes. Pure additive.
entity_type_id() is a simple position lookup — no complexity hidden behind the API.
1-based indexing means 0 is always "untyped" — no sentinel confusion, no off-by-one.
The push() → push_typed() delegation keeps all existing callers working.

What's not great:

entity_type_id() does a linear scan of Ontology.schemas — O(N) per lookup. Fine for N < 100 schemas, but if someone has 1000+ entity types this becomes a problem. Should be a HashMap<&str, EntityTypeId> cache on Ontology. Not worth optimizing now (N is ~10 for SMB), but flagged.
The dispatch() step that writes entity_type into the SoA (D-H3 in the plan) is NOT wired yet — this PR adds the FIELD but not the dispatch-time write. That's Phase 2 territory because it requires knowing which OntologySpec the current triplet matches, which is the novel-pattern-detection logic in D-E3.
No clippy gate on the shader-driver crate (only contract is gated). The shader-driver has pre-existing clippy debt from the parallel agents.

Bottom line: Phase 1 is the foundation. Column H exists, has tests, doesn't break anything. The interesting work (dispatch-time type binding) is Phase 2.

Test plan

9 bindspace tests pass (6 existing + 3 new Column H)
261 contract tests pass (260 existing + 1 new entity_type_id)
cargo check workspace clean
Backward compat: push() still works without entity_type arg

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

Generated by Claude Code

Foundry Vertex "Object Type" equivalent. Per-row entity type binding in the BindSpace SoA, enabling type-filtered queries without schema re-parsing. D-H1: `EntityTypeId = u16` + `entity_type_id(ontology, name) -> u16` in contract::ontology. 1-based index into Ontology.schemas. 0 = untyped. D-H2: `entity_type: Box<[u16]>` field on BindSpace SoA. +2 bytes/row (71774 → 71776 footprint for 1 row). D-H3: `BindSpaceBuilder::push_typed()` writes entity_type per row. `push()` defaults to 0 (untyped) for backward compat. D-H4: 4 tests (entity_type defaults to 0, set/get, builder push_typed, contract entity_type_id 1-based lookup). All pass. Phase 1 complete per bindspace-columns-v1.md §5. Unblocks: LF-22 ObjectView usage, LF-40 type-filtered search. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 117a76a5ec

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-27T03:42:38Z

+pub fn entity_type_id(ontology: &Ontology, name: &str) -> EntityTypeId {
+    ontology.schemas.iter()
+        .position(|s| s.name == name)
+        .map(|idx| (idx + 1) as EntityTypeId)


Prevent EntityTypeId wraparound on large ontologies

Casting (idx + 1) to u16 here will silently wrap once ontology.schemas.len() >= 65_536, which can turn a real schema into 0 (the reserved “untyped” sentinel) or collide with another type ID. In those large-ontology cases this corrupts row typing semantics in BindSpace and makes type-based filtering unreliable; this lookup should detect overflow and fail explicitly (or otherwise enforce the maximum schema count) instead of truncating.

Useful? React with 👍 / 👎.

Three artifacts in one commit: 1. Post-merge governance for #352 (lance-graph-ontology v5 + ogit-cascade v1 plans, merged 2026-05-07 as 8e2f088): - PR_ARC_INVENTORY.md prepended with full Added/Locked/Deferred/Docs entry; Confidence line updatable. - LATEST_STATE.md table prepended; "Last updated" refreshed to 2026-05-07. 2. New plan: .claude/plans/palantir-parity-cascade-v2.md (262 lines). Integration capstone over 4 prior Foundry parity docs and v1 cascade. Pillar 0 carry-forward: Foundry parity IS SoA-as-canon parity. Column H (PR #272 SHIPPED) is already the Foundry Object Type bridge; v2 makes the SoA carry the Foundry-equivalent shape. 15 deliverables, top-3 ship with this plan (V2-1 ledger, V2-2 triangle, V2-3 BusDto bridge). Business Logic ↔ Thinking-style ↔ OGIT triangle introduced as routing knowledge artifact. 3. New knowledge doc: .claude/knowledge/soa-dto-dependency-ledger.md (210 lines). Append-only entropy table of 22 DTOs across 4 tiers (sensor → engine → contract → callcenter). Three classifications: bare-metal (9), SoA-glue (7), bridge-projection (6, with 3 OPEN re-classifications). Internal vs external O(1) mapping diagrams. Codec cascade column status: all 8 cascade columns OPEN, current registry uses (bridge_id, public_name) tuples + ogit_uri hashing per 2026-05-07 audit. Probe queue with pass criteria for D-CASCADE- V1-1/7/11 + D-PARITY-V2-3/10. Maintenance protocol attached. Findings driving the artifacts: - StreamDto, ResonanceDto, BusDto all live in thinking-engine::dto.rs (Tier 0/1/2), upstream of contract. - ResonanceDto IS the SoA (4096 ripple energies), not a glue layer. - OntologyRegistry has NO codec cascade columns today; D-CASCADE-V1-7 is the wiring deliverable. - Foundry parity has 5+ prior docs; v2 integrates, does not duplicate. Append-only governance honored on PR_ARC, LATEST_STATE, INTEGRATION_PLANS (prepend only; no past entries edited). Layer-2 AGENT_LOG.md (gitignored) will carry the entry post-push. https://claude.ai/code/session_01WevBiZ3jzVocu8fBpTY8sq

Third addendum, written after actually loading Grok's bundle and the relevant source files into one mental space, the way the prompt asked for from the start. The audit shape used file-by-file slice reads; this addendum used full-file parallel reads with the bundle held together. The output is a different topology, not just additional findings. Substantive observations the audit and Grok's pass both missed: 1. AwarenessPlane16K already exists, six channels not one. Shipped 2026-05-06 in crates/lance-graph-contract/src/splat.rs. Support / Contradiction / Forecast / Counterfactual / Style / Source. Forecast and Counterfactual are scenario-only and explicitly cannot promote ontology facts. Grok's single-channel AwarenessColumn undershoots — the workspace already separates "I am believing" from "I am imagining" at the substrate level. 2. The deposition kernel is geometric: (center_a << 8) ^ center_b mod 16384. TriadicProjection (S/P, P/O, S/O) selects which Pearl-2³ lens produced the codebook pair. Pearl 2³ as parallel dimensions is already implemented at the splat level via projection-byte addressing, spreading evidence across multiple Pearl-aware addresses. 3. Schema-as-MUL-priors lives in DomainProfile per StepDomain. Medcare = 0.92 + Human + fail-closed + HIPAA 6-year retention; SMB = 0.75 + Llm + commerce-grade. Unit-tested invariants. The "ontology-aware MUL trust thresholds" TODO at bindspace.rs:191-198 is exactly the missing wire between EntityTypeId (Column H, shipped PR #272) and DomainProfile. 4. Investigation-as-substrate-traversal is concrete: anchor by entity_type → traverse CausalEdge64.forward() chains accumulating into cycle_fingerprint → deposit splats per channel during traversal → read CamSplatCertificate at stabilization → emit four-way SplatDecision (Proceed / RequireExactReplay / PrefetchOnly / ScenarioOnly). Forecast and Counterfactual channels are how the substrate runs hypothetical investigations without committing to facts — this IS the preemption framing Grok proposed, with sharper semantics. 5. The 3-byte polyglot tag is dialect:u8 (CognitiveEventRow membrane projection) + MetaWord.thinking:6 + MetaWord.nars_f:8 — already produced, scattered across two structs that compose at the membrane boundary. The work is composing them as one queryable predicate, not inventing a new tag. 6. CrystalFingerprint is a multi-carrier enum carrying Binary16K / Structured5x5 / Vsa10kI8 / Vsa10kF32 / Vsa16kF32 deliberately. The user's "Vsa10000 deprecated" correction is per-call-site routing, not deletion: each carrier has a purpose (Vsa10000 for Markov bundling, Vsa16kF32 for collapse-gate + cycle column, Binary16K for bit-deposition splat planes, CAM-PQ for compressed search). The TECH_DEBT entry needs reframing as N decisions, not one migration. 7. DEBUG-STRINGIFY-1 (entropy 5): 35 format!("{:?}", logical_plan) sites read DataFusion LogicalPlan Debug as a stable surface workspace-wide. Hot-path Cypher migration is sized 1-2x larger than Grok's Phase 2 plan accounts for, because the typed-visitor work to eradicate the 35 sites is comparable in size to the parser wiring itself. 8. The substrate shape is six operations on one SoA, each operation a different addressing of the same rows: cognition (cycle column + MetaColumn + CausalEdge64.forward()); memory (entity_type / temporal / cycle_fp_hi-lo); imagination (SplatChannel Forecast/Counterfactual); awareness (AwarenessPlane16K six-channel); external traffic (CognitiveEventRow scalar projection across BBB); audit (step_trajectory_hash). One SoA. Six operations. Each operation is a different way of *reading* the same rows; writes stay narrow via CollapseGate. This is not "BindSpace + accessories"; it's an SoA deliberately designed so every cognitive operation is an addressing mode rather than a layer. The single most valuable observation: the workspace's own ARCHITECTURE_ENTROPY_LEDGER already classifies 22 DTOs across 4 tiers with concrete spaghetti clusters and per-row entropy scores. Grok's pass and my audit independently re-derived smaller versions of this ledger. The right framing for the next session is "complete the seams between structures that already exist" — sized as the high-entropy rows in the existing ledger — not "add the proposed structures." The substrate is more complete than either external pass realized. Also: removed the .grok/ checkout from the working tree (per user's note that the prefix is distinctive enough that bleed risk is low; the files live on origin/main, no need to mirror them on the feature branch). https://claude.ai/code/session_01WevBiZ3jzVocu8fBpTY8sq

chatgpt-codex-connector Bot reviewed Apr 27, 2026

View reviewed changes

AdaWorldAPI merged commit 58c692a into main Apr 27, 2026
1 of 5 checks passed

AdaWorldAPI mentioned this pull request May 7, 2026

plan(palantir-parity-cascade-v2)+ledger(soa-dto-deps)+post-merge(#352) #353

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Column H — EntityTypeId on BindSpace (Phase 1 of 4)#272

feat: Column H — EntityTypeId on BindSpace (Phase 1 of 4)#272
AdaWorldAPI merged 1 commit intomainfrom
claude/bindspace-phase1-column-h

AdaWorldAPI commented Apr 27, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AdaWorldAPI commented Apr 27, 2026

Summary

Deliverables

What it changes

Brutal honest review

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants