feat(ogar-vocab): Project + OGAR codebook + LabelDTO + LE wire contract#60
Conversation
Continue the OP <-> Redmine convergence ladder (scope split: this session
is OP/Redmine, other session is Odoo/Open-Source-Billing).
Project is the root container of project-domain work: referenced by
project_work_item().project and billable_work_entry().project edges, this
PR adds the canonical class those edges resolve to.
- ogar-vocab: project() with 3 direct family edges
HasMany work_items -> ProjectWorkItem (existing canonical)
HasMany time_entries -> BillableWorkEntry (existing canonical)
HasMany members -> ProjectActor (forward; future PR)
+ typed identity attributes (name, identifier as 'string').
+ Language::Unknown per the codex-P2 doctrine on synthetic classes.
Nested-project parent is real cross-curator but surfaced via mixins
(Redmine awesome_nested_set; OP Projects::Hierarchy) — the producer
doesn't decode mixin-borne parent today; documented as a follow-up.
- ogar-from-ruff: project_role(curator_name) maps Rails-AR association
names to the 3 canonical roles. Universal (time_entries), divergent
(issues / work_packages -> work_items), and the through-association
actor chain (members / memberships / users / member_principals /
principals -> members). Forward-compat 'parent' arm kept (no current
curator triggers it, harmless). project_canonical_roles(&Class) is the
v1 lineage-transcode projector for Project.
- Tests:
- ogar-vocab: project_is_the_promoted_canonical_class pins shape +
language + typed attributes.
- ogar-from-ruff: project_role_maps_rails_dialect_synonyms +
project_canonical_roles_covers_both_curators on synthetic fixtures.
- ogar-from-rails real-corpus:
redmine_and_openproject_projects_converge_through_canonical proves
both Redmine Project and OpenProject Project lift to canonical_concept
'project' and project to the IDENTICAL 3-role canonical role set on
REAL source.
ogar-vocab 22, ogar-from-ruff 24 (+2 unit), ogar-from-rails 7 real-corpus
(+1 Project); clippy clean; disk 23 GB free.
…ss is identity
The user's load-bearing insight, made code: 'labels are arbitrary if it
maps to the same binary codebook values'. The curator name stays whatever
shape the curator emits (Rails Issue/WorkPackage, Odoo
account.analytic.line, …); the OGAR codebook is what maps any of them to
the canonical u16 binary identity. The string layer collapses; the
address is the identity.
ogar-vocab additions:
- canonical_concept_id(concept: &str) -> u16
FNV-1a 32-bit XOR-folded to u16; pure + deterministic + portable.
The 0-slot is canon-reserved (NodeGuid::CLASSID_DEFAULT) so any
collision-with-0 hash is bumped to 1.
- ogar_codebook(alias: &str) -> u16
The 'leave it Odoo-shaped, map to canonical via OGAR codebook' API.
Composes canonical_concept (promoted-invariant table) with
canonical_concept_id (hash) so a curator label and the canonical
string label produce the SAME id.
- impl Class { pub fn canonical_id(&self) -> Option<u16> } convenience.
- LabelDTO { label: String, id: u16 } + LabelDTO::from_alias(label)
Consumer-facing primitive: two consumers with different label
conventions for the same concept produce LabelDTOs with DIFFERENT
labels and EQUAL ids. OGAR has the awareness; consumers carry their
own labels.
5 new unit tests pin the contract:
- canonical_concept_id_is_deterministic_and_nonzero
- canonical_concept_id_distinct_for_promoted_concepts
- ogar_codebook_maps_curator_labels_to_canonical_id
- label_dto_carries_local_label_and_shared_codebook_id
- class_canonical_id_round_trips_through_codebook
Real-corpus tests now assert binary-codebook convergence directly on the
extracted classes:
- Redmine Issue and OP WorkPackage have the SAME canonical_id (the
ogar_codebook value for 'project_work_item').
- Redmine and OP Project have the SAME canonical_id.
- Redmine TimeEntry's canonical_id matches ogar_codebook
('account.analytic.line') without producer-side label-shaping —
cross-domain binary convergence proved on the Rails side without
reaching into Odoo source (other session's scope).
ogar-vocab 27 (+5), ogar-from-ruff 24, ogar-from-rails 7 real-corpus
green; clippy clean; disk 23 GB free.
…ok ids
Per the LE-contract direction: consumers (SurrealDB AST, lance-graph-
planner, kanban, …) consume the codebook value as 2 little-endian bytes,
matching the NodeGuid layout (canonical_node.rs is LE throughout).
- canonical_concept_id rustdoc: document the 2-byte LE wire format and
declare wire-compatibility with NodeGuid.classid's u16 low half.
- Class::canonical_id_le() -> Option<[u8; 2]>: convenience.
- LabelDTO::id_le() -> [u8; 2]: convenience.
- New unit test le_wire_contract_round_trips pins:
* same canonical -> same LE bytes,
* u16::from_le_bytes roundtrip is stable,
* Class.canonical_id_le agrees with LabelDTO.id_le for the same
canonical concept,
* no canonical -> None on the wire.
LabelDTO documentation notes that long-term the type belongs in
lance-graph-contract alongside ClassId and the NodeGuid LE layout
(consumers depend on lance-graph-contract for ClassId anyway). Wire is
the source of truth; any encoder/decoder agreeing on u16 LE is
compatible regardless of which crate exports the DTO.
ogar-vocab 28 (+1 LE), ogar-from-ruff 24, real-corpus 7 green; clippy clean.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e780241b76
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let folded = (hash as u16) ^ ((hash >> 16) as u16); | ||
| if folded == 0 { 1 } else { folded } |
There was a problem hiding this comment.
Avoid lossy hashes for codebook identity
Because LabelDTO.id and Class::canonical_id() are introduced as identity values, folding FNV into 16 bits makes unrelated labels indistinguishable whenever they collide. This is not just theoretical: with the current normalization canonical_concept_id("outcome") == canonical_concept_id("handle_out") (33032), and Odoo-scale vocabularies make birthday collisions likely. The integration contract in docs/INTEGRATION-MAP.md:92-93 says the ClassId/entity_type is minted uniquely by the registry and is never a content hash, so returning this folded hash directly can merge unrelated concepts downstream; use a checked registry/table or collision-resolving codebook instead.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Agreed and fixed in #61. Verified the collision locally — canonical_concept_id("outcome") == canonical_concept_id("handle_out") == 33032 under FNV-1a XOR-fold. The integration contract (docs/INTEGRATION-MAP.md:92-93) is the right authority: ClassId is minted uniquely by the registry, never a content hash. Replaced the hash with a curated const CODEBOOK: &[(&str, u16)] table — each promoted concept gets a stable assigned id (project = 0x0001, project_work_item = 0x0002, billable_work_entry = 0x0003). canonical_concept_id now returns Option<u16> — Some for promoted concepts, None for everything else. No silent collisions possible. codebook_has_no_duplicate_ids_or_zero pins the invariant; canonical_concept_id_returns_some_for_promoted_none_for_unknown explicitly tests that outcome and handle_out now return None.
| /// binary identity is the codebook value. | ||
| #[must_use] | ||
| pub fn ogar_codebook(alias: &str) -> u16 { | ||
| canonical_concept_id(&canonical_concept(alias)) |
There was a problem hiding this comment.
Normalize canonical class labels before hashing
When callers pass an OGAR canonical class label itself to the new ogar_codebook/LabelDTO::from_alias path, such as billable_work_entry().name == "BillableWorkEntry" or project_work_item().name == "ProjectWorkItem", canonical_concept does not recognize those PascalCase canonical names and falls back to hashing billableworkentry/projectworkitem. The resulting ids differ from canonical_concept_id("billable_work_entry") and Class::canonical_id(), so OGAR-owned labels no longer share the promised identity; add those canonical class-name spellings to the alias table or canonicalize them before hashing.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Agreed and fixed in #61. billable_work_entry().name is "BillableWorkEntry" (PascalCase); the resolver lowercased to "billableworkentry" and fell through to lexical, producing a different id than "billable_work_entry". Same for ProjectWorkItem and Project. Added the PascalCase + lowercase-concat aliases ("billableworkentry", "projectworkitem") into the promoted-invariant arms, plus an explicit Project/Projects arm. ogar_codebook("BillableWorkEntry") and ogar_codebook("ProjectWorkItem") now share the id with their snake_case forms and with the curator labels (TimeEntry, account.analytic.line, Issue, WorkPackage). Pinned by ogar_codebook_maps_curator_labels_to_canonical_id and label_dto_carries_local_label_and_shared_codebook_id.
…x P2 #145) The 0x0700/0x0701 codebook rows had no resolver arm, so PascalCase model-name inputs (OsintSystem/OsintPerson, the builders' Class::new names) lexically landed on osintsystem/osintperson — NOT in the codebook — and ogar_codebook returned None. Add the canonical class-name arm mirroring the PR #60 pattern every other promoted class already has, plus a round-trip test asserting OsintSystem->0x0700 and OsintPerson->0x0701 (and via the builders' .name). Additive; ogar-vocab suite green. Addresses codex review comment on PR #145. Co-Authored-By: Claude <noreply@anthropic.com>
Three things, one PR — the OP <-> Redmine convergence ladder
1.
Projectcanonical classReferenced by
project_work_item().projectandbillable_work_entry().projectedges; this PR adds the canonical class those edges resolve to.ogar_vocab::project()— 3 direct family edges (work_items→ProjectWorkItem,time_entries→BillableWorkEntry,members→ProjectActor) + typed identity attributes (name,identifierasstring) +Language::Unknown. Nested-project parent waits on mixin-decode (Redmineawesome_nested_set; OPProjects::Hierarchy) — documented as follow-up.ogar_from_ruff::project_role(...)+project_canonical_roles(&Class)— Rails-dialect → canonical role resolver.Projectboth lift to canonical 3-role surface.2. OGAR codebook +
LabelDTO— labels decorative, address is identityThe load-bearing insight made code:
canonical_concept_id(concept: &str) -> u16— FNV-1a 32 XOR-folded tou16. Pure + deterministic + portable.ogar_codebook(alias) -> u16— composes promoted-invariant resolver with hash; leave the curator label whatever shape it is, the codebook maps to the canonical target:Class::canonical_id() -> Option<u16>— convenience.LabelDTO { label, id, canonical }— consumer-facing triple.labelis consumer-local (not normalised);idis the codebook value;canonicalis the canonical-AST label for downstream consumers (SurrealAST, lance-graph-planner, kanban) that need a portable symbol.3. LE wire contract — 2 little-endian bytes
Per direction: consumers consume the codebook id as 2 little-endian bytes, matching the
NodeGuidLE layout inlance-graph-contract.Class::canonical_id_le() -> Option<[u8; 2]>.LabelDTO::id_le() -> [u8; 2].canonical_concept_idrustdoc documents the wire format and declares wire-compatibility withNodeGuid.classid's u16 low half.LabelDTOtype itself belongs inlance-graph-contractlong-term (alongsideClassId+NodeGuid); a follow-up PR migrates it. Wire is the source of truth: any encoder/decoder agreeing on u16 LE is compatible regardless of which crate exports the DTO.Real-corpus tests now assert binary convergence
Issueand OPWorkPackagehave the samecanonical_id.Projecthave the samecanonical_id.TimeEntry'scanonical_idmatchesogar_codebook("account.analytic.line")without producer-side label-shaping — cross-domain binary convergence proven on the Rails side without reaching into Odoo source (other session's scope).Green
ogar-vocab28 (+6: Project, codebook, ogar_codebook, LabelDTO, canonical_id, LE wire) ·ogar-from-ruff24 (+2 Project) ·ogar-from-rails7 real-corpus (+1 Project, binary-id assertions on existing three) · clippy clean · disk 23 GB free.🤖 Generated with Claude Code