Skip to content

Reconcile overlap between relationships and hierarchy #56

@mindsocket

Description

@mindsocket

Problem

hierarchy and relationships in the schema metadata serve overlapping purposes but are implemented separately, creating complexity and inconsistency.

A hierarchy level is conceptually a relationship with:

  • Structural ordering (position in the type chain)
  • A resolvedParents resolution phase (post-parse link following)
  • Optional selfRef/selfRefField for same-type parent links

Relationships have:

  • Embedding format hints (heading, list, table, page)
  • Matchers for heading text
  • fieldOn to control which side holds the link

Both support field, fieldOn, and multi. But hierarchy goes through resolveHierarchyEdges() post-parse, while relationships are resolved inline during extractEmbeddedNodes(). This split means:

  • Hierarchy nodes can't use heading/list/table embedding patterns
  • Relationships don't get resolvedParents — making cross-node queries harder
  • parse-embedded has to maintain separate logic paths for hierarchy-typed nodes vs. relationship-typed nodes, even though they often appear in the same markdown structure

What to investigate / decide

  1. Can hierarchy be modelled as a constrained subset of relationships? — Specifically, can we add ordering + resolvedParents semantics to relationships without losing the distinction that hierarchy imposes (strict type chain, depth inference, no arbitrary embedding)?

  2. Should resolvedParents apply to relationship-typed nodes too? — If a node is embedded via a relationship (e.g. Application inside a Solution), it would be useful to have resolvedParents populated, just as hierarchy nodes do.

  3. Where should the resolution phase live? — Currently resolveHierarchyEdges() is a separate post-parse pass only for hierarchy. If relationships also needed resolution, this pass would need to generalise.

  4. What stays distinct? — Hierarchy's depth-based type inference and structural ordering are specific to the tree metaphor. These probably shouldn't bleed into general relationships. The goal is to reduce duplicated logic, not to collapse the concepts entirely.

Expected outcome

  • Reduced duplication between resolveHierarchyEdges, validate-hierarchy, and the relationship parsing/validation code
  • A clearer mental model: hierarchy is the primary type chain; relationships are lateral or embedded associations, some of which also warrant parent resolution
  • parse-embedded can use a single traversal path that handles both, gated by which metadata is active

Related

  • Companion issue: Refactor parse-embedded to use clean mdast traversal with explicit signal tracking
  • This is likely a blocker for that refactor — the traversal design depends on how these two concepts relate

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions