Skip to content

Define ontology schema file format for domain-specific schemas #547

@AlexMikhalev

Description

@AlexMikhalev

Parent

Epic: #544

Problem

The --schema flag proposed in #545 and #546 needs a file format specification. Currently, domain models are stored as domain-model.json in the publishing pipeline but there is no standard terraphim ontology schema format.

Proposed Format

An ontology schema file defines the expected entity types, relationship types, and their properties for a specific domain. This maps the v1.9.0 feature gates (ontology, medical, hgnc) to a user-definable file.

{
  "version": "1.0",
  "domain": "publishing/context-graphs-for-engineers",
  "entity_types": [
    {
      "id": "context_graph",
      "label": "Context graph",
      "description": "A knowledge graph extended with decision traces...",
      "aliases": ["Decision Trace Infrastructure", "Agent Governance Layer"],
      "anti_patterns": ["Operational KG", "Enhanced knowledge graph"],
      "broader": ["context_engineering"],
      "narrower": ["decision_trace", "temporal_metadata"]
    }
  ],
  "relationship_types": [
    {
      "id": "extends",
      "label": "extends",
      "source_types": ["context_graph"],
      "target_types": ["knowledge_graph"]
    }
  ],
  "normalization": {
    "methods": ["Exact", "Fuzzy"],
    "fuzzy_threshold": 0.6
  }
}

Key Decisions

  1. ID format: snake_case identifiers (same as domain-model.json term keys)
  2. URI scheme: domain-model://{domain}/terms/{id} for GroundingMetadata
  3. Anti-patterns: Terms to flag as incorrect usage (from domain model avoid lists)
  4. Normalization config: Which methods to use and thresholds

Derivation from domain-model.json

A script (derive-ontology-schema.py) converts the publishing domain-model.json to this format. The schema format is a subset focused on what terraphim-agent needs for extraction and coverage.

Compatibility

  • Feature gate ontology (default) provides the core types
  • Domain-specific schemas extend the base with their own entity/relationship types
  • The schema file replaces compile-time feature gates with runtime configuration

Use Case

Any domain (publishing, medical, engineering) can define its own ontology schema without recompiling terraphim. The publishing pipeline generates the schema from domain-model.json; other domains can create schemas manually or via their own tooling.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions