Skip to content

feat: Integrate ExoRDF to RDF/RDFS Mapping into SPARQL Triple Store #367

@kitelev

Description

@kitelev

Context & Problem Statement

Background

The triple store currently generates RDF triples for Exocortex assets using custom ExoRDF predicates only. To enable semantic interoperability and SPARQL inference, we need to also generate standard RDF/RDFS vocabulary triples that map ExoRDF concepts to W3C standards.

Current State

  • InMemoryTripleStore generates triples like:
  • No rdfs:subClassOf relationships for class hierarchy
  • No rdfs:subPropertyOf relationships for property hierarchy
  • SPARQL queries cannot use standard RDF/RDFS predicates for inference

Desired State

  • Triple store generates BOTH ExoRDF AND RDF/RDFS triples
  • Class hierarchy triples (ems__Task rdfs:subClassOf exo__Asset, etc.)
  • Property hierarchy triples (exo__Instance_class rdfs:subPropertyOf rdf:type, etc.)
  • SPARQL queries can use rdfs:subClassOf* for transitive class queries
  • Inference support for subclass/subproperty relationships

User Impact

  • SPARQL queries can use standard semantic web predicates
  • Queries like "find all assets" work via rdfs:subClassOf inference
  • Better integration with semantic web tools
  • Future reasoning capabilities (RDFS inference, OWL reasoning)

Acceptance Criteria

Functional Requirements

Given asset with exo__Instance_class property
When triple store generates triples
Then both ExoRDF (exo__Instance_class) and RDF/RDFS (rdf:type) triples are created

Given SPARQL query using rdfs:subClassOf
When querying for all assets via subclass inference
Then results include tasks, projects, areas (all asset subtypes)

Given property with ExoRDF to RDF/RDFS mapping
When triple store initializes
Then generates rdfs:subPropertyOf triple for mapping

Given SPARQL query using rdf:type with rdfs:subClassOf* inference
When querying assets by type hierarchy
Then transitive closure includes all subtypes

Non-Functional Requirements

  • Performance: Triple generation <5ms overhead per asset
  • Memory: Additional triples do not exceed 20% memory increase
  • Compatibility: Existing SPARQL queries still work (backward compatible)

Definition of Done

  • Update InMemoryTripleStore to generate RDF/RDFS vocabulary triples
  • Generate rdfs:subClassOf triples for ExoRDF class hierarchy
  • Generate rdfs:subPropertyOf triples for ExoRDF property mappings
  • Generate rdf:type triples alongside exo__Instance_class
  • Implement SPARQL inference for rdfs:subClassOf (transitive)
  • Implement SPARQL inference for rdfs:subPropertyOf (transitive)
  • Unit tests for triple generation (>80% coverage)
  • SPARQL query tests using RDF/RDFS predicates
  • Performance tests (triple generation overhead)
  • All existing tests pass (backward compatibility)
  • PR merged to main

Technical Details

Architecture

Affected Layers:

  • Infrastructure (packages/core/src/infrastructure/rdf/)
  • Application (packages/obsidian-plugin/src/application/services/SPARQLApi.ts)
  • Domain (packages/core/src/domain/models/rdf/) - May need inference models

Key Files to Modify:

packages/core/src/infrastructure/rdf/InMemoryTripleStore.ts  # Add RDF/RDFS triple generation
packages/core/src/infrastructure/rdf/RDFVocabularyMapper.ts  # NEW - Mapping logic
packages/obsidian-plugin/src/application/services/ObsidianTripleStore.ts  # Integration
packages/core/src/domain/models/rdf/Namespace.ts  # Ensure RDF/RDFS namespaces exist

Technical Approach

Step 1: Create RDF Vocabulary Mapper

// packages/core/src/infrastructure/rdf/RDFVocabularyMapper.ts

import { Triple, IRI, Literal, Namespace } from "../../domain/models/rdf";

export interface ClassMapping {
  exoClass: string;  // "exo__Asset"
  rdfClass: IRI;     // rdfs:Resource
}

export interface PropertyMapping {
  exoProperty: string;  // "exo__Instance_class"
  rdfProperty: IRI;     // rdf:type
}

export class RDFVocabularyMapper {
  private readonly classMappings: ClassMapping[] = [
    { exoClass: "exo__Asset", rdfClass: Namespace.RDFS.term("Resource") },
    { exoClass: "exo__Class", rdfClass: Namespace.RDFS.term("Class") },
    { exoClass: "exo__Property", rdfClass: Namespace.RDF.term("Property") },
  ];

  private readonly propertyMappings: PropertyMapping[] = [
    { exoProperty: "exo__Asset_isDefinedBy", rdfProperty: Namespace.RDFS.term("isDefinedBy") },
    { exoProperty: "exo__Class_superClass", rdfProperty: Namespace.RDFS.term("subClassOf") },
    { exoProperty: "exo__Instance_class", rdfProperty: Namespace.RDF.term("type") },
    { exoProperty: "exo__Property_range", rdfProperty: Namespace.RDFS.term("range") },
    { exoProperty: "exo__Property_domain", rdfProperty: Namespace.RDFS.term("domain") },
    { exoProperty: "exo__Property_superProperty", rdfProperty: Namespace.RDFS.term("subPropertyOf") },
  ];

  /**
   * Generate rdfs:subClassOf triples for ExoRDF class hierarchy.
   * Example: <ems:Task> rdfs:subClassOf <exo:Asset>
   */
  generateClassHierarchyTriples(): Triple[] {
    const triples: Triple[] = [];

    // ExoRDF class hierarchy (hardcoded for now, could be dynamic later)
    const hierarchy = [
      { child: "ems__Task", parent: "exo__Asset" },
      { child: "ems__Project", parent: "exo__Asset" },
      { child: "ems__Area", parent: "exo__Asset" },
      { child: "exo__Asset", parent: "rdfs:Resource" },
      { child: "exo__Class", parent: "rdfs:Class" },
      { child: "exo__Property", parent: "rdf:Property" },
    ];

    for (const { child, parent } of hierarchy) {
      const childIRI = this.constructClassIRI(child);
      const parentIRI = this.constructClassIRI(parent);
      
      triples.push(
        new Triple(
          childIRI,
          Namespace.RDFS.term("subClassOf"),
          parentIRI
        )
      );
    }

    return triples;
  }

  /**
   * Generate rdfs:subPropertyOf triples for ExoRDF property mappings.
   * Example: <exo:Instance_class> rdfs:subPropertyOf rdf:type
   */
  generatePropertyHierarchyTriples(): Triple[] {
    return this.propertyMappings.map((mapping) => {
      const exoPropertyIRI = Namespace.EXO.term(mapping.exoProperty.replace("exo__", ""));
      
      return new Triple(
        exoPropertyIRI,
        Namespace.RDFS.term("subPropertyOf"),
        mapping.rdfProperty
      );
    });
  }

  /**
   * Given ExoRDF property and value, generate corresponding RDF/RDFS triple.
   * Example: exo__Instance_class: "ems__Task" → <asset> rdf:type <ems:Task>
   */
  generateMappedTriple(
    subject: IRI,
    exoProperty: string,
    value: string | IRI
  ): Triple | null {
    const mapping = this.propertyMappings.find(
      (m) => m.exoProperty === exoProperty
    );

    if (!mapping) {
      return null; // No RDF/RDFS mapping for this property
    }

    // Convert value to IRI if needed
    const objectIRI = typeof value === "string" 
      ? this.constructClassIRI(value)
      : value;

    return new Triple(subject, mapping.rdfProperty, objectIRI);
  }

  private constructClassIRI(className: string): IRI {
    // Handle namespace prefixes
    if (className.startsWith("rdfs:")) {
      return Namespace.RDFS.term(className.split(":")[1]);
    }
    if (className.startsWith("rdf:")) {
      return Namespace.RDF.term(className.split(":")[1]);
    }
    if (className.startsWith("exo__")) {
      return Namespace.EXO.term(className.replace("exo__", ""));
    }
    if (className.startsWith("ems__")) {
      return Namespace.EMS.term(className.replace("ems__", ""));
    }
    
    // Default to EXO namespace
    return Namespace.EXO.term(className);
  }
}

Step 2: Update InMemoryTripleStore

// packages/core/src/infrastructure/rdf/InMemoryTripleStore.ts

import { RDFVocabularyMapper } from "./RDFVocabularyMapper";

export class InMemoryTripleStore {
  private readonly vocabMapper: RDFVocabularyMapper;

  constructor() {
    // ... existing code ...
    this.vocabMapper = new RDFVocabularyMapper();
    this.initializeVocabularyTriples();
  }

  /**
   * Initialize triple store with RDF/RDFS vocabulary triples.
   * Called once during construction.
   */
  private initializeVocabularyTriples(): void {
    // Add class hierarchy triples
    const classTriples = this.vocabMapper.generateClassHierarchyTriples();
    for (const triple of classTriples) {
      this.add(triple);
    }

    // Add property hierarchy triples
    const propertyTriples = this.vocabMapper.generatePropertyHierarchyTriples();
    for (const triple of propertyTriples) {
      this.add(triple);
    }
  }

  /**
   * Add asset triples (existing method, now enhanced).
   */
  addAssetTriples(assetURI: IRI, metadata: AssetMetadata): void {
    // Generate ExoRDF triples (existing logic)
    for (const [key, value] of Object.entries(metadata.frontmatter)) {
      const exoPropertyIRI = Namespace.EXO.term(key.replace(/^exo__/, ""));
      const valueNode = this.convertValueToNode(value);
      this.add(new Triple(assetURI, exoPropertyIRI, valueNode));

      // NEW: Generate corresponding RDF/RDFS triple if mapping exists
      const mappedTriple = this.vocabMapper.generateMappedTriple(
        assetURI,
        key,
        value
      );
      if (mappedTriple) {
        this.add(mappedTriple);
      }
    }
  }
}

Step 3: SPARQL Inference Support

// Extend SPARQL query engine to support rdfs:subClassOf* transitive property

// In query execution, expand rdfs:subClassOf* patterns:
// ?class rdfs:subClassOf* exo:Asset
// → Find all classes that are subclasses (directly or transitively) of exo:Asset

Key Dependencies

Gotchas & Edge Cases

⚠️ Watch out for:

  • Performance impact of additional triples (20% memory increase acceptable)
  • Backward compatibility with existing SPARQL queries
  • Circular subclass relationships (validation needed)
  • Transitive closure computation performance (cache results)
  • Property mappings may be one-to-many (e.g., both exo__Instance_class and rdf:type)

Integration Points

  • ObsidianTripleStore - Uses InMemoryTripleStore
  • SPARQLApi - Query execution with inference
  • RDF serializers - May need to output RDF/RDFS triples

AI Agent Guidance

Step-by-Step Implementation

  1. Create RDF Vocabulary Mapper

    cd packages/core/src/infrastructure/rdf
    touch RDFVocabularyMapper.ts
    • Implement class hierarchy generation
    • Implement property hierarchy generation
    • Implement mapped triple generation
  2. Update InMemoryTripleStore

    • Add initializeVocabularyTriples() method
    • Call during constructor
    • Enhance addAssetTriples() to generate mapped triples
    • Ensure backward compatibility
  3. Add SPARQL Inference

    • Detect rdfs:subClassOf* patterns in queries
    • Compute transitive closure
    • Cache results for performance
  4. Write comprehensive tests

    cd packages/core/tests/unit/infrastructure/rdf
    touch RDFVocabularyMapper.test.ts

    Test cases:

    • Class hierarchy triple generation
    • Property hierarchy triple generation
    • Mapped triple generation for assets
    • SPARQL queries using rdfs:subClassOf
    • SPARQL queries using rdf:type
    • Performance tests (triple generation overhead)
    • Backward compatibility (existing queries still work)
  5. Integration testing

    • Load real vault data
    • Verify RDF/RDFS triples generated
    • Test SPARQL queries with inference
    • Verify performance acceptable
  6. Validation

    npm run test:unit
    npm run test:e2e
    npm run build
    npm run lint

Example Code References

Similar patterns:

  • packages/core/src/infrastructure/rdf/InMemoryTripleStore.ts (triple management)
  • packages/core/src/domain/models/rdf/Namespace.ts (namespace handling)
  • packages/obsidian-plugin/src/application/services/ObsidianTripleStore.ts (integration)

Common Mistakes to Avoid

❌ Generating too many triples (memory explosion)
✅ Only generate necessary hierarchy triples once at initialization

❌ Breaking existing SPARQL queries
✅ Ensure backward compatibility, existing queries still work

❌ Slow transitive closure computation
✅ Cache rdfs:subClassOf* results, compute once

❌ Hardcoding class hierarchy
✅ Make it configurable, consider dynamic loading from ontology files


Testing Requirements

Unit Tests

Minimum Coverage: 80%

Test Cases:

  • RDFVocabularyMapper.generateClassHierarchyTriples()
  • RDFVocabularyMapper.generatePropertyHierarchyTriples()
  • RDFVocabularyMapper.generateMappedTriple()
  • InMemoryTripleStore.initializeVocabularyTriples()
  • InMemoryTripleStore.addAssetTriples() with mapping
  • SPARQL query with rdfs:subClassOf
  • SPARQL query with rdfs:subClassOf* (transitive)
  • SPARQL query with rdf:type
  • Performance test: triple generation overhead
  • Backward compatibility: existing queries work

E2E Tests

  • Load vault with real assets
  • Verify RDF/RDFS triples present in triple store
  • Execute SPARQL query: SELECT ?asset WHERE { ?asset rdf:type ?type . ?type rdfs:subClassOf* exo:Asset }
  • Verify all asset types returned (tasks, projects, areas)

Documentation Requirements

Code Documentation

  • JSDoc on RDFVocabularyMapper methods
  • Inline comments explaining mapping logic
  • Performance notes in comments

Developer Documentation

  • Update docs/rdf/ExoRDF-Mapping.md (implementation notes)
  • Update docs/sparql/Developer-Guide.md (inference capabilities)
  • Add examples using rdfs:subClassOf in queries

Related Issues

Depends on:

Blocks:

Related:

  • InMemoryTripleStore.ts - Core triple storage
  • Namespace.ts - RDF/RDFS namespace definitions

Additional Notes

Timeline Estimate: 6-8 hours (mapper + triple store update + inference + tests)

Key Design Decisions:

  1. Generate vocabulary triples once at initialization (not per asset)
  2. Generate mapped triples per asset alongside ExoRDF triples
  3. Backward compatible: existing ExoRDF triples remain
  4. Inference support for rdfs:subClassOf* transitive queries
  5. Performance target: <5ms overhead per asset, <20% memory increase

Success Criteria: SPARQL queries can use standard RDF/RDFS predicates with inference, all assets queryable via class hierarchy

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions