Skip to content

Obsidian format parser: aliases, wikilinks, and tags extraction #604

@AlexMikhalev

Description

@AlexMikhalev

Summary

Extend parse_markdown_directives_content() in terraphim_automata/src/markdown_directives.rs to parse Obsidian-native patterns alongside the existing Logseq synonyms:: syntax.

Current Parser

The parser handles 4 directives:

  • synonyms:: -> synonyms (Logseq property syntax)
  • type::: -> document type
  • route:: -> LLM routing config
  • priority:: -> priority score

New Patterns to Parse

1. YAML Frontmatter

---
aliases: [context management, semantic scope]
tags: [engineering, knowledge-graph]
---
  • aliases: -> treated as synonyms (same semantics as synonyms::)
  • tags: -> stored as concept categories (new field in MarkdownDirectives)
  • Supports both YAML list syntax ([a, b]) and multi-line (- a\n- b)

2. Wikilinks

Related to [[knowledge graph]] and [[ontology]] concepts.
  • Extract all [[target]] and [[target|display text]] patterns
  • Store as related concepts (new Vec<String> field in MarkdownDirectives)
  • These become edges in the rolegraph, not synonyms

3. Inline Tags

This concept is important for #engineering and #search.
  • Extract #tag patterns (excluding headings ## heading)
  • Store as categories alongside YAML frontmatter tags
  • Deduplicate with YAML tags

Parser Mode Configuration

Add a format field to haystack config:

[[roles.engineer.haystacks]]
location = "~/.config/terraphim/kg"
format = "terraphim"  # synonyms:: + type::: + route:: + priority:: (default)

[[roles.engineer.haystacks]]
location = "~/synced/pages"
format = "logseq"     # synonyms:: only

[[roles.engineer.haystacks]]
location = "~/synced/ObsidianVault"
format = "obsidian"   # aliases: + [[wikilinks]] + #tags + synonyms::

All formats output the same MarkdownDirectives struct. The obsidian format is a superset that also accepts synonyms:: (for Dataview users).

Extended Struct

pub struct MarkdownDirectives {
    pub doc_type: DocumentType,
    pub synonyms: Vec<String>,           // existing
    pub route: Option<RouteDirective>,    // existing
    pub priority: Option<u8>,            // existing
    pub categories: Vec<String>,         // NEW: from tags:/\#tag
    pub related_concepts: Vec<String>,   // NEW: from [[wikilinks]]
}

Test Plan

  • Fixture: real Obsidian vault files with YAML frontmatter, wikilinks, and tags
  • Test: aliases parsed as synonyms alongside existing synonyms:: support
  • Test: wikilinks extracted as related concepts
  • Test: tags extracted from both YAML frontmatter and inline #tags, deduplicated
  • Test: mixed format (file has both synonyms:: and aliases:) merges correctly
  • Test: obsidian format config flag routes to extended parser

Affected Crates

  • terraphim_automata (primary -- extend markdown_directives.rs)
  • terraphim_types (add categories + related_concepts to MarkdownDirectives)
  • terraphim_config (add format field to haystack config)

Estimated Effort

~4 hours

Part of

Epic #603

Metadata

Metadata

Assignees

No one assigned

    Labels

    architectureArchitecture and design decisionsenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions