Tag Index Engine
Build a tag indexing system that extracts #tags from note content and maintains a searchable index using the LSM prefix scan.
Storage Design
Use column family tag: with prefix-based lookups:
cf "tag":
tag:{tagname} -> JSON array of note paths that contain this tag
Components
TagIndex::index_tags(note_path, tags: Vec<String>)
- Diff old vs new tags for the note
- Add note path to
tag:{tag} for new tags
- Remove note path from
tag:{tag} for removed tags
TagIndex::remove_note_tags(note_path)
- Get all tags for the note from metadata
- Remove note path from each
tag:{tag}
TagIndex::search_by_tag(tag, cursor, limit) -> (Vec<String>, Option<String>)
- Use
Engine::search_prefix("tag:{tag}") with cursor pagination
- Return note paths sorted lexicographically
TagIndex::list_all_tags() -> Vec<String>
- Scan
tag: prefix, collect unique tag names
- Support pagination via cursor
Tag Parsing Rules (from markdown)
#tag — at word boundary
#tag/subtag — nested tags (store as full path)
#tag with spaces — not valid (must be alphanumeric)
#tag# — trailing hash is not part of tag
- Ignore tags inside code blocks, inline code, and HTML comments
- Max tag length: 100 chars
- Allowed chars:
[a-zA-Z0-9_/-]
Acceptance Criteria
Parent Epic
#275
Tag Index Engine
Build a tag indexing system that extracts
#tagsfrom note content and maintains a searchable index using the LSM prefix scan.Storage Design
Use column family
tag:with prefix-based lookups:Components
TagIndex::index_tags(note_path, tags: Vec<String>)tag:{tag}for new tagstag:{tag}for removed tagsTagIndex::remove_note_tags(note_path)tag:{tag}TagIndex::search_by_tag(tag, cursor, limit) -> (Vec<String>, Option<String>)Engine::search_prefix("tag:{tag}")with cursor paginationTagIndex::list_all_tags() -> Vec<String>tag:prefix, collect unique tag namesTag Parsing Rules (from markdown)
#tag— at word boundary#tag/subtag— nested tags (store as full path)#tag with spaces— not valid (must be alphanumeric)#tag#— trailing hash is not part of tag[a-zA-Z0-9_/-]Acceptance Criteria
search_by_tagreturns correct notes with paginationtag/subtag) work as expectedParent Epic
#275