-
Notifications
You must be signed in to change notification settings - Fork 28
Janitor
The Janitor is one of four AI-powered tools in Alfred for managing an Obsidian vault. It periodically scans every file in the vault for structural problems and automatically fixes issues like broken wikilinks, missing frontmatter fields, and orphaned files.
Janitor maintains vault health by detecting and repairing structural problems across all vault records. It runs periodic sweeps with a multi-stage pipeline that combines deterministic Python fixes with targeted LLM calls for complex repairs.
Key capabilities:
- Detects broken wikilinks, invalid frontmatter, orphaned files, and stub records
- Applies deterministic fixes for common structural issues
- Uses LLM calls for link disambiguation and stub enrichment
- Supports both light (structural-only) and deep (full agent) sweep modes
- Respects vault scope rules (can edit and delete, but not create new records)
Janitor scans for the following issue types:
| Issue Code | Description |
|---|---|
| FM001 | Missing required frontmatter fields (type, status, name/title) |
| FM002 | Invalid type (not in KNOWN_TYPES) |
| FM003 | Invalid status for the record type |
| FM004 | Wrong field types (string provided where list expected, or vice versa) |
| LK001 | Broken wikilinks (target file doesn't exist in vault) |
| ORPHAN | Files with no incoming wikilinks from other records |
Janitor uses a three-stage pipeline to repair issues, progressing from fast deterministic fixes to targeted LLM interventions.
Applies deterministic fixes for frontmatter issues (FM001-FM004) without any LLM calls.
What it fixes:
- Infers record type from directory placement (e.g., file in
person/getstype: person) - Infers name/title from filename if missing
- Fixes field type mismatches (converts strings to lists, lists to strings as needed)
- Repairs invalid status values to valid ones for the record type
- Adds missing required fields with sensible defaults
Output: Reports counts of fixed/flagged/skipped issues.
For each file with broken wikilinks, makes a focused LLM call to disambiguate and repair the links.
Process:
- Provides the file content and the broken link text
- Includes a list of existing vault records as candidate targets
- LLM suggests the correct target for each broken link
- Applies fixes via
alfred vault edit
Example: A broken link [[John]] might be resolved to [[person/John Smith]] or [[person/John Doe]] based on context.
For stub records (files with minimal body content), makes an LLM call to enrich them using existing vault context.
Guidelines:
- Only adds verifiable facts from existing vault records
- Expands relationships using existing wikilinks
- Does NOT generate speculative or filler content
- Preserves the record's original purpose and scope
Fast scan that runs autofix (Stage 1) only. No LLM calls are made.
Use when:
- You want frequent health checks without API costs
- Frontmatter issues are the main concern
- Agent backend is not configured
Configuration:
janitor:
interval: 300 # seconds between light sweeps
structural_only: trueRuns all three stages, including LLM-powered link repair and stub enrichment.
Use when:
- You have broken links that need disambiguation
- Stub records need enrichment
- Agent backend is configured and available
Configuration:
janitor:
interval: 300 # light sweep interval
deep_interval_hours: 24 # deep sweep interval
structural_only: falseJanitor configuration lives in the janitor section of config.yaml:
janitor:
# Scan interval for light sweeps (seconds)
interval: 300
# Deep sweep interval (hours)
deep_interval_hours: 24
# Whether to apply fixes or just report issues
fix_mode: true
# Skip LLM stages (Stage 2 & 3), run autofix only
structural_only: falseJanitor uses the global agent section for backend selection:
agent:
backend: claude # or 'openclaw', 'zo'
claude:
default_model: claude-opus-4-6
openclaw:
agent_id: vault-janitor
stagger_startup_seconds: 10
zo:
api_key: ${ZO_API_KEY}
model: anthropic/claude-opus-4-6Run a single structural scan and print a report (no fixes applied):
alfred janitor scanOutput: Lists all detected issues with file paths and issue codes.
Run a single scan and apply fixes:
alfred janitor fixBehavior: Runs the full pipeline (all 3 stages) if agent is configured, or just autofix (Stage 1) if structural_only: true.
Run periodic sweeps as a foreground daemon:
alfred janitor watchBehavior: Runs light sweeps at interval seconds, and deep sweeps at deep_interval_hours hours (if configured).
Start Janitor as a background process:
alfred up --only janitorCheck status:
alfred statusStop daemon:
alfred downThe 3-stage pipeline mode was designed for OpenClaw and works best with its agent architecture.
Setup:
- Register a
vault-janitoragent in OpenClaw - Set the agent's workspace to include vault schema files
- Configure
janitor.structural_only: falseto enable all stages
Concurrency: OpenClaw requires concurrency: 1 due to session locking.
Uses a single-call legacy approach: all issues for a sweep are sent to Claude in one agent invocation.
Tradeoffs: Less granular than pipeline mode, but works well for small vaults.
Uses a single-call legacy approach with snapshot/diff fallback for mutation tracking.
Tradeoffs: No per-file pipeline, but good for HTTP-based agent workflows.
Janitor maintains state in data/janitor_state.json:
{
"processed_hashes": {},
"last_sweep": "2026-02-23T10:30:00Z",
"last_deep_sweep": "2026-02-22T08:00:00Z",
"sweep_count": 42
}Purpose: Tracks sweep history and timing. Can be deleted to force a fresh sweep.
-
Tool log:
data/janitor.log— daemon activity, scan results, error messages -
Audit log:
data/vault_audit.log— append-only JSONL of every vault mutation
For CLI backends (Claude, OpenClaw), changes are tracked via session-scoped JSONL files:
{"op": "edit", "path": "person/John Smith.md", "fields_changed": ["status", "tags"]}
{"op": "edit", "path": "project/Alpha.md", "fields_changed": ["related"]}Location: vault/.mutations/{session-id}.jsonl
Janitor operates under the janitor scope, which allows:
- Edit: Modify frontmatter and body content
- Delete: Remove orphaned or invalid files
- Move: Rename files (via Obsidian CLI if available)
Restricted:
- Create: Cannot create new records (use Curator for that)
See src/alfred/vault/scope.py for full scope definitions.
Run a light sweep every 5 minutes, deep sweep once per day:
janitor:
interval: 300
deep_interval_hours: 24
fix_mode: true
structural_only: falsealfred up --only janitorFast, free, frequent scans with no API costs:
janitor:
interval: 60
structural_only: true
fix_mode: truealfred up --only janitorScan the vault once and apply fixes interactively:
# Review issues first
alfred janitor scan
# Apply fixes
alfred janitor fixCheck:
- Ensure records are in correct directories (type must match directory)
- Verify
KNOWN_TYPESincludes the record types in your vault - Check
data/janitor.logfor scan errors
Fix:
- Stage 2 relies on LLM understanding of context
- Ensure the agent has access to
vault/CLAUDE.md(schema documentation) - For OpenClaw, verify the workspace includes vault schema files
Explanation: ORPHAN detection only checks for incoming wikilinks. Hub files (dashboards, indexes) may legitimately have no incoming links.
Solution: Exclude specific files/directories from orphan checks (feature not yet implemented).
Check:
- Verify
structural_only: falsein config - Ensure agent backend is configured
- Check
last_deep_sweeptimestamp indata/janitor_state.json - Review
data/janitor.logfor agent errors
- Curator: Processes inbox files into structured vault records
- Distiller: Extracts latent knowledge from operational records
- Surveyor: Discovers semantic relationships via embeddings and clustering
See the main Alfred documentation for architecture and setup guides.
Getting Started
Architecture
Workers
Reference