Janitor

The Janitor is one of four AI-powered tools in Alfred for managing an Obsidian vault. It periodically scans every file in the vault for structural problems and automatically fixes issues like broken wikilinks, missing frontmatter fields, and orphaned files.

Overview

Janitor maintains vault health by detecting and repairing structural problems across all vault records. It runs periodic sweeps with a multi-stage pipeline that combines deterministic Python fixes with targeted LLM calls for complex repairs.

Key capabilities:

Detects broken wikilinks, invalid frontmatter, orphaned files, and stub records
Applies deterministic fixes for common structural issues
Uses LLM calls for link disambiguation and stub enrichment
Supports both light (structural-only) and deep (full agent) sweep modes
Respects vault scope rules (can edit and delete, but not create new records)

Issue Detection

Janitor scans for the following issue types:

Issue Code	Description
FM001	Missing required frontmatter fields (type, status, name/title)
FM002	Invalid type (not in KNOWN_TYPES)
FM003	Invalid status for the record type
FM004	Wrong field types (string provided where list expected, or vice versa)
LK001	Broken wikilinks (target file doesn't exist in vault)
ORPHAN	Files with no incoming wikilinks from other records

3-Stage Fix Pipeline

Janitor uses a three-stage pipeline to repair issues, progressing from fast deterministic fixes to targeted LLM interventions.

Stage 1: Autofix (Pure Python)

Applies deterministic fixes for frontmatter issues (FM001-FM004) without any LLM calls.

What it fixes:

Infers record type from directory placement (e.g., file in person/ gets type: person)
Infers name/title from filename if missing
Fixes field type mismatches (converts strings to lists, lists to strings as needed)
Repairs invalid status values to valid ones for the record type
Adds missing required fields with sensible defaults

Output: Reports counts of fixed/flagged/skipped issues.

Stage 2: Link Repair (LLM, per-file)

For each file with broken wikilinks, makes a focused LLM call to disambiguate and repair the links.

Process:

Provides the file content and the broken link text
Includes a list of existing vault records as candidate targets
LLM suggests the correct target for each broken link
Applies fixes via alfred vault edit

Example: A broken link [[John]] might be resolved to [[person/John Smith]] or [[person/John Doe]] based on context.

Stage 3: Stub Enrichment (LLM, per-file)

For stub records (files with minimal body content), makes an LLM call to enrich them using existing vault context.

Guidelines:

Only adds verifiable facts from existing vault records
Expands relationships using existing wikilinks
Does NOT generate speculative or filler content
Preserves the record's original purpose and scope

Sweep Modes

Light Sweep (Structural-Only)

Fast scan that runs autofix (Stage 1) only. No LLM calls are made.

Use when:

You want frequent health checks without API costs
Frontmatter issues are the main concern
Agent backend is not configured

Configuration:

janitor:
  interval: 300  # seconds between light sweeps
  structural_only: true

Deep Sweep (Full Pipeline)

Runs all three stages, including LLM-powered link repair and stub enrichment.

Use when:

You have broken links that need disambiguation
Stub records need enrichment
Agent backend is configured and available

Configuration:

janitor:
  interval: 300  # light sweep interval
  deep_interval_hours: 24  # deep sweep interval
  structural_only: false

Configuration

Janitor configuration lives in the janitor section of config.yaml:

janitor:
  # Scan interval for light sweeps (seconds)
  interval: 300

  # Deep sweep interval (hours)
  deep_interval_hours: 24

  # Whether to apply fixes or just report issues
  fix_mode: true

  # Skip LLM stages (Stage 2 & 3), run autofix only
  structural_only: false

Global Agent Configuration

Janitor uses the global agent section for backend selection:

agent:
  backend: claude  # or 'openclaw', 'zo'

  claude:
    default_model: claude-opus-4-6

  openclaw:
    agent_id: vault-janitor
    stagger_startup_seconds: 10

  zo:
    api_key: ${ZO_API_KEY}
    model: anthropic/claude-opus-4-6

CLI Commands

One-Shot Scan

Run a single structural scan and print a report (no fixes applied):

alfred janitor scan

Output: Lists all detected issues with file paths and issue codes.

One-Shot Fix

Run a single scan and apply fixes:

alfred janitor fix

Behavior: Runs the full pipeline (all 3 stages) if agent is configured, or just autofix (Stage 1) if structural_only: true.

Watch Daemon

Run periodic sweeps as a foreground daemon:

alfred janitor watch

Behavior: Runs light sweeps at interval seconds, and deep sweeps at deep_interval_hours hours (if configured).

Background Daemon

Start Janitor as a background process:

alfred up --only janitor

Check status:

alfred status

Stop daemon:

alfred down

Backend Support

OpenClaw (Recommended for Pipeline Mode)

The 3-stage pipeline mode was designed for OpenClaw and works best with its agent architecture.

Setup:

Register a vault-janitor agent in OpenClaw
Set the agent's workspace to include vault schema files
Configure janitor.structural_only: false to enable all stages

Concurrency: OpenClaw requires concurrency: 1 due to session locking.

Claude Code

Uses a single-call legacy approach: all issues for a sweep are sent to Claude in one agent invocation.

Tradeoffs: Less granular than pipeline mode, but works well for small vaults.

Zo Computer

Uses a single-call legacy approach with snapshot/diff fallback for mutation tracking.

Tradeoffs: No per-file pipeline, but good for HTTP-based agent workflows.

State & Logging

State File

Janitor maintains state in data/janitor_state.json:

{
  "processed_hashes": {},
  "last_sweep": "2026-02-23T10:30:00Z",
  "last_deep_sweep": "2026-02-22T08:00:00Z",
  "sweep_count": 42
}

Purpose: Tracks sweep history and timing. Can be deleted to force a fresh sweep.

Log Files

Tool log: data/janitor.log — daemon activity, scan results, error messages
Audit log: data/vault_audit.log — append-only JSONL of every vault mutation

Mutation Tracking

For CLI backends (Claude, OpenClaw), changes are tracked via session-scoped JSONL files:

{"op": "edit", "path": "person/John Smith.md", "fields_changed": ["status", "tags"]}
{"op": "edit", "path": "project/Alpha.md", "fields_changed": ["related"]}

Location: vault/.mutations/{session-id}.jsonl

Vault Scope Rules

Janitor operates under the janitor scope, which allows:

Edit: Modify frontmatter and body content
Delete: Remove orphaned or invalid files
Move: Rename files (via Obsidian CLI if available)

Restricted:

Create: Cannot create new records (use Curator for that)

See src/alfred/vault/scope.py for full scope definitions.

Common Workflows

Daily Health Check

Run a light sweep every 5 minutes, deep sweep once per day:

janitor:
  interval: 300
  deep_interval_hours: 24
  fix_mode: true
  structural_only: false

alfred up --only janitor

Structural-Only Mode (No LLM)

Fast, free, frequent scans with no API costs:

janitor:
  interval: 60
  structural_only: true
  fix_mode: true

alfred up --only janitor

Manual Fix Run

Scan the vault once and apply fixes interactively:

# Review issues first
alfred janitor scan

# Apply fixes
alfred janitor fix

Troubleshooting

"No issues detected" but vault has problems

Check:

Ensure records are in correct directories (type must match directory)
Verify KNOWN_TYPES includes the record types in your vault
Check data/janitor.log for scan errors

Link repairs are incorrect

Fix:

Stage 2 relies on LLM understanding of context
Ensure the agent has access to vault/CLAUDE.md (schema documentation)
For OpenClaw, verify the workspace includes vault schema files

Orphan detection flags valid files

Explanation: ORPHAN detection only checks for incoming wikilinks. Hub files (dashboards, indexes) may legitimately have no incoming links.

Solution: Exclude specific files/directories from orphan checks (feature not yet implemented).

Deep sweeps not running

Check:

Verify structural_only: false in config
Ensure agent backend is configured
Check last_deep_sweep timestamp in data/janitor_state.json
Review data/janitor.log for agent errors

Related Tools

Curator: Processes inbox files into structured vault records
Distiller: Extracts latent knowledge from operational records
Surveyor: Discovers semantic relationships via embeddings and clustering

See the main Alfred documentation for architecture and setup guides.

Getting Started

Architecture

Workers

Reference

Janitor

Janitor

Overview

Issue Detection

3-Stage Fix Pipeline

Stage 1: Autofix (Pure Python)

Stage 2: Link Repair (LLM, per-file)

Stage 3: Stub Enrichment (LLM, per-file)

Sweep Modes

Light Sweep (Structural-Only)

Deep Sweep (Full Pipeline)

Configuration

Global Agent Configuration

CLI Commands

One-Shot Scan

One-Shot Fix

Watch Daemon

Background Daemon

Backend Support

OpenClaw (Recommended for Pipeline Mode)

Claude Code

Zo Computer

State & Logging

State File

Log Files

Mutation Tracking

Vault Scope Rules

Common Workflows

Daily Health Check

Structural-Only Mode (No LLM)

Manual Fix Run

Troubleshooting

"No issues detected" but vault has problems

Link repairs are incorrect

Orphan detection flags valid files

Deep sweeps not running

Related Tools

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally