Skip to content

Watcher as standalone change-detection service #185

@gregoryfoster

Description

@gregoryfoster

Summary

Decouple Watcher from the Archiver SDK at runtime by storing all pipeline-critical state (URL, extraction specs, change history) locally. Archiver becomes a client of Watcher's API (control plane) rather than a runtime dependency Watcher calls each pipeline cycle. Non-Archiver clients can define WatchedItems with a URL and source specs directly. Closes #184 as resolved-by-architecture.

Design doc

docs/plans/2026-06-07-watcher-standalone-service-design.md

Scope

Phase A — Pipeline decoupling (ships first)

  • effective_url + source_specs + archiver_info_source_id added to WatchedItem; info_item_id → nullable
  • health_status / last_checked_at / last_changed_at moved from Watch → WatchedItem
  • Watch drops target_info_source_id, info_item_id, effective_url
  • New change_revisions table (Watcher-local change history keyed by watched_item_id)
  • pending_source_revisionspending_archiver_sync (rekeyed, Archiver-backed items only)
  • last_known_revisions table dropped (replaced by change_revisions query)
  • fetch_info_item_bindings / InfoItemBindings / InfoSourceProto deleted
  • Pipeline reads URL + specs from WatchedItem; no Archiver SDK call at runtime
  • Drain worker rekeyed to change_revision_id; _resolve_sub_aspect_watch deleted
  • New GET /api/v1/watched-items/{id}/revisions endpoint
  • POST /watched-items accepts {url, source_specs} directly; info_item_id optional

Phase B — Control plane inversion

  • Auth mechanism for Archiver to call Watcher's API as a trusted caller
  • Dashboard stripped of InfoItem-coupled routes (picker, binding tree)
  • POST /watched-items becomes Archiver's primary WatchedItem creation surface

Out of scope

  • Phase C (Redis state sync) — Archiver will PATCH Watcher directly on URL/spec changes
  • Selector variance detection beyond zero-content fallback
  • Per-Watch spec overrides

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions