Skip to content

Extract Information service to standalone Archiver repo #149

@gregoryfoster

Description

@gregoryfoster

Summary

Extract the in-tree Information service (src/information/) into its own standalone repo at /home/exedev/archiver, renamed to Archiver. Follows the Notifier extraction pattern (#132): separate repo on the same VM, separate systemd unit, separate Postgres database, generated Python SDK consumed as a path dependency.

This is the rename + lift; data-model evolution toward InfoSource / SourceSpec / SourceRevision (per docs/research/2026-05-06-archiver-information-model.md) lands as a deliberate v2 in the new repo afterward.

Design doc

docs/plans/2026-05-06-archiver-extraction-design.md

Scope

  • Pure rename + extract; storage tier deferred to a future Replicator service.
  • Service-name-only rename: InfoItem, info_item_id, InfoSpec, info_spec_id, /info-items/, info.changes, watcher columns — all unchanged. Only the service binary, repo path, systemd unit, port-8020 binding, and SDK package name (archiver_client) change.
  • Notifier-style mirror for shared content-acquisition code (fetchers/, extractors/, simhash, extraction_defaults, logging). Discipline-based sync; revisit if drift bites.
  • Fetcher microservice deferred indefinitely; per-service local fetcher boundary preserved for future swap.
  • Separate Postgres database for archiver (archiver / archiver_test), not just separate schema. Watcher loses direct DB access to InfoItem/InfoSpec rows; SDK is the only path.
  • Big-bang migration (pre-production state, no dual-stack window).
  • Pre-extraction cleanup: relocate extraction_config from src/information/core/tools/ to src/core/extraction_defaults.py in each repo (fixes existing reverse-import from src/workers/pipeline.py).
  • Watcher INFORMATION_* env vars renamed to ARCHIVER_*. Hard cut; no transitional period.
  • Future model evolution and Replicator stand-up are explicitly out of scope here.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions