Summary
Extract the in-tree Information service (src/information/) into its own standalone repo at /home/exedev/archiver, renamed to Archiver. Follows the Notifier extraction pattern (#132): separate repo on the same VM, separate systemd unit, separate Postgres database, generated Python SDK consumed as a path dependency.
This is the rename + lift; data-model evolution toward InfoSource / SourceSpec / SourceRevision (per docs/research/2026-05-06-archiver-information-model.md) lands as a deliberate v2 in the new repo afterward.
Design doc
docs/plans/2026-05-06-archiver-extraction-design.md
Scope
- Pure rename + extract; storage tier deferred to a future Replicator service.
- Service-name-only rename:
InfoItem, info_item_id, InfoSpec, info_spec_id, /info-items/, info.changes, watcher columns — all unchanged. Only the service binary, repo path, systemd unit, port-8020 binding, and SDK package name (archiver_client) change.
- Notifier-style mirror for shared content-acquisition code (
fetchers/, extractors/, simhash, extraction_defaults, logging). Discipline-based sync; revisit if drift bites.
- Fetcher microservice deferred indefinitely; per-service local fetcher boundary preserved for future swap.
- Separate Postgres database for archiver (
archiver / archiver_test), not just separate schema. Watcher loses direct DB access to InfoItem/InfoSpec rows; SDK is the only path.
- Big-bang migration (pre-production state, no dual-stack window).
- Pre-extraction cleanup: relocate
extraction_config from src/information/core/tools/ to src/core/extraction_defaults.py in each repo (fixes existing reverse-import from src/workers/pipeline.py).
- Watcher
INFORMATION_* env vars renamed to ARCHIVER_*. Hard cut; no transitional period.
- Future model evolution and Replicator stand-up are explicitly out of scope here.
Related
Summary
Extract the in-tree Information service (
src/information/) into its own standalone repo at/home/exedev/archiver, renamed to Archiver. Follows the Notifier extraction pattern (#132): separate repo on the same VM, separate systemd unit, separate Postgres database, generated Python SDK consumed as a path dependency.This is the rename + lift; data-model evolution toward
InfoSource/SourceSpec/SourceRevision(perdocs/research/2026-05-06-archiver-information-model.md) lands as a deliberate v2 in the new repo afterward.Design doc
docs/plans/2026-05-06-archiver-extraction-design.mdScope
InfoItem,info_item_id,InfoSpec,info_spec_id,/info-items/,info.changes, watcher columns — all unchanged. Only the service binary, repo path, systemd unit, port-8020 binding, and SDK package name (archiver_client) change.fetchers/,extractors/,simhash,extraction_defaults,logging). Discipline-based sync; revisit if drift bites.archiver/archiver_test), not just separate schema. Watcher loses direct DB access to InfoItem/InfoSpec rows; SDK is the only path.extraction_configfromsrc/information/core/tools/tosrc/core/extraction_defaults.pyin each repo (fixes existing reverse-import fromsrc/workers/pipeline.py).INFORMATION_*env vars renamed toARCHIVER_*. Hard cut; no transitional period.Related
docs/plans/2026-05-03-information-source-specifications-design.md— original Information service design (whose "Later: extract" step this implements)docs/research/2026-05-06-archiver-information-model.md— Archiver future-state research (carries into the new repo)