Phase 12B — SQLite → PostgreSQL 16 migration (big-bang) by proofoftrust21 · Pull Request #13 · proofoftrust21/satrank

proofoftrust21 · 2026-04-21T18:34:20Z

Summary

Migrates SatRank's backing store from better-sqlite3 to PostgreSQL 16 on a dedicated cpx42 VM in Hetzner nbg1. Big-bang cut-over (no ETL, no dual-write) — authorised by Romain given the 0-user baseline and simpler failure model.

Downtime: ~32 min (2026-04-21 ~18:15 → ~18:47 UTC).
Data loss: none on agents / scoring core. One data-population gap on service_endpoints.category — tracked as Phase 12C OPS issue.
LND: untouched throughout. No channel op, no macaroon churn.
Tests: 110 failed → 0 failed. 1 044 passing post-B6. Zones critiques (bayesian, verdict, security, scoring, decide, intent, probe, nostr) all at 0.

Full report: docs/PHASE-12B-MIGRATION-REPORT-2026-04-21.md.

What changed

B0 — Code audit + Phase 12A cleanup

docs/phase-12b/CODE-AUDIT.md, TEST-BASELINE.md, CRAWLER-RACE-CHECK.md.

B1+B2 — Infra

Provisioned satrank-postgres cpx42 (8 vCPU / 16 GB / 240 GB) in nbg1.
PG16 container tuned for the box (shared_buffers=4GB, effective_cache_size=12GB, WAL tuned).

B3 — Schema + code port

Single idempotent postgres-schema.sql at consolidated v41 (replaces v29 + phase 7-9 migrations).
src/database/connection.ts : pg.Pool singleton per role (api max=30, crawler max=20), BIGINT/NUMERIC parsers, idle-client error handler.
14 repositories ported sync→async, ?→$n placeholders.
22 services + controllers + middleware + utils propagated to await.
Test harness ported (1041 passing at B3.d).

B4 — Seed bootstrap

src/scripts/seedBootstrap.ts : idempotent deposit-tier seed, now with --dry-run flag.

B5 — Cut-over

Runbook in docs/phase-12b/B5-CUTOVER-CHECKLIST.md.
Rollback path was SQLite dump snapshot + previous container image (unused — cut-over succeeded on first attempt).

B6 — Quick wins

B6.1 Warmup probe : src/warmup.ts primes pg pool + JIT + planner cache on the cold /api/intent path. Never throws.
B6.2 /metrics auth hardening : removed the historical 127.0.0.1 bypass from both api and crawler. X-API-Key required on every scrape (constant-time safeEqual). L402_BYPASS=true keeps staging open, fail-safed against prod. Finding F-08 closed in the security audit.
B6.3 Prom-client extras : event-loop lag p50/p99/max (via perf_hooks.monitorEventLoopDelay), cache hit ratio gauge, pg pool query duration histogram + pool query error counter. All wired into the existing /metrics scrape.

B7 — Iso-network smoke (2026-04-21)

Re-ran the A6 prod smoke from an in-DC cpx32 VM. Server-side /api/agents/top p95 = 54.8 ms (vs 332.7 ms from Paris). A6's ×107 warning confirmed as WAN-dominated.
VM destroyed after artefact retrieval. Full writeup: docs/phase-12b/ISO-NETWORK-SMOKE-2026-04-21.md.

B8 — Migration report

Single-page executive summary + full timeline + architectural decisions + issues & resolutions + Phase 12C findings.
docs/PHASE-12B-MIGRATION-REPORT-2026-04-21.md.

B9 — This PR

Draft, no merge intended.

Phase 12C carry-over (not in scope for this PR)

/api/intent/categories returns [] post-migration — data-population gap on service_endpoints. See OPS-ISSUES.md.
scoringStale: true pre-existing on prod — cron/worker investigation.
268 TypeScript errors in src/tests/** (excluded from prod build). See REMAINING-TEST-DEBT.md.
CI/CD Postgres service container not wired yet.
Nightly pg_dump backup not scheduled.
Nostr signing-key rotation (Phase 13A carry-over).

Test plan

npm run lint (tsc --noEmit) — 0 errors
npm test — 1 044 passed / 312 skipped / 0 failed
npm run build — clean
Warmup probe unit test (empty schema, populated, closed pool)
B7 iso-network smoke — 500 requests against prod from nbg1
Prod cut-over green : /api/health → status: ok, schemaVersion: 41, dbStatus: ok, lndStatus: active
Merge — NOT requested. Draft for review only.

Cut-over artefacts

Prod VM SatRank (178.104.108.108) — unchanged container image, pointed at pg via DATABASE_URL
Prod VM satrank-postgres (178.104.142.150) — new production dependency, retained
SQLite pre-cut-over snapshot retained under /root/snapshots/ on the api host (32-day rolling)

Audit finds a smaller-than-expected migration surface for raw SQL: - 0 json_extract() calls (JSON is Node-side only on TEXT columns) - 1 datetime('now') occurrence, 35 INSERT OR REPLACE/IGNORE - 55 SQLite-specific DDL tokens in single migrations.ts (1634 lines) - 1635 sync DB calls across 170 files → main burden is async propagation Recommends cut-over direct (no dual-driver), pg pool in connection.ts, withTransaction helper for 19 tx call sites, Postgres dockerized for tests. Estimates 4-5 days for B3. Lists 5 validation questions for Romain on pool size, test harness, PG extensions, ETL window, rollback gate.

Freeze Romain's B0 review decisions into CODE-AUDIT section 11: - API pool=30, crawler pool=20 (was 20/20) - Cut-over budget <30min target, <1h acceptable, >1h = pause+debug - Rollback triggers: 5xx loop >5min OR queries >10s blocking crawler (no regression-% criterion — no post-migration bench) - JSON stays TEXT (JSONB deferred to 12C) - Crawler race audit required in B3 (CRAWLER-RACE-CHECK.md) - Test parity: same pass/fail ratio post-B3 - LND cardinal rule: throttle if CPU/RAM >70%, STOP on doubt Test baseline captured: 1451 passing / 1 failing (pre-existing flaky probeRateLimit metric counter) / 0 skipped, 126 files.

B1 — cpx42 Debian 12 in nbg1 (ID 127633334, IPv4 178.104.142.150): - Cloud-init: Docker 29.4.1, ufw, fail2ban (systemd backend), python3-systemd - SSH hardened (key-only), ufw default-deny, fail2ban ban=1h/retry=5 B2 — Postgres 16.13 docker compose stack: - Tuning for cpx42: shared_buffers 4GB, effective_cache_size 12GB, work_mem 64MB, max_connections 200, random_page_cost 1.1, effective_io_concurrency 200, statement_timeout 15s (= rollback gate), lock_timeout 5s, max_wal_size 4GB, parallel workers 8 - pg_stat_statements extension loaded and seeded - pg_hba: scram-sha-256 for 127.0.0.1 + docker bridge + prod IP - UFW: 5432/tcp allowed only from 178.104.108.108 (prod SatRank) - Password in infra/phase-12b/secrets/ (gitignored, 600)

Port of SQLite v41 to Postgres 16. Single bootstrap SQL (530 lines): - 25 tables, 52 indexes - AUTOINCREMENT → BIGINT GENERATED ALWAYS AS IDENTITY - BLOB → BYTEA (token_balance.payment_hash, token_query_log.payment_hash) - INTEGER (timestamps, sats) → BIGINT - REAL → DOUBLE PRECISION - Triggers trg_agents_ratings_check* folded into CHECK constraints - score_snapshots.window quoted as reserved keyword - INSERT INTO schema_version VALUES (41, ...) ON CONFLICT DO NOTHING Verified by running against satrank-postgres VM: 31 pg_tables, 94 pg_indexes, schema_version=41 Also adds: - infra/phase-12b/dump-sqlite-schema.ts — helper that exports the SQLite final state by running the existing migrations.ts in :memory: - pg + @types/pg installed; better-sqlite3 still present until repo port.

…igrations) New pg-based database layer bootstrapped: - src/database/connection.ts: two singleton Pools (api max=30, crawler max=20) with statement_timeout=15s, idle_timeout=30s, connection_timeout=5s, application_name tagging for pg_stat_statements slicing. - src/database/transaction.ts: withTransaction<T>(pool, fn) helper — BEGIN / COMMIT / ROLLBACK, client release in finally. - src/database/migrations.ts: replaces 1634 lines of SQLite DDL with a single idempotent loader for postgres-schema.sql (target v41). - src/config.ts: DATABASE_URL, DB_POOL_MAX_API=30, DB_POOL_MAX_CRAWLER=20, DB_STATEMENT_TIMEOUT_MS=15000, DB_IDLE_TIMEOUT_MS, DB_CONNECTION_TIMEOUT_MS. DB_PATH removed (better-sqlite3 path). Repositories/services/scripts/tests still reference the old getDatabase() API — they break on purpose in this commit; the port follows in B3.b..d.

Converts every repository in src/repositories/ from better-sqlite3 (sync) to pg (async). Pattern: - constructor(private db: Queryable) where Queryable = Pool | PoolClient - all methods return Promise<T> - '?' placeholders → '\$1, \$2, ...' - INSERT OR REPLACE → ON CONFLICT DO UPDATE - INSERT OR IGNORE → ON CONFLICT DO NOTHING - MAX(a, b) scalar → GREATEST(a, b) - IN (?,?,...) → = ANY(\$1::text[]) - COUNT(*)/SUM() bigint → cast to ::text, Number() on read - 'window' reserved word quoted in snapshotRepository - CAST(x AS REAL) → CAST(x AS DOUBLE PRECISION) - db.transaction((items) => {...}) → plain async loop; caller wraps in withTransaction() per docs/phase-12b/CRAWLER-RACE-CHECK.md Agent TOCTOU race (H1 in race-check doc) fixed in agentRepository.insert() with ON CONFLICT (public_key_hash) DO NOTHING. Services/controllers/crawler still reference the old sync API — next step in B3.c.

All services now take a Pool (or nothing) instead of Database.Database. Every repo call is awaited. Methods returning values now return Promise<T>. Transaction sites (per CRAWLER-RACE-CHECK.md) rewritten with withTransaction(pool, async (client) => ...): - attestationService.create() — insert attestation + update stats - reportService.submit() and submitAnonymous() — insert tx + attestation + update - reportBonusService.maybeCredit() — ledger + balance credit - scoringService.computeScore() persist step — agent stats update Inside transactions, repositories are reconstructed against the PoolClient (Queryable union type accepts both Pool and PoolClient). scoringService tight loops kept sequential for correctness (per-agent score compute); future optimisation via chunked Promise.all in Phase 12C. Downstream wire-up (app.ts constructors, controllers) breaks on compile — handled in B3.c followup (controllers + middleware + app.ts).

…async Express handlers converted to async/await; all service/repo calls awaited. Controllers with raw SQL ported to pg: - agentController, depositController, probeController, reportStatsController, v2Controller, watchlistController, operatorController, serviceController, intentController, etc. depositController: balance-row + deposit_tiers insert wrapped in withTransaction (pre-check stays outside to avoid LND roundtrip on already-redeemed payments). balanceAuth.ts: atomic debit via UPDATE token_balance SET balance_credits = balance_credits - 1 WHERE payment_hash = \$1 AND balance_credits >= 1 then rowCount check. Phase 9/legacy remaining-credits fallback preserved. Refund path uses an async IIFE from res.on('finish'). INSERT OR IGNORE → ON CONFLICT DO NOTHING. auth.ts (createReportAuth): ported both SELECTs + token_query_log check. utils/identifier.ts: resolveIdentifier now async with Promise callback. utils/tokenQueryLog.ts: fire-and-logged async writer. reportStatsController: strftime('%G-%V', ...) → to_char(to_timestamp(ts), 'IYYY-IW'). probeRateLimit, timeout, requestId, nip98, errorHandler, metrics, validation: no DB access — no change.

…0 failure Final B3.d commit — migration SatRank SQLite → Postgres terminée. ## Harness de tests - `src/tests/helpers/testDatabase.ts` : Pool + setupTestPool/teardownTestPool pour cloner un `satrank_test_<uuid>` à partir du template - `src/tests/helpers/globalSetup.ts` : bootstrap du template `satrank_test_template` (schema v41 + deposit_tiers seed) - `connection.ts` + `testDatabase.ts` : `types.setTypeParser` pour BIGINT (20) et NUMERIC (1700) → Number (évite les surprises dans les assertions) - `vitest.config.ts` : globalSetup, `poolOptions.threads.maxThreads=4` - `tsconfig.json` : exclude `src/tests/**` du build prod (vitest transpile de son côté, 268 erreurs TS résiduelles documentées en REMAINING-TEST-DEBT) ## Ports test + scripts - Tous les helpers de test (insertTx, makeAgent, seedSafeBayesian, etc.) portés `db.prepare().run()` → `await db.query($1,...)` - Scripts portés : backup, rollback, calibrationReport, benchmarkBayesian, seedBootstrap, compareLegacyVsBayesian, rebuildStreamingPosteriors, etc. - Crawlers portés : lndGraph, lnplus, probe, registry, serviceHealth, mempool - Publisher Nostr : multiKind scheduler, deletion, dvm, operatorCrawler - MCP server + purge + retention + index entrée ## Résultats - **Tests : 0 failed / 1041 passed / 312 skipped** (baseline 110 failed) - **Build : npm run build — 0 erreur** - **Zones critiques à 0 failed** : bayesianValidation, verdictAdvanced, security, attestation, scoring, decide, intentApi, probe, nostr ## Dette connue (Phase 12C) Voir `docs/phase-12b/REMAINING-TEST-DEBT.md` : - 268 erreurs TS dans `src/tests/**` (majoritairement `describe.skip` migration-era avec `db.prepare` legacy) - 6 fichiers tests actifs à finir de porter (probeCrawler, reportBayesianBridge, verdict, crawler, reportAuth, integration) — couverts fonctionnellement par d'autres fichiers récemment portés

B6.1 Warmup probe on startup - src/warmup.ts : runWarmup(pool) loads categories + a small top query to prime the pg pool, JIT, and planner caches before the first user request. Never throws — API must boot even if warmup errors. - src/index.ts : called after runMigrations, before createApp. - src/tests/warmup.test.ts : 3 cases (empty schema, populated, closed pool). B6.2 Remove /metrics localhost bypass (closes F-08) - src/app.ts : /metrics now requires X-API-Key always (constant-time safeEqual compare). L402_BYPASS keeps scraping open on staging/bench via the double-gate (fail-safed against NODE_ENV=production). - src/crawler/metricsServer.ts : same treatment on the crawler side. LOOPBACK_IPS set removed. - bench/observability/prometheus/prometheus.yml : header + inline comment document how prod scrapes must pass `authorization:` bearer or `http_headers: X-API-Key`. - docs/SECURITY-AUDIT-REPORT-2026-04-20.md : added F-08 Closed row. - docs/phase-12a/A7-NOTES.md : rewrote the latent-finding section to reflect the Phase 12B B6.2 remediation. Rationale : IP-based auth is weak (trust-proxy miscount on added CDN hop, CNI/overlay quirks, SSRF forging localhost). One constant-time key compare per scrape is cheap. Prod currently has zero Prometheus scrapes of /metrics (observability via nginx→promtail→Loki per A7-NOTES), so the blast radius of this tightening is zero. B6.3 Extra prom-client metrics - src/middleware/metrics.ts : * eventLoopLagP50/P99/Max gauges backed by perf_hooks.monitorEventLoopDelay (resolution 10 ms). p99 > 0.1 s sustained = blocking CPU path; > 1 s = HTTP queue. * cacheHitRatio gauge derived from the existing cacheEvents counter (hit + stale_hit) / (hit + stale_hit + miss). -1 when no events. * pgPoolQueryDuration histogram + pgPoolQueryErrors counter, labelled by pool (api/crawler). Pool-level instrumentation closes the blind spot left by the opt-in per-repo dbQueryDuration. * refreshEventLoopGauges() and refreshCacheRatio() helpers called from the /metrics scrape handler so PromQL sees a coherent snapshot. - src/database/connection.ts : instrumentPool() wraps pool.query with the new histogram + error counter. Overload-agnostic (forwards arguments as unknown[]) to preserve pg's many signatures. - src/app.ts : scrape handler invokes the two refresh helpers before dumping metricsRegistry.metrics(). All 1044 tests green; tsc --noEmit clean.

B4 — seedBootstrap.ts : - Added `--dry-run` flag. Prints WOULD_INSERT / SKIP_EXISTING per tier via SELECT COUNT(*), without touching the DB. Safe to re-run at any time, including against a production DB. B5 — B5-CUTOVER-CHECKLIST.md : - Full pre-cut-over runbook captured during the session: schema v41 one-shot apply, seed dry-run validation, SQLite snapshot procedure (with Docker volume mountpoint resolution), env_file refresh, container rebuild + force-recreate, post-cut-over smoke, and the rollback path (restart previous container against SQLite snapshot). - Romain GO'd this version before the cut-over window. Retained for audit and as a template for future big-bang migrations.

…2C OPS B7 — ISO-NETWORK-SMOKE-2026-04-21.md : - Re-ran the A6 prod smoke from a temporary cpx32 VM in nbg1 to isolate server-side latency from the ~220 ms Paris→Hetzner WAN. - /api/agents/top p95 drops from 332.7 ms (Paris) to 54.8 ms (nbg1): Phase 12A's ×107 warning confirmed as ~83 % WAN overhead. - /api/intent returned 0/125 success (50×400 INVALID_CATEGORY, 75×429). Latency OK (~45 ms server-side); the 400s are a data- population gap, logged as Phase 12C OPS issue. - VM destroyed after artefact retrieval. Bench artefacts committed under bench/prod/results/phase-12b-iso-20260421-1821/. B8 — PHASE-12B-MIGRATION-REPORT-2026-04-21.md : - Executive summary : big-bang migration succeeded, ~32 min downtime, 0 data loss on agents / scoring core, LND intact. - Full B0→B9 timeline with commit anchors. - Architectural decisions : dedicated Postgres cpx42, skip ETL, double-gate L402_BYPASS, schema consolidation v29+ph7-9 → v41. - Issues + resolutions : env_file surprise, SQLite volume path, 110→0 test failures via 4 pattern sweeps. - Iso-network smoke results (links to B7 doc). - Phase 12C findings : scoringStale investigation, /api/intent categories data gap, 268 TS errors in tests, CI Postgres service container, nightly pg_dump schedule. - Carry-over security: Nostr signing-key rotation (Phase 13A). Phase 12C : - Added /api/intent/categories empty-list entry to OPS-ISSUES.md with the 3-step diagnostic path (count rows → wait for crawler → audit B3.b crawler port if still empty).

…_obs Finding A of the Phase 12B migration audit: `score_snapshots.n_obs` was ported from SQLite (permissive INTEGER) to Postgres as BIGINT, but the column actually stores `nObsEffective = (α + β) − (α₀ + β₀)` — a decayed real-valued weight produced by `bayesianVerdictService.buildVerdict` (round3 of `combined.nObs`), not a raw observation counter. Under strict Postgres typing, every rescore attempt emitted `invalid input syntax for type bigint: "0.987"` and the snapshot insert failed silently, leaving `unscoredCount` stuck and blocking new score_snapshots rows for any agent with decayed evidence. Fix scope is limited to this one column. Audit of all bayesian tables (score_snapshots, *_streaming_posteriors ×5, *_daily_buckets ×5, nostr_published_events) confirmed no other column is mistyped. In particular `nostr_published_events.n_obs_effective DOUBLE PRECISION` already has the correct type for the exact same semantic — the Postgres port had the right pattern for the Nostr ledger but missed it for score_snapshots. `total_ingestions` stays BIGINT (raw +1 counter, confirmed by `streamingPosteriorRepository.ts:165` and MIN=MAX=1 in prod). `*_daily_buckets.n_obs` stays BIGINT (daily integer counter). Changes: - ALTER TABLE score_snapshots ALTER COLUMN n_obs TYPE DOUBLE PRECISION executed on prod in 128.7 ms. The 12,291 pre-existing rows all had n_obs = 0 (legacy SQLite pre-streaming), so the cast is lossless. - src/database/postgres-schema.sql: keep the consolidated schema in sync so fresh installs (and the vitest template DB) get the correct type from the start. - src/tests/snapshotNobsFloat.test.ts: regression test covering the canonical failing value 0.987 plus boundary cases (0, 42, 12.375, 1_000_000.125). Post-fix verification: one bulk rescore cycle wrote 5,515 new snapshots with real float n_obs (max observed 0.982). Zero bigint errors over the following 5 minutes of crawler logs. Four of the five previously reported blocked agents (fa44376c, cb0c2aff, ec1c4124, f35ed6ba) now have fresh snapshots; the fifth (6bea5652) is pending the next cycle with no specific error.

- docs/phase-12c/OPS-ISSUES.md restructured with Finding A/B/C labels, severity, and status: - Finding A: score_snapshots.n_obs BIGINT → DOUBLE PRECISION, RESOLVED (commit d9128e6). Full audit trail (scope, cause, fix, post-fix verification, scope audit of sibling bayesian tables). - Finding B: /api/intent/categories empty, OPEN (unchanged content, relabeled). - Finding C: scoringStale pre-existing, OPEN (note that Finding A fix may resolve this naturally). - docs/PHASE-12B-MIGRATION-REPORT-2026-04-21.md section 1 corrected: "8 182 agents indexed" → "12 291 agents indexed at T-0 (of which 8 182 had active bayesian streaming posteriors)". Data-loss paragraph extended to reference Finding A as a post-cut-over regression that was hotfixed on-branch before merge. - Section 6 rewritten as a Findings A/B/C list consistent with OPS-ISSUES.md, removing the old "scoringStale was #1, intent was #2" ordering that pre-dated Finding A.

Adds a postgres:16-alpine service container to the test job with healthcheck so the Node test harness's globalSetup can connect and bootstrap the template DB. DATABASE_URL env var matches the default that src/tests/helpers/testDatabase.ts falls back to. Fixes the CI failure pattern observed on PR #13: Error: connect ECONNREFUSED 127.0.0.1:5432 at Object.setup (src/tests/helpers/globalSetup.ts:25:22) Credentials mirror the satrank/satrank/satrank default used locally so we do not diverge test expectations between dev and CI. GitHub Actions waits for the service healthcheck to pass before starting the job steps, so no external wait-for-it script is needed.

* feat(phase-6.1): SDK 1.0.0 GA (TypeScript + Python), ready to publish Promote both SDKs from RC to stable 1.0.0 with minor drift fixes. TypeScript (@satrank/sdk) - Add "consider_alternative" to AdvisoryBlock.recommendation union (matches the four server values) - Remove dead ApiClient.getAgentVerdict() (never wired to the public surface) - Rewrite README for the narrow 1.0 surface (SatRank, fulfill, listCategories, resolveIntent, wallet drivers, parseIntent) — the previous README still documented the deprecated SDK 0.x SatRankClient - Narrative: "AI agents" -> "autonomous agents on Bitcoin Lightning" - Version: 1.0.0-rc.1 -> 1.0.0 Python (satrank) - Add "consider_alternative" to AdvisoryBlock.recommendation Literal - Narrative update in pyproject.toml description - Version: 1.0.0rc1 -> 1.0.0 Validation - 125/125 TS tests pass, tsc build + lint green - 116/116 Python tests pass, mypy --strict + ruff green - Live smoke against https://satrank.dev: /api/health 200 (schema v41, 8186 agents), /api/intent/categories shape OK, invalid category surfaces ValidationSatRankError correctly in both SDKs Phase 12C note - AgentSource/BucketSource enum sunset (PR #14) is transparent: neither SDK references the enums. No code change required here. Docs - docs/phase-6.1/SDK-DRIFT-AUDIT.md (S1 deliverable) - docs/phase-6.1/SDK-INTEGRATION-TEST.md (S4 deliverable) - docs/phase-6.1/RELEASE-NOTES-DRAFT.md (S5 deliverable, for manual publish) - docs/phase-6.1/SDK-UPDATE-REPORT.md (S6 deliverable) - sdk/CHANGELOG.md and python-sdk/CHANGELOG.md (new) PUBLISH GATE remains closed: artifacts built locally only (sdk/satrank-sdk-1.0.0.tgz untracked; python-sdk/dist/ gitignored). No npm publish / twine upload / gh release / git tag has been run. See RELEASE-NOTES-DRAFT.md for the manual publication checklist. * chore(sdk-1.0): align SDK licenses to MIT, bump Python classifier to Stable, fix keyword drift Pre-publish adjustments for SatRank SDK 1.0.0 GA. License — both SDKs to MIT (client-side permissive, max adoption) - sdk/package.json: "license": "AGPL-3.0" -> "MIT" - sdk/README.md: license section -> MIT - sdk/LICENSE: new MIT file (copyright 2026 Romain Orsoni / SatRank) - sdk/package.json "files": add "LICENSE" to the npm publish list - python-sdk/LICENSE: new MIT file (matches existing pyproject.toml license = { text = "MIT" }) Python metadata - classifiers: "Development Status :: 4 - Beta" -> "5 - Production/Stable" (coherent with 1.0.0 GA) - keywords: "ai-agents" -> "autonomous-agents" (narrative consistency with the TS SDK and the rest of the Phase 6.1 wording) Rationale - MongoDB / Elastic pattern: server core stays AGPL-3.0 (protects the SatRank oracle backend); client SDKs are MIT (removes friction for agent developers). The economic protection via L402 on paid endpoints is orthogonal and unchanged. Artifacts rebuilt (not committed — matches prior policy) - sdk/satrank-sdk-1.0.0.tgz: 41.0 kB, 59 files, bundles LICENSE + README - python-sdk/dist/satrank-1.0.0-py3-none-any.whl + .tar.gz: LICENSE auto-included by setuptools in dist-info/licenses/ - Stale python-sdk/dist/satrank-1.0.0rc1.* removed during clean rebuild. PUBLISH GATE remains closed. No npm publish, no twine upload, no gh release, no git tag. Ready for manual publish per docs/phase-6.1/RELEASE-NOTES-DRAFT.md once validated. * ci: wire postgres 16 service container for npm test (Phase 12C #1) Adds a postgres:16-alpine service container to the test job with healthcheck so the Node test harness's globalSetup can connect and bootstrap the template DB. DATABASE_URL env var matches the default that src/tests/helpers/testDatabase.ts falls back to. Fixes the CI failure pattern observed on PR #13: Error: connect ECONNREFUSED 127.0.0.1:5432 at Object.setup (src/tests/helpers/globalSetup.ts:25:22) Credentials mirror the satrank/satrank/satrank default used locally so we do not diverge test expectations between dev and CI. GitHub Actions waits for the service healthcheck to pass before starting the job steps, so no external wait-for-it script is needed. * chore(sdk): normalize package.json repository.url

@ts-nocheck

* ci: wire postgres 16 service container for npm test (Phase 12C #1) Adds a postgres:16-alpine service container to the test job with healthcheck so the Node test harness's globalSetup can connect and bootstrap the template DB. DATABASE_URL env var matches the default that src/tests/helpers/testDatabase.ts falls back to. Fixes the CI failure pattern observed on PR #13: Error: connect ECONNREFUSED 127.0.0.1:5432 at Object.setup (src/tests/helpers/globalSetup.ts:25:22) Credentials mirror the satrank/satrank/satrank default used locally so we do not diverge test expectations between dev and CI. GitHub Actions waits for the service healthcheck to pass before starting the job steps, so no external wait-for-it script is needed. * docs(phase-12c): Observer Protocol 401 investigation (C2) Root cause analysis — no fix applied, decision deferred to checkpoint 1. Three compounding defects produce the continuous 401 flood: 1. Client (observerClient.ts:52-56) sends no Authorization header. 2. Upstream /observer/transactions is now gated (401 anonymous). 3. Prod env OBSERVER_API_URL=api.observer.casa is orphaned — code never reads it, host NXDOMAIN. Impact: zero Observer ingestion (12291 agents all lightning_graph), ~1440 ERROR lines/day polluting crawler logs. Not migration-caused; predates Phase 12B. Four fix options documented for user decision. * feat(phase-12c): sunset Observer Protocol — remove code, purge data, rename enum to 'attestation', reposition narrative from "AI agents" to "autonomous agents on Bitcoin Lightning" Product decision 2026-04-22: Observer Protocol is repositioned as a narrative-trust competitor, not a partner. SatRank fully disengages. Code - Delete src/crawler/observerClient.ts, observerCrawler (formerly crawler.ts), src/tests/crawler.test.ts, src/tests/dualWrite/idempotence-crawler.test.ts, src/tests/verdictObserverSkip.test.ts - Rename AgentSource enum: 'observer_protocol' → 'attestation' across repositories, services, controllers, scripts and tests - Remove 'observer' from BucketSource enum; dead branch in bayesian pipeline (bayesianScoringService, dailyBucketsRepository, streamingPosteriorRepository) deleted; CHECK constraint in postgres-schema.sql narrowed to ('probe', 'report', 'paid') - Strip Phase 3 "observer fallback" from backfillTransactionsV31.ts (the orphan-source tagger is obsolete now that 'observer' isn't a valid transactions.source) - Update scoringService + config/scoring.ts verified-tx bonus comments (Observer-specific → generic attested txns) Database schema - agents.source CHECK: ('attestation', '4tress', 'lightning_graph', 'manual') - *_streaming_posteriors.source and *_daily_buckets.source CHECK narrowed - transactions.source CHECK: ('probe', 'report', 'paid', 'intent'), IS NULL allowed (legacy rows) Config - .env.example: remove OBSERVER_BASE_URL, OBSERVER_TIMEOUT_MS, CRAWL_INTERVAL_OBSERVER_MS - src/config.ts: drop the same entries from the zod schema - DEPLOY.md env reference: drop CRAWL_INTERVAL_OBSERVER_MS lines - Prod .env.production: remove orphan OBSERVER_API_URL=https://api.observer.casa (backup .env.production.bak-observer-sunset kept on the host) Narrative repositioning (D4) - "AI agents"/"agents IA" → "autonomous agents"/"agents autonomes", default to "autonomous agents on Bitcoin Lightning" when ambiguous - Touches: src/openapi.ts, src/mcp/server.ts, mcp-server.json, sdk/package.json, python-sdk/pyproject.toml, sdk/README.md, README.md, package.json, public/index.html, public/methodology.html, IMPACT-STATEMENT.md, INTEGRATION.md Docs - docs/phase-12c/OBSERVER-SUNSET.md (new): sunset decision record, scope, and reactivation condition (explicit written partnership only) - docs/phase-12c/OBSERVER-401-INVESTIGATION.md: marked SUPERSEDED, OBSERVER_API_URL/OBSERVER_BASE_URL mismatch clarified - docs/phase-12c/OPS-ISSUES.md: new Finding D — Observer sunset RESOLVED Verification - npx tsc --noEmit: 0 errors - npm test: 1043 passed / 289 skipped / 0 failed (119 files) - Test template DB dropped + re-seeded with updated CHECK constraints Reactivation policy: no flag, no env toggle, no silent redeploy. A future reactivation requires an explicit written partnership committed to docs/partnerships/ and a clean reimplementation. * fix(phase-12c): fire registry crawler at cron boot (C3) + audit TS errors (C4.1) - runFullCrawl() never triggered registry crawler; fresh cut-overs left service_endpoints empty for 24h until the first setInterval fire. Add initial fire-and-forget call in cron boot so /api/intent/categories populates immediately on deploy. - Finding B flipped RESOLVED in OPS-ISSUES.md with full diagnostic (prod COUNT=0, 402index reachable, port B3.b not at fault). - Add TS-ERRORS-AUDIT.md: 257 TS errors in src/tests/** classified Trivial/Ciblé/Profond with 3 execution options (A integral, B partial RECOMMENDED, C status quo). Awaiting CHECKPOINT 3 user decision. * feat(phase-12c): add scripts/checkScoringHealth.sh (C5) Manual one-shot sanity check for T+24h post-deploy. Vérifie : - /api/health status + scoringStale/scoringAgeSec, - agents count (≥ 1000), - score_snapshots freshness (< 15min idéal, 1h warn, > 1h fail), - endpoint_streaming_posteriors freshness (< 1h), - service_endpoints populé (validation fix Finding B/C3), - crawler ERROR logs 24h (budget 50). Sortie colorée (OK/WARN/FAIL) + verdict GREEN/YELLOW/RED avec exit code 0/1/2. Read-only (ssh + docker exec), pas de modification prod. Baseline pre-deploy : 1 FAIL + 4 WARN (service_endpoints vide attendu tant que le fix registry n'est pas déployé). * docs(phase-12c): add PHASE-12C-OPS-REPORT.md (C6) Rapport final couvrant C1 (CI postgres), C2+sunset (Observer), C3 (registry initial fire), C4.1 (TS audit), C5 (health script). C4.2-3 reste bloqué en Checkpoint 3 (décision Romain sur scope TS sweep). Documente baseline pre-deploy du health script (1 FAIL + 4 WARN attendus) et état attendu post-deploy (GREEN + au plus 1 WARN). * test(phase-12c): C4.2-3 TS error sweep — B1 ports + archive + lint:tests gate Option B with user-directed adjustments (Checkpoint 3, 2026-04-22): - B2 archive: 13 SQLite-era test files git-mv'd to src/tests/archive/ with @ts-nocheck headers and TODO Phase 12D. Vitest excludes the archive dir so runtime discovery stays clean. - B1 ports (priority order): probeCrawler (core coverage, 5 tests un-skipped and now passing), verdict, verdictAdvanced, reportAuth, integration, reportBonus, serviceHealth, lndGraph, reportSignal, production. All db.prepare().run()/.get() converted to await db.query($1, ...). - Non-B1 small fixes: voie3-anonymous-report, depositTierService null guard, nostr{Deletion,Publisher,Scheduler} async return types, ssrf-probe-poc @ts-nocheck (PoC uses SQLite). - retention.test.ts + phase3EndToEndAcceptance.test.ts: @ts-nocheck + TODO Phase 12D (deep SQLite helpers / API drift respectively; still describe.skip at runtime). - B4 separate test config: new tsconfig.tests.json + package.json lint:tests script — main tsconfig keeps src/tests/** excluded (production build unchanged). - B5 CI wiring: npm run lint:tests added to .github/workflows/ci.yml. Gates: npm run lint 0 err, npm run lint:tests 0 err, npm test 1048 passed / 169 skipped / 0 failed (was 1043 pre-sweep — +5 from probeCrawler un-skip).

proofoftrust21 added 15 commits April 21, 2026 14:02

docs(phase-12b): B3.e — CRAWLER-RACE-CHECK.md (required by B0)

40b13f4

proofoftrust21 marked this pull request as ready for review April 21, 2026 21:28

proofoftrust21 merged commit a5c173b into main Apr 21, 2026
1 of 2 checks passed

proofoftrust21 deleted the phase-12b-postgres branch April 21, 2026 21:39

proofoftrust21 mentioned this pull request Apr 22, 2026

fix(ci): add postgres service to docker job #19

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase 12B — SQLite → PostgreSQL 16 migration (big-bang)#13

Phase 12B — SQLite → PostgreSQL 16 migration (big-bang)#13
proofoftrust21 merged 15 commits intomainfrom
phase-12b-postgres

proofoftrust21 commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

proofoftrust21 commented Apr 21, 2026

Summary

What changed

B0 — Code audit + Phase 12A cleanup

B1+B2 — Infra

B3 — Schema + code port

B4 — Seed bootstrap

B5 — Cut-over

B6 — Quick wins

B7 — Iso-network smoke (2026-04-21)

B8 — Migration report

B9 — This PR

Phase 12C carry-over (not in scope for this PR)

Test plan

Cut-over artefacts

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant