Skip to content

Fix Fathom sync reliability: 10-day silent failure, missing calls #437

@jonschull

Description

@jonschull

Problem

The Fathom sync has been silently failing for 10 consecutive days (Jan 31 – Feb 10, 2026). 18 calls were missed, including the Jan 21 Town Hall transcript. No alerts were generated.

Failure timeline

  • Jan 31: python: command not found — PATH issue in cron environment
  • Feb 1: ModuleNotFoundError: No module named 'era2_config' — working directory issue
  • Feb 2–10: Fathom API failures — DNS resolution failures (Failed to resolve 'api.fathom.ai'), 503 Service Unavailable

Root cause

The 7-day default lookback in sync_fathom.py means once a call ages past 7 days during an outage, it's permanently missed without manual intervention. This is a silent failure mode — cron runs, logs errors, but nothing alerts anyone.

Fixes needed

1. Extend default lookback (quick fix)

Change --days=7 to --days=30 in era2/scripts/daily_sync.sh. Provides buffer against short outages.

2. Add failure alerting

If sync fails for 2+ consecutive days, send an alert (macOS notification or email). Check era2/logs/daily_sync.log for error patterns.

3. Weekly gap-heal cron

Add a Sunday cron that runs sync_fathom.py --days=60 to heal any gaps from extended outages.

4. Fix cron PATH issue

The Jan 31 failure was python: command not found. Ensure daily_sync.sh uses absolute paths to Python (miniconda). Verify the script works in a clean cron environment.

5. Consolidate redundant syncs

Two syncs run daily (ERA2 at 3 AM, FathomInventory at 4 AM). FathomInventory reports "0 new calls" consistently — likely redundant. Consider disabling one.

Files

  • era2/scripts/daily_sync.sh — cron wrapper
  • era2/scripts/sync_fathom.py — sync implementation
  • era2/logs/daily_sync.log — failure evidence
  • era2/lib/fathom.py — Fathom API client

Acceptance criteria

  • Default lookback extended to 30 days
  • Alerting on 2+ consecutive failures
  • Weekly gap-heal cron configured
  • Cron PATH issue fixed
  • daily_sync.log shows successful runs after fix

Metadata

Metadata

Assignees

No one assigned

    Labels

    auto-mergeNon-strategy change: auto-merge after review PASS

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions