-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Problem
The Fathom sync has been silently failing for 10 consecutive days (Jan 31 – Feb 10, 2026). 18 calls were missed, including the Jan 21 Town Hall transcript. No alerts were generated.
Failure timeline
- Jan 31:
python: command not found— PATH issue in cron environment - Feb 1:
ModuleNotFoundError: No module named 'era2_config'— working directory issue - Feb 2–10: Fathom API failures — DNS resolution failures (
Failed to resolve 'api.fathom.ai'), 503 Service Unavailable
Root cause
The 7-day default lookback in sync_fathom.py means once a call ages past 7 days during an outage, it's permanently missed without manual intervention. This is a silent failure mode — cron runs, logs errors, but nothing alerts anyone.
Fixes needed
1. Extend default lookback (quick fix)
Change --days=7 to --days=30 in era2/scripts/daily_sync.sh. Provides buffer against short outages.
2. Add failure alerting
If sync fails for 2+ consecutive days, send an alert (macOS notification or email). Check era2/logs/daily_sync.log for error patterns.
3. Weekly gap-heal cron
Add a Sunday cron that runs sync_fathom.py --days=60 to heal any gaps from extended outages.
4. Fix cron PATH issue
The Jan 31 failure was python: command not found. Ensure daily_sync.sh uses absolute paths to Python (miniconda). Verify the script works in a clean cron environment.
5. Consolidate redundant syncs
Two syncs run daily (ERA2 at 3 AM, FathomInventory at 4 AM). FathomInventory reports "0 new calls" consistently — likely redundant. Consider disabling one.
Files
era2/scripts/daily_sync.sh— cron wrapperera2/scripts/sync_fathom.py— sync implementationera2/logs/daily_sync.log— failure evidenceera2/lib/fathom.py— Fathom API client
Acceptance criteria
- Default lookback extended to 30 days
- Alerting on 2+ consecutive failures
- Weekly gap-heal cron configured
- Cron PATH issue fixed
daily_sync.logshows successful runs after fix