fix(spot): re-export AWS_REGION into spot shell — close #241 .env-removal regression (Sat SF DataPhase1 P0)#247
Merged
Conversation
…oval regression PR 9f (#241) removed `.env` sourcing from the spot bootstrap in favor of runtime get_secret() SSM lookups. That handled secrets, but the same `.env` was also the only thing exporting AWS_REGION — a plain env var (not a secret) that alpha_engine_lib.preflight.check_env_vars() hard- requires and boto3 needs as a default region. Result: 2026-05-16 Saturday SF DataPhase1 aborted ~1m52s in, at `weekly_collector.py --morning-enrich` preflight: RuntimeError: Pre-flight: required env vars missing: ['AWS_REGION'] Whole Saturday pipeline aborted (no downstream stale-data risk — it failed before any write). Fix: ENV_SOURCE (interpolated into every remote `run_remote bash` heredoc) now also exports AWS_REGION + AWS_DEFAULT_REGION from the dispatcher-side $AWS_REGION (already defaulted to us-east-1). Applied to both spot_data_weekly.sh and spot_drift_detection.sh — the identical #241 regression affects the Saturday DriftDetection state too. Regression test pins the ENV_SOURCE region exports in both scripts so a future ENV_SOURCE edit can't silently drop them again (shim-deletion launch-mechanism class). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813
added a commit
that referenced
this pull request
May 16, 2026
…nv-var check (#248) Second facet of the #241/#242 .env-deprecation regression, surfaced by the 2026-05-16 Saturday SF recovery run: AWS_REGION fix (#247) let DataPhase1 clear preflight and MorningEnrich completed (polygon 913/921 + FRED 4/4 fetched fine via get_secret()), but `weekly_collector.py --phase 1` then aborted at preflight: RuntimeError: Pre-flight: required env vars missing: ['FRED_API_KEY', 'POLYGON_API_KEY'] Every collector AND both reachability probes in this file already resolve these keys via get_secret() (SSM). The only stale code was DataPreflight.run()'s `check_env_vars("FRED_API_KEY","POLYGON_API_KEY")` (and the phase2 FMP/FINNHUB/EDGAR equivalent) — an os.environ assertion the env-deprecation arc migrated every consumer away from but missed here. MorningEnrich slipped through because its preflight only checks AWS_REGION; phase1/phase2 hard-failed on the stale gate. Fix: AWS_REGION stays an env-var check (plain boto3 region, not a secret); the API keys now go through a new `_check_secrets()` helper that calls get_secret(required=False) — same <1s fail-fast intent, same RuntimeError shape, sourced from SSM (with get_secret's env fallback) instead of os.environ. phase2 had the identical latent bug and is fixed in the same change. Tests updated to the get_secret() reality (patch preflight.get_secret rather than os.environ); full suite 1050 passed, 1 skipped. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 16, 2026
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Root cause (P0 — 2026-05-16 Saturday SF DataPhase1 failure)
Saturday pipeline aborted ~1m52s into
DataPhase1atweekly_collector.py --morning-enrichpreflight:PR 9f (#241,
61253df) removed the spot-side.envsourcing in favor of runtimeget_secret()SSM lookups. Diff line 118→119:ENV_SOURCE="set -a; ... source .../.env; set +a; export XDG_CACHE_HOME=/tmp; ..."ENV_SOURCE="export XDG_CACHE_HOME=/tmp; export PYTHON_BIN=$REMOTE_PYTHON;"The audit correctly migrated secrets to
get_secret(), but the same.envwas also the only thing exportingAWS_REGION— a plain env var (not a secret) thatalpha_engine_lib.preflight.check_env_vars()hard-requires and boto3 needs as a default region. This is the [shim-deletion / launch-mechanism] regression class.No stale-data risk: failure happened at preflight, before any S3/ArcticDB write; spot self-terminated.
Fix
ENV_SOURCE(interpolated into everyrun_remote bash -s <<HEREDOCworkload) now also exportsAWS_REGION+AWS_DEFAULT_REGIONfrom the dispatcher-side$AWS_REGION(already defaulted tous-east-1). Applied to bothspot_data_weekly.shandspot_drift_detection.sh— the identical #241 regression also affects the SaturdayDriftDetectionstate.Test
tests/test_spot_env_source_aws_region.pypins both scripts'ENV_SOURCEregion exports so a future edit can't silently drop them. Subset run: 278 passed, 1 skipped.Recovery
After merge, re-trigger the Saturday SF (dispatcher git-pulls
mainbefore running the spot script). DataPhase1 must clear before downstream Research/Predictor/Backtester.🤖 Generated with Claude Code