Delete trading_calendar + health_checker (moved to alpha-engine-dashboard)#43
Merged
Conversation
…oard) Part 3/3 of ae-dashboard de-bloat. Both scripts were copied verbatim to alpha-engine-dashboard in cipher813/alpha-engine-dashboard#18, and the Step Function SSM commands were repointed in #42 (merged 2026-04-16) to run from /home/ec2-user/alpha-engine-dashboard. This PR removes the original copies from alpha-engine-data now that nothing in the weekday Step Function or Saturday pipeline references them anymore. The data repo is now scoped purely to data-production code (collectors, builders, features, weekly_collector) — matches the producer-vs-observability seam documented in the earlier commits. Pre-merge requirements (MERGE ORDER IS IMPORTANT): 1. Friday 2026-04-17 weekday Step Function run must complete successfully using the new dashboard paths (verify CheckTradingDay logs `cd /home/ec2-user/alpha-engine-dashboard`, HealthCheck writes /var/log/health-check.log normally) 2. Update ae-dashboard crontab — there's a `0 */6 * * *` entry still running `cd /home/ec2-user/alpha-engine-data && .venv/bin/python health_checker.py --alert`. Operator must crontab -e on ae-dashboard and swap the path to /home/ec2-user/alpha-engine-dashboard before this PR merges, else the cron breaks until that edit happens Pairs with cipher813/alpha-engine-dashboard#19 (removes alpha-engine-data from the dashboard's boot-pull.sh REPOS list). Tests: full suite 49 passed (was 71 before moves — the 22 delta is 26 tests deleted with the moved files + 5 new test_module_health tests added earlier + others). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cipher813
added a commit
to cipher813/alpha-engine-dashboard
that referenced
this pull request
Apr 16, 2026
Part of the ae-dashboard de-bloat split. After cipher813/alpha-engine-data#43 merges (deletes trading_calendar.py + health_checker.py from the data repo), ae-dashboard no longer has any runtime dependency on alpha-engine-data — both files moved to this repo in #18 and the Step Function's CheckTradingDay + HealthCheck commands now run from here. Removing alpha-engine-data from the REPOS array stops the daily 12:00 UTC boot-pull from refreshing the now-unused clone. Existing clone at /home/ec2-user/alpha-engine-data on ae-dashboard can be rm'd manually post-merge. Pre-merge requirements: - cipher813/alpha-engine-data#43 merged (or mergeable in same batch) - ae-dashboard crontab updated so the `0 */6 * * *` health_checker cron no longer references /home/ec2-user/alpha-engine-data Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5 tasks
cipher813
added a commit
that referenced
this pull request
Apr 16, 2026
DataPhase1 now runs on a self-terminating c5.large spot instance (same pattern as Backtester + PredictorTraining) instead of hammering the t3.micro. The micro becomes a dispatcher: pulls the latest launcher script, sources .env, invokes bash infrastructure/spot_data_phase1.sh. All heavy Python work (yfinance, polygon, FRED, ArcticDB append) runs on the spot. Rationale: the 2026-04-16 OOM incident showed that running data-refresh workloads on a 1 GB RAM instance is fragile-by-design. Even though Saturday DataPhase1 has historically fit in micro RAM (it uses different code paths than the daily feature compute that OOM'd today), consolidating all heavy weekly compute onto self-terminating spots aligns DataPhase1 with the existing Backtester/PredictorTraining pattern and removes the 1 GB ceiling from future data-refresh growth. Also: SaturdayHealthCheck SSM command repointed from /home/ec2-user/alpha-engine-data (health_checker.py was deleted from that repo in #43) to /home/ec2-user/alpha-engine-dashboard where it now lives. Mirrors the same fix applied to the weekday HealthCheck step in #42. Files: - new infrastructure/spot_data_phase1.sh (spot launcher, mirrors spot_backtest.sh) - edit infrastructure/step_function.json (DataPhase1 + SaturdayHealthCheck commands) Timeout bumped 1800 → 2700s to accommodate spot bootstrap overhead (~7 min for instance launch + pip install on top of ~20 min workload). Deferred (separate PR): migrate RAGIngestion + DriftDetection to spot as well. They still run on the micro and need alpha-engine-data restored on ae-dashboard for now. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6 tasks
cipher813
added a commit
that referenced
this pull request
Apr 16, 2026
…44) * Migrate DataPhase1 to spot + fix SaturdayHealthCheck path DataPhase1 now runs on a self-terminating c5.large spot instance (same pattern as Backtester + PredictorTraining) instead of hammering the t3.micro. The micro becomes a dispatcher: pulls the latest launcher script, sources .env, invokes bash infrastructure/spot_data_phase1.sh. All heavy Python work (yfinance, polygon, FRED, ArcticDB append) runs on the spot. Rationale: the 2026-04-16 OOM incident showed that running data-refresh workloads on a 1 GB RAM instance is fragile-by-design. Even though Saturday DataPhase1 has historically fit in micro RAM (it uses different code paths than the daily feature compute that OOM'd today), consolidating all heavy weekly compute onto self-terminating spots aligns DataPhase1 with the existing Backtester/PredictorTraining pattern and removes the 1 GB ceiling from future data-refresh growth. Also: SaturdayHealthCheck SSM command repointed from /home/ec2-user/alpha-engine-data (health_checker.py was deleted from that repo in #43) to /home/ec2-user/alpha-engine-dashboard where it now lives. Mirrors the same fix applied to the weekday HealthCheck step in #42. Files: - new infrastructure/spot_data_phase1.sh (spot launcher, mirrors spot_backtest.sh) - edit infrastructure/step_function.json (DataPhase1 + SaturdayHealthCheck commands) Timeout bumped 1800 → 2700s to accommodate spot bootstrap overhead (~7 min for instance launch + pip install on top of ~20 min workload). Deferred (separate PR): migrate RAGIngestion + DriftDetection to spot as well. They still run on the micro and need alpha-engine-data restored on ae-dashboard for now. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Bundle DataPhase1+RAGIngestion on one spot; migrate DriftDetection Extends the DataPhase1-to-spot migration to cover all three Saturday SF steps that were running heavy alpha-engine-data workloads on the t3.micro: - DataPhase1 + RAGIngestion now share a single spot instance via spot_data_weekly.sh (renamed from spot_data_phase1.sh). Both workloads use the same alpha-engine-data clone + pip install — bundling saves ~7 min of bootstrap overhead and one spot request. RAGIngestion SF state chain (RAGIngestion + WaitForRAGIngestion + CheckRAGStatus + RAGWait + ExtractRAGError) removed; DataPhase1's success now wires directly to Research. - DriftDetection moves to its own spot via spot_drift_detection.sh. Launcher clones BOTH alpha-engine-data and alpha-engine-predictor (drift_detector lives in data/monitoring/ but imports from predictor via PYTHONPATH). Overkill cost-wise for the ~5 min workload (~7 min bootstrap + ~5 min work vs ~5 min on micro), but completes the architectural goal: zero heavy venvs on the micro. Net effect on ae-dashboard after next boot-pull: - alpha-engine-data: cloned (for launcher scripts only, ~300 lines bash) - alpha-engine-data/.venv: can be deleted permanently - 0 heavy Python workloads running on the t3.micro at any point in the Saturday pipeline Timeout bumps: - DataPhase1 (bundled): 2700s → 3600s (phase1 ~20min + rag ~15min + bootstrap ~7min) - DriftDetection: 300s → 1200s (bootstrap ~7min + workload ~5min) SF state count: 34 → 30 (-4 RAG chain states). Followup roadmap P2: bundle DriftDetection onto PredictorTraining's spot since drift reads predictor weights produced by that step — would save another bootstrap cycle. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cipher813
added a commit
that referenced
this pull request
Apr 16, 2026
The ae-dashboard de-bloat split (#43) moved this file to alpha-engine-dashboard on the (incorrect) assumption that the Step Function's CheckTradingDay was the sole consumer. Spot smoke test surfaced that collectors/universe_returns.py:215 also imports it: from trading_calendar import is_trading_day as nyse_is_trading_day which broke the spot migration's DataPhase1 workflow (ModuleNotFoundError in weekly_collector.py --phase 1). This is a dual-copy smell (same file now lives in both repos) — flagged in ROADMAP.md. The architecturally correct fix is to move it into alpha-engine-lib so all consumers pull from one source of truth, but that requires wiring up private-lib git auth on the spot which is out of scope for the Saturday-unblock window. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cipher813
added a commit
that referenced
this pull request
Apr 16, 2026
Closes the loop on the de-bloat split regression. collectors/universe_returns.py was importing `from trading_calendar import is_trading_day`, which broke when #43 moved the file out of this repo. Rather than restore a local copy (the patchwork approach in the now-closed #48), import from alpha_engine_lib.trading_calendar — a single source of truth shared across all Alpha Engine modules. - collectors/universe_returns.py: `from alpha_engine_lib.trading_calendar ...` - requirements.txt: bump alpha-engine-lib pin @v0.1.1 → @v0.1.3 Requires cipher813/alpha-engine-lib#4 merged + tagged v0.1.3 first. Spots pip-install from the git tag, so the new module appears at install time without any other plumbing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 tasks
cipher813
added a commit
that referenced
this pull request
Apr 16, 2026
Closes the loop on the de-bloat split regression. collectors/universe_returns.py was importing `from trading_calendar import is_trading_day`, which broke when #43 moved the file out of this repo. Rather than restore a local copy (the patchwork approach in the now-closed #48), import from alpha_engine_lib.trading_calendar — a single source of truth shared across all Alpha Engine modules. - collectors/universe_returns.py: `from alpha_engine_lib.trading_calendar ...` - requirements.txt: bump alpha-engine-lib pin @v0.1.1 → @v0.1.3 Requires cipher813/alpha-engine-lib#4 merged + tagged v0.1.3 first. Spots pip-install from the git tag, so the new module appears at install time without any other plumbing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cipher813
added a commit
that referenced
this pull request
Apr 16, 2026
#49) Closes the loop on the de-bloat split regression. collectors/universe_returns.py was importing `from trading_calendar import is_trading_day`, which broke when #43 moved the file out of this repo. Rather than restore a local copy (the patchwork approach in the now-closed #48), import from alpha_engine_lib.trading_calendar — a single source of truth shared across all Alpha Engine modules. - collectors/universe_returns.py: `from alpha_engine_lib.trading_calendar ...` - requirements.txt: bump alpha-engine-lib pin @v0.1.1 → @v0.1.3 Requires cipher813/alpha-engine-lib#4 merged + tagged v0.1.3 first. Spots pip-install from the git tag, so the new module appears at install time without any other plumbing. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Draft — do not merge until Friday 2026-04-17 weekday Step Function run validates the new paths.
Part 3/3 of the ae-dashboard de-bloat split. Removes `trading_calendar.py`, `health_checker.py`, and their tests from this repo now that:
After this merges, `alpha-engine-data` contains only data-production code (collectors, builders, features, weekly_collector) — the producer vs observability seam is clean.
Pre-merge checklist (must be in this order)
Post-merge cleanup
Manual op on ae-dashboard: `rm -rf /home/ec2-user/alpha-engine-data` to remove the now-unused clone.
Test plan
Deletions
🤖 Generated with Claude Code