Skip to content

Delete trading_calendar + health_checker (moved to alpha-engine-dashboard)#43

Merged
cipher813 merged 1 commit into
mainfrom
chore/remove-moved-files
Apr 16, 2026
Merged

Delete trading_calendar + health_checker (moved to alpha-engine-dashboard)#43
cipher813 merged 1 commit into
mainfrom
chore/remove-moved-files

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

Draft — do not merge until Friday 2026-04-17 weekday Step Function run validates the new paths.

Part 3/3 of the ae-dashboard de-bloat split. Removes `trading_calendar.py`, `health_checker.py`, and their tests from this repo now that:

  1. Both scripts are present in alpha-engine-dashboard (merged: cipher813/alpha-engine-dashboard#18)
  2. The Step Function's `CheckTradingDay` + `HealthCheck` SSM commands run from `/home/ec2-user/alpha-engine-dashboard` (merged + deployed: cipher813/alpha-engine-data#42)

After this merges, `alpha-engine-data` contains only data-production code (collectors, builders, features, weekly_collector) — the producer vs observability seam is clean.

Pre-merge checklist (must be in this order)

  • Friday 2026-04-17 weekday Step Function run completes successfully with new paths — verify in SF console, confirm `/var/log/health-check.log` on ae-dashboard is populated
  • Update ae-dashboard crontab: there's a `0 */6 * * *` entry still running `cd /home/ec2-user/alpha-engine-data && .venv/bin/python health_checker.py --alert`. Operator runs `crontab -e` on ae-dashboard and swaps `/home/ec2-user/alpha-engine-data` → `/home/ec2-user/alpha-engine-dashboard`. If this isn't done first, the cron will break when this PR merges.
  • Pair merge with cipher813/alpha-engine-dashboard#19 (removes alpha-engine-data from the dashboard's boot-pull.sh REPOS list)

Post-merge cleanup

Manual op on ae-dashboard: `rm -rf /home/ec2-user/alpha-engine-data` to remove the now-unused clone.

Test plan

  • Local `pytest` — 49 passed (deletions don't break anything)
  • Deletions are file-level; no other code in alpha-engine-data imports the removed modules

Deletions

  • `trading_calendar.py` (134 lines)
  • `health_checker.py` (291 lines)
  • `tests/test_trading_calendar.py` (66 lines)
  • `tests/test_health_checker.py` (196 lines)

🤖 Generated with Claude Code

…oard)

Part 3/3 of ae-dashboard de-bloat. Both scripts were copied verbatim
to alpha-engine-dashboard in cipher813/alpha-engine-dashboard#18, and
the Step Function SSM commands were repointed in
#42 (merged 2026-04-16) to run from
/home/ec2-user/alpha-engine-dashboard.

This PR removes the original copies from alpha-engine-data now that
nothing in the weekday Step Function or Saturday pipeline references
them anymore. The data repo is now scoped purely to data-production
code (collectors, builders, features, weekly_collector) — matches the
producer-vs-observability seam documented in the earlier commits.

Pre-merge requirements (MERGE ORDER IS IMPORTANT):
  1. Friday 2026-04-17 weekday Step Function run must complete
     successfully using the new dashboard paths (verify CheckTradingDay
     logs `cd /home/ec2-user/alpha-engine-dashboard`, HealthCheck writes
     /var/log/health-check.log normally)
  2. Update ae-dashboard crontab — there's a
     `0 */6 * * *` entry still running `cd /home/ec2-user/alpha-engine-data
     && .venv/bin/python health_checker.py --alert`. Operator must
     crontab -e on ae-dashboard and swap the path to
     /home/ec2-user/alpha-engine-dashboard before this PR merges, else
     the cron breaks until that edit happens

Pairs with cipher813/alpha-engine-dashboard#19 (removes
alpha-engine-data from the dashboard's boot-pull.sh REPOS list).

Tests: full suite 49 passed (was 71 before moves — the 22 delta is
26 tests deleted with the moved files + 5 new test_module_health
tests added earlier + others).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cipher813 added a commit to cipher813/alpha-engine-dashboard that referenced this pull request Apr 16, 2026
Part of the ae-dashboard de-bloat split. After
cipher813/alpha-engine-data#43 merges (deletes trading_calendar.py +
health_checker.py from the data repo), ae-dashboard no longer has
any runtime dependency on alpha-engine-data — both files moved to
this repo in #18 and the Step
Function's CheckTradingDay + HealthCheck commands now run from here.

Removing alpha-engine-data from the REPOS array stops the daily
12:00 UTC boot-pull from refreshing the now-unused clone. Existing
clone at /home/ec2-user/alpha-engine-data on ae-dashboard can be
rm'd manually post-merge.

Pre-merge requirements:
  - cipher813/alpha-engine-data#43 merged (or mergeable in same batch)
  - ae-dashboard crontab updated so the `0 */6 * * *` health_checker
    cron no longer references /home/ec2-user/alpha-engine-data

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 marked this pull request as ready for review April 16, 2026 16:53
@cipher813 cipher813 merged commit 7c1b22d into main Apr 16, 2026
1 check passed
@cipher813 cipher813 deleted the chore/remove-moved-files branch April 16, 2026 16:53
cipher813 added a commit that referenced this pull request Apr 16, 2026
DataPhase1 now runs on a self-terminating c5.large spot instance
(same pattern as Backtester + PredictorTraining) instead of
hammering the t3.micro. The micro becomes a dispatcher: pulls the
latest launcher script, sources .env, invokes bash
infrastructure/spot_data_phase1.sh. All heavy Python work
(yfinance, polygon, FRED, ArcticDB append) runs on the spot.

Rationale: the 2026-04-16 OOM incident showed that running
data-refresh workloads on a 1 GB RAM instance is fragile-by-design.
Even though Saturday DataPhase1 has historically fit in micro RAM
(it uses different code paths than the daily feature compute that
OOM'd today), consolidating all heavy weekly compute onto
self-terminating spots aligns DataPhase1 with the existing
Backtester/PredictorTraining pattern and removes the 1 GB ceiling
from future data-refresh growth.

Also: SaturdayHealthCheck SSM command repointed from
/home/ec2-user/alpha-engine-data (health_checker.py was deleted
from that repo in #43) to /home/ec2-user/alpha-engine-dashboard
where it now lives. Mirrors the same fix applied to the weekday
HealthCheck step in #42.

Files:
  - new  infrastructure/spot_data_phase1.sh      (spot launcher, mirrors spot_backtest.sh)
  - edit infrastructure/step_function.json        (DataPhase1 + SaturdayHealthCheck commands)

Timeout bumped 1800 → 2700s to accommodate spot bootstrap overhead
(~7 min for instance launch + pip install on top of ~20 min workload).

Deferred (separate PR): migrate RAGIngestion + DriftDetection to
spot as well. They still run on the micro and need alpha-engine-data
restored on ae-dashboard for now.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request Apr 16, 2026
…44)

* Migrate DataPhase1 to spot + fix SaturdayHealthCheck path

DataPhase1 now runs on a self-terminating c5.large spot instance
(same pattern as Backtester + PredictorTraining) instead of
hammering the t3.micro. The micro becomes a dispatcher: pulls the
latest launcher script, sources .env, invokes bash
infrastructure/spot_data_phase1.sh. All heavy Python work
(yfinance, polygon, FRED, ArcticDB append) runs on the spot.

Rationale: the 2026-04-16 OOM incident showed that running
data-refresh workloads on a 1 GB RAM instance is fragile-by-design.
Even though Saturday DataPhase1 has historically fit in micro RAM
(it uses different code paths than the daily feature compute that
OOM'd today), consolidating all heavy weekly compute onto
self-terminating spots aligns DataPhase1 with the existing
Backtester/PredictorTraining pattern and removes the 1 GB ceiling
from future data-refresh growth.

Also: SaturdayHealthCheck SSM command repointed from
/home/ec2-user/alpha-engine-data (health_checker.py was deleted
from that repo in #43) to /home/ec2-user/alpha-engine-dashboard
where it now lives. Mirrors the same fix applied to the weekday
HealthCheck step in #42.

Files:
  - new  infrastructure/spot_data_phase1.sh      (spot launcher, mirrors spot_backtest.sh)
  - edit infrastructure/step_function.json        (DataPhase1 + SaturdayHealthCheck commands)

Timeout bumped 1800 → 2700s to accommodate spot bootstrap overhead
(~7 min for instance launch + pip install on top of ~20 min workload).

Deferred (separate PR): migrate RAGIngestion + DriftDetection to
spot as well. They still run on the micro and need alpha-engine-data
restored on ae-dashboard for now.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Bundle DataPhase1+RAGIngestion on one spot; migrate DriftDetection

Extends the DataPhase1-to-spot migration to cover all three Saturday
SF steps that were running heavy alpha-engine-data workloads on the
t3.micro:

- DataPhase1 + RAGIngestion now share a single spot instance via
  spot_data_weekly.sh (renamed from spot_data_phase1.sh). Both
  workloads use the same alpha-engine-data clone + pip install —
  bundling saves ~7 min of bootstrap overhead and one spot request.
  RAGIngestion SF state chain (RAGIngestion + WaitForRAGIngestion +
  CheckRAGStatus + RAGWait + ExtractRAGError) removed; DataPhase1's
  success now wires directly to Research.

- DriftDetection moves to its own spot via spot_drift_detection.sh.
  Launcher clones BOTH alpha-engine-data and alpha-engine-predictor
  (drift_detector lives in data/monitoring/ but imports from
  predictor via PYTHONPATH). Overkill cost-wise for the ~5 min
  workload (~7 min bootstrap + ~5 min work vs ~5 min on micro), but
  completes the architectural goal: zero heavy venvs on the micro.

Net effect on ae-dashboard after next boot-pull:
  - alpha-engine-data: cloned (for launcher scripts only, ~300 lines bash)
  - alpha-engine-data/.venv: can be deleted permanently
  - 0 heavy Python workloads running on the t3.micro at any point in
    the Saturday pipeline

Timeout bumps:
  - DataPhase1 (bundled): 2700s → 3600s (phase1 ~20min + rag ~15min + bootstrap ~7min)
  - DriftDetection: 300s → 1200s (bootstrap ~7min + workload ~5min)

SF state count: 34 → 30 (-4 RAG chain states).

Followup roadmap P2: bundle DriftDetection onto PredictorTraining's
spot since drift reads predictor weights produced by that step —
would save another bootstrap cycle.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request Apr 16, 2026
The ae-dashboard de-bloat split (#43) moved this file to
alpha-engine-dashboard on the (incorrect) assumption that the Step
Function's CheckTradingDay was the sole consumer. Spot smoke test
surfaced that collectors/universe_returns.py:215 also imports it:

    from trading_calendar import is_trading_day as nyse_is_trading_day

which broke the spot migration's DataPhase1 workflow
(ModuleNotFoundError in weekly_collector.py --phase 1).

This is a dual-copy smell (same file now lives in both repos) —
flagged in ROADMAP.md. The architecturally correct fix is to move
it into alpha-engine-lib so all consumers pull from one source of
truth, but that requires wiring up private-lib git auth on the spot
which is out of scope for the Saturday-unblock window.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request Apr 16, 2026
Closes the loop on the de-bloat split regression. collectors/universe_returns.py
was importing `from trading_calendar import is_trading_day`, which broke
when #43 moved the file out of this repo. Rather than restore a local copy
(the patchwork approach in the now-closed #48), import from
alpha_engine_lib.trading_calendar — a single source of truth shared across
all Alpha Engine modules.

- collectors/universe_returns.py: `from alpha_engine_lib.trading_calendar ...`
- requirements.txt: bump alpha-engine-lib pin @v0.1.1 → @v0.1.3

Requires cipher813/alpha-engine-lib#4 merged + tagged v0.1.3 first.
Spots pip-install from the git tag, so the new module appears at
install time without any other plumbing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request Apr 16, 2026
Closes the loop on the de-bloat split regression. collectors/universe_returns.py
was importing `from trading_calendar import is_trading_day`, which broke
when #43 moved the file out of this repo. Rather than restore a local copy
(the patchwork approach in the now-closed #48), import from
alpha_engine_lib.trading_calendar — a single source of truth shared across
all Alpha Engine modules.

- collectors/universe_returns.py: `from alpha_engine_lib.trading_calendar ...`
- requirements.txt: bump alpha-engine-lib pin @v0.1.1 → @v0.1.3

Requires cipher813/alpha-engine-lib#4 merged + tagged v0.1.3 first.
Spots pip-install from the git tag, so the new module appears at
install time without any other plumbing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request Apr 16, 2026
#49)

Closes the loop on the de-bloat split regression. collectors/universe_returns.py
was importing `from trading_calendar import is_trading_day`, which broke
when #43 moved the file out of this repo. Rather than restore a local copy
(the patchwork approach in the now-closed #48), import from
alpha_engine_lib.trading_calendar — a single source of truth shared across
all Alpha Engine modules.

- collectors/universe_returns.py: `from alpha_engine_lib.trading_calendar ...`
- requirements.txt: bump alpha-engine-lib pin @v0.1.1 → @v0.1.3

Requires cipher813/alpha-engine-lib#4 merged + tagged v0.1.3 first.
Spots pip-install from the git tag, so the new module appears at
install time without any other plumbing.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant