feat(research): wire factor-profiles producer into the graph (un-orphan compute_and_write_factor_profiles)#203
Merged
Conversation
…an compute_and_write_factor_profiles)
Splice a new `compute_factor_profiles_node` into graph/research_graph.py
between `macro_economist_node` and `compute_focus_list_node`:
fetch_data → load_regime_substrate_node → macro_economist_node
→ compute_factor_profiles_node → compute_focus_list_node → dispatch …
Splice point + why:
- The producer `scoring.factor_scoring.compute_and_write_factor_profiles`
needs `sector_map` + `run_date`, both populated in `fetch_data` and
NOT mutated by load_regime_substrate_node or macro_economist_node.
- It must land `factors/profiles/{run_date}/by_ticker.json` +
`latest.json` in S3 BEFORE both consumers do their existing
`read_factor_profiles_from_s3()`: `compute_focus_list_node` (~:1198)
and `score_aggregator` (~:1322, downstream of the dispatch off
compute_focus_list_node). Splicing on the
macro→compute_focus_list_node edge satisfies both with one re-route.
- This edge was chosen over `fetch_data → load_regime_substrate_node`
because the Stage-C serial chain (fetch_data → substrate loader →
macro) is a pinned topology invariant
(tests/test_regime_stage_b_graph_topology.py); the macro→focus-list
edge is the cleanest splice that preserves every pinned edge and the
regime / macro / focus-list / dispatch chain + conditional edges.
Graceful-degrade: the node wraps the producer so ANY failure (missing
run_date, missing/short `features/{run_date}/*.parquet`, S3 error,
compute exception) is caught, logged flow-doctor-visibly
(warning/error), and the node returns cleanly
({"factor_profiles_written": False, "factor_profiles_s3_key": ""}) so
the graph continues. The consumers then degrade exactly as they do
today when the substrate is absent (they already `if not
factor_profiles: skip`) — i.e. no worse than the prior orphaned state.
The weekly research run is never hard-failed on this new dependency.
Profiles are NOT threaded through state — consumers read from S3 by
design; only a small observability delta flows
(`factor_profiles_written` / `factor_profiles_s3_key`).
Behavior-safety: NO flag is flipped. `config.FACTOR_BLEND_ENABLED`
and `config.FOCUS_LIST_GATING_ENABLED` stay default-false — this is
substrate-only: it makes `s3://alpha-engine-research/factors/` exist
(it is empty in prod today since the producer was orphaned / test-only)
and lets the focus-list shadow audit populate
`scanner_evaluations.focus_*`. No scoring/agent behavior changes.
Closes ROADMAP P1 "Wire the orphaned factor-profiles producer into the
Saturday SF" and unblocks the FOCUS_LIST P0's real gate (its shadow
audit now sees a populated factor substrate each run).
Tests:
- tests/test_factor_profiles_node.py (new): (a) node calls
compute_and_write_factor_profiles with the state's run_date +
sector_map and returns the observability delta on success;
(b) producer exception / missing run_date → node logs + returns
cleanly, no raise (graph continues); (c) static-AST graph-wiring
assertions (mirroring test_regime_stage_b_graph_topology.py) that the
node is registered, runs after fetch_data via the macro chain, and
strictly before compute_focus_list_node AND score_aggregator, without
altering the sector dispatch.
- tests/test_dry_run.py: fixed a pre-existing order-dependent
test-isolation bug surfaced (not caused) by the new test file.
`TestGraphModuleGuard.test_skips_late_bound_patches_when_graph_absent`
and `TestInstallRestore` setup/teardown evicted/replaced the real
`sys.modules["graph.research_graph"]` WITHOUT restoring it; a later
re-import created a second module object so other test modules'
collection-time-bound `_build_signals_payload` no longer saw their
`monkeypatch.setattr("graph.research_graph.<FLAG>", …)` (the leak the
test_regime_stage_b_graph_topology.py docstring documents). Now both
snapshot + restore the real module. Full suite: 1366 passed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813
added a commit
that referenced
this pull request
May 19, 2026
…ollow-up to #203) (#204) #203 wired the producer with graceful-degrade (catch/log/continue). Per Brian + feedback_no_silent_fails: that recreates the exact orphaned-producer silent-failure class this wiring exists to fix — a failing producer would log a warning nobody reads while focus-list + factor-blend silently go inert again. compute_factor_profiles_node now RAISES on any failure (missing run_date, producer exception) → the Research SF state fails loudly + alerts. Not spuriously fragile: features/{run_date}/*.parquet is produced by DataPhase1 UPSTREAM in the same Saturday SF, so its absence is already an incident (DataPhase1 should have failed) — this surfaces real breakage, never fails a healthy run. Matches the system's fail-loud norm (DataPhase2 populated-ratio gate; optimizer PR5 empty-order-book-not-legacy-fallback). Still substrate-only — no flag flipped, no scoring change. Docstring + 2 tests flipped graceful-return → pytest.raises. Suite 1366 passed. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Splice a new
compute_factor_profiles_nodeintograph/research_graph.pyso the previously-orphaned producerscoring.factor_scoring.compute_and_write_factor_profiles(zero production callers, test-only →s3://alpha-engine-research/factors/empty in prod) actually runs every Saturday SF research run.Splice point + why
Spliced on the
macro_economist_node → compute_focus_list_nodeedge (one re-route: that edge becomesmacro → compute_factor_profiles_node+ newcompute_factor_profiles_node → compute_focus_list_node):sector_map+run_date, both populated infetch_dataand not mutated byload_regime_substrate_node/macro_economist_node.factors/profiles/{run_date}/by_ticker.json+latest.jsonbefore both consumers' existingread_factor_profiles_from_s3():compute_focus_list_node(:1198) and:1322, downstream of the dispatch that hangs offscore_aggregator(compute_focus_list_node). This one splice satisfies both.fetch_data → load_regime_substrate_nodeedge was not used because the Stage-C serial chain is a pinned topology invariant (tests/test_regime_stage_b_graph_topology.py). The macro→focus-list edge is the cleanest splice preserving every pinned edge and the regime / macro / focus-list / dispatch chain + conditional edges.No SF/infra change required — it is a plain in-graph serial node.
Graceful-degrade
The node wraps the producer so any failure (missing
run_date, missing/shortfeatures/{run_date}/*.parquet, S3 error, compute exception) is caught, logged flow-doctor-visibly (logger.warning/logger.error), and the node returns cleanly ({"factor_profiles_written": False, "factor_profiles_s3_key": ""}). The graph continues; the consumers degrade exactly as they do today when the substrate is absent (they alreadyif not factor_profiles: skip) — no worse than the prior orphaned state. The weekly research run is never hard-failed on this new dependency. Profiles are not threaded through state — consumers read from S3 by design; only a small observability delta flows.Behavior-safety
No flag is flipped.
config.FACTOR_BLEND_ENABLEDandconfig.FOCUS_LIST_GATING_ENABLEDstay default-false. This is substrate-only: it makes the factor substrate exist/ready and lets the focus-list shadow audit populatescanner_evaluations.focus_*. No scoring or agent behavior changes.Closes / unblocks
Tests
tests/test_factor_profiles_node.py(new): (a) node callscompute_and_write_factor_profileswith the state'srun_date+sector_mapand returns the observability delta on success; (b) producer exception / missingrun_date→ node logs + returns cleanly, no raise; (c) static-AST graph-wiring assertions (mirroringtest_regime_stage_b_graph_topology.py) that the node is registered, runs afterfetch_datavia the macro chain, and strictly beforecompute_focus_list_nodeANDscore_aggregator, without altering the sector dispatch.tests/test_dry_run.py: fixed a pre-existing order-dependent test-isolation bug surfaced (not caused) by the new test file.TestGraphModuleGuard.test_skips_late_bound_patches_when_graph_absentandTestInstallRestoresetup/teardown evicted/replaced the realsys.modules["graph.research_graph"]without restoring it; a later re-import created a second module object so other test modules' collection-time-bound_build_signals_payloadno longer saw theirmonkeypatch.setattr("graph.research_graph.<FLAG>", …)(the exact leak thetest_regime_stage_b_graph_topology.pydocstring documents and previously only "sidestepped" via filename ordering). Both sites now snapshot + restore the real module.Full suite: 1366 passed (
~/Development/alpha-engine-research/.venv/bin/python -m pytest -q).🤖 Generated with Claude Code