feat(sf): wire counterfactual rule fit Lambda into Saturday SF#168
Merged
Conversation
Companion to alpha-engine-backtester #140 (counterfactual Lambda implementation). Closes the agent-justification triple end-to-end — all three signals now fire from the Saturday SF on the same trailing- 8-week corpus. SF chain after this PR: ... → EvalJudge{Weekly,FirstSaturday} → EvalRollingMean → CheckSkipRationaleClustering → RationaleClustering → CheckSkipReplayConcordance → ReplayConcordance → CheckSkipCounterfactual → Counterfactual → SaturdayHealthCheck → ... Three independent skip-gates, each landing on the next signal's gate rather than the health check (so skipping one signal doesn't bundle- skip the others): - {"skip_rationale_clustering": true} → CheckSkipReplayConcordance - {"skip_replay_concordance": true} → CheckSkipCounterfactual - {"skip_counterfactual": true} → SaturdayHealthCheck Default Counterfactual payload pins production cadence: - end_time_iso = SF execution start time - window_days = 56 (8 weeks) - max_depth = 3 ("3-deep rule") No target_models payload — counterfactual doesn't replay against a target model; sklearn fits a tree on actual (input → decision) pairs. IAM updates: - github-actions-lambda-deploy.json: alpha-engine-replay-counterfactual added to LambdaUpdate + LambdaInvokeCanary lists. Asymmetric-IAM- grant antipattern compliance — 5th Lambda this shape; durable CreateFunction grant from data #165 already covers create + update. - deploy_step_function.sh: SF role inline LambdaInvoke list updated with the new function ARN so SF can invoke it. Tests: - TestStatesPresent: CheckSkipCounterfactual + Counterfactual added to required-states pin. - TestSkipReplayConcordance: rerouted assertion (now lands at CheckSkipCounterfactual, not SaturdayHealthCheck). - TestReplayConcordance: success + Catch reroutes (same). - TestSkipCounterfactual: skip_counterfactual flag → SaturdayHealthCheck. - TestCounterfactual: live alias, payload required fields (end_time_iso, window_days=56, max_depth=3), 600s timeout matches Lambda cap, success + Catch routes, retry posture. Suite 459 → 467. Composes with the agent-justification dashboard surface — all three metrics emit under the AlphaEngine/Eval namespace: - agent_quality_score (eval-judge) - agent_quality_score_4w_mean (eval-rolling-mean) - agent_rationale_template_concentration (clustering) - agent_cheap_model_concordance (concordance) - agent_counterfactual_rule_fit (this one) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Companion to alpha-engine-backtester #140 (counterfactual Lambda implementation). Closes the agent-justification triple end-to-end — all three signals now fire from the Saturday SF on the same trailing-8-week corpus.
SF chain after this PR
```
... → EvalJudge{Weekly,FirstSaturday} → EvalRollingMean
→ CheckSkipRationaleClustering → RationaleClustering
→ CheckSkipReplayConcordance → ReplayConcordance
→ CheckSkipCounterfactual → Counterfactual
→ SaturdayHealthCheck → ...
```
Three independent skip-gates
Each lands on the next signal's gate rather than SaturdayHealthCheck — skipping one signal doesn't bundle-skip the others:
Default Counterfactual payload
```json
{
"end_time_iso.$": "$$.Execution.StartTime",
"window_days": 56,
"max_depth": 3
}
```
No `target_models` — counterfactual doesn't replay against a target model; sklearn fits a tree on actual (input → decision) pairs.
IAM updates
Test plan
Agent-justification dashboard surface
All five metrics emit under `AlphaEngine/Eval`:
🤖 Generated with Claude Code