refactor(iam): take ownership of shared orchestration roles#172
Merged
Conversation
Step Functions execution role + EventBridge cron role move here from alpha-engine/infrastructure/iam/. Their grants are derived from code that lives in this repo (SF JSON Lambda invoke targets, EC2 instances the SF SSMs, EventBridge SFN target ARNs) — co-locating the codified IAM with the source of those grants tightens the coupling so a single PR can change SF behavior + matching IAM atomically. Files added: - infrastructure/iam/alpha-engine-step-functions-role.json - infrastructure/iam/alpha-engine-eventbridge-sfn-role.json - infrastructure/iam/check-drift.py (flat-layout variant of alpha-engine's directory-per-role drift checker) - infrastructure/iam/README.md (documents which roles this repo owns + the single-writer rule) - .github/workflows/iam-drift-check.yml (PR + daily 09:30 UTC + manual) Files updated: - infrastructure/deploy_step_function.sh — drops the surviving inline put-role-policy block against alpha-engine-step-functions-role (PR #170 dropped the EB-SFN twin; PR #151 dropped the daily-script twin; this completes the trio). The script kept a stale narrower policy that clobbered ssm:DescribeInstanceInformation + ec2:StopInstances + the trading-instance SSM ARN every saturday deploy. Trust policy + create-role bootstrap stay (one-time setup). - infrastructure/add-ssm-policy.sh — drops alpha-engine-executor-role + alpha-engine-predictor-role from the ROLES list. Both now have alpha-engine-ssm-read codified in their home repos (executor: already; predictor: covered by separate PR codifying its existing live grant). Script remains the writer for non-codified Lambda execution roles only. OIDC trust policy on github-actions-iam-drift-check widened live to also permit repo:cipher813/alpha-engine-data so the new drift-check workflow here can authenticate with the existing OIDC role. Companion PRs: - alpha-engine #137 — removes the codified directories, updates the cross-repo foreign-writer guard. - alpha-engine-predictor (separate) — codifies existing live alpha-engine-ssm-read grant on predictor-role. Supersedes alpha-engine-data #171 (which only addressed the saturday script's inline write). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 6, 2026
cipher813
added a commit
that referenced
this pull request
May 13, 2026
…; requires lib v0.15.0) (#226) Wave 1 PR β of the institutional data-revamp arc (plan doc: ~/Development/alpha-engine-docs/private/data-revamp-260513.md). Producer-side concrete adapter implementations + multi-source aggregator. Pairs with alpha-engine-lib PR #46 (PR α, v0.15.0) which defined NewsSource Protocol + NewsArticle shape. Architectural pattern: data is the producer; lib defines the contract; research is the consumer (will read producer outputs via S3 + RAG retrieval in future sub-PRs, never imports adapters directly). New modules: collectors/news_sources/ polygon.py — FREE. Uses our existing polygon_client (data repo's copy) for rate-limit reuse. Normalizes /v2/reference/news. gdelt.py — FREE (no key). GDELT 2.0 DOC API; academic-grade event-extracted news. Requires ticker→name map for query building. yahoo_rss.py — FREE (fallback). Pure feedparser-based; matches existing collectors/alternative.py pattern but normalized into NewsArticle. benzinga.py — PAID stub. Raises NotImplementedError on init. ravenpack.py — PAID stub. bloomberg.py — PAID stub. collectors/news_aggregator.py NewsAggregator(sources, trust_weights) — fan-in across enabled NewsSource adapters → dedup (composite fingerprint: normalized title + URL path hash with querystring/fragment stripped) → preserve all source-provenance variants → return AggregatedNewsArticle records sorted by earliest_published_at desc. DEFAULT_TRUST_WEIGHTS: paid 0.95-1.0, polygon 0.9, gdelt 0.85, edgar_press 0.95, yahoo_rss 0.5. Lib pin bumps (lockstep, both must move per the pin-lockstep test): requirements.txt v0.12.0 → v0.15.0 Dockerfile v0.12.0 → v0.15.0 What's deferred (subsequent Wave 1 sub-PRs): PR A.1 — NLP pipeline (Loughran-McDonald + FinBERT + spaCy NER + LLM event extraction). Heavier deps; separate PR. PR A.2 — Structured aggregates writer (S3 parquet per ticker per day). Joined onto research's snapshot in PR F. PR A.3 — RAG ingest path: news → chunked → embedded → indexed in pgvector alongside existing SEC filings corpus. PR B — Filings substrate expansion (EDGAR full coverage: 10-K/Q/14A/S-1/13D/G/13F/Form-4). PR C — Analyst substrate (yfinance + FMP adapters + self-derived revisions tracking from daily snapshots). PR D — Async + S3 cache + per-vendor rate limiters. PR E — Wire RAG retrieval tools into research repo's thesis_update + sector agents. PR F — Wire new substrate into research's fetch_data (supersedes #170's per-ticker pre-fetch). +37 unit tests: - Protocol structural-subtyping for all 3 free adapters - Polygon: happy + transient-failure-per-ticker + schema-drift-skip - GDELT: happy + query building (multi-word vs single-word) + failure-skips-ticker + missing-name-map-fallback - Yahoo RSS: happy + entries-older-than-cutoff-dropped + no-link-skipped + fetch-failure-skips-ticker - Paid stubs: all 3 raise NotImplementedError on init - Aggregator: fan-in + URL/title dedup + canonical-title-longest + canonical-url-highest-trust + ticker-union + one-broken-source- isolated + output-sorted-desc + empty-fan-in - Trust weights: defaults + overrides + unknown-source-defaults-half - Fingerprint determinism - Lib shape contract pin (extra='forbid' + frozen) Suite: 848 passing. Composes with: - alpha-engine-lib PR #46 (v0.15.0) — required for shapes + Protocols - alpha-engine-research PR #172 (CLOSED) — original mis-located substrate; relocated here per architectural correction Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Move the codified inline policies for
alpha-engine-step-functions-roleandalpha-engine-eventbridge-sfn-rolefromalpha-engine/infrastructure/iam/to this repo. Their grants are derived from code that lives here (SF JSON Lambda invoke targets, EC2 instances the SF SSMs, EventBridge SFN target ARNs) — co-locating tightens the coupling so a single PR can change SF behavior + matching IAM atomically.Also drops the last surviving inline
put-role-policyblock fromdeploy_step_function.sh(Saturday script — PR #170 dropped its EB-SFN twin, PR #151 dropped the daily script's SF role twin; this completes the trio) and scopesadd-ssm-policy.shto non-codified Lambda execution roles only.What's added
Live operations performed
OIDC trust policy on
github-actions-iam-drift-checkwidened live to also permitrepo:cipher813/alpha-engine-data:*(added 2 sub patterns alongside the existing 2 for alpha-engine). Trust policies + role creation stay out-of-band per existing convention.Companion / supersedes
Predictor follow-up (NOT in this PR)
add-ssm-policy.sh's ROLES list dropsalpha-engine-predictor-rolealong withalpha-engine-executor-role. Executor'salpha-engine-ssm-readis already codified in alpha-engine. Predictor's livealpha-engine-ssm-readinline grant stays live (nothing deletes it) but is no longer maintained by this script. Codifying it on the predictor side is a separate small PR — alpha-engine-predictor'sinfrastructure/iam/would need a directory-per-role refactor first to support multiple inline policies on the same role.Test plan
bash -nsyntax-clean on both deploy scripts.python3 infrastructure/iam/check-drift.pyclean against live AWS for all 3 roles owned here (lambda-deploy, SF, EB-SFN).🤖 Generated with Claude Code