refactor(iam): take ownership of shared orchestration roles by cipher813 · Pull Request #172 · cipher813/alpha-engine-data

cipher813 · 2026-05-06T14:45:04Z

Summary

Move the codified inline policies for alpha-engine-step-functions-role and alpha-engine-eventbridge-sfn-role from alpha-engine/infrastructure/iam/ to this repo. Their grants are derived from code that lives here (SF JSON Lambda invoke targets, EC2 instances the SF SSMs, EventBridge SFN target ARNs) — co-locating tightens the coupling so a single PR can change SF behavior + matching IAM atomically.

Also drops the last surviving inline put-role-policy block from deploy_step_function.sh (Saturday script — PR #170 dropped its EB-SFN twin, PR #151 dropped the daily script's SF role twin; this completes the trio) and scopes add-ssm-policy.sh to non-codified Lambda execution roles only.

What's added

infrastructure/iam/
├── alpha-engine-step-functions-role.json    # NEW
├── alpha-engine-eventbridge-sfn-role.json   # NEW
├── github-actions-lambda-deploy.json        # was already here
├── apply.sh                                 # was already here
├── check-drift.py                           # NEW (flat-layout variant)
└── README.md                                # NEW

.github/workflows/iam-drift-check.yml         # NEW

Live operations performed

OIDC trust policy on github-actions-iam-drift-check widened live to also permit repo:cipher813/alpha-engine-data:* (added 2 sub patterns alongside the existing 2 for alpha-engine). Trust policies + role creation stay out-of-band per existing convention.

Companion / supersedes

Companion: refactor(iam): foreign-writer guard + relocate shared roles to alpha-engine-data alpha-engine#137 — removes the codified directories from alpha-engine + updates the cross-repo foreign-writer guard to handle multi-repo discovery + flat layout.
Supersedes: fix(infra): drop inline SF-role IAM write from saturday deploy #171 (closed; broader relocation includes its scope).

Predictor follow-up (NOT in this PR)

add-ssm-policy.sh's ROLES list drops alpha-engine-predictor-role along with alpha-engine-executor-role. Executor's alpha-engine-ssm-read is already codified in alpha-engine. Predictor's live alpha-engine-ssm-read inline grant stays live (nothing deletes it) but is no longer maintained by this script. Codifying it on the predictor side is a separate small PR — alpha-engine-predictor's infrastructure/iam/ would need a directory-per-role refactor first to support multiple inline policies on the same role.

Test plan

bash -n syntax-clean on both deploy scripts.
python3 infrastructure/iam/check-drift.py clean against live AWS for all 3 roles owned here (lambda-deploy, SF, EB-SFN).
Foreign-writer guard from companion PR runs clean against this branch.
CI runs both checks on this PR.
Saturday SF auto-fire (Sat 2026-05-09) confirms no regression.
Weekday SF auto-fire (Thu 2026-05-07) confirms no regression.

🤖 Generated with Claude Code

Step Functions execution role + EventBridge cron role move here from alpha-engine/infrastructure/iam/. Their grants are derived from code that lives in this repo (SF JSON Lambda invoke targets, EC2 instances the SF SSMs, EventBridge SFN target ARNs) — co-locating the codified IAM with the source of those grants tightens the coupling so a single PR can change SF behavior + matching IAM atomically. Files added: - infrastructure/iam/alpha-engine-step-functions-role.json - infrastructure/iam/alpha-engine-eventbridge-sfn-role.json - infrastructure/iam/check-drift.py (flat-layout variant of alpha-engine's directory-per-role drift checker) - infrastructure/iam/README.md (documents which roles this repo owns + the single-writer rule) - .github/workflows/iam-drift-check.yml (PR + daily 09:30 UTC + manual) Files updated: - infrastructure/deploy_step_function.sh — drops the surviving inline put-role-policy block against alpha-engine-step-functions-role (PR #170 dropped the EB-SFN twin; PR #151 dropped the daily-script twin; this completes the trio). The script kept a stale narrower policy that clobbered ssm:DescribeInstanceInformation + ec2:StopInstances + the trading-instance SSM ARN every saturday deploy. Trust policy + create-role bootstrap stay (one-time setup). - infrastructure/add-ssm-policy.sh — drops alpha-engine-executor-role + alpha-engine-predictor-role from the ROLES list. Both now have alpha-engine-ssm-read codified in their home repos (executor: already; predictor: covered by separate PR codifying its existing live grant). Script remains the writer for non-codified Lambda execution roles only. OIDC trust policy on github-actions-iam-drift-check widened live to also permit repo:cipher813/alpha-engine-data so the new drift-check workflow here can authenticate with the existing OIDC role. Companion PRs: - alpha-engine #137 — removes the codified directories, updates the cross-repo foreign-writer guard. - alpha-engine-predictor (separate) — codifies existing live alpha-engine-ssm-read grant on predictor-role. Supersedes alpha-engine-data #171 (which only addressed the saturday script's inline write). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…; requires lib v0.15.0) (#226) Wave 1 PR β of the institutional data-revamp arc (plan doc: ~/Development/alpha-engine-docs/private/data-revamp-260513.md). Producer-side concrete adapter implementations + multi-source aggregator. Pairs with alpha-engine-lib PR #46 (PR α, v0.15.0) which defined NewsSource Protocol + NewsArticle shape. Architectural pattern: data is the producer; lib defines the contract; research is the consumer (will read producer outputs via S3 + RAG retrieval in future sub-PRs, never imports adapters directly). New modules: collectors/news_sources/ polygon.py — FREE. Uses our existing polygon_client (data repo's copy) for rate-limit reuse. Normalizes /v2/reference/news. gdelt.py — FREE (no key). GDELT 2.0 DOC API; academic-grade event-extracted news. Requires ticker→name map for query building. yahoo_rss.py — FREE (fallback). Pure feedparser-based; matches existing collectors/alternative.py pattern but normalized into NewsArticle. benzinga.py — PAID stub. Raises NotImplementedError on init. ravenpack.py — PAID stub. bloomberg.py — PAID stub. collectors/news_aggregator.py NewsAggregator(sources, trust_weights) — fan-in across enabled NewsSource adapters → dedup (composite fingerprint: normalized title + URL path hash with querystring/fragment stripped) → preserve all source-provenance variants → return AggregatedNewsArticle records sorted by earliest_published_at desc. DEFAULT_TRUST_WEIGHTS: paid 0.95-1.0, polygon 0.9, gdelt 0.85, edgar_press 0.95, yahoo_rss 0.5. Lib pin bumps (lockstep, both must move per the pin-lockstep test): requirements.txt v0.12.0 → v0.15.0 Dockerfile v0.12.0 → v0.15.0 What's deferred (subsequent Wave 1 sub-PRs): PR A.1 — NLP pipeline (Loughran-McDonald + FinBERT + spaCy NER + LLM event extraction). Heavier deps; separate PR. PR A.2 — Structured aggregates writer (S3 parquet per ticker per day). Joined onto research's snapshot in PR F. PR A.3 — RAG ingest path: news → chunked → embedded → indexed in pgvector alongside existing SEC filings corpus. PR B — Filings substrate expansion (EDGAR full coverage: 10-K/Q/14A/S-1/13D/G/13F/Form-4). PR C — Analyst substrate (yfinance + FMP adapters + self-derived revisions tracking from daily snapshots). PR D — Async + S3 cache + per-vendor rate limiters. PR E — Wire RAG retrieval tools into research repo's thesis_update + sector agents. PR F — Wire new substrate into research's fetch_data (supersedes #170's per-ticker pre-fetch). +37 unit tests: - Protocol structural-subtyping for all 3 free adapters - Polygon: happy + transient-failure-per-ticker + schema-drift-skip - GDELT: happy + query building (multi-word vs single-word) + failure-skips-ticker + missing-name-map-fallback - Yahoo RSS: happy + entries-older-than-cutoff-dropped + no-link-skipped + fetch-failure-skips-ticker - Paid stubs: all 3 raise NotImplementedError on init - Aggregator: fan-in + URL/title dedup + canonical-title-longest + canonical-url-highest-trust + ticker-union + one-broken-source- isolated + output-sorted-desc + empty-fan-in - Trust weights: defaults + overrides + unknown-source-defaults-half - Fingerprint determinism - Lib shape contract pin (extra='forbid' + frozen) Suite: 848 passing. Composes with: - alpha-engine-lib PR #46 (v0.15.0) — required for shapes + Protocols - alpha-engine-research PR #172 (CLOSED) — original mis-located substrate; relocated here per architectural correction Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

This was referenced May 6, 2026

fix(infra): drop inline SF-role IAM write from saturday deploy #171

Closed

refactor(iam): foreign-writer guard + relocate shared roles to alpha-engine-data cipher813/alpha-engine#137

Merged

cipher813 added 2 commits May 6, 2026 08:00

ci: re-trigger after OIDC trust + scope widening

ac254b5

ci: re-trigger pull_request workflow after OIDC widening

859c692

cipher813 merged commit f6818da into main May 6, 2026
2 checks passed

cipher813 deleted the refactor/move-shared-iam-to-data branch May 6, 2026 15:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(iam): take ownership of shared orchestration roles#172

refactor(iam): take ownership of shared orchestration roles#172
cipher813 merged 3 commits into
mainfrom
refactor/move-shared-iam-to-data

cipher813 commented May 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cipher813 commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's added

Live operations performed

Companion / supersedes

Predictor follow-up (NOT in this PR)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cipher813 commented May 6, 2026 •

edited

Loading