Skip to content

feat(iam): grant github-actions-lambda-deploy access to changelog/ prefix#120

Merged
cipher813 merged 2 commits into
mainfrom
feat/iam-changelog-grant
May 1, 2026
Merged

feat(iam): grant github-actions-lambda-deploy access to changelog/ prefix#120
cipher813 merged 2 commits into
mainfrom
feat/iam-changelog-grant

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

Adds two Sids to the `github-actions-lambda-deploy` OIDC role so every alpha-engine* repo's deploy workflow can append entries to the new system-wide changelog at `s3://alpha-engine-research/changelog/`.

  • SystemChangelogAppend: `PutObject` + `GetObject` on `s3://alpha-engine-research/changelog/*`. Used by the `append-changelog` composite action.
  • SystemChangelogList: `ListBucket` scoped via `s3:prefix` condition to `changelog/`. Used by the daily aggregator cron.

Companion PR

alpha-engine-docs PR (opening next) — composite action + aggregator cron.

Apply state

Already applied live via `infrastructure/iam/apply.sh github-actions-lambda-deploy`. Smoke-tested via manual `PutObject` → `ListBucket` → `DeleteObject` cycle before committing this codification.

Why

System-wide deploy provenance is currently scattered across each repo's git log. Cross-repo incident debugging (e.g. 2026-05-01 SF timeout cascade — root cause crossed alpha-engine-data + alpha-engine-predictor) requires reconstructing "what shipped where, when" by querying each repo separately. This grant enables a single chronological log.

🤖 Generated with Claude Code

…efix

Adds two Sids to the OIDC role used by every alpha-engine* repo's deploy
workflow:

- SystemChangelogAppend: PutObject + GetObject on
  s3://alpha-engine-research/changelog/*. Used by the
  append-changelog composite action in alpha-engine-docs.
- SystemChangelogList: ListBucket scoped via s3:prefix condition to
  changelog/. Used by the daily aggregator cron in alpha-engine-docs
  to materialize the system-wide CHANGELOG.md from the per-deploy
  JSON entries.

Why: every successful deploy across any alpha-engine* repo will now
emit one JSON to s3://alpha-engine-research/changelog/, and a
daily cron aggregates them into a Markdown view. See
alpha-engine-docs PR for the full mechanism + caller pattern.

Already applied live via infrastructure/iam/apply.sh; this PR is
the codification of the source-of-truth file. Smoke-tested with a
manual put/list/delete cycle before committing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit cdd87ab into main May 1, 2026
1 check passed
@cipher813 cipher813 deleted the feat/iam-changelog-grant branch May 1, 2026 15:26
cipher813 added a commit that referenced this pull request May 1, 2026
* feat(ci): wire deploy.yml + deploy-infrastructure.yml into system-wide changelog

Adds a final step to both deploy workflows that calls the
append-changelog composite action in alpha-engine-docs. Each
successful (or failed) deploy now emits one JSON to
s3://alpha-engine-research/changelog/.

Two distinct entries per merge that touches both surfaces:
- deploy.yml          → Phase 2 Lambda image rebuild + alias bump
- deploy-infrastructure.yml → SF + CF stamp re-deploy

Distinguished by the deploy_workflow field on each entry, so the
materialized CHANGELOG.md can show both as separate items under the
same SHA.

Uses if: always() + ternary on job.status so failed deploys also
register in the log — the failure signal is itself a useful
provenance record.

Companion: alpha-engine-docs PR #3 (composite action + aggregator),
alpha-engine-data PR #120 (IAM grant — already merged).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(orchestration): SNS→S3 changelog incident mirror Lambda

Adds a small Lambda subscribed to the alpha-engine-alerts SNS topic
that mirrors every alert as one JSON entry under
s3://alpha-engine-research/changelog/incidents/. Closes the event-
mining loop alongside the deploy-side log: now both "what shipped"
and "what failed" feed the same time-ordered changelog.

Why
The 2026-05-01 weekday SF timeout cascade is the canonical example.
The deploy log records the 4 PRs that fixed it, but it never
captured the original SNS alert email at 06:01 PT — the failure
event itself. With this Lambda, that alert would have landed at
changelog/incidents/2026/05/01T13-01-XX_alpha-engine-alerts_*.json
with full subject + body, queryable months later for retro mining
("show me every SF failure incident this quarter").

Resources added (4)
- ChangelogIncidentMirrorRole       — minimal: PutObject scoped to
  changelog/incidents/* + AWSLambdaBasicExecutionRole for logs.
- ChangelogIncidentMirrorFunction   — python3.12, arm64, 256 MB,
  30s timeout. Inline ZipFile (~50 lines). Reads SNS Records,
  builds a JSON entry, S3 PutObject. No-ops cleanly on malformed
  timestamps (falls back to "now").
- ChangelogIncidentMirrorSubscription — SNS subscription on
  AlertsTopic with Protocol: lambda.
- ChangelogIncidentMirrorPermission — Lambda::Permission letting
  SNS invoke the function.

Schema (matches the deploy-side action's event_type discriminator)
{
  "ts_utc": ...,
  "event_type": "incident",
  "source": "alpha-engine-alerts",
  "subject": "...",
  "summary": "...",        // first 240 chars of subject or message line 1
  "details": "...",        // full message body
  "sns_message_id": "...",
  "topic_arn": "..."
}

Apply state
Already applied live via aws cloudformation execute-change-set;
smoke-tested with one SNS publish — entry landed at
s3://alpha-engine-research/changelog/incidents/2026/05/01T15-52-57_*
within 2s, schema validated, then cleaned up. This PR is the
codification of the source-of-truth template.

Companions
- alpha-engine-docs PR #5 (event_type schema + aggregator support)
- Future: flow-doctor S3 notifier, manual CLI helper.

Note on template description
The template's docstring says "Does NOT manage Lambda functions or
IAM roles." Strictly we now manage one of each — narrow exception
for the SNS-mirror because it's tightly coupled to AlertsTopic
defined here. Not updating the doc this commit; will revisit if a
second exception lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cipher813 added a commit that referenced this pull request May 19, 2026
…ence/ (#270)

ROADMAP P1 "predictor/ S3 namespace rationalization Wave 3" — start the
write-both soak that migrates the 10y price_cache parquet tree from
predictor/price_cache/ (under the predictor module's namespace) to
reference/price_cache/ (long-lived data-module references). Mirrors the
shape of Wave 1's predictor/daily_closes/ -> staging/daily_closes/ but
uses write-both + soak instead of hard-cutover because this writer only
rewrites STALE tickers — a hard cut would leave fresh tickers in legacy
and the new prefix incomplete for a full yfinance refresh cycle. CLAUDE.md
S3 Contract Safety mandates the write-both + >=1 week soak for any path
change of this shape.

## What ships in PR1 (producer-side only — zero reader changes)

- builders/_price_cache_writeboth.py (new): the single chokepoint.
  `price_cache_write_prefixes(primary)` returns [legacy, new] for the
  production default and [primary] for any custom string. Legacy ordered
  first so a fail-loud on the legacy write preserves pre-Wave-3 failure
  semantics — the new prefix never silently masks a legacy write error.
- collectors/prices.py: yfinance refresh upload now writes both prefixes.
- collectors/fred_history.py: FRED backfill upload now writes both prefixes.
- weekly_collector.py: chronic-gap self-heal patch writes both prefixes
  (the get_object read stays on legacy since readers haven't migrated).
- infrastructure/backfill_reference_price_cache.sh (new): one-shot
  `aws s3 sync` operator script to seed reference/price_cache/ with the
  ~934 objects currently in predictor/price_cache/. Idempotent; --dry-run
  supported. Run ONCE as part of PR1's deploy.
- tests/test_price_cache_writeboth.py (new, 7 tests): helper contract
  (legacy default returns both, custom returns single, ordering pinned)
  + each of the 3 production writers exercised end-to-end with stubbed
  s3 + recording asserts that BOTH keys land per ticker with identical
  bodies.
- tests/test_fred_history_fetcher.py: updated the pre-existing
  test_uploads_to_s3_when_not_dry_run from asserting a single upload to
  asserting write-both behavior. Required by zero-tolerance test policy.

## What does NOT ship in PR1

- Reader migrations: ~10 read sites across alpha-engine-data,
  alpha-engine-predictor, alpha-engine-backtester, alpha-engine-dashboard
  stay on the legacy prefix. PR3+ migrates them with legacy fallback.
- IAM grant expansion to cover reference/price_cache/* — PR2 mirrors
  Wave 1 #120's IAM pattern on the alpha-engine repo's
  alpha-engine-s3-access.json.
- builders/daily_append.py:_load_parquet_warmup (reader, not writer) —
  migrates in PR3.
- sector_map.json (separate concern — write-once-per-Saturday, not part
  of the stale-ticker churn). Handled at cutover or PR3.
- The cutover itself: PR4 will flip primary -> reference/, drop the
  legacy entry from price_cache_write_prefixes, retire reader fallbacks,
  and `aws s3 rm --recursive` the legacy prefix. Gated on >=1 week of
  clean write-both observation.

## Soak contract

PR1 merge -> deploy this commit live -> run the backfill script ONCE to
seed the new prefix -> next Saturday SF firing's first write to both
prefixes starts the soak clock -> after >=4 Saturday firings (matches
Wave 4's discipline) with no parity divergence, PR3 reader migrations
go in, then PR4 cutover.

## Tests

  pytest tests/ -q  -> 1387 passed, 1 skipped, 0 failed

Composes with: ROADMAP Wave 4 slim-deletion arc currently in flight
(institutional pattern for data-tier prefix changes — dual-read /
dual-write + lib reconcile observation), Wave 1 PR #112 (template),
S3 Contract Safety in CLAUDE.md.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant