Skip to content

DEVOP-561: seed Shai-Hulud IOC lists for org-wide daily sweep#2

Merged
spooktheducks merged 4 commits into
allora-network:mainfrom
srt0422:devop-561-seed-ioc-lists
May 14, 2026
Merged

DEVOP-561: seed Shai-Hulud IOC lists for org-wide daily sweep#2
spooktheducks merged 4 commits into
allora-network:mainfrom
srt0422:devop-561-seed-ioc-lists

Conversation

@srt0422
Copy link
Copy Markdown

@srt0422 srt0422 commented May 13, 2026

Summary

Seeds the indicator lists that the daily org-wide Indicator Of Compromise (IOC) sweep workflow (DEVOP-560) will consume.

  • .github/security/ioc-packages.txt — confirmed compromised npm + PyPI package@version pairs:
    • Shai-Hulud OpenSearch wave (@opensearch-project/opensearch 3.5.3–3.8.0).
    • PyPI sdist-poisoning variants (mistralai==2.4.6, guardrails-ai==0.10.1).
    • The broader 526-package wave: CrowdStrike, TinyColor (@ctrl/*), NativeScript (@nativescript-community/*), Teselagen, ThingsFactory/Operato (@things-factory/*), rxnt-*.
  • .github/security/ioc-hashes.txt — seed SHA-256 hashes from the Socket bundle.js dropper advisory. The seven sample hashes in this file are placeholders flagged inline — the runbook author must replace them with canonical hashes from the live Socket feed before DEVOP-560 ships. Inline comments document the source.
  • .github/security/REFRESH.md — documents the weekly refresh cadence and PR workflow.

Why now

The sweep workflow itself (DEVOP-560) is still on the To-do list. Seeding the IOC files first means: (a) we have a place to land manual refreshes during the gap before automation goes live, and (b) DEVOP-560's PR is reviewable against real data files instead of empty stubs.

Linear

https://linear.app/alloralabs/issue/DEVOP-561

Test plan

  • File-only PR; manually inspect the IOC list for typos and duplicates.
  • Confirm the 7 hash placeholders are replaced with canonical Socket-feed values before the DEVOP-560 sweep workflow is merged.
  • After DEVOP-560 lands, manually gh workflow run shai-hulud-sweep.yml against a known-clean repo and confirm zero false positives.

🤖 Generated with Claude Code


Summary by cubic

Seeds IOC package and hash lists for the org-wide Shai-Hulud daily sweep. Imports the full Datadog 2.0 snapshot and adds canonical dropper hashes; provides the data inputs for DEVOP-560 and satisfies Linear DEVOP-561.

  • New Features

    • .github/security/ioc-packages.txt: confirmed ecosystem:name@version indicators including npm:@opensearch-project/opensearch 3.5.3–3.8.0, pypi:mistralai@2.4.6, pypi:guardrails-ai@0.10.1, plus seeded entries from the 526-package wave across @crowdstrike/*, @ctrl/*, @nativescript-community/*, @teselagen/*, @things-factory/*, and rxnt-*; added the full Shai‑Hulud 2.0 Datadog consolidated snapshot (795 packages, 1091 pins). Gap: only the 1.0 remainder is sourced live from the Socket feed until DEVOP-560 ships.
    • .github/security/ioc-hashes.txt: populated with canonical SHA-256s — 7x bundle.js V1–V7 from Socket and 1x setup_bun.js + 6x bun_environment.js from Datadog’s Shai-Hulud 2.0 analysis; sources and corroboration noted inline.
    • .github/security/REFRESH.md: weekly refresh cadence, the PR workflow, and a script to import/refresh the Datadog 2.0 snapshot.
  • Migration

    • After DEVOP-560 lands, run shai-hulud-sweep.yml once against a known-clean repo to verify pickup.

Written for commit 3c55d06. Summary will update on new commits.

Adds the indicator lists consumed by .github/workflows/shai-hulud-sweep.yml
(DEVOP-560):

* .github/security/ioc-packages.txt - confirmed compromised npm + PyPI
  package@version pairs from the Shai-Hulud OpenSearch wave, the PyPI
  sdist-poisoning variants (mistralai 2.4.6, guardrails-ai 0.10.1), and
  the broader 526-package wave (CrowdStrike, TinyColor family, NativeScript,
  Teselagen, ThingsFactory/Operato, rxnt-*).
* .github/security/ioc-hashes.txt - seed SHA-256 hashes from the Socket
  bundle.js dropper advisory (placeholder hashes flagged inline; runbook
  author must replace with canonical hashes before DEVOP-560 ships).
* .github/security/REFRESH.md - documents the weekly refresh cadence and
  the PR workflow for updating both files.

The sweep workflow itself ships in DEVOP-560; this PR just seeds the data
it consumes.

Refs: https://linear.app/alloralabs/issue/DEVOP-561

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cubic analysis

2 issues found across 3 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name=".github/security/ioc-hashes.txt">

<violation number="1" location=".github/security/ioc-hashes.txt:18">
P1: The file seeds placeholder SHA-256 values instead of the required known-bad hashes from the advisory.

According to linked Linear issue DEVOP-561, this should contain the canonical 7 bundle.js hashes now, not placeholders.</violation>
</file>

<file name=".github/security/ioc-packages.txt">

<violation number="1" location=".github/security/ioc-packages.txt:129">
P1: According to linked Linear issue DEVOP-561, this IOC package seed is incomplete: it intentionally stops before adding the remaining 526-wave package@version entries required by the acceptance criteria, which can cause IOC sweep false negatives.</violation>
</file>

Linked issue analysis

Linked issue: DEVOP-561: Seed ioc-packages.txt and ioc-hashes.txt in .github org repo

Status Acceptance criteria Notes
Add `.github/security/ioc-packages.txt` containing ecosystem:name@version lines for compromised npm/PyPI releases (seed OpenSearch wave, PyPI sdist-poisoning entries, and the broader 526-package wave). The PR adds `.github/security/ioc-packages.txt` with npm and pypi entries (OpenSearch versions, mistralai and guardrails-ai, and many seeded packages from the 526 wave). The file format matches the expected :@ lines.
⚠️ Add `.github/security/ioc-hashes.txt` seeded with the 7 known-bad SHA-256 bundle.js hashes from the Socket advisory. The PR adds `.github/security/ioc-hashes.txt` and includes seven SHA-256-looking lines, but they are explicitly marked as placeholders and the file contains a NOTE requiring the runbook author to replace them with canonical Socket-feed values before DEVOP-560 ships. That means the hashes are present but not the required canonical values yet.
Add documentation for the refresh process (how & when to update these lists). The PR adds `.github/security/REFRESH.md` documenting weekly refresh cadence, steps to refresh packages and hashes, PR workflow, and audit trail instructions, satisfying the documentation requirement.
PR merged. Acceptance criteria require the PR to be merged. At review time the branch is present but the PR is not merged yet, so this step remains outstanding.
Architecture diagram
sequenceDiagram
    participant Admin as Security Engineer
    participant GitHub as .github Repository
    participant GHA as GitHub Actions (Sweep)
    participant Repo as Target Org Repos
    participant Socket as Socket.dev Feed

    Note over Admin,Socket: IOC Data Consumption & Refresh Flow

    Admin->>GitHub: NEW: Seed ioc-packages.txt & ioc-hashes.txt
    Admin->>GitHub: NEW: Add REFRESH.md runbook

    GHA->>GitHub: DEVOP-560: Read ioc-packages.txt
    GitHub-->>GHA: List of npm/PyPI compromised packages
    GHA->>GitHub: DEVOP-560: Read ioc-hashes.txt
    GitHub-->>GHA: List of SHA-256 dropper hashes

    loop Each repository in org
        GHA->>Repos: Scan lockfiles and vendored artifacts
        Repos-->>GHA: Dependencies and bundled files
        GHA->>GHA: CHANGED: Match against IOC packages & hashes
        alt Match found
            GHA->>GitHub: File issue or flag for security review
        else No match
            GHA->>GHA: Log clean result
        end
    end

    Note over Admin,Socket: Weekly Refresh Cycle (per REFRESH.md)

    loop Weekly (Mondays)
        Admin->>Socket: Fetch threat feed (category:supply-chain, tag:shai-hulud)
        Socket-->>Admin: Updated IOC list
        Admin->>GitHub: Open PR: security/ioc-refresh-YYYY-MM-DD
        Admin->>GitHub: Diff & append new IOC entries (never remove old)
        Admin->>GitHub: Tag @security-oncall for review
        GitHub->>GitHub: Merge after single approval
        Admin->>GHA: Trigger shai-hulud-sweep.yml manually
        GHA->>Repos: Re-run check with updated IOCs
    end

    Note over GitHub,Socket: Placeholder hashes must be replaced before DEVOP-560 ships

    Admin->>Socket: Fetch canonical SHA-256 hashes
    Socket-->>Admin: 7 verified dropper hashes
    Admin->>GitHub: Replace placeholder hashes with canonical values
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread .github/security/ioc-hashes.txt Outdated
Comment thread .github/security/ioc-packages.txt Outdated
…le, clarify package-list gap

- ioc-hashes.txt: replace example-looking hex values with commented-out
  zero-prefixed placeholders so the sweep consumer cannot accidentally
  match against them. Document the canonical-hash fetch procedure inline
  and gate DEVOP-560 launch on replacement.
- ioc-packages.txt: replace the brief truncation note with an explicit
  gap-notice block explaining why the 526-wave is sourced live from
  Socket rather than hand-seeded, and document the compensating controls
  in force until DEVOP-560 lands.

Addresses cubic review findings on ioc-hashes.txt:18 and
ioc-packages.txt:129. The hashes themselves still cannot be committed
without access to the live Socket advisory feed; this change makes the
gap fail-safe (no false-positive matches) instead of fail-open.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@srt0422
Copy link
Copy Markdown
Author

srt0422 commented May 13, 2026

Addressed in 6fcd93c. Both cubic findings are acknowledged but cannot be "fully resolved" in this PR because the canonical SHA-256 hashes only exist in the live Socket advisory feed and the full 526-package wave is sourced live, not hand-typed. The fix makes the gap fail-safe instead of fail-open:

  • ioc-hashes.txt: the seven placeholder hex strings are now commented out and replaced with zero-prefixed examples that explicitly document the canonical-fetch procedure. The sweep consumer can no longer accidentally match against them.
  • ioc-packages.txt: the truncation note now explicitly documents the gating dependency on the live Socket feed (DEVOP-560) and the compensating controls (Socket CLI in DEVOP-572, prefix-match suspect policy) that apply in the interim.

DEVOP-560's launch checklist gates on replacing the placeholder hashes with canonical values OR ingesting the live Socket feed before declaring the IOC sweep production-ready.

@srt0422 srt0422 added the shai-hulud Shai-Hulud supply-chain defense work label May 13, 2026
…dog hashes

Drops the 7 zero-prefixed placeholder lines and seeds the file with the
canonical SHA-256 dropper hashes from authoritative sources:

- 7x bundle.js V1–V7 from Socket's CrowdStrike-wave advisory (Sept 2025).
  V6 (46faab…) is independently corroborated by CISA, Unit 42 / Palo Alto
  Networks, and StepSecurity.
- 1x setup_bun.js + 6x bun_environment.js from Datadog Security Labs'
  Shai-Hulud 2.0 analysis (Nov 2025 wave). Three of the bun_environment
  variants are OSINT-corroborated.

Also fixes broken / wrong source citations in the file header:

- Removes dead URL `socket.dev/blog/shai-hulud-worm-second-wave` (404).
- Replaces it with the live Socket CrowdStrike advisory URL, which is the
  actual source of the V1–V7 hashes.
- Adds the Socket "Shai Hulud Strikes Again (v2)" advisory and the Datadog
  IOC repo as sources for the 2.0-wave hashes.
- Drops the `GHSA-cxm3-wv7p-598c` citation — that GHSA is the Nx
  s1ngularity advisory, unrelated to Shai-Hulud and carrying no hashes.

File-header scope generalized from "bundle.js droppers" to
"known Shai-Hulud dropper payloads" since the 2.0 wave drops differently
named files (setup_bun.js / bun_environment.js) but the sweep consumer
matches on hash, not filename.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name=".github/security/ioc-hashes.txt">

<violation number="1" location=".github/security/ioc-hashes.txt:45">
P2: Add advisory IDs to each newly added hash comment to match the documented IOC refresh requirements and preserve audit traceability.</violation>
</file>

Tip: Review your code locally with the cubic CLI to iterate faster.
Fix all with cubic

# Source: Datadog Security Labs Shai-Hulud 2.0 analysis. The 2.0 wave drops two
# files via a `preinstall` script instead of postinstall: a small loader
# (setup_bun.js) plus an obfuscated payload (bun_environment.js, ~10MB).
a3894003ad1d293ba96d77881ccd2071446dc3f65f434669b49b3da92421901a # setup_bun.js
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Add advisory IDs to each newly added hash comment to match the documented IOC refresh requirements and preserve audit traceability.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .github/security/ioc-hashes.txt, line 45:

<comment>Add advisory IDs to each newly added hash comment to match the documented IOC refresh requirements and preserve audit traceability.</comment>

<file context>
@@ -1,50 +1,51 @@
+# Source: Datadog Security Labs Shai-Hulud 2.0 analysis. The 2.0 wave drops two
+# files via a `preinstall` script instead of postinstall: a small loader
+# (setup_bun.js) plus an obfuscated payload (bun_environment.js, ~10MB).
+a3894003ad1d293ba96d77881ccd2071446dc3f65f434669b49b3da92421901a  # setup_bun.js
+62ee164b9b306250c1172583f138c9614139264f889fa99614903c12755468d0  # bun_environment.js
+cbb9bc5a8496243e02f3cc080efbe3e4a1430ba0671f2e43a202bf45b05479cd  # bun_environment.js
</file context>

Tip: Review your code locally with the cubic CLI to iterate faster.

Fix with Cubic

…ackages.txt

Closes the Nov 2025 wave coverage gap by importing the full consolidated
Shai-Hulud 2.0 IOC list from Datadog's public IOC repo:

  https://github.com/DataDog/indicators-of-compromise/tree/main/shai-hulud-2.0

The snapshot is the union of seven independent vendors' lists (datadog,
socketdev, stepsecurity, wiz, reversinglabs, helixguard, koi) — 795 unique
packages, 1091 name@version pins. None overlap with the existing
hand-seeded Sept 2025 (1.0) wave entries, so the import is purely additive.

Changes:
- New "npm — Shai-Hulud 2.0 (Datadog consolidated snapshot)" section
  containing the 1091 name@version lines.
- Gap notice narrowed: 2.0 is now fully covered; only the 1.0 wave
  remainder is still gap'd pending DEVOP-560 auto-refresh.
- Header sources block fleshed out with the actual advisory URLs
  (Socket CrowdStrike + v2 advisories, Datadog IOC repo).
- REFRESH.md gains a "Refreshing the Datadog 2.0 snapshot" section with
  the canonical import script so future refreshes are mechanical and
  diff-able against upstream.

The snapshot file is treated as opaque — refreshes must re-import the
full block, not edit rows by hand.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@srt0422 srt0422 requested a review from spooktheducks May 14, 2026 06:14
@spooktheducks spooktheducks merged commit da248ba into allora-network:main May 14, 2026
1 check passed
srt0422 added a commit that referenced this pull request May 26, 2026
- (P2 #5) Extract a new `Find rolling issue` step (gated on
  `rc=='1' || rc=='2'`) that resolves the rolling-issue number ONCE
  per run via the canonical `gh issue list ... sort:created-asc`
  query and exposes it as `steps.find-rolling-issue.outputs.issue_num`.
  Replace the duplicated inline `gh issue list` calls in the ioc-dedup
  and rolling-issue-update steps with the shared output. Removes the
  drift-hazard `# same query as the update step below — keep in sync`
  coupling and closes the TOCTOU window where a human could close the
  rolling issue between the two independent lookups.

- (P1 #1) Filter the ioc-dedup comment scan to `github-actions[bot]`
  authorship. Previously the `gh api ... --jq '.[] | {body, created_at}'`
  projection accepted markers from ANY commenter, so anyone with
  `issues: write` (or anyone able to social-engineer a maintainer into
  pasting attacker-supplied marker text) could forge
  `<!-- shai-hulud-ioc-stamp: <sha256> -->` or
  `<!-- shai-hulud-paged-at: <iso8601> -->` into the rolling issue and
  silently suppress real Slack pages by poisoning the dedup chain.
  Only this workflow (running as GITHUB_TOKEN) emits canonical markers,
  and its comments are attributed to `github-actions[bot]` — restrict
  the source set accordingly. Defense-in-depth follow-up (binding
  markers to the emitting run_id and verifying via gh api) deferred.

- (P1 #2) Move paged-at marker emission to a dedicated post-Slack step
  (`Persist Slack-paged marker`) gated on
  `success() && rc=='1' && should_page=='true'` so a failed Slack
  delivery never writes a paged-at timestamp. The rolling-issue update
  step keeps writing the IOC stamp marker (which represents the dedup
  decision input, NOT the Slack-delivery outcome — that's correct
  gating). The dedup reader already scans the most-recent paged-at
  marker across ALL bot-authored comments, so splitting the markers
  across two comments composes correctly with no parser change.
  Previously the paged-at marker was committed BEFORE the Slack page
  ran, so a failed Slack send would still record a paged-at timestamp
  and silently corrupt the dedup chain for up to 7 days (next
  IOC-grade run would believe Slack had paged, suppress its own page,
  and the standing IOC would stop alerting until the weekly re-page
  window expired).

  The new step has a `gh issue list` fallback for the rare case where
  the update step created a fresh rolling issue this run (so
  find-rolling-issue's output was empty); fail-OPEN warning if no
  issue is resolvable at all so a missing paged-at marker just forces
  the next run to page conservatively.

Verification: actionlint clean; YAML parses (11 steps in canonical
order: checkout → verify-tools → sweep → upload → find-rolling-issue
→ ioc-dedup → update-rolling-issue → slack-page → persist-paged-at →
slack-suppressed-notice → final-summary).

Refs: DEVOP-560, ce-code-review run 20260526-101810-4793bf13
findings #1 (anchor 100, security+adversarial), #2 (anchor 100,
correctness+adversarial+reliability), #5 (anchor 75, maintainability).
Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-human-review shai-hulud Shai-Hulud supply-chain defense work

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants