feat(workflows): add batch-workflow MCP tools#61001
Conversation
Add MCP surface for batch workflows (one-off broadcasts + recurring schedules): workflows-blast-radius, workflows-run-batch, workflows-schedule-create (hand-rolled with a blast-radius echo-back guard), plus enabling workflows-list-batch-jobs and workflows-update-schedule. Schedules are surfaced read-only on workflows-get; the batch trigger audience rejects event/action filters.
MCP surface for batch workflows (one-off broadcasts + recurring schedules): workflows-blast-radius, workflows-run-batch, workflows-schedule-create (hand-rolled with a blast-radius echo-back guard), plus workflows-list-batch-jobs and workflows-update-schedule. Schedules surfaced read-only on workflows-get; batch audiences reject event/action filters. Custom hog_flow actions get personal-API-key scopes (read/write, method-aware for dual-method actions) so MCP can call them.
MCP UI Apps size report
|
|
Size Change: 0 B Total Size: 81.6 MB ℹ️ View Unchanged
|
Reject behavioral (event-based) cohort references in batch/schedule workflow audiences via a serializer guard (mirrors the feature-flag pattern), enforced for non-web callers even on drafts so MCP fails fast. Expose hide_behavioral_cohorts on cohorts-list and clarify the batch audience mechanics (properties = mixed person-condition + cohort-ref list) in the workflow tool descriptions.
… harley/workflows-mcp-batch # Conflicts: # products/workflows/mcp/tools.yaml
Prompt To Fix All With AIFix the following 3 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 3
products/workflows/backend/api/hog_flow.py:210-234
Inconsistent draft enforcement for event/action filters vs. behavioral cohorts. Behavioral cohorts are already blocked for MCP/API drafts via `_should_enforce_audience_guard`, but the `events`/`actions` filter check only fires for non-drafts. An MCP caller can currently save a draft batch trigger that contains event or action filters without getting the error they'd expect. The PR description specifically calls out "non-web callers enforced even on drafts" — this check belongs under the same guard so the two save-time rejections stay in sync.
```suggestion
elif data.get("config", {}).get("type") == "batch":
filters = data.get("config", {}).get("filters", {})
if not is_draft:
if not filters:
raise serializers.ValidationError({"filters": "Filters are required for batch triggers."})
if not isinstance(filters, dict):
raise serializers.ValidationError({"filters": "Filters must be a dictionary."})
properties = filters.get("properties", None)
if properties is not None and not isinstance(properties, list):
raise serializers.ValidationError({"filters": {"properties": "Properties must be an array."}})
if self._should_enforce_audience_guard(is_draft) and isinstance(filters, dict):
# The audience targets who a person is (properties / cohort membership), not what they did.
# Event/action filters are silently dropped by the person-based blast radius (resolving to
# "everyone"), so reject them outright — mirrors the guard on conditional branches. The UI
# already prevents this; this closes the same gap for API/MCP callers.
if filters.get("events") or filters.get("actions"):
raise serializers.ValidationError(
{
"filters": (
"Batch trigger audiences can't filter on event behavior. Target person "
"properties or a cohort instead (create a cohort for 'users who did event X')."
)
}
)
self._reject_behavioral_cohorts_in_audience(filters.get("properties"))
```
### Issue 2 of 3
products/workflows/backend/api/test/test_hog_flow.py:853-888
Two pairs of near-identical tests could be collapsed into parameterized tests, per the team's stated preference. `test_hog_flow_batch_trigger_rejects_event_filters` and `test_hog_flow_batch_trigger_rejects_action_filters` differ only in which extra filter key is set; `test_hog_flow_batch_trigger_allows_static_cohort` and `test_hog_flow_batch_trigger_allows_property_cohort` differ only in cohort construction. Parameterizing removes duplication and makes it easier to extend coverage.
```suggestion
@parameterized.expand(
[
("events", {"events": [{"id": "$pageview", "type": "events"}]}),
("actions", {"actions": [{"id": "5", "type": "actions"}]}),
]
)
def test_hog_flow_batch_trigger_rejects_behavioral_filter(self, _name, extra_filters):
trigger_action = {
"id": "trigger_node",
"name": "trigger_1",
"type": "trigger",
"config": {
"type": "batch",
"filters": {"properties": [], **extra_filters},
},
}
hog_flow = {"name": "Test Batch Flow", "status": "active", "actions": [trigger_action]}
response = self.client.post(f"/api/projects/{self.team.id}/hog_flows", hog_flow)
assert response.status_code == 400, response.json()
```
### Issue 3 of 3
services/mcp/src/tools/workflows/batch.ts:9-11
Duplicated cap constant across service boundaries. `BATCH_WORKFLOW_MAX_AUDIENCE_SIZE = 5000` mirrors `CDP_BATCH_WORKFLOW_MAX_AUDIENCE_SIZE` in `nodejs/src/cdp/config.ts`, and the file comment asks to "Keep the two values in sync." If the batch consumer's cap is raised, only the Node.js file is likely to be updated, leaving the MCP guard at the old value — producing confusing over-cap rejections from an outdated threshold. Consider a shared config entry or a CI assertion that both constants match.
Reviews (1): Last reviewed commit: "Merge branch 'master' into harley/workfl..." | Re-trigger Greptile |
PR overviewThis PR adds MCP tooling for batch workflows, including operations around batch jobs and schedules for workflow execution. It also updates backend workflow API handling related to batch job and schedule mutation paths. There are still two open security issues around batch workflow fan-out controls. Personal API keys may be able to create batch jobs or schedules without the same person-scope permission expected for audience fan-out, and schedule updates can modify active batch broadcasts without a fresh affected-count confirmation. Four issues have already been addressed, but the remaining gaps still allow attacker-initiated changes with meaningful impact on scheduled or audience-targeted workflow execution. Open issues (2)
Fixed/addressed: 4 · PR risk: 6/10 |
Move the event/action audience rejection under the same draft discriminator as the behavioral-cohort guard, so MCP/API drafts can't slip an event-behavior audience through — the two save-time rejections now stay in sync. Steer batch audience guidance toward re-evaluating person-property conditions / dynamic cohorts over static frozen-list cohorts, and parameterize the batch-audience validation tests.
… harley/workflows-mcp-batch
The batch_jobs POST endpoint created a run for any workflow status — the active check lived only in the UI (disabled trigger button) and the scheduler, so a direct API/MCP call could mint a queued job + CDP request for a draft that the consumer then drops. Mirror the scheduler's gate: reject the run unless the workflow is active.
| list=extend_schema( | ||
| parameters=[ | ||
| OpenApiParameter( | ||
| name="hide_behavioral_cohorts", |
There was a problem hiding this comment.
Exposed so MCP can list cohorts that are workflow-compatible.
| scope_object_read_actions = ["list", "retrieve", "logs", "metrics", "metrics_totals"] | ||
| scope_object_write_actions = ["create", "update", "partial_update", "destroy", "invocations"] | ||
| scope_object_read_actions = ["list", "retrieve", "logs", "metrics", "metrics_totals", "user_blast_radius"] | ||
| scope_object_write_actions = [ |
There was a problem hiding this comment.
scope_object_write_actions declares which workflow actions require hog_flow:write so personal-API-key (MCP) callers can reach them — it lists the custom write actions (invocations, schedule_detail, bulk_delete) plus the standard CRUD writes, which have to be re-listed because setting this attribute replaces the default CRUD list rather than extending it.
Active (enabled) workflows are read-only via MCP until revisions exist — editing can break already-scheduled or in-flight runs with no rollback. Enforced in the backend (perform_update), so workflows-update returns to a plain generated tool (no acknowledge_live_edit opt-in). Also deregisters workflows-disable from the MCP surface — exposing it invited a disable->edit->enable workaround; the factory stays in lifecycle.ts for easy re-enable. Strengthens the blast-radius / run-batch copy to demand explicit user confirmation before broadcasting.
… harley/workflows-mcp-batch
Query snapshots: Backend query snapshots updatedChanges: 1 snapshots (1 modified, 0 added, 0 deleted) What this means:
Next steps:
|
…p-batch # Conflicts: # posthog/api/cohort.py # products/cohorts/frontend/generated/api.schemas.ts # services/mcp/src/api/generated.ts # services/mcp/src/generated/cohorts/api.ts # services/mcp/src/tools/generated/cohorts.ts # services/mcp/tests/unit/__snapshots__/tool-schemas/cohorts-list.json
… harley/workflows-mcp-batch
Query snapshots: Backend query snapshots updatedChanges: 1 snapshots (1 modified, 0 added, 0 deleted) What this means:
Next steps:
|
The user_blast_radius endpoint counts persons/groups matching caller- supplied filters, but was gated only on hog_flow:read — letting a workflow-read PAT/OAuth token probe person counts (an existence oracle, e.g. "does email X exist?") without person-data access. Require person:read in addition to hog_flow:read on that action (via dangerously_get_required_scopes; AND semantics). The web builder uses session auth, so live audience sizing is unaffected. The MCP tools that size audiences (workflows-blast-radius / -run-batch / -schedule-create) gain person:read in their declared scopes accordingly. Note: feature_flags has the identical pattern and is left to its owners (tightening it is a breaking change for existing integrations).
… harley/workflows-mcp-batch
dmarchuk
left a comment
There was a problem hiding this comment.
Looks good! 👍 Left just one comment for a schedule trigger cleanup we should do before merging
…eate on active Addresses two PR review comments: - dmarchuk: remove the generic 'schedule' trigger type from the MCP surface (matches the UI hiding it from the trigger-type dropdown). workflows-create/-update/-get descriptions drop it, and workflows-schedule-create is now batch-only. Recurring batch scheduling (schedule-create / update-schedule) is untouched. - veria-ai: close the draft-batch schedule-create audience bypass — ack a narrow audience on a draft, broaden the trigger, then enable, and the scheduled run would broadcast wider than acknowledged. schedule-create now requires the batch workflow to be active first (an active workflow's trigger can't be MCP-edited, so the acknowledged audience is locked); acknowledged_affected_count is now required.
|
🎭 Playwright report · View test results →
These issues are not necessarily caused by your changes. |
Documents the new batch workflow tools added in PostHog/posthog#61001: - workflows-blast-radius: Size audience before batch operations - workflows-run-batch: Trigger one-off broadcasts - workflows-schedule-create: Create recurring schedules - workflows-list-batch-jobs: List past batch runs - workflows-update-schedule: Update schedule configuration Also updates guardrails to reflect that: - Active workflows are now read-only via MCP - workflows-disable was removed - Blast-radius confirmation is required for batch operations
| # lists above can't distinguish GET (read) from POST (write) on the same action. Without | ||
| # this, these actions declare no scope and reject all personal-API-key (MCP) access. | ||
| if self.action in ("batch_jobs", "schedules"): | ||
| return ["hog_flow:read"] if request.method in ("GET", "HEAD", "OPTIONS") else ["hog_flow:write"] |
There was a problem hiding this comment.
Medium: Batch fan-out is exposed without person scope
This override makes mutating /batch_jobs and /schedules available to personal API keys that only have hog_flow:write. A token without person:read can now start an active batch workflow with caller-supplied person filters, or attach a recurring batch schedule, which is the same person-audience fan-out the MCP tools gate on person:read; require person:read for these mutating batch/schedule paths or enforce it after loading batch-trigger workflows.
| - hog_flow:write | ||
| annotations: | ||
| readOnly: false | ||
| destructive: false |
There was a problem hiding this comment.
Medium: Schedule updates bypass batch confirmation
Enabling workflows-update-schedule as a non-destructive raw PATCH lets an MCP caller change the RRULE, start time, or variables on an active batch schedule without the acknowledged_affected_count guard used by workflows-schedule-create. A prompt-injected tool call can make an existing scheduled broadcast run hourly against the current audience; keep this generated tool disabled and add a wrapper that fetches the workflow, requires person:read plus a fresh blast-radius acknowledgment for active batch schedules, or only allows updates while paused.
Problem
MCP had the workflow lifecycle (#60388) but no way to run a batch workflow or schedule a worflow — and a broadcast messages every matching person, so its audience needs real guardrails. Wiring these endpoints up for API/MCP also surfaced several safeguards that, until now, lived only in the frontend.
Loom demoLoom demo without disable/edit/enable flowNote
When testing this on localhost via Claude Desktop, I had to explicitly force
posthog:execor MCP v2 by passing?mode=cliin my MCP config as the way to use it. Because I have other MCP tools loaded (github, incident.io, slack, etc) I rapidly ran out of context with the defaulttoolsmode. The downside of this is we lose MCP-ui apps. This isn't a new issue - just worth flagging.Changes
New batch tools (hand-rolled, composing existing REST — no new endpoints):
workflows-blast-radius— size the audience.workflows-run-batch— one-off broadcast; requires active +acknowledged_affected_countfrom blast-radius (recomputed server-side, rejects on drift or over-cap).workflows-schedule-create— recurring RRULE schedule.Also: enabled
workflows-list-batch-jobs/workflows-update-schedule; nested read-onlyschedulesonworkflows-get/-create/-update; exposedhide_behavioral_cohortsoncohorts-listso agents can find workflow-usable cohorts; descriptions steer toward inline person-property conditions or dynamic cohorts (which re-evaluate) over static frozen-list cohorts; PAK scopes on the customhog_flowactions so MCP can call them at all (see Agent context).Guards moved server-side
Exposing the batch endpoints to personal API keys made these frontend-only gates reachable directly, so they're now enforced in the serializer/endpoint (the UI keeps its own copies). Draft leniency for all of them is scoped to the web UI via
get_event_source— MCP/API callers are enforced even on drafts, so an agent fails fast.hideBehavioralCohorts.)How did you test this code?
test_hog_flow.py,test_hog_flow_schedule.py): audience validation (behavioral/event/action rejection across batch + schedule, nested deps, web-draft-lenient vs MCP-draft-enforced), active-run gate, schedule CRUD, nested-schedule surfacing.lint-tool-names,tsc,ruff.workflows.integration.test.ts) against a local stack via a gitignored.env.test, incl. the cross-layer behavioral-cohort rejection (MCP tool → API → guard).Automatic notifications
Docs update
Self-described via serializer
help_text+ tool descriptions, which flow into the generated schemas.🤖 Agent context
Claude Code. Recurring theme: safeguards that lived only in the frontend became reachable once these endpoints were opened to API/MCP, so they're now enforced server-side, with draft leniency scoped to the web UI via
get_event_source.dangerously_get_required_scopes:batch_jobs/schedulesare dual-method (GET=read, POST=write), but the per-action scope lists map one scope per action name, not per method. The override returns read-for-GET / write-for-POST andNoneelsewhere (keeping default derivation) — it never loosens anything.