Production patterns for agentic workflows — the ops discipline AI agents are missing.
Maintained by the team at Agentled.ai — we built this because we kept explaining the same hard-won patterns to every developer who hit the wall. All content is platform-agnostic. PRs from any platform welcome.
Add to your project's CLAUDE.md:
# Load agentic-ops patterns
See: https://github.com/agentled/agentic-opsOr reference directly in your Claude Code session:
# Fetch the skill and append to your CLAUDE.md
curl -sL https://raw.githubusercontent.com/agentled/agentic-ops/main/CLAUDE.md >> CLAUDE.mdShould I use a scheduled trigger or a real-time event trigger for email intake?
| Schedule (polling) | App Event (real-time) | |
|---|---|---|
| Latency | minutes–hours | seconds |
| Idempotency | trivial — label marks processed | must dedupe on messageId; re-deliveries happen |
| Backfill | widen the query window | doesn't exist |
| Replay after outage | automatic on next run | events can be permanently lost |
| Debugging | read last execution log | subscription + delivery + filter + dedupe all need checking |
Rule: default to Schedule + label-based dedup for email/document intake. Use an event trigger only when the user explicitly needs < 1 minute latency.
Copy-paste Gmail polling query: -label:processed newer_than:1d
| File | Pattern | One-line rule |
|---|---|---|
| 00-why-agentic-ops | Why structured workflows beat ad-hoc prompting | Ad-hoc prompting doesn't scale; agentic-ops is the ops layer that makes AI agents production-ready. |
| 01-trigger-design | Polling vs event triggers | Default to schedule for intake; use events only when latency < 1 min is a hard requirement. |
| 02-dedup-gates | Idempotency and dedup | Always resolve label IDs before use — never pass display names to Gmail API. |
| 03-credit-efficiency | Not burning money while debugging | Fix → retry from failed step → verify. Never start a new execution to debug a failed one. |
| 04-loop-patterns | Iterating without N+1 or data loss | Always wait for loop completion before consuming results; never assume order. |
| 05-child-workflow-contracts | Composable workflows with typed return contracts | Child workflows use return, not milestone; always define a typed return contract. |
| 06-conditional-routing | Conditions that actually fire | Use criteria/variable (not conditions/field) — wrong field names silently skip steps. |
| 07-error-handling | skip vs stop vs wait | skip for optional data, stop for hard prerequisites, wait for async completion. |
| 08-composed-email-approval | Composed email with approval gate | One AI step generates + sends; never separate draft + send actions. |
| 09-reports-and-knowledge-storage | Report rendering + KG persistence | Always render reports via a config layout; persist structured results to the KG. |
| 10-person-research-ladder | Person research: signal-based lookup + fallback ladder | Pick the lookup by the strongest input signal; fall down the ladder, not across. |
| 11-company-research-ladder | Company research: match source to question | LinkedIn for people, website for positioning, directories for financials. |
| File | Pattern | One-line rule |
|---|---|---|
| 10-observability | Structured logging, execution tracing, alerting on silent failures | Declare a business-outcome assertion and emit count signals at every fetch, loop, and write. |
| 11-human-in-the-loop | Approval gates, async review, timeout and escalation | Every gate needs notification, preview, timeout, and escalation; route to a role, default to the safe action. |
| 12-idempotency | Safe retries, dedup keys, exactly-once at step level | Every side-effect step needs a deterministic idempotency key derived from execution + inputs. |
| 13-multi-agent-handoff | Passing context between agents without prompt drift | Pass typed structured payloads and always include the original input as an immutable reference. |
| 14-secret-and-credential-management | Env vars, rotation, per-user vs per-workspace scoping | Reference credentials by name and narrowest scope; never inline secrets in workflow JSON or prompts. |
| File | Anti-pattern | One-line rule |
|---|---|---|
| anti-pattern-prompt-in-loop | Calling LLM per iteration when batch works | Default to a single batch prompt with array output; loop only for distinct per-item tool calls. |
| anti-pattern-fire-and-forget | Async steps with no completion signal | Async dispatch is not async completion; wait on the completion event, not the dispatch. |
| anti-pattern-god-workflow | 30-step monoliths vs composable child workflows | If a workflow has > ~15 steps or reusable slices, split into orchestrator + typed children. |
| anti-pattern-llm-as-router | AI for binary decisions a condition handles for free | Use AI for structured output, use conditions to branch on it. |
| anti-pattern-missing-dedup | Polling workflows without a dedup gate (the cost math) | Polling without dedup burns credits proportional to source size × poll frequency. |
| anti-pattern-event-for-intake | App-event triggers where polling + label dedup is correct | Default to scheduled polling with label-based dedup for email/document intake. |
Every pattern follows the same structure so they're fast to scan:
## Pattern Name
**Problem**: One sentence — what goes wrong without this.
**Why it fails silently**: The specific failure mode developers don't see coming.
### Anti-pattern
[the wrong way]
### Correct pattern
[the right way]
### One-line rule
> Always X, never Y. Reason in one clause.
Patterns are most valuable when they come from real production failures — not theoretical advice.
- Open an issue using the new pattern template
- Or submit a PR adding a file to
patterns/v1/following the format above - Patterns must include a "why it fails silently" section — that's the hard-won knowledge
See MAINTAINERS.md for the people who review PRs.
Hit a pattern that didn't work? Found something missing? → Open a feedback issue — it directly shapes what gets built in v2.
MIT — use freely, attribution appreciated.