runcycles · amavashev · Mar 19, 2026 · Mar 19, 2026 · Mar 19, 2026
diff --git a/.vitepress/config.ts b/.vitepress/config.ts
@@ -197,6 +197,7 @@ export default defineConfig({
           text: 'Integrations',
           collapsed: false,
           items: [
+            { text: 'Ecosystem', link: '/how-to/ecosystem', badge: { text: 'New', type: 'tip' } },
             { text: 'Overview', link: '/how-to/integrations-overview' },
             { text: 'MCP (Claude, Cursor, Windsurf)', link: '/how-to/integrating-cycles-with-mcp' },
             { text: 'OpenAI', link: '/how-to/integrating-cycles-with-openai' },
@@ -264,6 +265,13 @@ export default defineConfig({
             { text: 'Security Hardening', link: '/how-to/security-hardening' },
             { text: 'Changelog', link: '/changelog' },
           ]
+        },
+        {
+          text: 'Community',
+          collapsed: true,
+          items: [
+            { text: 'Built with Cycles Badges', link: '/community/badges' },
+          ]
         }
       ],
     },

diff --git a/blog/agent-budget-patterns-visual-guide.md b/blog/agent-budget-patterns-visual-guide.md
@@ -0,0 +1,271 @@
+---
+title: "AI Agent Budget Patterns: A Practical Guide"
+date: 2026-03-19
+author: Cycles Team
+tags: [patterns, budgets, architecture, guide]
+description: "A practical reference for structuring AI agent budgets — covering tenant isolation, workflow caps, run-level limits, graceful degradation, and more."
+blog: true
+sidebar: false
+---
+
+# AI Agent Budget Patterns: A Practical Guide
+
+Every team running AI agents in production eventually faces the same question: how should we structure our budgets? Too coarse and a single runaway agent burns through the allocation. Too granular and the overhead of managing hundreds of micro-budgets becomes its own problem. This guide covers the six patterns we see most often, with concrete examples and trade-offs for each.
+
+<!-- more -->
+
+These patterns aren't mutually exclusive — most production systems combine two or three. The [common budget patterns](/how-to/common-budget-patterns) page in our docs covers the Cycles-specific implementation details; this post focuses on the architectural thinking behind each approach.
+
+## Pattern 1: Tenant isolation budgets
+
+**When to use:** Multi-tenant platforms where each customer or team gets their own AI agent access and you need hard spend isolation between them.
+
+The simplest and most common starting point. Each tenant gets an independent budget that cannot be exceeded, regardless of what other tenants are doing.
+
+```python
+# Tenant isolation: each tenant has a completely independent budget
+tenant_budget = cycles.create_budget(
+    scope=f"tenant:{tenant_id}",
+    limit_dollars=500.00,
+    period="monthly",
+    on_exhausted="deny"
+)
+
+# Every agent call for this tenant checks against their budget
+async def run_agent_for_tenant(tenant_id, task):
+    budget = cycles.get_budget(scope=f"tenant:{tenant_id}")
+    result = await budget.execute(
+        agent.run(task),
+        estimated_cost=estimate_task_cost(task)
+    )
+    return result
+```
+
+**Trade-offs:**
+- Provides complete blast-radius isolation — one tenant's runaway agent cannot affect others
+- Simple to reason about and explain to customers
+- Can lead to underutilization: if Tenant A uses 10% of their budget and Tenant B hits 100%, there's no sharing
+- Requires careful initial sizing — set too low and legitimate workloads get blocked
+
+This pattern maps directly to how [tenant, workflow, and run budgets](/how-to/how-to-model-tenant-workflow-and-run-budgets-in-cycles) work in Cycles.
+
+## Pattern 2: Workflow-level caps
+
+**When to use:** When different agent workflows have different cost profiles and risk levels, and you want to cap each independently.
+
+A code review agent and a deep research agent have very different cost characteristics. Workflow-level caps let you set appropriate limits for each.
+
+```python
+# Different workflows get different budgets
+workflow_budgets = {
+    "code-review":     {"limit": 2.00,  "per": "run"},
+    "deep-research":   {"limit": 25.00, "per": "run"},
+    "summarization":   {"limit": 5.00,  "per": "run"},
+    "chat":            {"limit": 1.00,  "per": "session"},
+}
+
+async def run_workflow(workflow_type, input_data):
+    config = workflow_budgets[workflow_type]
+    budget = cycles.create_budget(
+        scope=f"workflow:{workflow_type}:{run_id}",
+        limit_dollars=config["limit"],
+        period=config["per"]
+    )
+    return await budget.execute(agent.run(input_data))
+```
+
+**Trade-offs:**
+- Right-sized limits for each use case reduce both waste and false denials
+- Makes cost profiles explicit and auditable
+- Requires understanding the cost distribution of each workflow upfront
+- New workflows need budget configuration before deployment
+
+## Pattern 3: Per-run budgets with graceful degradation
+
+**When to use:** When you want agents to produce _some_ result even when they hit budget limits, rather than failing entirely.
+
+This is the pattern that separates production-grade agent systems from prototypes. Instead of a hard stop at budget exhaustion, the agent downgrades its approach.
+
+```python
+async def research_with_degradation(query, budget_dollars=10.00):
+    budget = cycles.create_budget(
+        scope=f"run:{run_id}",
+        limit_dollars=budget_dollars
+    )
+
+    # Phase 1: Use the best model
+    remaining = budget.remaining()
+    if remaining > 5.00:
+        result = await budget.execute(
+            agent.run(query, model="claude-opus-4-20250514")
+        )
+    # Phase 2: Fall back to a cheaper model
+    elif remaining > 1.00:
+        result = await budget.execute(
+            agent.run(query, model="claude-sonnet-4-20250514")
+        )
+    # Phase 3: Return cached/partial results
+    else:
+        result = get_cached_or_partial_result(query)
+        result.metadata["degraded"] = True
+
+    return result
+```
+
+**Trade-offs:**
+- Users get a result instead of an error, improving perceived reliability
+- Requires designing multiple quality tiers for each workflow
+- The "degraded" signal needs to propagate to the user — silent degradation erodes trust
+- More complex to test: you need to validate each fallback tier
+
+We cover degradation strategies in detail in [How to Think About Degradation Paths](/how-to/how-to-think-about-degradation-paths-in-cycles-deny-downgrade-disable-or-defer).
+
+## Pattern 4: Shared pool with priority tiers
+
+**When to use:** When you want to maximize utilization of a fixed budget across multiple agents or users, with guarantees for high-priority work.
+
+Instead of giving each consumer a fixed allocation, you share a pool but enforce priority ordering when the pool runs low.
+
+```python
+# Shared pool with priority tiers
+pool = cycles.create_budget(
+    scope="org:engineering",
+    limit_dollars=5000.00,
+    period="monthly"
+)
+
+# Priority tiers determine who gets denied first
+PRIORITY_THRESHOLDS = {
+    "critical":  0.0,   # Only denied at $0 remaining
+    "high":      0.10,  # Denied below 10% remaining
+    "normal":    0.25,  # Denied below 25% remaining
+    "low":       0.50,  # Denied below 50% remaining
+    "bulk":      0.70,  # Denied below 70% remaining (off-peak only)
+}
+
+async def execute_with_priority(task, priority="normal"):
+    remaining_fraction = pool.remaining() / pool.limit
+    threshold = PRIORITY_THRESHOLDS[priority]
+
+    if remaining_fraction <= threshold:
+        raise BudgetExhaustedError(
+            f"Pool at {remaining_fraction:.0%}, "
+            f"threshold for '{priority}' is {threshold:.0%}"
+        )
+
+    return await pool.execute(task)
+```
+
+**Trade-offs:**
+- Higher overall utilization — no budget sits idle while another is exhausted
+- Critical work is protected even under heavy load
+- Harder to predict per-team or per-user costs for billing purposes
+- Requires agreement on what constitutes "critical" vs. "low" priority
+- Risk of low-priority work getting permanently starved in busy periods
+
+## Pattern 5: Shadow mode rollout
+
+**When to use:** When you're introducing budget controls to an existing system and need to validate limits before enforcing them.
+
+This is less a budget _structure_ and more a deployment pattern, but it's essential for any team that isn't starting from scratch. Shadow mode tracks what _would_ have been denied without actually denying anything.
+
+```python
+# Shadow mode: log but don't enforce
+budget = cycles.create_budget(
+    scope=f"tenant:{tenant_id}",
+    limit_dollars=100.00,
+    period="daily",
+    mode="shadow"  # Track but don't enforce
+)
+
+# In shadow mode, execute() always succeeds but logs violations
+result = await budget.execute(agent.run(task))
+
+# After a validation period, check the shadow logs
+shadow_report = cycles.get_shadow_report(
+    scope=f"tenant:{tenant_id}",
+    period="last_7_days"
+)
+# Output: "23 calls would have been denied. Peak overage: $47.30."
+# Now you can tune the limit before switching to enforce mode.
+```
+
+**Trade-offs:**
+- Zero risk of breaking production workflows during rollout
+- Generates real data for sizing budgets accurately
+- Adds latency (the budget check still happens, just without enforcement)
+- Teams sometimes stay in shadow mode too long, delaying the value of enforcement
+
+Our [shadow mode rollout guide](/how-to/shadow-mode-in-cycles-how-to-roll-out-budget-enforcement-without-breaking-production) walks through the full process, including how to analyze shadow logs and choose enforcement cutover criteria.
+
+## Pattern 6: Hybrid model (tokens + dollars)
+
+**When to use:** When you need to track both the raw resource consumption (tokens) and the monetary cost (dollars), because they don't always move in lockstep.
+
+Token counts and dollar costs diverge when you use multiple models, when pricing changes, or when non-LLM tools (web search, code execution) are part of the agent's toolkit.
+
+```python
+# Hybrid budget: track both dimensions
+budget = cycles.create_budget(
+    scope=f"run:{run_id}",
+    limits={
+        "tokens": 500_000,       # Hard cap on token consumption
+        "dollars": 15.00,        # Hard cap on dollar spend
+    },
+    on_exhausted="deny_and_notify"
+)
+
+async def execute_hybrid(task):
+    # Both limits are checked atomically
+    result = await budget.execute(
+        agent.run(task),
+        estimated={
+            "tokens": estimate_tokens(task),
+            "dollars": estimate_cost(task),
+        }
+    )
+    return result
+
+# Useful for cases where a cheap model uses many tokens
+# or an expensive model uses few
+```
+
+**Trade-offs:**
+- Catches scenarios that a single-dimension budget misses (e.g., a cheap model looping uses few dollars but millions of tokens)
+- Useful for capacity planning beyond just cost
+- More complex to configure and explain to users
+- Requires accurate estimation for both dimensions
+
+## Combining patterns
+
+Most production systems layer two or three of these patterns. A common combination:
+
+1. **Tenant isolation** (Pattern 1) as the outer boundary
+2. **Workflow caps** (Pattern 2) within each tenant
+3. **Graceful degradation** (Pattern 3) within each workflow run
+4. **Shadow mode** (Pattern 5) for rollout
+
+This gives you hard isolation between customers, right-sized limits per use case, user-friendly behavior at the limits, and a safe path to enforcement.
+
+```
+Tenant Budget ($500/mo)
+├── Code Review Workflow ($2/run)
+│   └── Per-run with degradation
+├── Research Workflow ($25/run)
+│   └── Per-run with degradation
+└── Chat Workflow ($1/session)
+    └── Hard deny at limit
+```
+
+The [budget allocation and management guide](/how-to/budget-allocation-and-management-in-cycles) covers how to implement these hierarchies in Cycles, and the [cost estimation cheat sheet](/how-to/cost-estimation-cheat-sheet) helps with initial sizing for each tier.
+
+## Choosing your starting point
+
+If you're unsure where to begin:
+
+- **Single-tenant, single-agent:** Start with Pattern 3 (per-run with degradation)
+- **Multi-tenant SaaS:** Start with Pattern 1 (tenant isolation) + Pattern 5 (shadow mode)
+- **Internal platform with multiple teams:** Start with Pattern 4 (shared pool with priority)
+- **Migrating from no controls:** Start with Pattern 5 (shadow mode) to gather data first
+
+The most important step isn't picking the perfect pattern — it's having _any_ budget boundary in the execution path. You can always refine the structure later. You can't un-spend money that an uncontrolled agent already burned.