# Veronica Core: Circuit Breaker for AG2 Agents

Production multi-agent systems fail in two distinct ways:

- **Individual agent failure** - one LLM endpoint degrades while others are healthy.
- **System-wide emergency** - something is deeply wrong and every agent must stop immediately.

The [veronica-core](https://github.com/amabito/veronica-core) library handles both with a single
`CircuitBreakerCapability` that attaches to any AG2 agent via the standard `add_to_agent()` pattern.

1. **Basic circuit breaker** - an agent trips after repeated failures; callers receive `None` instead of hanging.
2. **System-wide SAFE_MODE** - a shared `VeronicaIntegration` blocks all agents instantly on anomaly detection, then recovers in two steps.
3. **Per-agent isolation** - a broken agent's open circuit does not affect healthy agents sharing the same capability instance.

## Installation

```bash
pip install -U "autogen[openai]" veronica-core
```

## Imports

In [None]:
# Copyright (c) 2023 - 2026, AG2ai, Inc., AG2ai open-source projects maintainers and core contributors
# SPDX-License-Identifier: Apache-2.0

from autogen import ConversableAgent

from veronica_core import (
    CircuitBreakerCapability,
    CircuitState,
    MemoryBackend,
    VeronicaIntegration,
    VeronicaState,
)

## Demo 1: Basic Circuit Breaker

A `CircuitBreakerCapability` wraps `agent.generate_reply()` transparently.
When an agent returns `None` (the AG2 convention for "I have no reply"),
the breaker counts it as a failure. After `failure_threshold` consecutive
failures the circuit opens, and subsequent calls return `None` immediately
without invoking the agent.

In [None]:
# An agent whose backend is completely broken (always returns None)
planner = ConversableAgent("planner", llm_config=False)
planner.register_reply(
    trigger=lambda _: True,
    reply_func=lambda agent, messages, sender, config: (True, None),
    position=0,
    remove_other_reply_funcs=True,
)

cap = CircuitBreakerCapability(failure_threshold=3)
cap.add_to_agent(planner)

breaker = cap.get_breaker("planner")
print(f"initial state  : {breaker.state}")  # CircuitState.CLOSED

msg = [{"role": "user", "content": "test"}]

# Three None replies trip the circuit
for _ in range(3):
    planner.generate_reply(msg)

print(f"after 3 failures: {breaker.state}")       # CircuitState.OPEN
print(f"failure count   : {breaker.failure_count}")  # 3

# Subsequent calls are short-circuited -- the agent is never invoked
reply = planner.generate_reply(msg)
print(f"reply when open : {reply!r}")  # None

## Demo 2: System-wide SAFE_MODE

When multiple agents share a single `VeronicaIntegration`, any component can
trigger a system-wide halt by transitioning to `SAFE_MODE`. All agents are
blocked immediately — no code changes at call sites.

Recovery requires two explicit transitions (`SAFE_MODE → IDLE → SCREENING`) —
skipping straight to SCREENING isn't valid.

In [None]:
def _always_ok(agent, messages, sender, config):
    return True, f"{agent.name}: ok"


# MemoryBackend keeps state in-process -- no files written during the demo
veronica = VeronicaIntegration(backend=MemoryBackend())
cap2 = CircuitBreakerCapability(failure_threshold=5, veronica=veronica)

msg = [{"role": "user", "content": "test"}]

planner2  = ConversableAgent("planner",  llm_config=False)
executor2 = ConversableAgent("executor", llm_config=False)
for agent in (planner2, executor2):
    agent.register_reply(
        trigger=lambda _: True,
        reply_func=_always_ok,
        position=0,
        remove_other_reply_funcs=True,
    )
    cap2.add_to_agent(agent)

# Both agents are healthy
print(planner2.generate_reply(msg))   # planner: ok
print(executor2.generate_reply(msg))  # executor: ok

# Anomaly detected -- halt everything immediately
# VeronicaIntegration starts in SCREENING, so SCREENING -> SAFE_MODE is valid
veronica.state.transition(VeronicaState.SAFE_MODE, reason="anomaly detected")
print(planner2.generate_reply(msg))   # None -- blocked by SAFE_MODE
print(executor2.generate_reply(msg))  # None -- blocked by SAFE_MODE

# Two-step recovery: confirm stability (IDLE), then resume screening
veronica.state.transition(VeronicaState.IDLE,      reason="anomaly resolved")
veronica.state.transition(VeronicaState.SCREENING, reason="resuming")
print(planner2.generate_reply(msg))   # planner: ok
print(executor2.generate_reply(msg))  # executor: ok

## Demo 3: Per-agent Isolation

Each call to `add_to_agent()` creates an independent `CircuitBreaker` for that
agent. A broken agent's circuit opening does not affect any other agent, even
when they share the same `CircuitBreakerCapability` instance.

In [None]:
cap3 = CircuitBreakerCapability(failure_threshold=2)

healthy = ConversableAgent("healthy", llm_config=False)
healthy.register_reply(
    trigger=lambda _: True,
    reply_func=lambda agent, messages, sender, config: (True, "healthy: ok"),
    position=0,
    remove_other_reply_funcs=True,
)

broken = ConversableAgent("broken", llm_config=False)
broken.register_reply(
    trigger=lambda _: True,
    reply_func=lambda agent, messages, sender, config: (True, None),
    position=0,
    remove_other_reply_funcs=True,
)

cap3.add_to_agent(healthy)
cap3.add_to_agent(broken)

msg = [{"role": "user", "content": "test"}]

# Trip the broken agent's circuit
broken.generate_reply(msg)
broken.generate_reply(msg)
print(f"broken  state: {cap3.get_breaker('broken').state}")   # CircuitState.OPEN

# The healthy agent is completely unaffected -- same cap, independent breaker
print(f"healthy reply: {healthy.generate_reply(msg)!r}")         # 'healthy: ok'
print(f"healthy state: {cap3.get_breaker('healthy').state}")    # CircuitState.CLOSED

## Summary

| Feature | API |
|---------|-----|
| Protect an agent | `cap.add_to_agent(agent)` |
| Inspect circuit state | `cap.get_breaker(agent.name).state` |
| System-wide halt | `veronica.state.transition(VeronicaState.SAFE_MODE, ...)` |
| Recovery | `SAFE_MODE -> IDLE -> SCREENING` (two explicit transitions) |
| Backend for demos | `MemoryBackend()` (no file I/O) |

Existing `agent.generate_reply(messages)` calls need no changes.