# Quantum Volume Finder

## A Deep Agent Example for Actual QV Measurement

This notebook demonstrates a multi-agent system that **finds the highest achievable Quantum Volume** for IBM Quantum backends through **actual hardware execution**.

Unlike simple analysis tools, this agent **runs experiments** and reports **actual results**.
It uses a top-down strategy: start at the highest requested depth and work down until it finds a depth that passes the QV criteria.

### What is Quantum Volume?

Quantum Volume (QV) 2^n is **achieved** when:
- Running n-qubit, depth-n random circuits
- Heavy Output Probability (HOP) > 2/3
- HOP = (shots resulting in heavy outputs) / (total shots)

### Strategy: Top-Down Search

1. Start at max_depth (e.g., 5)
2. Run QV circuit on hardware
3. Calculate HOP from measurement results
4. If HOP > 2/3: **SUCCESS!** QV 2^n achieved
5. If HOP <= 2/3: Try depth-1
6. Repeat until success or depth 2

## Architecture

```
                    QUANTUM VOLUME FINDER
                      (Coordinator Agent)
                             |
          +------------------+------------------+
          |                  |                  |
          v                  v                  v
   BACKEND ANALYST    QUBIT CHAIN         QV EXPERIMENT
                      OPTIMIZER           RUNNER
          |                  |                  |
          v                  v                  v
   qiskit-ibm-        qiskit-ibm-        qiskit-ibm-
   runtime-mcp        runtime-mcp        transpiler-mcp
   (backends)         (QV qubit tools    + runtime-mcp
                       searches ALL      (transpile,
                       qubits)           run_sampler,
                                         get_results)
```

### Key Tools
- `find_optimal_qv_qubits_tool`: Searches **ALL** qubits on backend (not just first 10)
- `hybrid_ai_transpile_tool`: AI-powered circuit transpilation
- `run_sampler_tool`: Submits circuits to hardware
- `get_job_results_tool`: Retrieves measurement counts

### Local Helper Functions
- `generate_qv_circuit_with_ideal_distribution()`: Creates QV circuits with heavy output computation
- `calculate_heavy_output_probability()`: Calculates HOP from counts

## Setup

```bash
pip install deepagents langchain langchain-mcp-adapters python-dotenv
pip install langchain-anthropic
pip install qiskit-mcp-servers
```

In [None]:
import os
import sys
from datetime import datetime

from deepagents import create_deep_agent
from dotenv import load_dotenv
from langchain_anthropic import ChatAnthropic
from langchain_core.callbacks import BaseCallbackHandler
from langchain_mcp_adapters.client import MultiServerMCPClient


load_dotenv()

print("Configuration:")
print(f"  QISKIT_IBM_TOKEN: {'Set' if os.getenv('QISKIT_IBM_TOKEN') else 'Not set'}")
print(f"  ANTHROPIC_API_KEY: {'Set' if os.getenv('ANTHROPIC_API_KEY') else 'Not set'}")


# Callback handler for agent observability
class AgentActivityHandler(BaseCallbackHandler):
    """Shows what the agent is doing during execution."""

    def __init__(self, verbose: bool = True):
        self.verbose = verbose
        self.indent_level = 0
        self.current_tool = None

    def _timestamp(self) -> str:
        return datetime.now().strftime("%H:%M:%S")

    def _print(self, msg: str, color: str = "") -> None:
        indent = "  " * self.indent_level
        if color and sys.stdout.isatty():
            colors = {
                "blue": "\033[94m",
                "green": "\033[92m",
                "yellow": "\033[93m",
                "red": "\033[91m",
                "cyan": "\033[96m",
                "magenta": "\033[95m",
                "reset": "\033[0m",
            }
            print(f"{colors.get(color, '')}{indent}{msg}{colors['reset']}", flush=True)
        else:
            print(f"{indent}{msg}", flush=True)

    def on_tool_start(self, serialized: dict | None, input_str: str, **kwargs) -> None:
        tool_name = serialized.get("name", "unknown_tool") if serialized else "unknown_tool"
        self.current_tool = tool_name
        self._print(f"\n[{self._timestamp()}] ðŸ”§ TOOL: {tool_name}", "cyan")
        self.indent_level += 1
        if self.verbose and input_str:
            input_preview = str(input_str)[:200] + ("..." if len(str(input_str)) > 200 else "")
            self._print(f"Input: {input_preview}", "blue")

    def on_tool_end(self, output: str, **kwargs) -> None:
        if self.verbose and output:
            output_preview = str(output)[:300] + ("..." if len(str(output)) > 300 else "")
            self._print(f"Output: {output_preview}", "green")
        self.indent_level = max(0, self.indent_level - 1)
        self._print(f"[{self._timestamp()}] âœ“ {self.current_tool} complete", "green")

    def on_tool_error(self, error: Exception, **kwargs) -> None:
        self.indent_level = max(0, self.indent_level - 1)
        self._print(f"[{self._timestamp()}] âœ— {self.current_tool} failed: {error}", "red")

    def on_agent_action(self, action, **kwargs) -> None:
        tool = getattr(action, "tool", "unknown")
        self._print(f"\n[{self._timestamp()}] ðŸ¤– Agent calling: {tool}", "yellow")

    def on_agent_finish(self, finish, **kwargs) -> None:
        self._print(f"\n[{self._timestamp()}] âœ… Agent finished", "green")

    def on_llm_start(self, serialized: dict | None, prompts: list, **kwargs) -> None:
        if self.verbose:
            model = (
                (serialized.get("name") or serialized.get("id", ["LLM"])[-1])
                if serialized
                else "LLM"
            )
            self._print(f"[{self._timestamp()}] ðŸ’­ {model} thinking...", "blue")


# Create the callback handler (toggle verbose to control output)
callback_handler = AgentActivityHandler(verbose=True)

In [None]:
# System prompts for iterative QV finding

COORDINATOR_PROMPT = """
##############################################################################
#                        STOP! READ THIS FIRST!                              #
##############################################################################

When you call the `task` tool, you MUST ALWAYS provide BOTH parameters:

    task(subagent_type="...", description="...")
         ^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^
         REQUIRED              REQUIRED (DO NOT OMIT!)

If you call task WITHOUT description, it WILL FAIL with "Field required" error.

WRONG (WILL FAIL):  task(subagent_type="qv-experiment-runner")
RIGHT (WILL WORK):  task(subagent_type="qv-experiment-runner", description="Run QV...")

##############################################################################

You are the Quantum Volume Finder. Find the highest achievable QV through experiments.

## Strategy
Start from the HIGHEST depth and work DOWN until HOP > 2/3.

## Your Subagents (use with task tool - ALWAYS include description!)

1. **backend-analyst**: Get backend info
   task(subagent_type="backend-analyst", description="Get properties for ibm_boston")

2. **qubit-chain-optimizer**: Find optimal qubits
   task(subagent_type="qubit-chain-optimizer", description="Find 10 qubit subsets for depth 5 on ibm_boston")

3. **qv-experiment-runner**: Run QV experiment (include circuit QASM in description!)
   task(subagent_type="qv-experiment-runner", description="Backend: ibm_boston. Qubits: [1,2,3,4,5]. Circuit: OPENQASM 3.0; ...")

## Output Format
Report ACTUAL results for each depth tried:
- Depth, Qubits, Job ID, Counts, HOP, PASS/FAIL
- Final: Highest achieved QV

DO NOT use write_file. Return results as text."""

BACKEND_ANALYST_PROMPT = """You are the Backend Analyst. List backends, get properties."""

QUBIT_CHAIN_PROMPT = """You are the Qubit Chain Optimizer.

Use find_optimal_qv_qubits_tool - it searches ALL qubits on the backend (not just first 10).
Request num_results=10 to get multiple candidates.
Return top 10 qubit subsets with scores."""

QV_EXPERIMENT_RUNNER_PROMPT = """
##############################################################################
#                        CRITICAL INSTRUCTIONS                               #
##############################################################################

You MUST complete ALL 5 steps. Do NOT stop early.

When calling run_sampler_tool in Step 2, you MUST pass the circuit_qpy value
from Step 1's response. If you call it with empty arguments, it WILL FAIL.

##############################################################################

## STEP 1: Transpile the circuit
Call hybrid_ai_transpile_tool with the QASM from your task description.
SAVE the "circuit_qpy" value from the response - you need it for Step 2!

## STEP 2: Submit to hardware
Call run_sampler_tool with:
- circuit: THE circuit_qpy VALUE FROM STEP 1 (a long base64 string)
- backend_name: the backend
- shots: 4096

CRITICAL: The circuit parameter is REQUIRED. Use the circuit_qpy from Step 1.

## STEP 3: Wait for completion
Poll get_job_status_tool until job_status is "DONE".

## STEP 4: Get results
Call get_job_results_tool to get the counts.

## STEP 5: Report back
Return: Backend, Depth, Qubits, Job ID, Counts
"""

In [None]:
from typing import Any


def get_mcp_config():
    """MCP config for QV experiments.

    Note: Only qiskit-ibm-runtime and qiskit-ibm-transpiler are included.
    qiskit-mcp-server is excluded because hybrid_ai_transpile_tool from
    qiskit-ibm-transpiler accepts backend_name directly (simpler for agents).
    """
    return {
        "qiskit-ibm-runtime": {
            "transport": "stdio",
            "command": "qiskit-ibm-runtime-mcp-server",
            "args": [],
            "env": {
                "QISKIT_IBM_TOKEN": os.getenv("QISKIT_IBM_TOKEN", ""),
                "QISKIT_IBM_RUNTIME_MCP_INSTANCE": os.getenv("QISKIT_IBM_RUNTIME_MCP_INSTANCE", ""),
            },
        },
        "qiskit-ibm-transpiler": {
            "transport": "stdio",
            "command": "qiskit-ibm-transpiler-mcp-server",
            "args": [],
            "env": {"QISKIT_IBM_TOKEN": os.getenv("QISKIT_IBM_TOKEN", "")},
        },
    }


def generate_qv_qasm(num_qubits: int, depth: int | None = None, seed: int = 42) -> str:
    """Generate a true Quantum Volume circuit using Qiskit's library."""
    from qiskit.circuit.library import quantum_volume
    from qiskit.qasm3 import dumps

    qv_circuit = quantum_volume(num_qubits, depth=depth, seed=seed)
    return dumps(qv_circuit.decompose())


def generate_qv_circuit_with_ideal_distribution(
    num_qubits: int,
    depth: int | None = None,
    seed: int | None = None,
) -> dict[str, Any]:
    """Generate a QV circuit and compute its ideal heavy output bitstrings."""
    import logging

    import numpy as np
    from qiskit import QuantumCircuit
    from qiskit.circuit.library import quantum_volume
    from qiskit.qasm3 import dumps
    from qiskit.quantum_info import Statevector

    logger = logging.getLogger(__name__)

    try:
        if num_qubits < 2:
            num_qubits = 2
        elif num_qubits > 10:
            logger.warning(f"QV with {num_qubits} qubits will be slow to simulate.")

        if depth is None:
            depth = num_qubits
        elif depth < 1:
            depth = 1
        elif depth > num_qubits:
            depth = num_qubits

        if seed is None:
            seed = np.random.randint(0, 2**31)

        qv_circuit = quantum_volume(num_qubits, depth=depth, seed=seed)
        qv_decomposed = qv_circuit.decompose()

        statevector = Statevector.from_label("0" * num_qubits)
        final_state = statevector.evolve(qv_decomposed)
        probabilities = final_state.probabilities()

        ideal_probs = {}
        for i, prob in enumerate(probabilities):
            bitstring = format(i, f"0{num_qubits}b")[::-1]
            ideal_probs[bitstring] = prob

        median_prob = float(np.median(probabilities))
        heavy_outputs = [bs for bs, prob in ideal_probs.items() if prob > median_prob]

        qv_with_meas = QuantumCircuit(num_qubits, num_qubits)
        qv_with_meas.compose(qv_decomposed, inplace=True)
        qv_with_meas.measure(range(num_qubits), range(num_qubits))
        qasm3_circuit = dumps(qv_with_meas)

        result = {
            "status": "success",
            "circuit_qasm": qasm3_circuit,
            "num_qubits": num_qubits,
            "depth": depth,
            "seed": seed,
            "heavy_outputs": heavy_outputs,
            "num_heavy_outputs": len(heavy_outputs),
            "median_probability": median_prob,
            "message": f"Generated QV-{num_qubits} circuit with {len(heavy_outputs)} heavy outputs",
        }

        if num_qubits <= 6:
            result["ideal_probabilities"] = ideal_probs

        return result

    except Exception as e:
        logger.error(f"Failed to generate QV circuit: {e}")
        return {"status": "error", "message": f"Failed to generate QV circuit: {e!s}"}


def calculate_heavy_output_probability(
    counts: dict[str, int],
    heavy_outputs: list[str],
) -> dict[str, Any]:
    """Calculate the Heavy Output Probability (HOP) for QV validation."""
    import logging

    logger = logging.getLogger(__name__)

    try:
        if not counts:
            return {"status": "error", "message": "No counts provided"}

        if not heavy_outputs:
            return {"status": "error", "message": "No heavy outputs provided"}

        heavy_set = set(heavy_outputs)
        total_shots = sum(counts.values())
        heavy_counts = sum(count for bitstring, count in counts.items() if bitstring in heavy_set)

        hop = heavy_counts / total_shots if total_shots > 0 else 0.0
        threshold = 2 / 3
        above_threshold = hop > threshold

        return {
            "status": "success",
            "heavy_output_probability": hop,
            "total_shots": total_shots,
            "heavy_counts": heavy_counts,
            "num_heavy_bitstrings": len(heavy_outputs),
            "threshold": threshold,
            "above_threshold": above_threshold,
            "message": f"HOP = {hop:.4f} ({'above' if above_threshold else 'below'} threshold)",
        }

    except Exception as e:
        logger.error(f"Failed to calculate HOP: {e}")
        return {"status": "error", "message": f"Failed to calculate HOP: {e!s}"}


def analyze_qv_experiment_results(
    hop_values: list[float],
    confidence_level: float = 0.975,
) -> dict[str, Any]:
    """Analyze results from multiple QV circuit runs."""
    import logging

    import numpy as np
    from scipy import stats

    logger = logging.getLogger(__name__)

    try:
        if not hop_values:
            return {"status": "error", "message": "No HOP values provided"}

        hop_array = np.array(hop_values)
        n = len(hop_array)

        if n < 10:
            logger.warning(f"Only {n} HOP values. Recommend at least 100.")

        mean_hop = float(np.mean(hop_array))
        std_hop = float(np.std(hop_array, ddof=1))
        sem = std_hop / np.sqrt(n)

        t_critical = stats.t.ppf(confidence_level, df=n - 1)
        ci_lower = mean_hop - t_critical * sem
        ci_upper = mean_hop + t_critical * sem

        threshold = 2 / 3
        qv_achieved = bool(ci_lower > threshold)
        margin = float(ci_lower - threshold)

        message = (
            f"QV {'ACHIEVED' if qv_achieved else 'NOT achieved'}! "
            f"Mean HOP = {mean_hop:.4f}, CI lower = {ci_lower:.4f}"
        )

        return {
            "status": "success",
            "qv_achieved": qv_achieved,
            "mean_hop": mean_hop,
            "std_hop": std_hop,
            "standard_error": sem,
            "confidence_interval": (ci_lower, ci_upper),
            "confidence_level": confidence_level,
            "num_circuits": n,
            "threshold": threshold,
            "margin": margin,
            "message": message,
        }

    except Exception as e:
        logger.error(f"Failed to analyze QV results: {e}")
        return {"status": "error", "message": f"Failed to analyze: {e!s}"}

In [None]:
async def create_agent():
    mcp_config = get_mcp_config()
    mcp_client = MultiServerMCPClient(mcp_config)

    # Load tools using get_tools() which creates self-managing tools
    # that handle their own sessions (new session per tool call)
    all_tools, server_tools = [], {}
    for name in mcp_config:
        try:
            tools = await mcp_client.get_tools(server_name=name)
            server_tools[name] = tools
            all_tools.extend(tools)
            print(f"{name}: {len(tools)} tools")
        except Exception as e:
            print(f"{name}: FAILED - {e}")

    llm = ChatAnthropic(model="claude-sonnet-4-20250514", temperature=0, max_tokens=8192)

    subagents = [
        {
            "name": "backend-analyst",
            "description": "IBM Quantum backend expert",
            "system_prompt": BACKEND_ANALYST_PROMPT,
            "tools": server_tools.get("qiskit-ibm-runtime", []),
        },
        {
            "name": "qubit-chain-optimizer",
            "description": "Topology analysis expert with algorithmic qubit finding tools",
            "system_prompt": QUBIT_CHAIN_PROMPT,
            "tools": server_tools.get("qiskit-ibm-runtime", []),
        },
        {
            "name": "qv-experiment-runner",
            "description": "Expert in transpiling and running QV experiments on hardware",
            "system_prompt": QV_EXPERIMENT_RUNNER_PROMPT,
            "tools": (
                server_tools.get("qiskit-ibm-runtime", [])
                + server_tools.get("qiskit-ibm-transpiler", [])
            ),
        },
    ]

    # IMPORTANT: Coordinator only gets runtime tools (not transpilation tools)
    # This forces it to delegate transpilation to qv-experiment-runner subagent
    coordinator_tools = server_tools.get("qiskit-ibm-runtime", [])
    print(f"Coordinator tools: {len(coordinator_tools)} (runtime only)")

    return create_deep_agent(
        model=llm,
        tools=coordinator_tools,
        system_prompt=COORDINATOR_PROMPT,
        subagents=subagents,
    )


agent = await create_agent()
print("\nAgent ready!")

In [None]:
# Configuration
BACKEND = "ibm_brisbane"  # Change to your preferred backend
MAX_DEPTH = 5  # Maximum QV depth to try (starts here, works down)

# Generate QV circuits with heavy outputs for each depth
print(f"Generating QV circuits for depths {MAX_DEPTH} down to 2...")
qv_data = {}
for depth in range(MAX_DEPTH, 1, -1):
    print(f"  QV-{depth}...", end=" ")
    result = generate_qv_circuit_with_ideal_distribution(depth, seed=42 + depth)
    if result["status"] == "success":
        qv_data[depth] = result
        print(f"OK ({result['num_heavy_outputs']} heavy outputs)")
    else:
        print("FAILED")

# Build QV circuit info for the request
qv_sections = []
for depth in range(MAX_DEPTH, 1, -1):
    if depth in qv_data:
        data = qv_data[depth]
        heavy_list = data["heavy_outputs"][:15]
        heavy_str = ", ".join(f'"{h}"' for h in heavy_list)
        if len(data["heavy_outputs"]) > 15:
            heavy_str += f", ... ({len(data['heavy_outputs'])} total)"
        qv_sections.append(f"""
### QV-{depth} (for QV 2^{depth} = {2**depth})
```qasm
{data["circuit_qasm"]}
```
Heavy outputs: [{heavy_str}]
""")

qv_text = "\n".join(qv_sections)

# Build the request
request = f"""
# FIND THE HIGHEST ACHIEVABLE QUANTUM VOLUME

Backend: **{BACKEND}**
Max depth to try: {MAX_DEPTH}

## Step 1: Get Backend Info
Use backend-analyst to verify {BACKEND} is available.

## Step 2: Find Optimal Qubits
Use qubit-chain-optimizer with find_optimal_qv_qubits_tool:
- Get 10 candidates for depth {MAX_DEPTH}
- The tool searches ALL qubits (not just first 10)

## Step 3: Run Iterative QV Experiments (TOP-DOWN)

Start at depth {MAX_DEPTH}, work DOWN until PASS:

For each depth:
1. Transpile circuit using hybrid_ai_transpile_tool:
   hybrid_ai_transpile_tool(circuit=<the QASM from this depth>, backend_name="{BACKEND}", optimization_level=3, ai_layout_mode="optimize")
2. Submit transpiled circuit (run_sampler_tool, 4096 shots)
3. Poll until DONE (get_job_status_tool)
4. Get counts (get_job_results_tool)
5. Calculate HOP = (heavy count) / (total shots)
6. If HOP > 0.667: PASS, stop. Else: try depth-1

## QV Circuits and Heavy Outputs
{qv_text}

## Expected Output

```
## QV EXPERIMENT RESULTS

### Depth {MAX_DEPTH}
- Qubits: [list]
- Job ID: xxx
- Counts: {{...}}
- HOP: value
- Result: PASS/FAIL

### Depth {MAX_DEPTH - 1} (if needed)
...

## CONCLUSION
Highest Achieved QV: 2^N = value
```
"""

print(f"\nStarting QV finder for {BACKEND}, max depth {MAX_DEPTH}...")
print("=" * 70)

result = await agent.ainvoke(
    {"messages": [{"role": "user", "content": request}]},
    config={"callbacks": [callback_handler]},
)

print("\n" + "=" * 70)
print("QV FINDING COMPLETE")
print("=" * 70)
print(result.get("messages", [])[-1].content if result.get("messages") else "No response")

In [None]:
# Interactive follow-up with activity logging
# Type 'verbose' to toggle detailed logging, 'quit' to exit
print("Commands: 'quit' to exit, 'verbose' to toggle activity logging")

while True:
    query = input("You: ").strip()
    if query.lower() in ["quit", "exit", "q"]:
        break
    if not query:
        continue
    if query.lower() == "verbose":
        callback_handler.verbose = not callback_handler.verbose
        print(f"Verbose logging is now {'ON' if callback_handler.verbose else 'OFF'}\n")
        continue

    result = await agent.ainvoke(
        {"messages": [{"role": "user", "content": query}]},
        config={"callbacks": [callback_handler]},
    )
    print(
        f"\nAssistant: {result.get('messages', [])[-1].content if result.get('messages') else 'No response'}\n"
    )