# LangChain Agent with Qiskit Gym MCP Server

This notebook demonstrates how to create an AI agent using LangChain that connects to the **qiskit-gym-mcp-server** via the Model Context Protocol (MCP).

The agent can interact with qiskit-gym's reinforcement learning environments to:
- Create RL environments for quantum circuit synthesis (Permutation, LinearFunction, Clifford)
- Train models using PPO or AlphaZero algorithms
- Run training in background mode for long sessions
- Synthesize optimal circuits using trained models
- Manage models (save, load, list, delete)

## Architecture

```
┌─────────────┐     MCP Protocol     ┌──────────────────────────┐
│  LangChain  │ ◄──────────────────► │  qiskit-gym-mcp-server   │
│    Agent    │                      │                          │
└─────────────┘                      │  ┌────────────────────┐  │
                                     │  │    qiskit-gym      │  │
                                     │  │  (RL Environments) │  │
                                     │  └────────────────────┘  │
                                     └──────────────────────────┘
```

## Setup

### 1. Install Dependencies

Run these commands in your terminal:

```bash
# Install the MCP server
pip install qiskit-gym-mcp-server

# Install LangChain dependencies
pip install langchain langchain-mcp-adapters python-dotenv

# Install your preferred LLM provider (choose one):
pip install langchain-openai       # For OpenAI
pip install langchain-anthropic    # For Anthropic Claude
pip install langchain-google-genai # For Google Gemini
pip install langchain-ollama       # For local Ollama
pip install langchain-ibm          # For IBM Watsonx
```

### 2. Configure Environment Variables

Set your LLM provider API key.

You can either:
- Set them in a `.env` file in this directory
- Set them as environment variables
- Enter them in the cell below

**Note:** This server doesn't require IBM Quantum credentials - it uses local qiskit-gym for RL training.

In [None]:
import os
from dotenv import load_dotenv

# LangChain imports
from langchain.agents import create_agent
from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain_mcp_adapters.tools import load_mcp_tools
from langchain_core.messages import HumanMessage

# Load from .env file if it exists
load_dotenv()

# Set your LLM provider API key (uncomment the one you're using):
# os.environ["OPENAI_API_KEY"] = "your-openai-api-key"
# os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-api-key"
# os.environ["GOOGLE_API_KEY"] = "your-google-api-key"

# Verify configuration
print("Configuration status:")
print(f"  OPENAI_API_KEY: {'✓ Set' if os.getenv('OPENAI_API_KEY') else '✗ Not set'}")
print(f"  ANTHROPIC_API_KEY: {'✓ Set' if os.getenv('ANTHROPIC_API_KEY') else '✗ Not set'}")

## Choose Your LLM Provider

Run **one** of the following cells based on your preferred LLM provider:

In [None]:
# Option 1: OpenAI
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o", temperature=0)
print("Using OpenAI GPT-4o")

In [None]:
# Option 2: Anthropic Claude
from langchain_anthropic import ChatAnthropic
llm = ChatAnthropic(model="claude-sonnet-4-20250514", temperature=0)
print("Using Anthropic Claude Sonnet")

In [None]:
# Option 3: Google Gemini
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model="gemini-2.5-pro", temperature=0)
print("Using Google Gemini Pro")

In [None]:
# Option 4: Local Ollama (no API key needed)
from langchain_ollama import ChatOllama
llm = ChatOllama(model="llama3.2", temperature=0)
print("Using local Ollama with Llama 3.2")

In [None]:
# Option 5: IBM Watsonx
from langchain_ibm import ChatWatsonx
llm = ChatWatsonx(
    model_id="ibm/granite-3-8b-instruct",
    url=os.getenv("WATSONX_URL", "https://us-south.ml.cloud.ibm.com"),
    project_id=os.getenv("WATSONX_PROJECT_ID"),
    params={"temperature": 0, "max_tokens": 4096},
)
print("Using IBM Watsonx Granite")

## Define the System Prompt

This prompt tells the agent what it can do and how to behave:

In [None]:
SYSTEM_PROMPT = """You are a helpful quantum computing assistant with access to qiskit-gym's
reinforcement learning-based circuit synthesis through the MCP server.

You can help users train RL models and synthesize optimal quantum circuits.

## IMPORTANT: Three Problem Types (Choose Correctly!)

There are THREE distinct environment types. Pay attention to what the user asks for:

1. **Permutation** (create_permutation_env_tool):
   - For QUBIT ROUTING / SWAP gate synthesis
   - Input: A permutation like [2, 0, 1, 3] meaning qubit 0→2, 1→0, 2→1, 3→3
   - Output: Optimal SWAP circuit to achieve that permutation
   - Keywords: "permutation", "swap", "routing", "qubit mapping"

2. **LinearFunction** (create_linear_function_env_tool):
   - For CNOT circuit synthesis of LINEAR BOOLEAN FUNCTIONS
   - Input: An invertible binary matrix representing the linear function
   - Output: Optimal CNOT-only circuit
   - Keywords: "linear function", "CNOT", "cx", "linear reversible", "parity network"

3. **Clifford** (create_clifford_env_tool):
   - For CLIFFORD CIRCUIT synthesis (H, S, CNOT gates)
   - Input: A Clifford tableau
   - Output: Optimal Clifford circuit
   - Keywords: "clifford", "stabilizer", "H+S+CNOT"

**When the user says "linear", determine if they mean:**
- "linear topology" → refers to the COUPLING MAP shape (a line: 0-1-2-3)
- "linear function" → refers to the LinearFunction ENVIRONMENT TYPE

## Environment Creation
- create_permutation_env_tool: Create PermutationGym for SWAP routing
- create_linear_function_env_tool: Create LinearFunctionGym for CNOT synthesis
- create_clifford_env_tool: Create CliffordGym for Clifford circuit synthesis
- list_environments_tool: List active environments
- delete_environment_tool: Remove an environment

## Training
- start_training_tool: Start RL training (supports background=True for async training)
- batch_train_environments_tool: Train multiple environments (supports background=True)
- get_training_status_tool: Check training progress
- wait_for_training_tool: Wait for background training to complete
- stop_training_tool: Stop a running training session
- list_training_sessions_tool: List all training sessions

## Synthesis
- synthesize_permutation_tool: Generate optimal SWAP circuit for a permutation
- synthesize_linear_function_tool: Generate optimal CNOT circuit
- synthesize_clifford_tool: Generate optimal Clifford circuit

## Model Management
- save_model_tool: Save trained model to disk
- load_model_tool: Load model from disk
- list_saved_models_tool: List models on disk
- list_loaded_models_tool: List models in memory

## Utility Tools
- generate_random_permutation_tool: Generate random permutation for testing
- generate_random_linear_function_tool: Generate random linear function
- generate_random_clifford_tool: Generate random Clifford element

## Hardware Presets
Available presets for coupling maps:
- linear_5, linear_10: Linear chain topologies
- grid_3x3, grid_5x5: Grid topologies
- ibm_heron_r1, ibm_heron_r2: IBM Heron heavy-hex topology
- ibm_nighthawk: IBM Nighthawk 10x12 grid

## RL Algorithms
- ppo: Proximal Policy Optimization (recommended, fast)
- alphazero: MCTS with neural networks (better for complex problems, slower)

## Policy Networks
- basic: Simple feedforward network (good for <8 qubits)
- conv1d: 1D convolutional network (better for larger problems)

## IMPORTANT: Background Training
**ALWAYS use background=True for training** to avoid connection timeouts:
- start_training_tool(..., background=True) - returns immediately with session_id
- batch_train_environments_tool(..., background=True) - returns immediately with session_ids
- Use get_training_status_tool(session_id) to check progress
- Use wait_for_training_tool(session_id) to block until complete

Only use synchronous training (background=False) for very short demos (< 10 iterations).

## Workflow Tips
1. **Always use background=True** for any real training
2. For batch training multiple environments, use batch_train_environments_tool with background=True
3. Poll progress with get_training_status_tool or wait with wait_for_training_tool
4. Save models you want to keep with save_model_tool
5. Use list_training_sessions_tool to see all running/completed training

When a user asks to train a model:
1. Create an appropriate environment using create_*_env_tool
2. Start training with start_training_tool(..., background=True)
3. Use get_training_status_tool to check progress periodically
4. When complete, save the model if needed with save_model_tool
"""

## Connect to the MCP Server and Create Agent

Now let's connect to the qiskit-gym MCP server and create our agent:

In [None]:
# Configure MCP client for qiskit-gym server
mcp_client = MultiServerMCPClient({
    "qiskit-gym": {
        "transport": "stdio",
        "command": "qiskit-gym-mcp-server",
        "args": [],
        "env": {},
    }
})

print("MCP client configured for qiskit-gym-mcp-server")

In [None]:
# Helper function to run queries with optional conversation history
async def run_agent_query(agent, query: str, history: list = None) -> tuple[str, list]:
    """Run a query through the agent with conversation history.
    
    Args:
        agent: The LangChain agent
        query: The user's question or request
        history: Optional list of previous messages for context
        
    Returns:
        Tuple of (response_text, updated_history)
    """
    messages = list(history) if history else []
    messages.append(HumanMessage(content=query))
    
    result = await agent.ainvoke({"messages": messages})
    result_messages = result.get("messages", [])
    
    if result_messages:
        response = result_messages[-1].content
        return response, result_messages
    
    return "No response generated.", messages

## Example 1: Basic Training Workflow

Let's create an environment and train a simple model:

In [None]:
async with mcp_client.session("qiskit-gym") as session:
    # Load tools and create agent
    tools = await load_mcp_tools(session)
    print(f"Loaded {len(tools)} tools from MCP server")
    
    agent = create_agent(llm, tools, system_prompt=SYSTEM_PROMPT)
    
    # Run a training query (using background=True as recommended)
    query = """Create a permutation environment for a 5-qubit linear chain
    and train a model with PPO for 20 iterations using background=True.
    Then wait for training to complete and show the results."""
    
    print(f"Query: {query}\n")
    print("-" * 50)
    response, _ = await run_agent_query(agent, query)
    print(response)

## Example 2: Background Training

For longer training sessions, use background mode:

In [None]:
async with mcp_client.session("qiskit-gym") as session:
    tools = await load_mcp_tools(session)
    agent = create_agent(llm, tools, system_prompt=SYSTEM_PROMPT)
    
    # Start background training
    query = """Create a Clifford environment for a 4-qubit grid (2x2)
    and start training in the background with PPO for 50 iterations.
    Then check the training status."""
    
    print(f"Query: {query}\n")
    print("-" * 50)
    response, _ = await run_agent_query(agent, query)
    print(response)

## Example 3: Wait for Training and Synthesize

Wait for background training to complete, then use the model:

In [None]:
async with mcp_client.session("qiskit-gym") as session:
    tools = await load_mcp_tools(session)
    agent = create_agent(llm, tools, system_prompt=SYSTEM_PROMPT)
    
    # Complete workflow: environment -> train (background) -> wait -> synthesize
    query = """Help me with this workflow:
    1. Create a permutation environment for a 4-qubit linear chain
    2. Train a model with PPO for 30 iterations (use background=True)
    3. Wait for training to complete
    4. Generate a random permutation for testing
    5. Use the trained model to synthesize an optimal circuit
    """
    
    print(f"Query: {query}\n")
    print("-" * 50)
    response, _ = await run_agent_query(agent, query)
    print(response)

## Example 4: List Resources

Check what environments and models are available:

In [None]:
async with mcp_client.session("qiskit-gym") as session:
    tools = await load_mcp_tools(session)
    agent = create_agent(llm, tools, system_prompt=SYSTEM_PROMPT)
    
    query = "List all active environments, training sessions, and loaded models."
    
    print(f"Query: {query}\n")
    print("-" * 50)
    response, _ = await run_agent_query(agent, query)
    print(response)

## Example 5: Custom Query

Try your own query:

In [None]:
# Edit this query to try your own requests
my_query = "What hardware presets are available for creating environments?"

async with mcp_client.session("qiskit-gym") as session:
    tools = await load_mcp_tools(session)
    agent = create_agent(llm, tools, system_prompt=SYSTEM_PROMPT)
    
    print(f"Query: {my_query}\n")
    print("-" * 50)
    response, _ = await run_agent_query(agent, my_query)
    print(response)

## Available Tools Reference

Here's a quick reference of all available tools:

### Environment Management
| Tool | Description |
|------|-------------|
| `create_permutation_env_tool` | Create PermutationGym for SWAP routing |
| `create_linear_function_env_tool` | Create LinearFunctionGym for CNOT synthesis |
| `create_clifford_env_tool` | Create CliffordGym with custom gate sets |
| `list_environments_tool` | List active environments |
| `delete_environment_tool` | Remove an environment |

### Training
| Tool | Description |
|------|-------------|
| `start_training_tool` | Start RL training (supports `background=True`) |
| `batch_train_environments_tool` | Train multiple environments (supports `background=True`) |
| `wait_for_training_tool` | Wait for background training to complete |
| `get_training_status_tool` | Get training progress and metrics |
| `stop_training_tool` | Stop a training session |
| `list_training_sessions_tool` | List all training sessions |

### Synthesis
| Tool | Description |
|------|-------------|
| `synthesize_permutation_tool` | Generate optimal SWAP circuit |
| `synthesize_linear_function_tool` | Generate optimal CNOT circuit |
| `synthesize_clifford_tool` | Generate optimal Clifford circuit |

### Model Management
| Tool | Description |
|------|-------------|
| `save_model_tool` | Save trained model to disk |
| `load_model_tool` | Load model from disk |
| `list_saved_models_tool` | List models on disk |
| `list_loaded_models_tool` | List models in memory |

### Important: Use Background Training!
**Always use `background=True`** for training to avoid connection timeouts:
```python
# Good - returns immediately
start_training_tool(env_id, num_iterations=100, background=True)
batch_train_environments_tool(env_ids, num_iterations=100, background=True)

# Then check progress or wait
get_training_status_tool(session_id)
wait_for_training_tool(session_id, timeout=600)
```

## Hardware Presets

| Preset | Qubits | Topology | Description |
|--------|--------|----------|-------------|
| `linear_5` | 5 | Line | 5-qubit linear chain |
| `linear_10` | 10 | Line | 10-qubit linear chain |
| `grid_3x3` | 9 | Grid | 3x3 square grid |
| `grid_5x5` | 25 | Grid | 5x5 square grid |
| `ibm_heron_r1` | 133 | Heavy-hex | IBM Heron r1 processor |
| `ibm_heron_r2` | 156 | Heavy-hex | IBM Heron r2 processor |
| `ibm_nighthawk` | 120 | 10x12 grid | IBM Nighthawk |