# Monitoring AI Connectors and Agent Builder with OpenRouter

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/supporting-blog-content/openrouter-agent-builder-monitoring/openrouter-agent-builder-monitoring.ipynb)

This notebook demonstrates how to:
- Create an AI Connector for Agent Builder using OpenRouter
- Set up an inference endpoint for data enrichment
- Build an ingest pipeline that extracts structured fields from product descriptions
- Create an Agent Builder agent that queries enriched data
- Monitor LLM usage with OpenRouter Broadcast and Elastic APM

## Architecture Overview

We'll build an AI-enriched audio products catalog where:
1. An **ingest pipeline** uses OpenRouter to extract structured fields (category, features, use_case) from product descriptions
2. An **Agent Builder** agent answers questions about the products using semantic search and ES|QL
3. **OpenRouter Broadcast** sends traces to Elastic APM for monitoring costs, latency, and token usage

## Install Dependencies

First, let's install the required Python packages.

In [None]:
!pip install -qU elasticsearch requests

## Setup Credentials

You'll need the following credentials:

- **ELASTIC_URL**: Your Elasticsearch endpoint URL
- **KIBANA_URL**: Your Kibana endpoint URL
- **ELASTIC_API_KEY**: An API key with permissions to create indices, pipelines, inference endpoints, and connectors
- **OPENROUTER_AGENT_KEY**: Your OpenRouter API key for Agent Builder interactions
- **OPENROUTER_INGESTION_KEY**: (Optional) A separate OpenRouter API key for ingestion. If not provided, uses the agent key.

To get started with Elastic Cloud, [sign up for a free trial](https://cloud.elastic.co/registration?utm_source=github&utm_content=elasticsearch-labs-notebook).

In [None]:
from getpass import getpass

ELASTIC_URL = getpass("Elastic URL: ")
KIBANA_URL = getpass("Kibana URL: ")
ELASTIC_API_KEY = getpass("Elastic API Key: ")
OPENROUTER_AGENT_KEY = getpass("OpenRouter Agent API Key: ")

# Using separate keys allows differentiating costs between ingestion and agent chat in monitoring dashboards
OPENROUTER_INGESTION_KEY = (
    getpass("OpenRouter Ingestion API Key (press Enter to use Agent key): ")
    or OPENROUTER_AGENT_KEY
)

## Initialize Elasticsearch Client

Create the Elasticsearch client with a higher timeout for inference operations.

In [None]:
from elasticsearch import Elasticsearch
import requests

es = Elasticsearch(hosts=[ELASTIC_URL], api_key=ELASTIC_API_KEY, request_timeout=60)

# Verify connection
info = es.info()
print(f"Connected to Elasticsearch {info['version']['number']}")

## Create AI Connector for Agent Builder

The AI Connector allows Agent Builder to communicate with LLMs through OpenRouter. We use a reasoning-capable model like GPT-5.2 for the agent since it needs to handle complex queries and tool orchestration.

In [None]:
connector_payload = {
    "name": "OpenRouter Agent Connector",
    "connector_type_id": ".gen-ai",
    "config": {
        "apiProvider": "Other",
        "apiUrl": "https://openrouter.ai/api/v1/chat/completions",
        "defaultModel": "openai/gpt-5.2",
        "enableNativeFunctionCalling": True,
    },
    "secrets": {"apiKey": OPENROUTER_AGENT_KEY},
}

response = requests.post(
    f"{KIBANA_URL}/api/actions/connector",
    headers={
        "kbn-xsrf": "true",
        "Authorization": f"ApiKey {ELASTIC_API_KEY}",
        "Content-Type": "application/json",
    },
    json=connector_payload,
)

connector = response.json()
if "id" in connector:
    print(f"Connector created: {connector['id']}")
else:
    print(f"Response: {connector}")

## Create Inference Endpoint

The inference endpoint allows Elasticsearch to call LLMs during data processing. We use a fast, cheaper model like GPT-4.1-mini for bulk ingestion tasks that don't require advanced reasoning capabilities.

In [None]:
inference_config = {
    "service": "openai",
    "service_settings": {
        "model_id": "openai/gpt-4.1-mini",
        "api_key": OPENROUTER_INGESTION_KEY,
        "url": "https://openrouter.ai/api/v1/chat/completions",
    },
}

try:
    response = es.inference.put(
        inference_id="openrouter-inference-endpoint",
        task_type="completion",
        body=inference_config,
    )
    print(f"Inference endpoint created: {response['inference_id']}")
except Exception as e:
    if "resource_already_exists" in str(e).lower():
        print("Inference endpoint already exists")
    else:
        raise e

## Create Ingest Pipeline

The ingest pipeline extracts structured fields from product descriptions using the LLM. The key is providing possible values as enums so the LLM groups consistently. Otherwise, we might get variations like "Noise Cancellation", "ANC", and "noise-cancelling" that are harder to aggregate.

The pipeline has four processors:
1. **Script**: Builds the extraction prompt with the product description
2. **Inference**: Calls OpenRouter to extract structured data
3. **JSON**: Parses the response and adds fields to the document
4. **Remove**: Cleans up temporary fields

In [None]:
# Define the extraction prompt
EXTRACTION_PROMPT = (
    "Extract audio product information from this description. "
    "Return raw JSON only, no markdown, no explanation. Fields: "
    "category (string, one of: Headphones/Earbuds/Speakers/Microphones/Accessories), "
    "features (array of strings from: wireless/noise_cancellation/long_battery/waterproof/voice_assistant/fast_charging/portable/surround_sound), "
    "use_case (string, one of: Travel/Office/Home/Fitness/Gaming/Studio). "
    "Description: "
)

# Create the enrichment pipeline
pipeline_config = {
    "processors": [
        {"script": {"source": f"ctx.prompt = '{EXTRACTION_PROMPT}' + ctx.description"}},
        {
            "inference": {
                "model_id": "openrouter-inference-endpoint",
                "input_output": {
                    "input_field": "prompt",
                    "output_field": "ai_response",
                },
            }
        },
        {"json": {"field": "ai_response", "add_to_root": True}},
        {"remove": {"field": ["prompt", "ai_response"]}},
    ]
}

es.ingest.put_pipeline(id="product-enrichment-pipeline", body=pipeline_config)

print("Pipeline created: product-enrichment-pipeline")

## Create Index with Mapping

Create the index with the appropriate mappings. The enriched fields (category, features, use_case) are mapped as keywords for efficient filtering and aggregations.

In [None]:
index_mapping = {
    "mappings": {
        "properties": {
            "name": {"type": "text"},
            "description": {"type": "text"},
            "price": {"type": "float"},
            "category": {"type": "keyword"},
            "features": {"type": "keyword"},
            "use_case": {"type": "keyword"},
        }
    }
}

es.indices.create(index="products-enriched", body=index_mapping)

print("Index created: products-enriched")

## Index Sample Products

Index sample audio products using the enrichment pipeline. Each document will be processed by the LLM to extract structured fields.

> **Note:** For production use with larger datasets, use the [Bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk) for better performance. We use individual indexing here to clearly show each document being processed.

In [None]:
products = [
    {
        "name": "Wireless Noise-Canceling Headphones",
        "description": "Premium wireless Bluetooth headphones with active noise cancellation, 30-hour battery life, and premium leather ear cushions. Perfect for travel and office use.",
        "price": 299.99,
    },
    {
        "name": "Portable Bluetooth Speaker",
        "description": "Compact waterproof speaker with 360-degree surround sound. 20-hour battery life, perfect for outdoor adventures and pool parties.",
        "price": 149.99,
    },
    {
        "name": "Studio Condenser Microphone",
        "description": "Professional USB microphone with noise cancellation and voice assistant compatibility. Ideal for podcasting, streaming, and home studio recording.",
        "price": 199.99,
    },
]

for i, product in enumerate(products):
    es.index(
        index="products-enriched",
        id=i,
        body=product,
        pipeline="product-enrichment-pipeline",
    )
    print(f"Indexed: {product['name']}")

# Refresh to make documents searchable
es.indices.refresh(index="products-enriched")
print("\nAll products indexed and refreshed")

## Verify Enrichment

Let's verify that the pipeline correctly extracted structured fields from the product descriptions.

In [None]:
results = es.search(
    index="products-enriched", body={"query": {"match_all": {}}, "size": 10}
)

print("Enriched Products:\n")
for hit in results["hits"]["hits"]:
    source = hit["_source"]
    print(f"Name: {source['name']}")
    print(f"  Category: {source.get('category', 'N/A')}")
    print(f"  Features: {source.get('features', 'N/A')}")
    print(f"  Use Case: {source.get('use_case', 'N/A')}")
    print(f"  Price: ${source['price']}")
    print()

## Create Agent Builder Agent

Create an Agent Builder agent that can answer questions about the product catalog using semantic search and ES|QL for analytics.

In [None]:
agent_payload = {
    "id": "audio-product-assistant",
    "name": "Audio Product Assistant",
    "description": "Answers questions about audio product catalog using semantic search and analytics",
    "labels": ["audio"],
    "avatar_color": "#BFDBFF",
    "avatar_symbol": "AU",
    "configuration": {
        "tools": [
            {
                "tool_ids": [
                    "platform.core.search",
                    "platform.core.list_indices",
                    "platform.core.get_index_mapping",
                    "platform.core.execute_esql",
                ]
            }
        ],
        "instructions": """You are an audio product assistant that helps users find and analyze audio equipment.

Use the products-enriched index for all queries. The extracted fields are:
- category: Headphones, Earbuds, Speakers, Microphones, or Accessories
- features: array of product features like wireless, noise_cancellation, long_battery
- use_case: Travel, Office, Home, Fitness, Gaming, or Studio

For analytical questions, use ESQL to aggregate data.
For product searches, use semantic search on the description field.""",
    },
}

response = requests.post(
    f"{KIBANA_URL}/api/agent_builder/agents",
    headers={
        "kbn-xsrf": "true",
        "Authorization": f"ApiKey {ELASTIC_API_KEY}",
        "Content-Type": "application/json",
    },
    json=agent_payload,
)

agent = response.json()
if "id" in agent:
    print(f"Agent created: {agent['id']}")
else:
    print(f"Response: {agent}")

## Test with ES|QL Query

Let's run a simple ES|QL query to verify the data is accessible and the enriched fields work correctly.

In [None]:
esql_query = """
FROM products-enriched
| STATS avg_price = AVG(price), count = COUNT(*) BY category
| SORT avg_price DESC
"""

result = es.esql.query(query=esql_query)
print("Average Price by Category:\n")
print(result)

## Configure OpenRouter Broadcast

To monitor LLM usage, costs, and performance, configure OpenRouter Broadcast to send traces to Elastic APM.

### Step 1: Get OpenTelemetry Endpoint

Navigate to the APM tutorial in Kibana:
```
https://<your_kibana_url>/app/observabilityOnboarding/otel-apm/?category=application
```

Collect the URL and authentication token from the **OpenTelemetry** tab.

### Step 2: Configure Broadcast in OpenRouter

1. Go to [OpenRouter Broadcast settings](https://openrouter.ai/settings/broadcast)
2. Add a new destination for "OpenTelemetry Collector"
3. Configure the endpoint with the `/v1/traces` path:

```
Endpoint: https://xxxxx.ingest.us-east-2.aws.elastic-cloud.com:443/v1/traces
Headers: {"Authorization": "Bearer YOUR_APM_SECRET_TOKEN"}
```

**Important:** Your Kibana server needs to be reachable via the public internet to receive data from OpenRouter.

### Step 3: Test Connection

Press **Test connection** in OpenRouter and verify you see a success message.

### Monitoring Data

After configuration, you'll see documents in Kibana under:
- Data stream: `traces-generic.otel-default`
- Service name: `openrouter`

The traces include:
- Token usage (prompt, completion, total)
- Cost (in USD)
- Latency (time to first token, total)
- Model information

## Cleanup (Optional)

Uncomment and run the following code to clean up the resources created in this notebook.

In [None]:
# Uncomment to clean up resources

# Delete index
# es.indices.delete(index="products-enriched")
# print("Deleted index: products-enriched")

# Delete pipeline
# es.ingest.delete_pipeline(id="product-enrichment-pipeline")
# print("Deleted pipeline: product-enrichment-pipeline")

# Delete inference endpoint
# es.inference.delete(inference_id="openrouter-inference-endpoint")
# print("Deleted inference endpoint: openrouter-inference-endpoint")