Skip to content

lipish/xtrace

Repository files navigation

xtrace

xtrace is a server-side component for AI/LLM observability that collects, stores, and queries traces/observations/metrics to help you diagnose latency, cost, quality, and failure patterns in production.

This crate is currently published as a binary (executable service) (only src/main.rs), so it does not provide a Rust SDK API that can be imported as a dependency. You can deploy and call it as an HTTP service.

Running

Dependencies: PostgreSQL.

Environment variables: DATABASE_URL (required) API_BEARER_TOKEN (required, protects the API) BIND_ADDR (optional, default 127.0.0.1:8742) DEFAULT_PROJECT_ID (optional, default default) XTRACE_PUBLIC_KEY (optional, for Langfuse public API BasicAuth compatibility) XTRACE_SECRET_KEY (optional, for Langfuse public API BasicAuth compatibility) RATE_LIMIT_QPS (optional, default 20, per-token query rate limit QPS) RATE_LIMIT_BURST (optional, default 40, per-token query rate limit burst cap)

Compatibility: Also supports legacy names LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY.

Start:

DATABASE_URL=postgresql://xinference@localhost:5432/xtrace \
API_BEARER_TOKEN=... \
cargo run --release

Health check:

curl http://127.0.0.1:8742/healthz

HTTP API (Core Routes)

Except /healthz, all other endpoints require a Bearer token: Authorization: Bearer $API_BEARER_TOKEN

POST /v1/l/batch Batch ingest events.

Request body structure (simplified): trace (optional) + observations (array)

{
  "trace": {
    "id": "00000000-0000-0000-0000-000000000000",
    "timestamp": "2026-01-01T00:00:00Z",
    "name": "chat",
    "userId": "alice",
    "tags": ["prod"]
  },
  "observations": [
    {
      "id": "00000000-0000-0000-0000-000000000001",
      "traceId": "00000000-0000-0000-0000-000000000000",
      "type": "GENERATION",
      "name": "llm",
      "startTime": "2026-01-01T00:00:00Z",
      "endTime": "2026-01-01T00:00:01Z",
      "model": "gpt-4o-mini",
      "input": {"role": "user", "content": "hi"},
      "output": {"role": "assistant", "content": "hello"}
    }
  ]
}

GET /api/public/traces Paginated trace query.

GET /api/public/traces/:traceId Fetch a single trace's details.

GET /api/public/metrics/daily Query daily aggregated metrics.

Nebula Integration (Metrics)

To support Nebula reporting GPU/node metrics (time-series metrics), xtrace provides additional metrics write and query endpoints.

Note: The current implementation works in single-tenant/single-project mode; all written metrics go to DEFAULT_PROJECT_ID and environment is fixed to default.

Write

POST /v1/metrics/batch

Request body:

{
  "metrics": [
    {
      "name": "gpu_utilization",
      "labels": {"node_id": "node-1", "gpu_index": "0"},
      "value": 85.0,
      "timestamp": "2026-02-12T12:18:00Z"
    }
  ]
}

Example:

NOW=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
curl -H "Authorization: Bearer $API_BEARER_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"metrics\":[{\"name\":\"gpu_utilization\",\"labels\":{\"node_id\":\"node-1\",\"gpu_index\":\"0\"},\"value\":85.0,\"timestamp\":\"$NOW\"}]}" \
  "http://127.0.0.1:8742/v1/metrics/batch"

Query

GET /api/public/metrics/names

Returns all metric names under the current project:

curl -H "Authorization: Bearer $API_BEARER_TOKEN" \
  "http://127.0.0.1:8742/api/public/metrics/names"

GET /api/public/metrics/query

Parameters:

  • name (required)
  • from/to (optional, ISO8601; default last 1 hour)
  • labels (optional, JSON string; backend filters with labels @> ...)
  • step (optional: 1m/5m/1h/1d; default 1m)
  • agg (optional: avg/max/min/sum/last; default avg)

Example:

curl -H "Authorization: Bearer $API_BEARER_TOKEN" \
  "http://127.0.0.1:8742/api/public/metrics/query?name=gpu_utilization&step=1m&agg=last&labels=%7B%22node_id%22%3A%22node-1%22%2C%22gpu_index%22%3A%220%22%7D"

Example:

curl -H "Authorization: Bearer $API_BEARER_TOKEN" \
  "http://127.0.0.1:8742/api/public/traces?page=1&limit=50"

Rust SDK (xtrace-client)

The repository includes an xtrace-client crate (HTTP SDK, based on reqwest).

[dependencies]
xtrace-client = "0.0.1"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }

Example:

use xtrace_client::{Client, TraceListQuery};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let client = Client::new("http://127.0.0.1:8742/", "YOUR_TOKEN")?;
    client.healthz().await?;

    let traces = client.list_traces(&TraceListQuery::default()).await?;
    println!("{}", traces.data.len());

    let now = chrono::Utc::now();
    client
        .push_metrics(&[xtrace_client::MetricPoint {
            name: "gpu_utilization".to_string(),
            labels: std::collections::HashMap::from([
                ("node_id".to_string(), "node-1".to_string()),
                ("gpu_index".to_string(), "0".to_string()),
            ]),
            value: 85.0,
            timestamp: now,
        }])
        .await?;
    Ok(())
}

The xtrace crate is positioned as a server-side component; splitting the SDK separately avoids pulling server-side dependencies (axum/sqlx) into client applications.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •