xtrace is a server-side component for AI/LLM observability that collects, stores, and queries traces/observations/metrics to help you diagnose latency, cost, quality, and failure patterns in production.
This crate is currently published as a binary (executable service) (only src/main.rs), so it does not provide a Rust SDK API that can be imported as a dependency. You can deploy and call it as an HTTP service.
Dependencies: PostgreSQL.
Environment variables:
DATABASE_URL (required)
API_BEARER_TOKEN (required, protects the API)
BIND_ADDR (optional, default 127.0.0.1:8742)
DEFAULT_PROJECT_ID (optional, default default)
XTRACE_PUBLIC_KEY (optional, for Langfuse public API BasicAuth compatibility)
XTRACE_SECRET_KEY (optional, for Langfuse public API BasicAuth compatibility)
RATE_LIMIT_QPS (optional, default 20, per-token query rate limit QPS)
RATE_LIMIT_BURST (optional, default 40, per-token query rate limit burst cap)
Compatibility:
Also supports legacy names LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY.
Start:
DATABASE_URL=postgresql://xinference@localhost:5432/xtrace \
API_BEARER_TOKEN=... \
cargo run --releaseHealth check:
curl http://127.0.0.1:8742/healthzExcept /healthz, all other endpoints require a Bearer token:
Authorization: Bearer $API_BEARER_TOKEN
POST /v1/l/batch
Batch ingest events.
Request body structure (simplified):
trace (optional) + observations (array)
{
"trace": {
"id": "00000000-0000-0000-0000-000000000000",
"timestamp": "2026-01-01T00:00:00Z",
"name": "chat",
"userId": "alice",
"tags": ["prod"]
},
"observations": [
{
"id": "00000000-0000-0000-0000-000000000001",
"traceId": "00000000-0000-0000-0000-000000000000",
"type": "GENERATION",
"name": "llm",
"startTime": "2026-01-01T00:00:00Z",
"endTime": "2026-01-01T00:00:01Z",
"model": "gpt-4o-mini",
"input": {"role": "user", "content": "hi"},
"output": {"role": "assistant", "content": "hello"}
}
]
}GET /api/public/traces
Paginated trace query.
GET /api/public/traces/:traceId
Fetch a single trace's details.
GET /api/public/metrics/daily
Query daily aggregated metrics.
To support Nebula reporting GPU/node metrics (time-series metrics), xtrace provides additional metrics write and query endpoints.
Note: The current implementation works in single-tenant/single-project mode; all written metrics go to DEFAULT_PROJECT_ID and environment is fixed to default.
POST /v1/metrics/batch
Request body:
{
"metrics": [
{
"name": "gpu_utilization",
"labels": {"node_id": "node-1", "gpu_index": "0"},
"value": 85.0,
"timestamp": "2026-02-12T12:18:00Z"
}
]
}Example:
NOW=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
curl -H "Authorization: Bearer $API_BEARER_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"metrics\":[{\"name\":\"gpu_utilization\",\"labels\":{\"node_id\":\"node-1\",\"gpu_index\":\"0\"},\"value\":85.0,\"timestamp\":\"$NOW\"}]}" \
"http://127.0.0.1:8742/v1/metrics/batch"GET /api/public/metrics/names
Returns all metric names under the current project:
curl -H "Authorization: Bearer $API_BEARER_TOKEN" \
"http://127.0.0.1:8742/api/public/metrics/names"GET /api/public/metrics/query
Parameters:
- name (required)
- from/to (optional, ISO8601; default last 1 hour)
- labels (optional, JSON string; backend filters with
labels @> ...) - step (optional: 1m/5m/1h/1d; default 1m)
- agg (optional: avg/max/min/sum/last; default avg)
Example:
curl -H "Authorization: Bearer $API_BEARER_TOKEN" \
"http://127.0.0.1:8742/api/public/metrics/query?name=gpu_utilization&step=1m&agg=last&labels=%7B%22node_id%22%3A%22node-1%22%2C%22gpu_index%22%3A%220%22%7D"Example:
curl -H "Authorization: Bearer $API_BEARER_TOKEN" \
"http://127.0.0.1:8742/api/public/traces?page=1&limit=50"The repository includes an xtrace-client crate (HTTP SDK, based on reqwest).
[dependencies]
xtrace-client = "0.0.1"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }Example:
use xtrace_client::{Client, TraceListQuery};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let client = Client::new("http://127.0.0.1:8742/", "YOUR_TOKEN")?;
client.healthz().await?;
let traces = client.list_traces(&TraceListQuery::default()).await?;
println!("{}", traces.data.len());
let now = chrono::Utc::now();
client
.push_metrics(&[xtrace_client::MetricPoint {
name: "gpu_utilization".to_string(),
labels: std::collections::HashMap::from([
("node_id".to_string(), "node-1".to_string()),
("gpu_index".to_string(), "0".to_string()),
]),
value: 85.0,
timestamp: now,
}])
.await?;
Ok(())
}The xtrace crate is positioned as a server-side component; splitting the SDK separately avoids pulling server-side dependencies (axum/sqlx) into client applications.