feat(otel): add OpenTelemetry ingest, query, and frontend traces UI#18
Merged
feat(otel): add OpenTelemetry ingest, query, and frontend traces UI#18
Conversation
… integration - Introduced `temps-otel` crate to the workspace and updated dependencies in `Cargo.toml`. - Added `OtelRead` and `OtelWrite` permissions to the `Permission` enum in `temps-auth`. - Registered `OtelPlugin` in the console API for OpenTelemetry metrics, traces, and logs collection. - Created migration for OpenTelemetry tables in the database. - Updated relevant files to integrate OpenTelemetry functionality across the application.
- Added `.env` to `.gitignore` to prevent sensitive information from being tracked. - Updated `Cargo.toml` to include new crates: `temps-environments`, `temps-screenshots`, and `temps-embeddings`. - Added `tower` and `uuid` dependencies to `Cargo.lock` and `Cargo.toml`. - Enhanced `CHANGELOG.md` with new features related to PostgreSQL backups and preset providers. - Updated `docker-compose.yml` for PostgreSQL configuration to support WAL-G for backups. - Improved CLI error handling and added source map management commands in `temps-cli`. - Refined analytics event handling and introduced console event ingestion in `temps-analytics-events`.
- Added resource monitoring tab in the project sidebar and a dedicated monitoring settings page with per-environment CPU, memory, and disk metrics. - Introduced `status_code_class` query parameter for proxy log stats endpoints to filter by status code classes (e.g., "2xx", "3xx"). - Implemented TimescaleDB compression and retention policies for the `proxy_logs` hypertable, optimizing data management. - Enabled `cargo clippy` pre-commit hook to catch lint issues before CI, improving code quality. - Updated various components and API types to support new monitoring functionalities and enhance user experience.
- Complete temps-otel crate: OTLP/HTTP protobuf ingest (traces, metrics, logs), query handlers, TimescaleDB storage, rate limiting, quota checks, anomaly detection, health summaries, and sidecar config generation - Auth: support tk_ (API key) and dt_ (deployment token) authentication for OTel ingest with path-based and header-based routes - Frontend: Traces list page with filtering (time range, service, status), trace detail page with span waterfall visualization and span detail panel, setup section with OTLP endpoint and Next.js code snippets - Add deployment_id to deployment tokens for OTel context propagation - Fix protobuf Span.flags from uint32 to fixed32 per OTLP v1.1.0+ spec - Remove server-side tail sampling (sampling is client SDK responsibility) - Add OtelRead/OtelWrite permissions, plugin registered in console - 117 passing unit tests, zero clippy warnings
- Add protobuf-compiler installation to all CI jobs that compile the workspace (check, clippy, build-tests, unit-tests, integration-tests) - Add temps-otel to unit-b test group - Add OTel feature entry to CHANGELOG.md [Unreleased] section - Document protoc and wasm-pack as prerequisites in CONTRIBUTING.md with platform-specific installation instructions - Add changelog reminder to PR checklist
Add GET /otel/trace-summaries that returns one row per trace (grouped by trace_id) with root span name, service name, deployment environment, span count, error count, and duration. This fixes the pagination bug where the old endpoint returned flat spans causing only ~5 traces to display per page. - Add TraceSummary type with deployment_environment field - Add query_trace_summaries() and count_traces() to OtelStorage trait - Implement both in TimescaleDB (GROUP BY trace_id with array_agg) - Implement both in MockOtelStorage for tests - Add TraceSummariesResponse handler with proper total count - Register new route and OpenAPI annotations - Update TracesList.tsx to use new endpoint (remove client-side groupByTrace) - Show Environment column as badge when viewing all environments - Change page size from 50 to 10 traces per page - Auto-inject OTel env vars in workflow_planner for deployments
- TracesList: filters stack vertically on mobile (flex-col sm:flex-row), selects go full-width, hide Kind/Spans/Timestamp columns on mobile, compact pagination, overflow-x-auto on table - TraceDetail: waterfall + detail panel stack vertically on mobile (flex-col lg:flex-row), detail panel goes full-width, span name column narrower on mobile, min-width on scrollable rows - Add mobile responsiveness guidelines to CLAUDE.md
…ummaries The deployment_environment column comes from the OTel resource attribute which most SDKs don't set. JOIN deployments + environments tables to get the actual environment name, falling back to the resource attribute via COALESCE. Also qualify all column references with table alias since the query now involves multiple tables.
…ent and proxying capabilities - Added `temps-external-plugins` crate for managing standalone binary plugins, including discovery, lifecycle management, and HTTP proxying. - Implemented `temps-plugin-sdk` crate to provide a standardized interface for plugin authors, including manifest definitions and service registration. - Integrated external plugins into the main Temps application, allowing for dynamic loading and management via Unix domain sockets. - Updated `Cargo.toml` and `Cargo.lock` to include new dependencies and workspace members for the external plugins system. - Enhanced the console API to support graceful shutdown of external plugins and added routes for listing running plugins. - Documented the new external plugin features in the CHANGELOG.md.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add a complete OpenTelemetry observability stack to Temps — from OTLP/HTTP protobuf ingest through TimescaleDB storage to a frontend traces visualization UI.
Backend (
temps-otelcrate)tk_) and deployment tokens (dt_), with header-based and path-based ingest routestime_bucket, query logs, pipeline stats, health summaries, insightsAuth & Permissions
OtelRead/OtelWritepermissions addeddeployment_idadded to deployment tokens for full OTel context propagationFrontend
Notable fixes
Span.flagschanged fromuint32tofixed32per OTLP v1.1.0+ specTraceDetaildata extraction (data.datanotdata.spans)ERROR/OKfrom API)Type of change
Checklist
cargo test --lib)cargo check --libpasses with no warningsRelated issues
Ref #17