feat(logs): add structured log aggregator for Docker container observability#22
Merged
feat(logs): add structured log aggregator for Docker container observability#22
Conversation
…ollection Implement temps-log-aggregator crate for real-time Docker container log collection, compressed NDJSON chunk storage, full-text search, and live tail via SSE. This provides comprehensive observability for deployed containers on the platform. Key features: - Real-time Docker log streaming with automatic container discovery - Compressed NDJSON chunk storage (zstd) on filesystem or S3 - Dual search paths: TimescaleDB index for ERROR/WARN, archive scan for full-text - Live tail via Server-Sent Events with project/service/level filtering - Automatic retention cleanup with configurable policies - Permission-guarded handlers (LogsRead, LogsDelete) with audit logging Integration points: - Deploy containers labeled with sh.temps.* for automatic log collection - Streaming resilience: reconnect tracking, container-gone detection, bounded retries - Events listener with outer retry loop for permanent liveness - Plugin registered in console.rs with configurable storage backend Test coverage: 101 tests (unit + integration) covering parser, storage, chunk writer, search, tail, metadata, handlers, permissions, compression roundtrips, and large batch scenarios.
…log history UI - Refactor log aggregator from UUID to i32 project_id throughout the entire write and read path (collector, chunk writer, metadata, search, storage keys) - Remove unused log_events table and TimescaleDB hypertable dependency - Fix BuildKit build log output: emit vertex names (build step descriptions) in addition to command output, so cached layers are visible - Revert tar context creation to file-based spawn_blocking approach - Restore temps-environments, temps-screenshots, temps-embeddings in workspace - Consolidate migrations into single m20260225_000001 - Add frontend log history viewer with filters, pagination, and virtualized rendering - Add History tab to project runtime logs page
- Introduced a new `_typos.toml` file to extend words with the `flate2` crate. - Updated `.gitignore` to include `.env`, `.env.local`, and `content/` directories. - Enabled `cargo clippy` pre-commit hook in `.pre-commit-config.yaml` for linting Rust code. - Updated `Cargo.lock` with new dependencies including `fixedbitset`, `multimap`, `petgraph`, and `prost-build`. - Added `temps-otel`, `temps-plugin-sdk`, and `temps-external-plugins` crates to the workspace in `Cargo.toml`. - Enhanced `CHANGELOG.md` with new features and improvements related to OpenTelemetry and plugin systems.
…ory viewer - Added `temps-log-aggregator` crate for real-time Docker container log collection with features like automatic container discovery, compressed NDJSON storage, dual search paths, and live tail via Server-Sent Events. - Implemented a frontend log history viewer with search filters, pagination, and virtualized rendering, accessible through a new History tab in the project runtime logs page. - Upgraded Bollard to 0.20.1, migrating all crates to the new API. - Enhanced BuildKit log output to include vertex names for better visibility in deployment logs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #21
Adds a new
temps-log-aggregatorcrate that provides comprehensive structured log aggregation for Docker containers deployed on the platform.sh.temps.*labels, with reconnect resilience, container-gone detection, and bounded retriesLogsRead/LogsDeletepermission guards on all endpointsChanges
New crate:
temps-log-aggregator(8,200+ lines)parser.rs— Docker JSON log parsing, plain-text level detection, structured field extractionstorage/— Pluggable storage backends (filesystem, S3) with zstd compressionservices/chunk_writer.rs— Per-container buffering with 1MB / 30s flush triggersservices/collector.rs— Docker log streaming with resilience (last_seen_ts tracking, container-gone detection, max consecutive errors)services/metadata.rs— TimescaleDB operations for log_chunks and log_eventsservices/search.rs— Dual-path search routing (index vs archive), field filters, paginationservices/tail.rs— Broadcast-based live tail with filter matchingservices/retention.rs— Chunk cleanup by age with storage deletionhandlers/— HTTP endpoints: POST /logs/search, GET /logs/context, GET /logs/tail (SSE), DELETE /projects/{id}/logsplugin.rs— Platform integration: startup scan, Docker events listener, retention schedulerModified crates
labels: HashMap<String, String>toDeployRequest, applied to container creationsh.temps.project_id,sh.temps.deployment_id,sh.temps.environment_id,sh.temps.service,sh.temps.namespacelabelsLogsDeletepermission to all 5 locations (enum, Display, from_str, all(), Role::permissions)log_chunksandlog_eventsSea-ORM entitiesm20260225_000001_create_log_aggregator_tablesmigrationTEMPS_LOG_STORAGE_BACKENDand S3 config env vars, registers pluginTest Coverage
101 tests (was 89 before this session, +12 new):
No regressions:
temps-auth(102 passed),temps-deployer(39 passed)