feat: use local json logs for observability#98
Conversation
- Replace Axiom OTLP export with local JSON file logging - Add local JSONFileSpanExporter for OpenTelemetry traces - Use BatchSpanProcessor to prevent blocking event loop - Write telemetry to blacki-telemetry.log - Write traces to blacki-traces.log - Update compose.yaml to use bind mounts for logs and state - Update gitignore to exclude local logs and state
There was a problem hiding this comment.
Code Review
This pull request transitions the observability stack from OTLP exports to local JSON file logging and tracing, including updates to the Docker environment and .gitignore. Reviewers suggested several enhancements: adding parent_id to trace spans for hierarchy reconstruction, using ISO 8601 timestamps, and including process/thread IDs in logs. Additionally, feedback highlighted opportunities to deduplicate directory path logic, improve JSON serialization robustness using default=str, and simplify OpenTelemetry resource creation.
| span_data = { | ||
| "name": span.name, | ||
| "context": { | ||
| "trace_id": format(span.context.trace_id, "032x"), | ||
| "span_id": format(span.context.span_id, "016x"), | ||
| }, |
There was a problem hiding this comment.
The exported span data is missing the parent_span_id. Without this field, it is impossible to reconstruct the trace hierarchy (parent-child relationships) in DuckDB or other analysis tools.
| span_data = { | |
| "name": span.name, | |
| "context": { | |
| "trace_id": format(span.context.trace_id, "032x"), | |
| "span_id": format(span.context.span_id, "016x"), | |
| }, | |
| span_data = { | |
| "name": span.name, | |
| "context": { | |
| "trace_id": format(span.context.trace_id, "032x"), | |
| "span_id": format(span.context.span_id, "016x"), | |
| }, | |
| "parent_id": format(span.parent.span_id, "016x") if span.parent else None, |
| class JSONFormatter(logging.Formatter): | ||
| def format(self, record: logging.LogRecord) -> str: | ||
| log_record = { | ||
| "timestamp": self.formatTime(record, self.datefmt), |
There was a problem hiding this comment.
For machine-readable JSON logs, it is highly recommended to use ISO 8601 format for timestamps. The default formatTime behavior can be locale-dependent and inconsistent across environments.
| "timestamp": self.formatTime(record, self.datefmt), | |
| "timestamp": datetime.datetime.fromtimestamp(record.created, tz=datetime.timezone.utc).isoformat(), |
| log_record = { | ||
| "timestamp": self.formatTime(record, self.datefmt), | ||
| "level": record.levelname, | ||
| "name": record.name, | ||
| "message": record.getMessage(), | ||
| } |
There was a problem hiding this comment.
Consider adding process and thread IDs to the JSON log record. This is especially useful for debugging concurrency issues in a FastAPI application.
| log_record = { | |
| "timestamp": self.formatTime(record, self.datefmt), | |
| "level": record.levelname, | |
| "name": record.name, | |
| "message": record.getMessage(), | |
| } | |
| log_record = { | |
| "timestamp": self.formatTime(record, self.datefmt), | |
| "level": record.levelname, | |
| "name": record.name, | |
| "message": record.getMessage(), | |
| "process": record.process, | |
| "thread": record.threadName, | |
| } |
|
|
||
| # Configure local JSON file handler | ||
| # We use ./logs locally and /app/logs inside Docker, so check both | ||
| log_dir = "/app/logs" if Path("/.dockerenv").exists() else "./logs" |
| if span.events | ||
| else [], | ||
| } | ||
| f.write(json.dumps(span_data) + "\n") |
There was a problem hiding this comment.
Using json.dumps without a default handler can raise a TypeError if any span attributes or events contain non-serializable types. It is safer to use default=str to ensure the exporter doesn't crash.
| f.write(json.dumps(span_data) + "\n") | |
| f.write(json.dumps(span_data, default=str) + "\n") |
| resource_attrs = {} | ||
| if "OTEL_RESOURCE_ATTRIBUTES" in os.environ: | ||
| for pair in os.environ["OTEL_RESOURCE_ATTRIBUTES"].split(","): | ||
| if "=" in pair: | ||
| key, value = pair.split("=", 1) | ||
| resource_attrs[key] = value |
There was a problem hiding this comment.
Manual parsing of OTEL_RESOURCE_ATTRIBUTES is redundant here. The OpenTelemetry SDK's Resource.create() method automatically reads and merges attributes from the environment variable by default.
| resource_attrs = {} | |
| if "OTEL_RESOURCE_ATTRIBUTES" in os.environ: | |
| for pair in os.environ["OTEL_RESOURCE_ATTRIBUTES"].split(","): | |
| if "=" in pair: | |
| key, value = pair.split("=", 1) | |
| resource_attrs[key] = value | |
| resource = Resource.create() |
- Add parent_id to trace spans for hierarchy - Use ISO 8601 timestamps for logs and spans - Include process and thread IDs in logs - Deduplicate directory path logic - Improve JSON serialization using default=str - Simplify OpenTelemetry resource creation
What
Replace Axiom OTLP export with local JSON file logging and trace export via DuckDB.
Why
To simplify the observability stack, lower reliance on external third-party services (Axiom), and prevent logs/traces from being lost during CI/CD redeployments. It enables utilizing the server disk and DuckDB for log analytics.
How
logging.FileHandlerto output JSON lines to./logs/blacki-telemetry.log.JSONFileSpanExporterusingBatchSpanProcessorto capture OpenTelemetry traces and write to./logs/blacki-traces.logasynchronously to prevent blocking the event loop.compose.yamlto use bind mounts for logs (./logs:/app/logs) and ADK state (./.adk_state:/app/src/.adk) to protect data fromdocker system pruneduring CI/CD..gitignoreto avoid checking in the locallogs/and.adk_state/folders.Tests
docker compose up -d.blacki-telemetry.logandblacki-traces.logare populated and formatted correctly.read_json_auto.