This document provides an overview of the Log Processor, a Go-based service that continuously consumes, processes, and structures logs from microservices. It is designed to work closely with Kafka (as a log stream), TimeSeries/relational databases (for structured data storage), and optionally Neo4j (for visualizing microservice dependencies).
- Purpose
- High-Level Architecture
- Key Components
- Data Flow
- Typical Use Cases
- Pitfalls and Recommendations
- Extensibility
- Contributing
The Log Processor’s primary goals:
- Ingest and store raw logs from Kafka topics (published by microservices).
- Reconstruct complete request flows (traces) by correlating data with
trace_idandspan_id. - Convert raw data into a structured format, enabling easy querying, analytics, and continuous updates to runtime models.
- Periodically update a service dependency graph (if desired, in a graph database like Neo4j).
-
Kafka Consumer
- Reads log messages published by microservices to Kafka.
- Each message typically includes:
trace_id/span_id- Source service, destination service
- Request/response details
- Timestamps
- Possible parent–child relationships for hierarchical call chains
-
Database Storage
- A time-series or relational database is used for:
- Raw Logs: storing individual log records as they arrive.
- Structured Logs: storing aggregated records (grouped by
trace_idto rebuild full request flows).
- A time-series or relational database is used for:
-
Dependency Graph (Optional)
- A scheduler periodically pulls processed data from the database and updates a dependency graph, capturing:
- Service relationships (which service calls which)
- Call metrics (call frequency, average latency, error counts)
- A scheduler periodically pulls processed data from the database and updates a dependency graph, capturing:
-
Dashboard / Analytics
- Analytical dashboards (e.g., Grafana) or custom UIs can query the Log Processor’s outputs for real-time observability.
- Language: Go
- Responsibility:
- Subscribes to one or more Kafka topics (e.g.
microservice-logs-topic). - Batches or streams log entries to the Log Processor’s core logic.
- Subscribes to one or more Kafka topics (e.g.
- Description:
- Groups log entries by
trace_id. - Reconstructs call hierarchies using parent–child
span_id. - Performs transformations or adds metadata (e.g. user sessions, environment markers).
- Groups log entries by
- Purpose:
- Holds both raw and enriched log data for querying.
- Typically includes tables for
raw_logsandstructured_logs, keyed bytrace_id.
- Purpose:
- Reads from
structured_logsin small batches (e.g., every few seconds/minutes). - Inserts or updates edges in a graph database to reflect new inter-service calls or changes in latencies.
- Reads from
-
Microservice -> Kafka
Each microservice logs events to Kafka, tagging each entry with atrace_idandspan_idfor correlation. -
Kafka -> Raw Logs Table
The Log Processor consumes messages from Kafka and writes them to araw_logstable. This ensures minimal overhead during ingestion. -
Log Processor -> Structured Logs
After correlation and trace reconstruction, the processor writes aggregated data (per trace) tostructured_logs. This can include total duration, involved services, and error info. -
Structured Logs -> Dependency Graph
The processor (or a background job) uses the aggregated data to update a dependency graph, capturing service-to-service calls and associated metrics.
-
Real-Time Monitoring
Continuously track call frequency, average/95th percentile latency, error rates, and feed data into dashboards or alerts. -
Distributed Tracing
Reconstruct full call flows fromtrace_idandspan_id, enabling quick identification of bottlenecks or anomalies. -
Architecture Insights
Maintain an always-fresh overview of how services interact at runtime. Detect cyclical dependencies, newly introduced services, or changes in network topologies.
-
High-Volume Logging
- With many services, logs can be massive. Consider partitioning Kafka topics and scaling the Log Processor horizontally.
-
Indexing
- Ensure critical columns (e.g.
trace_id,span_id) are properly indexed in your database to prevent query bottlenecks.
- Ensure critical columns (e.g.
-
Partial Traces
- Logs may arrive out-of-order or may be missing. Handle incomplete data gracefully (e.g., partial correlation, skipping incomplete spans).
-
Fault Tolerance
- Use replication for Kafka or database.
- Log Processor itself should be stateless, allowing easy re-deployment or autoscaling.
-
Custom Enrichment
Add domain-specific metadata (e.g., attach a user token, geo-location) to each structured log entry. -
Alternative Storage
Replace or add other data stores (e.g. Elasticsearch, InfluxDB) for specialized indexing or analytics demands. -
Integration with Analytics
Pipe your structured logs into ML pipelines or anomaly detection services for advanced performance insights.
- Fork the repository and create a feature branch.
- Implement and test changes or bug fixes.
- Open a Pull Request describing your changes.