Log Processor Overview

This document provides an overview of the Log Processor, a Go-based service that continuously consumes, processes, and structures logs from microservices. It is designed to work closely with Kafka (as a log stream), TimeSeries/relational databases (for structured data storage), and optionally Neo4j (for visualizing microservice dependencies).

Purpose

The Log Processor’s primary goals:

Ingest and store raw logs from Kafka topics (published by microservices).
Reconstruct complete request flows (traces) by correlating data with trace_id and span_id.
Convert raw data into a structured format, enabling easy querying, analytics, and continuous updates to runtime models.
Periodically update a service dependency graph (if desired, in a graph database like Neo4j).

High-Level Architecture

Kafka Consumer
- Reads log messages published by microservices to Kafka.
- Each message typically includes:
  - trace_id / span_id
  - Source service, destination service
  - Request/response details
  - Timestamps
  - Possible parent–child relationships for hierarchical call chains
Database Storage
- A time-series or relational database is used for:
  - Raw Logs: storing individual log records as they arrive.
  - Structured Logs: storing aggregated records (grouped by trace_id to rebuild full request flows).
Dependency Graph (Optional)
- A scheduler periodically pulls processed data from the database and updates a dependency graph, capturing:
  - Service relationships (which service calls which)
  - Call metrics (call frequency, average latency, error counts)
Dashboard / Analytics
- Analytical dashboards (e.g., Grafana) or custom UIs can query the Log Processor’s outputs for real-time observability.

Key Components

1. Kafka Consumer

Language: Go
Responsibility:
- Subscribes to one or more Kafka topics (e.g. microservice-logs-topic).
- Batches or streams log entries to the Log Processor’s core logic.

2. Processing / Enrichment

Description:
- Groups log entries by trace_id.
- Reconstructs call hierarchies using parent–child span_id.
- Performs transformations or adds metadata (e.g. user sessions, environment markers).

3. Time-Series / Relational Database

Purpose:
- Holds both raw and enriched log data for querying.
- Typically includes tables for raw_logs and structured_logs, keyed by trace_id.

4. Dependency Graph Updater

Purpose:
- Reads from structured_logs in small batches (e.g., every few seconds/minutes).
- Inserts or updates edges in a graph database to reflect new inter-service calls or changes in latencies.

Data Flow

Microservice -> Kafka
Each microservice logs events to Kafka, tagging each entry with a trace_id and span_id for correlation.
Kafka -> Raw Logs Table
The Log Processor consumes messages from Kafka and writes them to a raw_logs table. This ensures minimal overhead during ingestion.
Log Processor -> Structured Logs
After correlation and trace reconstruction, the processor writes aggregated data (per trace) to structured_logs. This can include total duration, involved services, and error info.
Structured Logs -> Dependency Graph
The processor (or a background job) uses the aggregated data to update a dependency graph, capturing service-to-service calls and associated metrics.

Typical Use Cases

Real-Time Monitoring
Continuously track call frequency, average/95th percentile latency, error rates, and feed data into dashboards or alerts.
Distributed Tracing
Reconstruct full call flows from trace_id and span_id, enabling quick identification of bottlenecks or anomalies.
Architecture Insights
Maintain an always-fresh overview of how services interact at runtime. Detect cyclical dependencies, newly introduced services, or changes in network topologies.

Pitfalls and Recommendations

High-Volume Logging
- With many services, logs can be massive. Consider partitioning Kafka topics and scaling the Log Processor horizontally.
Indexing
- Ensure critical columns (e.g. trace_id, span_id) are properly indexed in your database to prevent query bottlenecks.
Partial Traces
- Logs may arrive out-of-order or may be missing. Handle incomplete data gracefully (e.g., partial correlation, skipping incomplete spans).
Fault Tolerance
- Use replication for Kafka or database.
- Log Processor itself should be stateless, allowing easy re-deployment or autoscaling.

Extensibility

Custom Enrichment
Add domain-specific metadata (e.g., attach a user token, geo-location) to each structured log entry.
Alternative Storage
Replace or add other data stores (e.g. Elasticsearch, InfluxDB) for specialized indexing or analytics demands.
Integration with Analytics
Pipe your structured logs into ML pipelines or anomaly detection services for advanced performance insights.

Contributing

Fork the repository and create a feature branch.
Implement and test changes or bug fixes.
Open a Pull Request describing your changes.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github/workflows		.github/workflows
controllers		controllers
database		database
kafka		kafka
mappers		mappers
models		models
payload		payload
repository		repository
routes		routes
scheduler		scheduler
services		services
util		util
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Log Processor Overview

Table of Contents

Purpose

High-Level Architecture

Key Components

1. Kafka Consumer

2. Processing / Enrichment

3. Time-Series / Relational Database

4. Dependency Graph Updater

Data Flow

Typical Use Cases

Pitfalls and Recommendations

Extensibility

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

anti-pattern-analyzer/log-processor

Folders and files

Latest commit

History

Repository files navigation

Log Processor Overview

Table of Contents

Purpose

High-Level Architecture

Key Components

1. Kafka Consumer

2. Processing / Enrichment

3. Time-Series / Relational Database

4. Dependency Graph Updater

Data Flow

Typical Use Cases

Pitfalls and Recommendations

Extensibility

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages