Add kmsg receiver for Linux kernel log messages by neeme-praks-sympower · Pull Request #309 · streamfold/rotel

neeme-praks-sympower · 2026-02-17T11:04:15Z

Summary

Add a new kmsg receiver for collecting Linux kernel log messages from /dev/kmsg. This enables rotel to ingest kernel-level diagnostics and forward them as OpenTelemetry logs.

Features

Kernel message ingestion - Read messages from /dev/kmsg using async I/O
Priority filtering - Filter messages by syslog priority level (0-7)
Batch processing - Configurable batch size and timeout for efficient processing
Ring buffer overflow handling - Graceful recovery when kernel buffer overflows
Continuation message tracking - Mark multi-part kernel messages with kmsg.continuation attribute
OTLP conversion - Convert kmsg records to OpenTelemetry log format with proper severity mapping

Configuration

receivers:
  kmsg:
    priority_level: 6        # Include messages up to INFO level
    read_existing: false     # Start from current position (not historical)
    batch_size: 100
    batch_timeout_ms: 1000

Part of #299

neeme-praks-sympower · 2026-02-17T15:06:30Z

I'll fix conflicts once #310 is merged (otherwise tests are flaky).

neeme-praks-sympower · 2026-02-17T16:09:03Z

I'll fix conflicts once #310 is merged (otherwise tests are flaky).

Done

mheffner

This looks like a great addition to me, thanks for adding this.

At some point would we want the ability to save the current offset so that rotel would begin reading at the same location in kmsg after a restart of rotel? This could happen similarly to the file receiver's persistence. However, I'm guessing on an IoT device restart there's a chance that the persistence file location is wiped.

src/receivers/kmsg/receiver.rs

src/receivers/kmsg/convert.rs

neeme-praks-sympower · 2026-02-23T15:35:31Z

At some point would we want the ability to save the current offset so that rotel would begin reading at the same location in kmsg after a restart of rotel? This could happen similarly to the file receiver's persistence. However, I'm guessing on an IoT device restart there's a chance that the persistence file location is wiped.

Excellent idea, added in 36d2084

rjenkins · 2026-02-23T17:25:29Z

At some point would we want the ability to save the current offset so that rotel would begin reading at the same location in kmsg after a restart of rotel? This could happen similarly to the file receiver's persistence. However, I'm guessing on an IoT device restart there's a chance that the persistence file location is wiped.

Excellent idea, added in 36d2084

@neeme-praks-sympower, nice addition. Not a blocker for now but in the future you might want to leverage the end-to-end message acknowledgement feature to "track" the pending offsets at the receiver and acknowledge them once they've actually been exported. That way you'll get at-least-once semantics. Right now if you crash, you might update your offset before the messages are published and on restart you won't re-read them so you'll lose some messages, essentially at-most-once delivery. Checkout the Kafka receiver or File receiver for examples on how to wire in offset Metadata, handle message acknowledgement from the exporter, and track offsets acknowledgement to drive offset state persistence.

mheffner

Looks good, thanks for taking a look at the feedback!

One small thing: I removed the integration-tests feature flag and moved it behind an env var to decouple it from build features, so I think I broke the merge. You can see the change in this commit, could you model a similar KMSG_INTEGRATION_TESTS envvar? 7a72d56. Besides that we should be good to go.

Add a new receiver that reads kernel messages from /dev/kmsg and converts them to OpenTelemetry logs. This enables collection of kernel-level diagnostics on Linux systems. Features: - Async I/O using AsyncFd for efficient event-based reading - Priority filtering by syslog level (0-7) - Configurable batching (size and timeout) - Ring buffer overflow recovery (EPIPE handling) - Continuation message tracking via kmsg.continuation attribute - Proper severity mapping to OpenTelemetry levels

Implement offset persistence for the kmsg receiver to resume reading from where it left off after a Rotel restart. This prevents duplicate processing of kernel messages during normal service restarts while correctly handling system reboots. Key features: - Persist last-read sequence number to a JSON file using atomic writes - Use Linux boot_id to detect system reboots and invalidate stale state - Configurable checkpoint interval (default 5s, minimum 100ms) - Directory fsync on shutdown for durability - Automatic resume on restart: reads from ring buffer start and skips already-processed messages based on persisted sequence

Prevent silent infinite retry loops when /dev/kmsg becomes inaccessible after a successful start. Track consecutive read error duration and exit after a configurable threshold (default 60 seconds). Changes: - Add max_read_error_duration config option (default 60s) - Exit immediately with friendly message for permission denied (EACCES), suggesting CAP_SYSLOG capability - Exit immediately with friendly message for device not found (ENOENT), noting container/chroot environments - For other errors: track duration, warn while within threshold, exit when threshold exceeded - Log debug message when reads recover after previous failures

Replace conditional_wait with poll_pending to prevent dropping in-flight batches when cancellation wins the select! race. The old conditional_wait used .take() which removed the future from the Option before awaiting. If cancellation fired first, the pending send future was dropped and the batch lost. The new poll_pending polls the future in place without taking ownership. If cancellation wins, the future remains in the Option and shutdown code can complete it via complete_pending_send().

Move boot time calculation from convert_to_otlp_logs (called per batch) to BatchSender initialization (called once at receiver start). This eliminates repeated /proc/uptime reads during normal operation. The convert_to_otlp_logs function now accepts boot_time_ns as a parameter, allowing the caller to control when and how boot time is obtained.

Previously, the kmsg receiver checkpointed offsets immediately after reading messages, which could cause data loss if the process crashed before messages were exported. This change defers offset persistence until downstream exporters acknowledge successful export. Key changes: - Add KmsgOffsetTracker to track in-flight sequences using a BTreeSet, enabling O(1) lookup of the lowest pending sequence for safe checkpoint calculation - Add KmsgOffsetCommitter as a separate task that processes acks and persists offsets, surviving the main receiver loop to drain remaining acks on shutdown - Attach KmsgMetadata with sequence numbers and ack channel to outgoing messages - Change resume logic to skip sequences < persisted (was <=) to match new semantics where persisted sequence is the one to resume from The persistable sequence is now the lowest pending sequence (if any are in-flight) or hwm+1 (if all acknowledged), ensuring we never checkpoint past unacknowledged messages.

Replace the integration-tests feature flag with an environment variable for kmsg integration tests, matching the pattern used for Kafka tests. Changes: - Add KMSG_INTEGRATION_TESTS env var handling in build.rs - Update test file to use cfg(kmsg_integration_tests = "true") - Pass env var to Docker container in helper script

neeme-praks-sympower · 2026-02-24T12:04:38Z

You might want to leverage the end-to-end message acknowledgement feature to "track" the pending offsets at the receiver and acknowledge them once they've actually been exported. That way you'll get at-least-once semantics. Right now if you crash, you might update your offset before the messages are published and on restart you won't re-read them so you'll lose some messages, essentially at-most-once delivery. Checkout the Kafka receiver or File receiver for examples on how to wire in offset Metadata, handle message acknowledgement from the exporter, and track offsets acknowledgement to drive offset state persistence.

Even better, added in a4ee9d1.

neeme-praks-sympower · 2026-02-24T12:06:43Z

One small thing: I removed the integration-tests feature flag and moved it behind an env var to decouple it from build features, so I think I broke the merge. You can see the change in this commit, could you model a similar KMSG_INTEGRATION_TESTS envvar? 7a72d56. Besides that we should be good to go.

Applied a similar change in bb101a8. I rebased the entire branch on top of latest changes from upstream main (so the commit you referenced would be visible in this branch).

mheffner

Integration test flagging looks good from my side. @rjenkins will have a better handle on the offset tracking portion

rjenkins

👍 to merge.

neeme-praks-sympower force-pushed the feature/kmsg-receiver branch from ec31ead to a12db1a Compare February 17, 2026 11:11

neeme-praks-sympower force-pushed the feature/kmsg-receiver branch from a12db1a to 53c17c9 Compare February 17, 2026 15:31

mheffner approved these changes Feb 18, 2026

View reviewed changes

mheffner requested a review from rjenkins February 18, 2026 15:58

rjenkins reviewed Feb 19, 2026

View reviewed changes

src/receivers/kmsg/receiver.rs Outdated Show resolved Hide resolved

src/receivers/kmsg/receiver.rs Outdated Show resolved Hide resolved

src/receivers/kmsg/convert.rs Outdated Show resolved Hide resolved

rjenkins approved these changes Feb 23, 2026

View reviewed changes

mheffner approved these changes Feb 23, 2026

View reviewed changes

neeme-praks-sympower added 8 commits February 24, 2026 13:46

Fix formatting

9944215

neeme-praks-sympower force-pushed the feature/kmsg-receiver branch from a95c584 to bb101a8 Compare February 24, 2026 12:03

mheffner approved these changes Feb 24, 2026

View reviewed changes

rjenkins approved these changes Feb 25, 2026

View reviewed changes

mheffner merged commit 40daa06 into streamfold:main Feb 25, 2026
6 checks passed

neeme-praks-sympower deleted the feature/kmsg-receiver branch February 26, 2026 06:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add kmsg receiver for Linux kernel log messages#309

Add kmsg receiver for Linux kernel log messages#309
mheffner merged 8 commits intostreamfold:mainfrom
neeme-praks-sympower:feature/kmsg-receiver

neeme-praks-sympower commented Feb 17, 2026

Uh oh!

neeme-praks-sympower commented Feb 17, 2026 •

edited

Loading

Uh oh!

neeme-praks-sympower commented Feb 17, 2026

Uh oh!

mheffner left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

neeme-praks-sympower commented Feb 23, 2026

Uh oh!

rjenkins commented Feb 23, 2026

Uh oh!

mheffner left a comment

Uh oh!

neeme-praks-sympower commented Feb 24, 2026

Uh oh!

neeme-praks-sympower commented Feb 24, 2026

Uh oh!

mheffner left a comment

Uh oh!

rjenkins left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

neeme-praks-sympower commented Feb 17, 2026

Summary

Features

Configuration

Uh oh!

neeme-praks-sympower commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

neeme-praks-sympower commented Feb 17, 2026

Uh oh!

mheffner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

neeme-praks-sympower commented Feb 23, 2026

Uh oh!

rjenkins commented Feb 23, 2026

Uh oh!

mheffner left a comment

Choose a reason for hiding this comment

Uh oh!

neeme-praks-sympower commented Feb 24, 2026

Uh oh!

neeme-praks-sympower commented Feb 24, 2026

Uh oh!

mheffner left a comment

Choose a reason for hiding this comment

Uh oh!

rjenkins left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

neeme-praks-sympower commented Feb 17, 2026 •

edited

Loading