Open
Description
Component(s)
receiver/filelog
What happened?
Description
SAP audit log files have utf-16le encoding, continuously updated with new logs, fixed length records, and no line termination. This should be supportable using the Filelog receivers encoding and multiline support, but I've got a reproducible bug and workaround.
Steps to Reproduce
- Create a utf-16le file named
auditlog.txt
containing 10 SAP audit log records in the format:
2AUK20250227000000002316500018D110.102.8BATCH_ALRI SAPMSSY1 0501Z91_VALR_IF&&Z91_VAL_PLSTATUS 10.122.81.29 2AUK20250227000000002316500018D110.102.8BATCH_ALRI SAPMSSY1 0501Z91_VALR_IF&&Z91_VAL_PLSTATUS 10.122.81.29 2AUK20250227000000002316500018D110.102.8BATCH_ALRI SAPMSSY1 0501Z91_VALR_IF&&Z91_VAL_PLSTATUS 10.122.81.29 2AUK20250227000000002316500018D110.102.8BATCH_ALRI SAPMSSY1 0501Z91_VALR_IF&&Z91_VAL_PLSTATUS 10.122.81.29 2AUK20250227000000002316500018D110.102.8BATCH_ALRI SAPMSSY1 0501Z91_VALR_IF&&Z91_VAL_PLSTATUS 10.122.81.29 2AUK20250227000000002316500018D110.102.8BATCH_ALRI SAPMSSY1 0501Z91_VALR_IF&&Z91_VAL_PLSTATUS 10.122.81.29 2AUK20250227000000002316500018D110.102.8BATCH_ALRI SAPMSSY1 0501Z91_VALR_IF&&Z91_VAL_PLSTATUS 10.122.81.29 2AUK20250227000000002316500018D110.102.8BATCH_ALRI SAPMSSY1 0501Z91_VALR_IF&&Z91_VAL_PLSTATUS 10.122.81.29 2AUK20250227000000002316500018D110.102.8BATCH_ALRI SAPMSSY1 0501Z91_VALR_IF&&Z91_VAL_PLSTATUS 10.122.81.29 2AUK20250227000000002316500018D110.102.8BATCH_ALRI SAPMSSY1 0501Z91_VALR_IF&&Z91_VAL_PLSTATUS 10.122.81.29
- Configure filelog receiver as following:
receivers:
filelog/sap:
include: [ auditlog.txt ]
encoding: utf-16le
multiline:
line_start_pattern: '([23])[A-Z][A-Z][A-Z0-9]\d{14}00'
preserve_trailing_whitespaces: true
start_at: beginning
Expected Result
10 log events
Actual Result
1 log event
Workaround
I suspected the multiline processing is not honouring the file encoding, and therefore failing to match the pattern. To test this theory, I adjusted the multiline to only match on the first 8 bits of each 16 bit character:
receivers:
filelog/sap:
include: [ auditlog.txt ]
encoding: utf-16le
multiline:
line_start_pattern: '([23]).[A-Z].[A-Z].[A-Z0-9].(\d.){14}0.0.'
preserve_trailing_whitespaces: true
start_at: beginning
This configuration outputs 10 log records, each one containing a complete 200 character record.
Collector version
v0.122.0
Environment information
Environment
OS: MacOS 15.3.2
OpenTelemetry Collector configuration
receivers:
filelog/sap:
include: [ auditlog.txt ]
encoding: utf-16le
multiline:
line_start_pattern: '([23])[A-Z][A-Z][A-Z0-9]\d{14}00'
preserve_trailing_whitespaces: true
start_at: beginning
exporters:
file/debug:
path: debug.json
service:
pipelines:
logs:
receivers:
- filelog/sap
exporters:
- file/debug
Log output
Additional context
No response