Skip to content

decode_opentelemetry: guard NULL fqname and label name in compute_metric_hash#265

Merged
edsiper merged 2 commits intofluent:masterfrom
enoquefcd:fix-otel-decode-null-deref
May 4, 2026
Merged

decode_opentelemetry: guard NULL fqname and label name in compute_metric_hash#265
edsiper merged 2 commits intofluent:masterfrom
enoquefcd:fix-otel-decode-null-deref

Conversation

@enoquefcd
Copy link
Copy Markdown
Contributor

@enoquefcd enoquefcd commented Apr 30, 2026

compute_metric_hash dereferences map->opts->fqname and label_value->name without NULL checks while its peer get_or_create_metric_metadata_context, called immediately after from the same caller, already guards map->opts and map->opts->fqname. The unguarded path is reachable from real OTLP/HTTP histogram payloads — cfl_sds_len(NULL) segfaults the worker.

Reported and reproduced via fluent/fluent-bit#11764 with Quarkus 3.32.x clients pushing histograms over OTLP/HTTP. Stack trace:

[engine] caught signal (SIGSEGV)
#0  get_or_create_data_point_metadata_context() at src/cmt_decode_opentelemetry.c:331
#1  decode_histogram_data_point()                at src/cmt_decode_opentelemetry.c:1021
#2  decode_histogram_data_point_list()           at src/cmt_decode_opentelemetry.c:1061
#3  decode_histogram_entry()                     at src/cmt_decode_opentelemetry.c:1158
#4  decode_metrics_entry()                       at src/cmt_decode_opentelemetry.c:1545
#5  decode_scope_metrics_entry()                 at src/cmt_decode_opentelemetry.c:1763
#6  decode_resource_metrics_entry()              at src/cmt_decode_opentelemetry.c:1871
#7  decode_service_request()                     at src/cmt_decode_opentelemetry.c:1931
#8  cmt_decode_opentelemetry_create()            at src/cmt_decode_opentelemetry.c:1954

Root cause

decode_data_point_labels falls into its else branch for unrecognised AnyValue.value_case values (e.g. NOT_SET = 0) and passes NULL to append_new_metric_label_value. create_label(NULL, …) leaves label->name as NULL, which crashes any encoder calling cfl_sds_len(label->name) — not only compute_metric_hash but also cmt_encode_text and cmt_encode_splunk_hec.

Changes

  1. decode_data_point_labels: store "" (empty string) in the unrecognised-value fallback, consistent with STRING_VALUE_STRINDEX. Prevents NULL from entering metric->labels.

  2. compute_metric_hash: extend the early-return guard to cover map->opts == NULL || map->opts->fqname == NULL (matching get_or_create_metric_metadata_context), and skip NULL label names in the hash loop — defence in depth.

Regression test

tests/data/otlp_null_label_histogram.bin — a serialised ExportMetricsServiceRequest with one Histogram data point whose sole attribute has AnyValue.value_case = NOT_SET. Without this fix the test exits SIGSEGV (signal 11); with it, CMT_DECODE_OPENTELEMETRY_SUCCESS.

ctest --test-dir build -R cmt-test-opentelemetry --output-on-failure
# or
./build/tests/cmt-test-opentelemetry opentelemetry_histogram_null_label_no_crash

…d compute_metric_hash

decode_data_point_labels falls into its else branch for unrecognised
AnyValue.value_case values (e.g. NOT_SET = 0) and passes NULL to
append_new_metric_label_value, leaving label->name as NULL.  Any
encoder calling cfl_sds_len(label->name) -- including cmt_encode_text
and cmt_encode_splunk_hec -- then crashes.

Fix: store "" (empty string) in the unrecognised-value fallback,
consistent with STRING_VALUE_STRINDEX.  Also extend the early-return
guard in compute_metric_hash to cover map->opts == NULL and
map->opts->fqname == NULL (matching get_or_create_metric_metadata_context),
and skip NULL label names in the hash loop, for defence in depth.

Signed-off-by: Enoque Duarte <enoquefcd@gmail.com>
…Value

Add otlp_null_label_histogram.bin -- a serialised
ExportMetricsServiceRequest with one Histogram data point whose sole
attribute has AnyValue.value_case = NOT_SET.  The test
opentelemetry_histogram_null_label_no_crash verifies the decoder
returns CMT_DECODE_OPENTELEMETRY_SUCCESS rather than exiting with
SIGSEGV.

Signed-off-by: Enoque Duarte <enoquefcd@gmail.com>
@enoquefcd enoquefcd force-pushed the fix-otel-decode-null-deref branch from fd32c59 to da4b9fa Compare April 30, 2026 20:44
@edsiper edsiper merged commit 89d2654 into fluent:master May 4, 2026
63 of 71 checks passed
@edsiper
Copy link
Copy Markdown
Member

edsiper commented May 4, 2026

thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants