Skip to content

[Geneva Exporter]: Avoid heap reallocations by using thread-local buffers #310

Open
@lalitb

Description

@lalitb

Avoid heap reallocations by using thread-local buffers

Background

Currently, with every request processed by OtlpEncoder, we create fresh Vec buffers in hot paths such as:

  • determine_fields (for building up the schema field list)
  • write_row_data (for serializing per-log record rows)
  • temporary vectors in batch assembly

Since the encoder can be invoked at high frequency and potentially from multiple threads (e.g., via a multi-threaded runtime), this results in repeated heap allocations per request, putting additional pressure on the allocator and causing unnecessary CPU and memory churn.

Example Hot Code Path

fn determine_fields(&self, log: &LogRecord) -> Vec<FieldDef> {
    let estimated_capacity = 7 + 4 + log.attributes.len();
    let mut fields = Vec::with_capacity(estimated_capacity);  // <-- Allocates every call
    ...
}

Similarly, in write_row_data:

fn write_row_data(&self, log: &LogRecord, sorted_fields: &[FieldDef]) -> Vec<u8> {
    let mut buffer = Vec::with_capacity(sorted_fields.len() * 50); // <-- Allocates every call
    ...
}

Proposal

Introduce thread-local buffers for encoding temporary vectors.

Use Rust's std::thread_local! or std::cell::RefCell for maintaining per-thread Vec<FieldDef> and Vec<u8> buffers.

On each encode call, clear and reuse the thread-local buffer, instead of allocating a new one.

This approach is safe and effective, since each request is independent, and sharing across threads isn't needed.

Why thread-local?

Each thread maintains its own reusable buffer, so no synchronization is required.

Dramatically reduces allocator overhead for high-throughput encoding.

Suggested Implementation

Add a thread-local pool for each kind of temporary vector used in encoding (e.g., fields, row data).

Refactor code to obtain the buffer from the thread-local storage, clear it, and use it during the encode call.

For example:

thread_local! {
    static FIELD_DEFS_BUF: RefCell<Vec<FieldDef>> = RefCell::new(Vec::with_capacity(32));
    static ROW_BUF: RefCell<Vec<u8>> = RefCell::new(Vec::with_capacity(4096));
}

Usage example:

FIELD_DEFS_BUF.with(|buf| {
    let mut fields = buf.borrow_mut();
    fields.clear();
    // use fields as mutable Vec<FieldDef> instead of allocating a new one
});

Benefits

  • Reduces per-request heap allocations by reusing buffers across requests within a thread.
  • Lowers CPU usage and allocator contention under load.
  • Zero risk of cross-thread races (buffers are per-thread).
  • No external crate required; can use Rust std only.

Next Steps

  • Audit code for all per-request allocations of Vec, Vec, and other large temporaries.
  • Refactor these to use thread-local buffers as described.
  • Add a benchmark to demonstrate reduction in allocation count and performance win.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions