Description:
I am experiencing a memory "leak" (unbounded growth) when using snmalloc as the global allocator in a Rust-based query server. The application uses DataFusion and delta-rs to query object storage.
The Issue:
Memory is not being released back to the OS (or reused effectively) at the end of a REST request lifecycle. This leads to a steady climb in RSS until the process is OOM killed.
With jemalloc: Memory is reclaimed/recycled correctly.
With snmalloc: Memory usage climbs linearly and crashes.
Environment:
OS: Linux
snmalloc-rs version: 0.3.8
Relevant Crates: DataFusion, delta-rs (which relies heavily on arrow and FFI).
Runtime: [Tokio 1.48]
Observations:
Interestingly, in the snmalloc trace, the OOM occurs well before the "Limits" (32GB) defined in our monitoring. It seems the allocator is struggling with the specific allocation patterns of DataFusion's execution plan (large buffers for Arrow record batches).
Request:
Are there specific configurations for snmalloc or known issues with the way it interacts with large, short-lived Arrow buffers that might prevent timely deallocation?
Refrence links
delta-io/delta-rs#4241 (comment)
Description:
I am experiencing a memory "leak" (unbounded growth) when using snmalloc as the global allocator in a Rust-based query server. The application uses DataFusion and delta-rs to query object storage.
The Issue:
Memory is not being released back to the OS (or reused effectively) at the end of a REST request lifecycle. This leads to a steady climb in RSS until the process is OOM killed.
With jemalloc: Memory is reclaimed/recycled correctly.
With snmalloc: Memory usage climbs linearly and crashes.
Environment:
OS: Linux
snmalloc-rs version: 0.3.8
Relevant Crates: DataFusion, delta-rs (which relies heavily on arrow and FFI).
Runtime: [Tokio 1.48]
Observations:
Interestingly, in the snmalloc trace, the OOM occurs well before the "Limits" (32GB) defined in our monitoring. It seems the allocator is struggling with the specific allocation patterns of DataFusion's execution plan (large buffers for Arrow record batches).
Request:
Are there specific configurations for snmalloc or known issues with the way it interacts with large, short-lived Arrow buffers that might prevent timely deallocation?
Refrence links
delta-io/delta-rs#4241 (comment)