Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 25 additions & 1 deletion datafusion/execution/src/memory_pool/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -68,11 +68,35 @@ pub use pool::*;
/// Note that a `MemoryPool` can be shared by concurrently executing plans,
/// which can be used to control memory usage in a multi-tenant system.
///
/// # How MemoryPool works by example
///
/// Scenario 1:
/// For `Filter` operator, `RecordBatch`es will stream through it, so it
/// don't have to keep track of memory usage through [`MemoryPool`].
///
/// Scenario 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

/// For `CrossJoin` operator, if the input size gets larger, the intermediate
/// state will also grow. So `CrossJoin` operator will use [`MemoryPool`] to
/// limit the memory usage.
/// 2.1 `CrossJoin` operator has read a new batch, asked memory pool for
/// additional memory. Memory pool updates the usage and returns success.
/// 2.2 `CrossJoin` has read another batch, and tries to reserve more memory
/// again, memory pool does not have enough memory. Since `CrossJoin` operator
/// has not implemented spilling, it will stop execution and return an error.
///
/// Scenario 3:
/// For `Aggregate` operator, its intermediate states will also accumulate as
/// the input size gets larger, but with spilling capability. When it tries to
/// reserve more memory from the memory pool, and the memory pool has already
/// reached the memory limit, it will return an error. Then, `Aggregate`
/// operator will spill the intermediate buffers to disk, and release memory
/// from the memory pool, and continue to retry memory reservation.
///
/// # Implementing `MemoryPool`
///
/// You can implement a custom allocation policy by implementing the
/// [`MemoryPool`] trait and configuring a `SessionContext` appropriately.
/// However, mDataFusion comes with the following simple memory pool implementations that
/// However, DataFusion comes with the following simple memory pool implementations that
/// handle many common cases:
///
/// * [`UnboundedMemoryPool`]: no memory limits (the default)
Expand Down