You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge?
For any embedder running multi-tenant DataFusion workloads, two operational signals are missing from the Java binding today:
Per-query memory — there is no way to ask DataFusion how many bytes a specific in-flight query is currently holding, or what its peak was. SessionContextBuilder.memoryLimit(...) (PR feat: configure SessionContext and RuntimeEnv via builder #28) lets you cap the global pool, but if a tenant blows past their fair-share allocation you can't tell whose query it is. Without this you cannot do fair-share scheduling, abuse detection, or correctly attribute OOMs.
Tokio runtime stats — the JNI library drives a single shared multi-threaded Tokio runtime in lib.rs. There is no Java-side window into how busy that runtime is right now: how many worker threads exist, how saturated they are, how deep the global queue is. Embedders that report node-level health (e.g. an OpenSearch _nodes/stats endpoint) need these numbers; today they have to hand-roll a parallel native bridge to surface them.
Both problems share the same FFI shape: snapshot a small struct of counters across the boundary on demand. They are bundled here so the design conversation only needs to happen once.
Describe the solution you'd like
Two narrow additions, both opt-in.
1. Per-query memory: MemoryUsage snapshot from a DataFrame
DataFramedf = ctx.sql(...);
ArrowReaderr = df.executeStream(allocator);
// ... while another thread drains r ...MemoryUsageusage = df.memoryUsage(); // peakBytes / currentBytes / consumerCount
The Java surface is a single new accessor on DataFrame. The Rust side wraps the session's MemoryPool with a per-query tracking adapter (modelled on TrackConsumersPool upstream) so the snapshot reports only the consumers tied to this query's MemoryConsumers, not the whole pool. The accessor is non-consuming, can be polled from any thread, and returns immutable values.
For most users the wrapper is harmless and adds no measurable overhead; it can be inserted automatically by SessionContext so callers don't have to opt in per query.
2. Tokio runtime stats: SessionContext.runtimeStats()
A small immutable POJO mirroring the subset of tokio_metrics::RuntimeMetrics that operations dashboards typically render. Snapshotted under the JNI guard; cheap (one struct copy across FFI).
Because tokio_metrics requires the --cfg tokio_unstable build flag to compile in, runtime stats are gated behind a default-off runtime-metrics Cargo feature on datafusion-jni. Default cargo build is unaffected (no new build flags, no new dep). To enable:
The Java method always exists; if invoked against a build that compiled the feature off it throws a clear RuntimeException pointing at the flag. Same shape as the just-shipped substrait feature (PR #75).
The two pieces are intentionally bundled. Per-query memory tracking and runtime stats both need the same JNI plumbing pattern (snapshot a struct of counters across FFI), the same lifecycle considerations (polling from another thread while a query runs), and the same opt-in gating story for builds that don't want the extra dependency. Splitting them would mean two near-identical reviews.
Describe alternatives you've considered
Always-on Tokio metrics, set --cfg tokio_unstable globally. Would force every contributor and every CI environment to override RUSTFLAGS. Pollutes the build env and breaks anyone who already has their own RUSTFLAGS. Rejected.
Additional context
tokio-metrics 0.5.0 is the right version against the Tokio currently used in native/Cargo.toml (tokio = "1"). It exposes RuntimeMonitor whose intervals() iterator yields RuntimeMetrics snapshots; the JNI handler grabs one snapshot per call.
Is your feature request related to a problem or challenge?
For any embedder running multi-tenant DataFusion workloads, two operational signals are missing from the Java binding today:
Per-query memory — there is no way to ask DataFusion how many bytes a specific in-flight query is currently holding, or what its peak was.
SessionContextBuilder.memoryLimit(...)(PR feat: configure SessionContext and RuntimeEnv via builder #28) lets you cap the global pool, but if a tenant blows past their fair-share allocation you can't tell whose query it is. Without this you cannot do fair-share scheduling, abuse detection, or correctly attribute OOMs.Tokio runtime stats — the JNI library drives a single shared multi-threaded Tokio runtime in
lib.rs. There is no Java-side window into how busy that runtime is right now: how many worker threads exist, how saturated they are, how deep the global queue is. Embedders that report node-level health (e.g. an OpenSearch_nodes/statsendpoint) need these numbers; today they have to hand-roll a parallel native bridge to surface them.Both problems share the same FFI shape: snapshot a small struct of counters across the boundary on demand. They are bundled here so the design conversation only needs to happen once.
Describe the solution you'd like
Two narrow additions, both opt-in.
1. Per-query memory:
MemoryUsagesnapshot from aDataFrameThe Java surface is a single new accessor on
DataFrame. The Rust side wraps the session'sMemoryPoolwith a per-query tracking adapter (modelled onTrackConsumersPoolupstream) so the snapshot reports only the consumers tied to this query'sMemoryConsumers, not the whole pool. The accessor is non-consuming, can be polled from any thread, and returns immutable values.For most users the wrapper is harmless and adds no measurable overhead; it can be inserted automatically by
SessionContextso callers don't have to opt in per query.2. Tokio runtime stats:
SessionContext.runtimeStats()A small immutable POJO mirroring the subset of
tokio_metrics::RuntimeMetricsthat operations dashboards typically render. Snapshotted under the JNI guard; cheap (one struct copy across FFI).Because
tokio_metricsrequires the--cfg tokio_unstablebuild flag to compile in, runtime stats are gated behind a default-offruntime-metricsCargo feature ondatafusion-jni. Defaultcargo buildis unaffected (no new build flags, no new dep). To enable:RUSTFLAGS="--cfg tokio_unstable" cargo build --features runtime-metricsThe Java method always exists; if invoked against a build that compiled the feature off it throws a clear
RuntimeExceptionpointing at the flag. Same shape as the just-shippedsubstraitfeature (PR #75).The two pieces are intentionally bundled. Per-query memory tracking and runtime stats both need the same JNI plumbing pattern (snapshot a struct of counters across FFI), the same lifecycle considerations (polling from another thread while a query runs), and the same opt-in gating story for builds that don't want the extra dependency. Splitting them would mean two near-identical reviews.
Describe alternatives you've considered
--cfg tokio_unstableglobally. Would force every contributor and every CI environment to overrideRUSTFLAGS. Pollutes the build env and breaks anyone who already has their own RUSTFLAGS. Rejected.Additional context
tokio-metrics 0.5.0is the right version against the Tokio currently used innative/Cargo.toml(tokio = "1"). It exposesRuntimeMonitorwhoseintervals()iterator yieldsRuntimeMetricssnapshots; the JNI handler grabs one snapshot per call.