v0.21.0 Release Summary

What Changed

v0.21.0 adds a read-side caching layer over cloud object stores and tunes the DataFusion SessionContext for higher-concurrency GCS/S3 workloads. A bug where the trace summary table skipped vacuum after compaction is fixed. An internal record type rename is propagated across all crates.

Breaking Changes

None. No schema changes, no migration required.

Changes

Object store caching layer (`CachingStore`)

A new CachingStore<T: ObjectStore> wrapper in scouter_dataframe caches head() responses and small get_range() reads (≤2 MB) from cloud object stores.

After Delta Lake Z-ORDER compaction, Parquet files are immutable — the same path always returns the same bytes. DataFusion issues repeated HEAD + footer range reads on every query. Without caching, each read is a separate cloud round-trip (~30–60 ms on GCS). CachingStore eliminates these by serving repeated reads from an in-process mini_moka cache.

Cache configuration:

Setting	Default	Env var
Max cache size	64 MB	`SCOUTER_OBJECT_CACHE_MB`
TTL	1 hour	—
Max cacheable range read	2 MB	—

All mutating and streaming operations (put, delete, list, get for large ranges) pass through to the inner store uncached.

DataFusion `SessionContext` tuning

The shared SessionContext used for trace queries now includes explicit read-path and write-path settings:

Setting	Old	New	Why
`metadata_size_hint`	512 KB	1 MB	Captures bloom filter + footer + column indexes in one GCS round-trip instead of the default multi-step chain
`bloom_filter_on_read`	default	`true`	Activates bloom filters on `trace_id` and `entity_id` to skip non-matching row groups before decoding
`schema_force_view_types`	default	`true`	Zero-copy `Utf8View`/`BinaryView` — prevents DataFusion from downgrading these on read-back from Parquet
`meta_fetch_concurrency`	32	64	Parallel HEAD stats during Delta log replay; matches `pool_max_idle_per_host`
`maximum_parallel_row_group_writers`	default	4	Concurrent row group encoding during compaction and flush
`maximum_buffered_record_batches_per_stream`	default	8	Smooths bursty reads from GCS

Connection pool tuning

Cloud object store HTTP client settings updated:

Setting	Old	New
`pool_max_idle_per_host`	16	64
`pool_idle_timeout`	90s	120s
Request timeout	—	30s
Connect timeout	—	5s

Bug fix: vacuum missing after summary optimize

TraceSummaryDBEngine::run_maintenance() called optimize_table() but not vacuum_table() afterward. Compaction tombstones old Parquet files; without an immediate vacuum those files remain on storage until the next scheduled vacuum cycle.

Fixed to vacuum immediately after a successful optimize:

Ok(()) => {
    if let Err(e) = self.vacuum_table(0).await {
        error!("Post-optimize vacuum failed: {}", e);
    }
    // release task ...
}

This matches the existing behavior in TraceSpanDBEngine.

Internal record type rename (PR #221)

Internal record type renamed across scouter_client, scouter_drift, scouter_evaluate, scouter_events, scouter_server, and py-scouter. No public API change for Python users — stub files updated.

Upgrading from v0.20.0

No action required. All changes are additive or internal.

SCOUTER_OBJECT_CACHE_MB is optional. The default (64 MB) is appropriate for most deployments. Increase it if you have many concurrent readers querying large numbers of Parquet files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.21.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

v0.21.0 Release Summary

What Changed

Breaking Changes

Changes

Object store caching layer (`CachingStore`)

DataFusion `SessionContext` tuning

Connection pool tuning

Bug fix: vacuum missing after summary optimize

Internal record type rename (PR #221)

Upgrading from v0.20.0

Uh oh!

v0.21.0

v0.21.0 Release Summary

What Changed

Breaking Changes

Changes

Object store caching layer (CachingStore)

DataFusion SessionContext tuning

Connection pool tuning

Bug fix: vacuum missing after summary optimize

Internal record type rename (PR #221)

Upgrading from v0.20.0

Uh oh!

Object store caching layer (`CachingStore`)

DataFusion `SessionContext` tuning