v0.21.0
v0.21.0 Release Summary
What Changed
v0.21.0 adds a read-side caching layer over cloud object stores and tunes the DataFusion SessionContext for higher-concurrency GCS/S3 workloads. A bug where the trace summary table skipped vacuum after compaction is fixed. An internal record type rename is propagated across all crates.
Breaking Changes
None. No schema changes, no migration required.
Changes
Object store caching layer (CachingStore)
A new CachingStore<T: ObjectStore> wrapper in scouter_dataframe caches head() responses and small get_range() reads (≤2 MB) from cloud object stores.
After Delta Lake Z-ORDER compaction, Parquet files are immutable — the same path always returns the same bytes. DataFusion issues repeated HEAD + footer range reads on every query. Without caching, each read is a separate cloud round-trip (~30–60 ms on GCS). CachingStore eliminates these by serving repeated reads from an in-process mini_moka cache.
Cache configuration:
| Setting | Default | Env var |
|---|---|---|
| Max cache size | 64 MB | SCOUTER_OBJECT_CACHE_MB |
| TTL | 1 hour | — |
| Max cacheable range read | 2 MB | — |
All mutating and streaming operations (put, delete, list, get for large ranges) pass through to the inner store uncached.
DataFusion SessionContext tuning
The shared SessionContext used for trace queries now includes explicit read-path and write-path settings:
| Setting | Old | New | Why |
|---|---|---|---|
metadata_size_hint |
512 KB | 1 MB | Captures bloom filter + footer + column indexes in one GCS round-trip instead of the default multi-step chain |
bloom_filter_on_read |
default | true |
Activates bloom filters on trace_id and entity_id to skip non-matching row groups before decoding |
schema_force_view_types |
default | true |
Zero-copy Utf8View/BinaryView — prevents DataFusion from downgrading these on read-back from Parquet |
meta_fetch_concurrency |
32 | 64 | Parallel HEAD stats during Delta log replay; matches pool_max_idle_per_host |
maximum_parallel_row_group_writers |
default | 4 | Concurrent row group encoding during compaction and flush |
maximum_buffered_record_batches_per_stream |
default | 8 | Smooths bursty reads from GCS |
Connection pool tuning
Cloud object store HTTP client settings updated:
| Setting | Old | New |
|---|---|---|
pool_max_idle_per_host |
16 | 64 |
pool_idle_timeout |
90s | 120s |
| Request timeout | — | 30s |
| Connect timeout | — | 5s |
Bug fix: vacuum missing after summary optimize
TraceSummaryDBEngine::run_maintenance() called optimize_table() but not vacuum_table() afterward. Compaction tombstones old Parquet files; without an immediate vacuum those files remain on storage until the next scheduled vacuum cycle.
Fixed to vacuum immediately after a successful optimize:
Ok(()) => {
if let Err(e) = self.vacuum_table(0).await {
error!("Post-optimize vacuum failed: {}", e);
}
// release task ...
}This matches the existing behavior in TraceSpanDBEngine.
Internal record type rename (PR #221)
Internal record type renamed across scouter_client, scouter_drift, scouter_evaluate, scouter_events, scouter_server, and py-scouter. No public API change for Python users — stub files updated.
Upgrading from v0.20.0
No action required. All changes are additive or internal.
SCOUTER_OBJECT_CACHE_MB is optional. The default (64 MB) is appropriate for most deployments. Increase it if you have many concurrent readers querying large numbers of Parquet files.