Skip to content

v0.23.0

Choose a tag to compare

@thorrester thorrester released this 18 Mar 17:46
· 307 commits to main since this release
9f2b77c

v0.23.0 Release Summary

What Changed

This release adds distributed Delta Lake support for the trace storage engine. In multi-pod deployments, reader pods now automatically pick up new data committed by writer pods without a restart. A new configurable refresh interval controls how often each pod refreshes its in-memory Delta table snapshot from shared object storage.


Breaking Changes

None. No schema changes, no migration required. The new SCOUTER_TRACE_REFRESH_INTERVAL_SECS env var defaults to 10 and requires no action for existing deployments.


Changes

Distributed trace storage: cross-pod Delta Lake refresh

Previously, each pod's TraceSpanDBEngine loaded the Delta table snapshot once at startup. In a multi-pod deployment sharing GCS/S3, reader pods would never see data committed by the writer pod until they restarted.

The engine's actor loop now runs a periodic refresh_table() tick alongside its existing command and compaction handlers. On each tick it calls update_incremental() on a cloned DeltaTable. If the version advanced, it deregisters and re-registers the DataFusion SessionContext table provider so subsequent queries return fresh results. If the incremental update fails (empty table, transient network error), the clone is discarded and the original table state is preserved.

Key details:

Setting Default Env var
Refresh interval 10 seconds SCOUTER_TRACE_REFRESH_INTERVAL_SECS

Set lower (e.g. 5) for faster cross-pod visibility at the cost of more object-store LIST calls. Set higher to reduce overhead when read latency is not critical.

The refresh runs independently on every pod — unlike compaction, there is no control-table mutual exclusion. Each pod refreshes its own in-memory snapshot.

Trace engine: safer incremental updates

All update_incremental calls in the engine (compaction, writes, optimizations) now call update_datafusion_session() after updating the table, ensuring the DataFusion SessionContext always has the correct object store registered. This fixes a class of bugs where DeltaScan::scan() could fail to resolve file URLs after a table update in cloud-backed deployments.

CI: release workflow fix

The release workflow tag comparison step now uses ${{ github.ref_name }} instead of $GITHUB_REF_NAME, fixing an interpolation issue where the tag name was not correctly resolved in the version check step.


Upgrading from v0.22.0

No action required. The refresh interval defaults to 10 seconds. To tune it, set SCOUTER_TRACE_REFRESH_INTERVAL_SECS in your environment.