-
Notifications
You must be signed in to change notification settings - Fork 37
Description
Description
The Grafana panel "Data Size On Disk" occasionally shows EiB-scale spikes, which are far larger than the real data size.
Context
Panel query:
sum(ticdc_event_store_on_disk_data_size{...}) by (instance)
Metric source:
collectAndReportStoreMetrics calls diskSpaceUsage(stats) and exports
ticdc_event_store_on_disk_data_size.
In diskSpaceUsage, pebble.Metrics.Compact.InProgressBytes (int64) is cast
directly to uint64 and added:
usageBytes += uint64(m.Compact.InProgressBytes)
In Pebble, InProgressBytes is signed and may go negative when compactions finish.
If it becomes negative, the uint64 cast wraps to ~2^64, producing an EiB-scale value.
Relevant code:
- logservice/eventstore/event_store.go:1090-1103, 1138-1144
- pebble/metrics.go: Compact.InProgressBytes is int64
Expected vs Actual
Expected: on_disk_data_size reflects real on-disk bytes and never shows EiB spikes.
Actual: occasional EiB-level spikes on the panel.
Proposed Solution
Clamp Compact.InProgressBytes to 0 when it is negative (or otherwise ensure a non-negative
value) before casting to uint64 in diskSpaceUsage.
Screenshot
