Skip to content

tsdb/wlog[PERF]: optimize WAL watcher reads (up to 540x less B/op; 13000x less allocs/op)#18250

Merged
bwplotka merged 7 commits intomainfrom
bwplotka/wal-watcher-optimize
Mar 11, 2026
Merged

tsdb/wlog[PERF]: optimize WAL watcher reads (up to 540x less B/op; 13000x less allocs/op)#18250
bwplotka merged 7 commits intomainfrom
bwplotka/wal-watcher-optimize

Conversation

@bwplotka
Copy link
Copy Markdown
Member

@bwplotka bwplotka commented Mar 6, 2026

This optimizes biggest contributors to WAL watching allocations. Practically I see 20% less alloc bytes/op for scrape -> WAL -> RW path.

Fixes #18256

Changes:

  • reuse Ref* buffers. Something we do on every TSDB appender commit, but we never did for WAL watching
  • reuse sampleToSend/etc. We can totally use same array, it's safe on the kind of filtering we do.
  • move error wrap on record reader .Err errors. EOF is a common error when tailing segments and it was handled on caller side, yet we wrap with fmt.Sprintf 100% times.

See the detailed analysis https://docs.google.com/document/d/1efVAMcEw7-R_KatHHcobcFBlNsre-DoThVHI8AO2SDQ/edit?tab=t.0

Benchmarks

I ran extensive benchmarks using synthetic data as well as real WAL segments pulled from the prombench runs.

All benchmarks are here https://github.com/prometheus/prometheus/compare/bwplotka/wal-reuse?expand=1

I might propose some benchmark to main, but it's debatable we want them:

goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/cmd/prometheus
cpu: Apple M4 Pro
                                 │    main    │           main-v2           │
                                 │   sec/op   │   sec/op    vs base         │
E2EScrapeAndRemoteWriteNoChurn-4   1.015 ± 0%   1.014 ± 0%  ~ (p=0.589 n=6)

                                 │       main       │              main-v2              │
                                 │ recv_requests/op │ recv_requests/op  vs base         │
E2EScrapeAndRemoteWriteNoChurn-4         504.5 ± 1%         504.0 ± 0%  ~ (p=0.792 n=6)

                                 │      main       │              main-v2               │
                                 │ recv_samples/op │ recv_samples/op  vs base           │
E2EScrapeAndRemoteWriteNoChurn-4       1.000M ± 0%       1.000M ± 0%  ~ (p=1.000 n=6) ¹
¹ all samples are equal

                                 │     main     │               main-v2               │
                                 │     B/op     │     B/op      vs base               │
E2EScrapeAndRemoteWriteNoChurn-4   320.9Mi ± 1%   255.1Mi ± 0%  -20.50% (p=0.002 n=6)

                                 │    main     │              main-v2              │
                                 │  allocs/op  │  allocs/op   vs base              │
E2EScrapeAndRemoteWriteNoChurn-4   3.035M ± 0%   3.011M ± 0%  -0.79% (p=0.002 n=6)

                                 │             main-v2             │
                                 │ wal_watcher_notifications_total │
E2EScrapeAndRemoteWriteNoChurn-4                       1.014k ± 2%

                                 │         main-v2         │
                                 │ wal_watcher_reads_total │
E2EScrapeAndRemoteWriteNoChurn-4               627.0 ± 10%
goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/tsdb/wlog
cpu: Apple M4 Pro
                                                                        │    main     │              main-v2              │
                                                                        │   sec/op    │   sec/op     vs base              │
Watcher_ReadSegment/data=pr18062/compression=snappy/case=one-go-2         291.2m ± 4%   285.2m ± 1%  -2.07% (p=0.002 n=6)
Watcher_ReadSegment/data=pr18062/compression=snappy/case=per-scrape-2     321.2m ± 3%   310.7m ± 1%  -3.24% (p=0.002 n=6)
Watcher_ReadSegment/data=main18062/compression=snappy/case=one-go-2       291.7m ± 3%   283.4m ± 1%  -2.83% (p=0.002 n=6)
Watcher_ReadSegment/data=main18062/compression=snappy/case=per-scrape-2   323.1m ± 6%   309.6m ± 1%  -4.19% (p=0.002 n=6)
Watcher_ReadSegment/data=synth5Rec/compression=snappy/case=one-go-2       329.7m ± 2%   340.4m ± 2%  +3.23% (p=0.015 n=6)
Watcher_ReadSegment/data=synth5Rec/compression=snappy/case=per-scrape-2   336.4m ± 1%   335.2m ± 0%       ~ (p=0.065 n=6)
geomean                                                                   315.0m        310.0m       -1.61%

                                                                        │     main     │               main-v2                │
                                                                        │ readBytes/op │ readBytes/op  vs base                │
Watcher_ReadSegment/data=pr18062/compression=snappy/case=one-go-2          134.2M ± 0%    134.2M ± 0%       ~ (p=1.000 n=6) ¹
Watcher_ReadSegment/data=pr18062/compression=snappy/case=per-scrape-2      134.2M ± 0%    134.2M ± 0%       ~ (p=1.000 n=6) ¹
Watcher_ReadSegment/data=main18062/compression=snappy/case=one-go-2        134.2M ± 0%    134.2M ± 0%       ~ (p=1.000 n=6) ¹
Watcher_ReadSegment/data=main18062/compression=snappy/case=per-scrape-2    134.2M ± 0%    134.2M ± 0%       ~ (p=1.000 n=6) ¹
Watcher_ReadSegment/data=synth5Rec/compression=snappy/case=one-go-2        45.84M ± 0%    45.84M ± 0%       ~ (p=1.000 n=6)
Watcher_ReadSegment/data=synth5Rec/compression=snappy/case=per-scrape-2    45.84M ± 0%    45.84M ± 0%       ~ (p=1.000 n=6) ¹
geomean                                                                    93.82M         93.82M       +0.00%
¹ all samples are equal

                                                                        │    main     │               main-v2               │
                                                                        │  reads/op   │  reads/op    vs base                │
Watcher_ReadSegment/data=pr18062/compression=snappy/case=one-go-2          1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=6) ¹
Watcher_ReadSegment/data=pr18062/compression=snappy/case=per-scrape-2     50.01k ± 0%   50.01k ± 0%       ~ (p=1.000 n=6) ¹
Watcher_ReadSegment/data=main18062/compression=snappy/case=one-go-2        1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=6) ¹
Watcher_ReadSegment/data=main18062/compression=snappy/case=per-scrape-2   49.75k ± 0%   49.75k ± 0%       ~ (p=1.000 n=6) ¹
Watcher_ReadSegment/data=synth5Rec/compression=snappy/case=one-go-2        1.000 ± 0%    1.000 ± 0%       ~ (p=1.000 n=6) ¹
Watcher_ReadSegment/data=synth5Rec/compression=snappy/case=per-scrape-2   10.00k ± 0%   10.00k ± 0%       ~ (p=1.000 n=6) ¹
geomean                                                                    170.9         170.9       +0.00%
¹ all samples are equal

                                                                        │       main       │                 main-v2                  │
                                                                        │ sampleAppends/op │ sampleAppends/op  vs base                │
Watcher_ReadSegment/data=pr18062/compression=snappy/case=one-go-2              50.01k ± 0%        50.01k ± 0%       ~ (p=1.000 n=6) ¹
Watcher_ReadSegment/data=pr18062/compression=snappy/case=per-scrape-2          50.01k ± 0%        50.01k ± 0%       ~ (p=1.000 n=6) ¹
Watcher_ReadSegment/data=main18062/compression=snappy/case=one-go-2            49.75k ± 0%        49.75k ± 0%       ~ (p=1.000 n=6) ¹
Watcher_ReadSegment/data=main18062/compression=snappy/case=per-scrape-2        49.75k ± 0%        49.75k ± 0%       ~ (p=1.000 n=6) ¹
Watcher_ReadSegment/data=synth5Rec/compression=snappy/case=one-go-2            10.00k ± 0%        10.00k ± 0%       ~ (p=1.000 n=6) ¹
Watcher_ReadSegment/data=synth5Rec/compression=snappy/case=per-scrape-2        10.00k ± 0%        10.00k ± 0%       ~ (p=1.000 n=6) ¹
geomean                                                                        29.19k             29.19k       +0.00%
¹ all samples are equal

                                                                        │      main      │               main-v2               │
                                                                        │      B/op      │     B/op      vs base               │
Watcher_ReadSegment/data=pr18062/compression=snappy/case=one-go-2         1588.20Ki ± 0%   88.21Ki ± 0%  -94.45% (p=0.002 n=6)
Watcher_ReadSegment/data=pr18062/compression=snappy/case=per-scrape-2     53185.5Ki ± 0%   187.0Ki ± 0%  -99.65% (p=0.002 n=6)
Watcher_ReadSegment/data=main18062/compression=snappy/case=one-go-2        834.07Ki ± 0%   77.89Ki ± 0%  -90.66% (p=0.002 n=6)
Watcher_ReadSegment/data=main18062/compression=snappy/case=per-scrape-2   54570.4Ki ± 0%   106.9Ki ± 0%  -99.80% (p=0.002 n=6)
Watcher_ReadSegment/data=synth5Rec/compression=snappy/case=one-go-2         442.6Mi ± 0%   442.5Mi ± 0%   -0.02% (p=0.002 n=6)
Watcher_ReadSegment/data=synth5Rec/compression=snappy/case=per-scrape-2     454.4Mi ± 0%   442.9Mi ± 0%   -2.52% (p=0.002 n=6)
geomean                                                                     29.82Mi        1.704Mi       -94.29%

                                                                        │      main       │               main-v2               │
                                                                        │    allocs/op    │  allocs/op    vs base               │
Watcher_ReadSegment/data=pr18062/compression=snappy/case=one-go-2             43.000 ± 2%    8.000 ±  0%  -81.40% (p=0.002 n=6)
Watcher_ReadSegment/data=pr18062/compression=snappy/case=per-scrape-2      105103.00 ± 0%    11.00 ±  0%  -99.99% (p=0.002 n=6)
Watcher_ReadSegment/data=main18062/compression=snappy/case=one-go-2           32.000 ± 3%    5.000 ±  0%  -84.38% (p=0.002 n=6)
Watcher_ReadSegment/data=main18062/compression=snappy/case=per-scrape-2   104587.500 ± 0%    8.000 ± 12%  -99.99% (p=0.002 n=6)
Watcher_ReadSegment/data=synth5Rec/compression=snappy/case=one-go-2           14.00M ± 0%   14.00M ±  0%   -0.00% (p=0.002 n=6)
Watcher_ReadSegment/data=synth5Rec/compression=snappy/case=per-scrape-2       14.03M ± 0%   14.00M ±  0%   -0.18% (p=0.002 n=6)
geomean   

Which issue(s) does the PR fix:

Does this PR introduce a user-facing change?

[PERF] remote: Optimize WAL watching used for RW sending to reuse internal buffers.

bwplotka added 4 commits March 6, 2026 14:41
Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>
@bwplotka bwplotka changed the title tsdb/wlog[PERF]: reuse buffers; move common err wrap in WAL watcher (up to 540x less B/op; 13000x less allocs/op) tsdb/wlog[PERF]: optimize WAL watcher reads (up to 540x less B/op; 13000x less allocs/op) Mar 6, 2026
@bwplotka
Copy link
Copy Markdown
Member Author

bwplotka commented Mar 6, 2026

/prombench main

@prombot
Copy link
Copy Markdown
Contributor

prombot commented Mar 6, 2026

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-18250 and main

After the successful deployment (check status here), the benchmarking results can be viewed at:

Available Commands:

  • To restart benchmark: /prombench restart main
  • To stop benchmark: /prombench cancel
  • To print help: /prombench help

Signed-off-by: bwplotka <bwplotka@gmail.com>
@bwplotka bwplotka requested a review from bboreham March 6, 2026 15:17
Comment thread tsdb/record/buffers.go
@bwplotka
Copy link
Copy Markdown
Member Author

bwplotka commented Mar 6, 2026

/prombench cancel

@prombot
Copy link
Copy Markdown
Contributor

prombot commented Mar 6, 2026

Benchmark cancel is in progress.

@bwplotka
Copy link
Copy Markdown
Member Author

bwplotka commented Mar 8, 2026

/prombench main

@prombot
Copy link
Copy Markdown
Contributor

prombot commented Mar 8, 2026

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-18250 and main

After the successful deployment (check status here), the benchmarking results can be viewed at:

Available Commands:

  • To restart benchmark: /prombench restart main
  • To stop benchmark: /prombench cancel
  • To print help: /prombench help

@bwplotka
Copy link
Copy Markdown
Member Author

bwplotka commented Mar 8, 2026

/prombench main

@prombot
Copy link
Copy Markdown
Contributor

prombot commented Mar 8, 2026

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-18250 and main

After the successful deployment (check status here), the benchmarking results can be viewed at:

Available Commands:

  • To restart benchmark: /prombench restart main
  • To stop benchmark: /prombench cancel
  • To print help: /prombench help

@bwplotka
Copy link
Copy Markdown
Member Author

bwplotka commented Mar 9, 2026

/prombench main

@prombot
Copy link
Copy Markdown
Contributor

prombot commented Mar 9, 2026

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-18250 and main

After the successful deployment (check status here), the benchmarking results can be viewed at:

Available Commands:

  • To restart benchmark: /prombench restart main
  • To stop benchmark: /prombench cancel
  • To print help: /prombench help

@bwplotka

This comment was marked as outdated.

@bwplotka
Copy link
Copy Markdown
Member Author

bwplotka commented Mar 9, 2026

/prombench restart main

@prombot
Copy link
Copy Markdown
Contributor

prombot commented Mar 9, 2026

⏱️ Welcome (again) to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-18250 and main

After successful deployment (check status here), the benchmarking results can be viewed at:

Available Commands:

  • To restart benchmark: /prombench restart main
  • To stop benchmark: /prombench cancel
  • To print help: /prombench help

@bwplotka
Copy link
Copy Markdown
Member Author

bwplotka commented Mar 9, 2026

So far it's a solid ~18% alloc/s improvement
image

@bwplotka
Copy link
Copy Markdown
Member Author

bwplotka commented Mar 9, 2026

So far so good IMO

image

@bwplotka bwplotka requested a review from krajorama March 9, 2026 16:23
Copy link
Copy Markdown
Member

@krajorama krajorama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one thing that might be worth doing is adding a note in the type WriteTo interface definition to say that the implementation that receives record.Ref* parameters must copy the contents and not keep a reference to the record.Ref* directly as these are reused.

Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>
@bwplotka bwplotka force-pushed the bwplotka/wal-watcher-optimize branch from 2556b60 to 58930ec Compare March 10, 2026 11:19
@bwplotka
Copy link
Copy Markdown
Member Author

LGTM, one thing that might be worth doing is adding a note in the type WriteTo interface definition to say that the implementation that receives record.Ref* parameters must copy the contents and not keep a reference to the record.Ref* directly as these are reused.

Addressed.

I'm happy with the results:
image

cc @bboreham objections?

@bwplotka
Copy link
Copy Markdown
Member Author

Failures are due to flaky tests #18269

Copy link
Copy Markdown
Member

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks clean. I am 2% afraid there will be some memory corruption, but ok to go ahead.

@bwplotka bwplotka merged commit a732020 into main Mar 11, 2026
55 of 57 checks passed
@bwplotka bwplotka deleted the bwplotka/wal-watcher-optimize branch March 11, 2026 09:17
@bwplotka
Copy link
Copy Markdown
Member Author

/prombench cancel

@prombot
Copy link
Copy Markdown
Contributor

prombot commented Mar 11, 2026

Benchmark cancel is in progress.

renovate Bot added a commit to sdwilsh/ansible-playbooks that referenced this pull request Apr 8, 2026
##### [\`v3.11.0\`](https://github.com/prometheus/prometheus/releases/tag/v3.11.0)

- \[CHANGE] Hetzner SD: The `__meta_hetzner_datacenter` label is deprecated for the role `robot` but kept for backward compatibility, use the `__meta_hetzner_robot_datacenter` label instead. For the role `hcloud`, the label is deprecated and will stop working after the 1 July 2026. [#17850](prometheus/prometheus#17850)
- \[CHANGE] Hetzner SD: The `__meta_hetzner_hcloud_datacenter_location` and `__meta_hetzner_hcloud_datacenter_location_network_zone` labels are deprecated, use the `__meta_hetzner_hcloud_location` and `__meta_hetzner_hcloud_location_network_zone` labels instead. [#17850](prometheus/prometheus#17850)
- \[CHANGE] Promtool: Redirect debug output to stderr to avoid interfering with stdout-based tool output. [#18346](prometheus/prometheus#18346)
- \[FEATURE] AWS SD: Add Elasticache Role. [#18099](prometheus/prometheus#18099)
- \[FEATURE] AWS SD: Add RDS Role. [#18206](prometheus/prometheus#18206)
- \[FEATURE] Azure SD: Add support for Azure Workload Identity authentication method. [#17207](prometheus/prometheus#17207)
- \[FEATURE] Discovery: Introduce `prometheus_sd_last_update_timestamp_seconds` metric to track the last time a service discovery update was sent to consumers. [#18194](prometheus/prometheus#18194)
- \[FEATURE] Kubernetes SD: Add support for node role selectors for pod roles. [#18006](prometheus/prometheus#18006)
- \[FEATURE] Kubernetes SD: Introduce pod-based labels for deployment, cronjob, and job controller names: `__meta_kubernetes_pod_deployment_name`, `__meta_kubernetes_pod_cronjob_name` and `__meta_kubernetes_pod_job_name`, respectively. [#17774](prometheus/prometheus#17774)
- \[FEATURE] PromQL: Add `</` and `>/` operators for trimming observations from native histograms. [#17904](prometheus/prometheus#17904)
- \[FEATURE] PromQL: Add experimental `histogram_quantiles` variadic function for computing multiple quantiles at once. [#17285](prometheus/prometheus#17285)
- \[FEATURE] TSDB: Add `storage.tsdb.retention.percentage` configuration to configure the maximum percent of disk usable for TSDB storage. [#18080](prometheus/prometheus#18080)
- \[FEATURE] TSDB: Add an experimental `fast-startup` feature flag that writes a `series_state.json` file to the WAL directory to track active series state across restarts. [#18303](prometheus/prometheus#18303)
- \[FEATURE] TSDB: Add an experimental `st-storage` feature flag. When enabled, Prometheus stores ingested start timestamps (ST, previously called Created Timestamp) from scrape or OTLP in the TSDB and Agent WAL, and exposes them via Remote Write 2. [#18062](prometheus/prometheus#18062)
- \[FEATURE] TSDB: Add an experimental `xor2-encoding` feature flag for the new TSDB block float sample chunk encoding that is optimized for scraped data and allows encoding start timestamps. [#18062](prometheus/prometheus#18062)
- \[ENHANCEMENT] HTTP client: Add AWS `external_id` support for sigv4. [#17916](prometheus/prometheus#17916)
- \[ENHANCEMENT] Kubernetes SD: Deduplicate deprecation warning logs from the Kubernetes API to reduce noise. [#17829](prometheus/prometheus#17829)
- \[ENHANCEMENT] TSDB: Remove old temporary checkpoints when creating a Checkpoint. [#17598](prometheus/prometheus#17598)
- \[ENHANCEMENT] UI: Add autocomplete support for experimental `first_over_time` and `ts_of_first_over_time` PromQL functions. [#18318](prometheus/prometheus#18318)
- \[ENHANCEMENT] Vultr SD: Upgrade govultr library from v2 to v3 for continued security patches and maintenance. [#18347](prometheus/prometheus#18347)
- \[PERF] PromQL: Improve performance and reduce heap allocations in joins (VectorBinop)/And/Or/Unless. [#17159](prometheus/prometheus#17159)
- \[PERF] PromQL: Partially address performance regression in native histogram aggregations due to using `KahanAdd`. [#18252](prometheus/prometheus#18252)
- \[PERF] Remote write: Optimize WAL watching used for RW sending to reuse internal buffers. [#18250](prometheus/prometheus#18250)
- \[PERF] TSDB: Optimize LabelValues intersection performance for matchers. [#18069](prometheus/prometheus#18069)
- \[PERF] UI: Skip restacking on hover in stacked series charts. [#18230](prometheus/prometheus#18230)
- \[BUGFIX] AWS SD: Fix EC2 SD ignoring the configured `endpoint` option, a regression from the AWS SDK v2 migration. [#18133](prometheus/prometheus#18133)
- \[BUGFIX] AWS SD: Fix panic in EC2 SD when DescribeAvailabilityZones returns nil ZoneName or ZoneId. [#18133](prometheus/prometheus#18133)
- \[BUGFIX] Agent: Fix memory leak caused by duplicate SeriesRefs being loaded as active series. [#17538](prometheus/prometheus#17538)
- \[BUGFIX] Alerting: Fix alert state incorrectly resetting to pending when the FOR period is increased in the config file. [#18244](prometheus/prometheus#18244)
- \[BUGFIX] Azure SD: Fix system-assigned managed identity not working when `client_id` is empty. [#18323](prometheus/prometheus#18323)
- \[BUGFIX] Consul SD: Fix filter parameter not being applied to health service endpoint, causing Node and Node.Meta filters to be ignored. [#17349](prometheus/prometheus#17349)
- \[BUGFIX] Kubernetes SD: Fix duplicate targets generated by `*DualStack` EndpointSlices policies. [#18192](prometheus/prometheus#18192)
- \[BUGFIX] OTLP: Fix ErrTooOldSample being returned as HTTP 500 instead of 400 in PRW v2 histogram write paths, preventing infinite client retry loops. [#18084](prometheus/prometheus#18084)
- \[BUGFIX] OTLP: Fix exemplars getting mixed between incorrect parts of a histogram. [#18056](prometheus/prometheus#18056)
- \[BUGFIX] PromQL: Do not skip histogram buckets in queries where histogram trimming is used. [#18263](prometheus/prometheus#18263)
- \[BUGFIX] Remote write: Fix `prometheus_remote_storage_sent_batch_duration_seconds` measuring before the request was sent. [#18214](prometheus/prometheus#18214)
- \[BUGFIX] Rules: Fix alert state restoration when rule labels contain Go template expressions. [#18375](prometheus/prometheus#18375)
- \[BUGFIX] Scrape: Fix panic when parsing bare label names without an equal sign in brace-only metric notation. [#18229](prometheus/prometheus#18229)
- \[BUGFIX] TSDB: Fail early when `use-uncached-io` feature flag is set on unsupported environments. [#18219](prometheus/prometheus#18219)
- \[BUGFIX] TSDB: Fall back to CLI flag values when retention is removed from config file. [#18200](prometheus/prometheus#18200)
- \[BUGFIX] TSDB: Fix memory leaks in buffer pools by clearing reference fields before returning buffers to pools. [#17895](prometheus/prometheus#17895)
- \[BUGFIX] TSDB: Fix missing mmap of histogram chunks during WAL replay. [#18306](prometheus/prometheus#18306)
- \[BUGFIX] TSDB: Fix storage.tsdb.retention.time unit mismatch in file causing retention to be 1e6 times longer than configured. [#18200](prometheus/prometheus#18200)
- \[BUGFIX] Tracing: Fix missing traceID in query log when tracing is enabled, previously only spanID was emitted. [#18189](prometheus/prometheus#18189)
- \[BUGFIX] UI: Fix tooltip Y-offset drift when using multiple graph panels. [#18228](prometheus/prometheus#18228)
- \[BUGFIX] UI: Update retention display in runtime info when config is reloaded. [#18200](prometheus/prometheus#18200)
renovate Bot added a commit to sdwilsh/ansible-playbooks that referenced this pull request Apr 8, 2026
##### [\`v3.11.0\`](https://github.com/prometheus/prometheus/releases/tag/v3.11.0)

- \[CHANGE] Hetzner SD: The `__meta_hetzner_datacenter` label is deprecated for the role `robot` but kept for backward compatibility, use the `__meta_hetzner_robot_datacenter` label instead. For the role `hcloud`, the label is deprecated and will stop working after the 1 July 2026. [#17850](prometheus/prometheus#17850)
- \[CHANGE] Hetzner SD: The `__meta_hetzner_hcloud_datacenter_location` and `__meta_hetzner_hcloud_datacenter_location_network_zone` labels are deprecated, use the `__meta_hetzner_hcloud_location` and `__meta_hetzner_hcloud_location_network_zone` labels instead. [#17850](prometheus/prometheus#17850)
- \[CHANGE] Promtool: Redirect debug output to stderr to avoid interfering with stdout-based tool output. [#18346](prometheus/prometheus#18346)
- \[FEATURE] AWS SD: Add Elasticache Role. [#18099](prometheus/prometheus#18099)
- \[FEATURE] AWS SD: Add RDS Role. [#18206](prometheus/prometheus#18206)
- \[FEATURE] Azure SD: Add support for Azure Workload Identity authentication method. [#17207](prometheus/prometheus#17207)
- \[FEATURE] Discovery: Introduce `prometheus_sd_last_update_timestamp_seconds` metric to track the last time a service discovery update was sent to consumers. [#18194](prometheus/prometheus#18194)
- \[FEATURE] Kubernetes SD: Add support for node role selectors for pod roles. [#18006](prometheus/prometheus#18006)
- \[FEATURE] Kubernetes SD: Introduce pod-based labels for deployment, cronjob, and job controller names: `__meta_kubernetes_pod_deployment_name`, `__meta_kubernetes_pod_cronjob_name` and `__meta_kubernetes_pod_job_name`, respectively. [#17774](prometheus/prometheus#17774)
- \[FEATURE] PromQL: Add `</` and `>/` operators for trimming observations from native histograms. [#17904](prometheus/prometheus#17904)
- \[FEATURE] PromQL: Add experimental `histogram_quantiles` variadic function for computing multiple quantiles at once. [#17285](prometheus/prometheus#17285)
- \[FEATURE] TSDB: Add `storage.tsdb.retention.percentage` configuration to configure the maximum percent of disk usable for TSDB storage. [#18080](prometheus/prometheus#18080)
- \[FEATURE] TSDB: Add an experimental `fast-startup` feature flag that writes a `series_state.json` file to the WAL directory to track active series state across restarts. [#18303](prometheus/prometheus#18303)
- \[FEATURE] TSDB: Add an experimental `st-storage` feature flag. When enabled, Prometheus stores ingested start timestamps (ST, previously called Created Timestamp) from scrape or OTLP in the TSDB and Agent WAL, and exposes them via Remote Write 2. [#18062](prometheus/prometheus#18062)
- \[FEATURE] TSDB: Add an experimental `xor2-encoding` feature flag for the new TSDB block float sample chunk encoding that is optimized for scraped data and allows encoding start timestamps. [#18062](prometheus/prometheus#18062)
- \[ENHANCEMENT] HTTP client: Add AWS `external_id` support for sigv4. [#17916](prometheus/prometheus#17916)
- \[ENHANCEMENT] Kubernetes SD: Deduplicate deprecation warning logs from the Kubernetes API to reduce noise. [#17829](prometheus/prometheus#17829)
- \[ENHANCEMENT] TSDB: Remove old temporary checkpoints when creating a Checkpoint. [#17598](prometheus/prometheus#17598)
- \[ENHANCEMENT] UI: Add autocomplete support for experimental `first_over_time` and `ts_of_first_over_time` PromQL functions. [#18318](prometheus/prometheus#18318)
- \[ENHANCEMENT] Vultr SD: Upgrade govultr library from v2 to v3 for continued security patches and maintenance. [#18347](prometheus/prometheus#18347)
- \[PERF] PromQL: Improve performance and reduce heap allocations in joins (VectorBinop)/And/Or/Unless. [#17159](prometheus/prometheus#17159)
- \[PERF] PromQL: Partially address performance regression in native histogram aggregations due to using `KahanAdd`. [#18252](prometheus/prometheus#18252)
- \[PERF] Remote write: Optimize WAL watching used for RW sending to reuse internal buffers. [#18250](prometheus/prometheus#18250)
- \[PERF] TSDB: Optimize LabelValues intersection performance for matchers. [#18069](prometheus/prometheus#18069)
- \[PERF] UI: Skip restacking on hover in stacked series charts. [#18230](prometheus/prometheus#18230)
- \[BUGFIX] AWS SD: Fix EC2 SD ignoring the configured `endpoint` option, a regression from the AWS SDK v2 migration. [#18133](prometheus/prometheus#18133)
- \[BUGFIX] AWS SD: Fix panic in EC2 SD when DescribeAvailabilityZones returns nil ZoneName or ZoneId. [#18133](prometheus/prometheus#18133)
- \[BUGFIX] Agent: Fix memory leak caused by duplicate SeriesRefs being loaded as active series. [#17538](prometheus/prometheus#17538)
- \[BUGFIX] Alerting: Fix alert state incorrectly resetting to pending when the FOR period is increased in the config file. [#18244](prometheus/prometheus#18244)
- \[BUGFIX] Azure SD: Fix system-assigned managed identity not working when `client_id` is empty. [#18323](prometheus/prometheus#18323)
- \[BUGFIX] Consul SD: Fix filter parameter not being applied to health service endpoint, causing Node and Node.Meta filters to be ignored. [#17349](prometheus/prometheus#17349)
- \[BUGFIX] Kubernetes SD: Fix duplicate targets generated by `*DualStack` EndpointSlices policies. [#18192](prometheus/prometheus#18192)
- \[BUGFIX] OTLP: Fix ErrTooOldSample being returned as HTTP 500 instead of 400 in PRW v2 histogram write paths, preventing infinite client retry loops. [#18084](prometheus/prometheus#18084)
- \[BUGFIX] OTLP: Fix exemplars getting mixed between incorrect parts of a histogram. [#18056](prometheus/prometheus#18056)
- \[BUGFIX] PromQL: Do not skip histogram buckets in queries where histogram trimming is used. [#18263](prometheus/prometheus#18263)
- \[BUGFIX] Remote write: Fix `prometheus_remote_storage_sent_batch_duration_seconds` measuring before the request was sent. [#18214](prometheus/prometheus#18214)
- \[BUGFIX] Rules: Fix alert state restoration when rule labels contain Go template expressions. [#18375](prometheus/prometheus#18375)
- \[BUGFIX] Scrape: Fix panic when parsing bare label names without an equal sign in brace-only metric notation. [#18229](prometheus/prometheus#18229)
- \[BUGFIX] TSDB: Fail early when `use-uncached-io` feature flag is set on unsupported environments. [#18219](prometheus/prometheus#18219)
- \[BUGFIX] TSDB: Fall back to CLI flag values when retention is removed from config file. [#18200](prometheus/prometheus#18200)
- \[BUGFIX] TSDB: Fix memory leaks in buffer pools by clearing reference fields before returning buffers to pools. [#17895](prometheus/prometheus#17895)
- \[BUGFIX] TSDB: Fix missing mmap of histogram chunks during WAL replay. [#18306](prometheus/prometheus#18306)
- \[BUGFIX] TSDB: Fix storage.tsdb.retention.time unit mismatch in file causing retention to be 1e6 times longer than configured. [#18200](prometheus/prometheus#18200)
- \[BUGFIX] Tracing: Fix missing traceID in query log when tracing is enabled, previously only spanID was emitted. [#18189](prometheus/prometheus#18189)
- \[BUGFIX] UI: Fix tooltip Y-offset drift when using multiple graph panels. [#18228](prometheus/prometheus#18228)
- \[BUGFIX] UI: Update retention display in runtime info when config is reloaded. [#18200](prometheus/prometheus#18200)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feature/start-time: WAL Optimization

4 participants