Handle indices with zero/missing uptime correctly in write-load calculation #136929

nicktindall · 2025-10-22T03:30:45Z

This caused an incident in QA, we will continue to investigate WHY an index might be missing uptime/write load for all shards, but this should protect against it if/when it happens again.

Fixes: ES-13286

nicktindall · 2025-10-22T03:47:38Z

.../test/java/org/elasticsearch/xpack/writeloadforecaster/LicensedWriteLoadForecasterTests.java

+        if (someIndicesHadUptime) {
+            assertThat(forecastedWriteLoad.getAsDouble(), not(notANumber()));
+        }
+    }


Could probably have tested this by calling forecastIndexWriteLoad directly (it's exposed for testing). Happy to do that instead if we want to reduce the boilerplate.

elasticsearchmachine · 2025-10-22T03:51:11Z

Hi @nicktindall, I've created a changelog YAML for you.

elasticsearchmachine · 2025-10-22T03:51:35Z

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

ywangd

LGTM

I noticed some transport version changes (#136336) and its revert (#136510) touched on how uptime is serialized. But given the transport version, it does not seem possible to impact serverless. It remains as a puzzle on why it happened.

ywangd · 2025-10-22T04:24:45Z

...r/src/main/java/org/elasticsearch/xpack/writeloadforecaster/LicensedWriteLoadForecaster.java

            // that index. It should be safe to extrapolate our weighted average out to the
            // maximum uptime observed, based on the assumption that write-load is roughly
            // evenly distributed across shards of a datastream index.
+            assert Double.isNaN(weightedAverageShardWriteLoad) == false : "Invalid average shard write load";


Nit: maybe Double.isFinite instead?

I changed these assertions to assert on the two values we are adding to the overall totals, please re-check. I think this is a better approach as it's more agnostic of how they are calculated.

See d59c122

nicktindall · 2025-10-22T06:40:40Z

💚 All backports created successfully

Status	Branch	Result
✅	9.2

Questions ?

Please refer to the Backport tool documentation

…lation (elastic#136929) Fixes: ES-13286 (cherry picked from commit 2e340de)

…lation (#136929) (#136933) Fixes: ES-13286 (cherry picked from commit 2e340de)

Handle indices with zero uptime correctly in write-load calculation

9051123

Fixes: ES-13286

elasticsearchmachine added the v9.3.0 label Oct 22, 2025

nicktindall added 3 commits October 22, 2025 14:41

Also include missing write loads

488949e

Naming

c21ec13

Tidy

5ec7299

nicktindall commented Oct 22, 2025

View reviewed changes

nicktindall requested a review from ywangd October 22, 2025 03:50

nicktindall added >bug :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) labels Oct 22, 2025

nicktindall changed the title ~~Handle indices with zero uptime correctly in write-load calculation~~ Handle indices with zero/missing uptime correctly in write-load calculation Oct 22, 2025

nicktindall and others added 2 commits October 22, 2025 14:51

Merge branch 'main' into write_load_forecast_handles_zero_uptime

bf278ea

Update docs/changelog/136929.yaml

f2804fc

nicktindall marked this pull request as ready for review October 22, 2025 03:51

elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Oct 22, 2025

ywangd approved these changes Oct 22, 2025

View reviewed changes

nicktindall added 2 commits October 22, 2025 16:12

Tweak assertions

d59c122

Typo

859543d

nicktindall merged commit 2e340de into elastic:main Oct 22, 2025
34 checks passed

nicktindall deleted the write_load_forecast_handles_zero_uptime branch October 22, 2025 06:34

nicktindall mentioned this pull request Oct 22, 2025

[9.2] Handle indices with zero/missing uptime correctly in write-load calculation (#136929) #136933

Merged

nicktindall added a commit to nicktindall/elasticsearch that referenced this pull request Oct 22, 2025

Handle indices with zero/missing uptime correctly in write-load calcu…

ddc90e0

…lation (elastic#136929) Fixes: ES-13286 (cherry picked from commit 2e340de)

nicktindall added a commit that referenced this pull request Oct 22, 2025

Handle indices with zero/missing uptime correctly in write-load calcu…

5f6fa9f

…lation (#136929) (#136933) Fixes: ES-13286 (cherry picked from commit 2e340de)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Handle indices with zero/missing uptime correctly in write-load calculation #136929

Handle indices with zero/missing uptime correctly in write-load calculation #136929

Uh oh!

nicktindall commented Oct 22, 2025 •

edited

Loading

Uh oh!

nicktindall Oct 22, 2025

Uh oh!

elasticsearchmachine commented Oct 22, 2025

Uh oh!

elasticsearchmachine commented Oct 22, 2025

Uh oh!

ywangd left a comment

Uh oh!

ywangd Oct 22, 2025

Uh oh!

nicktindall Oct 22, 2025

Uh oh!

Uh oh!

nicktindall commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Handle indices with zero/missing uptime correctly in write-load calculation #136929

Handle indices with zero/missing uptime correctly in write-load calculation #136929

Uh oh!

Conversation

nicktindall commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nicktindall Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Oct 22, 2025

Uh oh!

elasticsearchmachine commented Oct 22, 2025

Uh oh!

ywangd left a comment

Choose a reason for hiding this comment

Uh oh!

ywangd Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

nicktindall Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nicktindall commented Oct 22, 2025

💚 All backports created successfully

Questions ?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nicktindall commented Oct 22, 2025 •

edited

Loading