Optimize parsing of compound format in `MergePolicyConfig` #135643

nielsbauman · 2025-09-29T16:43:18Z

By checking if the setting value ends with a b, we can skip the ratio/double parsing. The default value is 1gb, so this will skip the ratio parsing by default.
I was doing some benchmarking on component templates and noticed the MergePolicyConfig constructor sticking out slightly in flamegraphs due to the failing double parsing. This minor optimization is almost not worth opening a PR for, but I figured it doesn't hurt to do.

Since the default value for `index.compound_format` is `1gb`, it makes more sense to try to parse a `ByteSizeValue` first and only after that try to parse a raw double. I was doing some benchmarking on component templates and noticed the `MergePolicyConfig` constructor sticking out slightly in flamegraphs due to the failing double parsing. This minor optimization is almost not worth opening a PR for, but I figured it doesn't hurt to do.

elasticsearchmachine · 2025-09-29T16:43:43Z

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

nielsbauman · 2025-09-29T16:45:09Z

server/src/test/java/org/elasticsearch/index/MergePolicyConfigTests.java

        assertCompoundThreshold(build("false"), 0.0, ByteSizeValue.ofBytes(Long.MAX_VALUE));
        assertCompoundThreshold(build(false), 0.0, ByteSizeValue.ofBytes(Long.MAX_VALUE));
-        assertCompoundThreshold(build(0), 0.0, ByteSizeValue.ofBytes(Long.MAX_VALUE));
+        assertCompoundThreshold(build(0), 1.0, ByteSizeValue.ofBytes(0));


As you can see in this test change, there is a slight change in behavior when the input value is 0. No other test seems to break on this change. If this change does have unwanted side effects, I'm fine with closing this PR and keeping the code as-is.

lkts · 2025-10-01T18:06:25Z

What even is index.compound_format? I can't find any documentation for it.

nielsbauman · 2025-10-01T18:20:07Z

What even is index.compound_format? I can't find any documentation for it.

I have absolutely no idea. I didn't really look into the setting itself, just at the parsing.

lkts · 2025-10-02T18:01:44Z

I am wondering if it's maybe deprecated and we can remove it in the future completely. In the meantime is it maybe better to check for if (INDEX_COMPOUND_FORMAT_SETTING.exists(indexSettings.getSettings())) inside MergePolicyConfig constructor and if it doesn't skip the parsing and use a known default instance?

This reverts commit 78e5fcc.

nielsbauman · 2025-10-02T20:46:41Z

Thanks for the suggestion! I've reverted my original commit and added a9de2dc to implement your suggestion.

lkts

LGTM but i wonder if we can find anyone familiar with this code to cross check.

henningandersen

If this was almost not worth opening a PR for, i wonder if the added complexity here is worth it? If it is important we sort of want to capture it in some test or benchmark. If not, we should not do this.

I wonder if a setting infra change could fix this more generically instead? Would be a shame to guard many settings reads like this.

nielsbauman · 2025-10-03T11:34:27Z

@henningandersen the part that was causing it to show up in the flamegraph was the stacktrace creation:

Therefore, this doesn't look like something we can tackle with generic setting infra change - please correct me if I'm wrong.

My original change was not to guard the setting reads, but to just swap the order of parsing (see 78e5fcc). That doesn't add any extra complexity, but the "right" order depends on the default value, although I don't expect this default value to change much.

nielsbauman · 2025-10-13T13:07:53Z

@henningandersen any thoughts on my previous comment?

henningandersen · 2025-10-14T07:54:31Z

I prefer the original version (only saw the new version). Can I suggest to follow this pattern, i.e., skip the double parsing if the value ends in b? That way the default value and order are unrelated.

This reverts commit a9de2dc.

henningandersen

I think we've introduced a change, which needs addressing.

henningandersen · 2025-10-14T11:23:01Z

server/src/main/java/org/elasticsearch/index/MergePolicyConfig.java

+        try {
+            return new CompoundFileThreshold(Double.parseDouble(noCFSRatio));
+        } catch (NumberFormatException ex) {
+            throw new IllegalArgumentException(


We should still parse it as bytes if it is not parseable as a double. For instance, ByteSizeValue.parseBytesSizeValue trims white space, so " 1gb " would no longer be legal with this change, but was in the past.

I added two test cases locally:

assertCompoundThreshold(build(" 1gb"), 1.0, ByteSizeValue.ofGb(1)); assertCompoundThreshold(build(" 0"), 0, ByteSizeValue.ofBytes(Long.MAX_VALUE));

and both pass on my current changes. Looking at the source code of Double.parseDouble, that calls FloatingDecimal.readJavaFormatString under the hood, which trims white spaces too:

in = in.trim(); // don't fool around with white space.

Are there any other examples you can think of that wouldn't be legal with this change? I can invert the if-statement to if (noCFSRatio.endsWith("b") == false && noCFSRatio.endsWith("B") == false); that allows us to do the bytes parsing at the end, if you feel more confident with that change.

I think " 1gb " would fail?

Ah, sorry, I missed the white space at the end of your example. My bad...

I tried " 1gb " and that works as well. It took me a minute to realize why, but I noticed we have the following line:

elasticsearch/server/src/main/java/org/elasticsearch/index/MergePolicyConfig.java

Lines 395 to 396 in 5d1a0a6

private static CompoundFileThreshold parseCompoundFormat(String noCFSRatio) {

noCFSRatio = noCFSRatio.trim();

Alright, then "1k" will fail I think? Or "1t"?

Ugh, you're right. Sorry, I'm not sharp on this PR... Fixed in 88c62e0.

henningandersen

LGTM.

henningandersen · 2025-10-15T13:43:29Z

server/src/main/java/org/elasticsearch/index/MergePolicyConfig.java

-                throw e;
+                return new CompoundFileThreshold(Double.parseDouble(noCFSRatio));
+            } catch (NumberFormatException e) {
+                // ignore, see if it parses as bytes


It would be good to retain this as suppressed if the parsing below fails too, just like the original version kept the byte size parsing as suppressed (fine to swap which is suppressed I think).

Yeah I realized that too, but couldn't think of a clean way to implement that. I added some logic in 86b35e5. Let me know if that matches what you had in mind.

Looks fine to me.

…35643) By checking if the setting value ends with a `b`, we can skip the ratio/double parsing. The default value is `1gb`, so this will skip the ratio parsing by default. I was doing some benchmarking on component templates and noticed the `MergePolicyConfig` constructor sticking out slightly in flamegraphs due to the failing double parsing.

nielsbauman added :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >refactoring labels Sep 29, 2025

elasticsearchmachine added Team:Distributed Indexing Meta label for Distributed Indexing team v9.2.0 labels Sep 29, 2025

nielsbauman commented Sep 29, 2025

View reviewed changes

Merge branch 'main' into merge-policy-parsing

c69f81a

elasticsearchmachine added v9.3.0 and removed v9.2.0 labels Oct 2, 2025

nielsbauman added 3 commits October 2, 2025 17:32

Merge branch 'main' into merge-policy-parsing

ced8c63

Revert "Swap MergePolicyConfig compound format parsing order"

5f42901

This reverts commit 78e5fcc.

Use default instance instead

a9de2dc

lkts approved these changes Oct 2, 2025

View reviewed changes

henningandersen reviewed Oct 3, 2025

View reviewed changes

nielsbauman added 3 commits October 14, 2025 10:15

Revert "Use default instance instead"

6f6a200

This reverts commit a9de2dc.

Merge branch 'main' into merge-policy-parsing

fc11dbd

Check last character to determine parsing

232dcc3

nielsbauman requested a review from henningandersen October 14, 2025 08:53

henningandersen reviewed Oct 14, 2025

View reviewed changes

nielsbauman added 3 commits October 15, 2025 14:16

Add some more test cases with white spaces

47ab9af

Merge branch 'main' into merge-policy-parsing

f123d96

Merge branch 'main' into merge-policy-parsing

63ca424

Fix bug

88c62e0

henningandersen approved these changes Oct 15, 2025

View reviewed changes

nielsbauman and others added 2 commits October 15, 2025 15:49

Retain NFE as suppressed

86b35e5

Merge branch 'main' into merge-policy-parsing

a34daff

nielsbauman enabled auto-merge (squash) October 15, 2025 14:40

nielsbauman disabled auto-merge October 15, 2025 14:40

nielsbauman changed the title ~~Swap MergePolicyConfig compound format parsing order~~ Optimize parsing of compound format in MergePolicyConfig Oct 15, 2025

nielsbauman merged commit c09dc8e into elastic:main Oct 15, 2025
34 checks passed

nielsbauman deleted the merge-policy-parsing branch October 15, 2025 16:54

	private static CompoundFileThreshold parseCompoundFormat(String noCFSRatio) {
	noCFSRatio = noCFSRatio.trim();

Optimize parsing of compound format in MergePolicyConfig #135643

Optimize parsing of compound format in MergePolicyConfig #135643

Uh oh!

Conversation

nielsbauman commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Sep 29, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lkts commented Oct 1, 2025

Uh oh!

nielsbauman commented Oct 1, 2025

Uh oh!

lkts commented Oct 2, 2025

Uh oh!

nielsbauman commented Oct 2, 2025

Uh oh!

lkts left a comment

Choose a reason for hiding this comment

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

nielsbauman commented Oct 3, 2025

Uh oh!

nielsbauman commented Oct 13, 2025

Uh oh!

henningandersen commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

henningandersen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Optimize parsing of compound format in `MergePolicyConfig` #135643

Optimize parsing of compound format in `MergePolicyConfig` #135643

nielsbauman commented Sep 29, 2025 •

edited

Loading

henningandersen commented Oct 14, 2025 •

edited

Loading