Add length validation for rename_replacement parameter in snapshot restore request #137859

gmjehovich · 2025-11-11T01:04:54Z

Summary

Adds validation to the rename_replacement parameter in the snapshot restore API to prevent resource exhaustion.

Changes

Added validation in RestoreSnapshotRequest.validate() to limit rename_replacement UTF-8 byte length to MAX_INDEX_NAME_BYTES (255 bytes)
Added integration test testRenameReplacementTooLongRejected() to verify the validation works correctly
Imported MAX_INDEX_NAME_BYTES from MetadataCreateIndexService to maintain consistency with index name constraints

Testing

Integration test confirms validation rejects overly long values
Test confirms valid values (≤255 bytes) are still accepted
Verified existing restore functionality is not affected

…store

elasticsearchmachine · 2025-11-12T01:05:02Z

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

elasticsearchmachine · 2025-11-12T01:13:32Z

Hi @gmjehovich, I've created a changelog YAML for you.

ywangd

Looking good. I have a suggestion about whether we should further restrict building unnecessarily long strings for renamed index names.

ywangd · 2025-11-12T01:55:51Z

...in/java/org/elasticsearch/action/admin/cluster/snapshots/restore/RestoreSnapshotRequest.java

+            if (renameReplacement.getBytes(StandardCharsets.UTF_8).length > MAX_INDEX_NAME_BYTES) {
+                validationException = addValidationError(
+                    "rename_replacement UTF-8 byte length ["
+                        + byteLength
+                        + "] exceeds maximum allowed length ["
+                        + MAX_INDEX_NAME_BYTES
+                        + "] bytes",
+                    validationException
+                );
+            }


Due to usage of back-references and escapes, the result index name can be shorter than 255 when renameReplacement is longer than 255. So we could be preventing some legal index names in edge cases. I think it might be OK though since such long index names are very rare.

Hmm, I hadn't considered this, it is a good point.

It seems to me like the easiest way to get around this would be to add a multiplier to make the validation more permissive for back-references/escapes, something like

if (renameReplacement.getBytes(StandardCharsets.UTF_8).length > MAX_INDEX_NAME_BYTES * 2) {

The multiplier itself is kind of arbitrary, but it provides some breathing room for the cases you brought up. Let me know what you think.

Yeah. We could do something along that line. I'd prefer to be a bit more strict, e.g. allowing an extra buffer of 10 chars.

ywangd · 2025-11-12T03:11:45Z

...va/org/elasticsearch/action/admin/cluster/snapshots/restore/RestoreSnapshotRequestTests.java

+        RestoreSnapshotRequest request = new RestoreSnapshotRequest(TimeValue.THIRTY_SECONDS, "repo", "snapshot");
+        request.indices("b".repeat(255));
+        request.renamePattern("b");
+        request.renameReplacement("1".repeat(10_000_000));


We can random the length from 256 to 10_000_000

ywangd · 2025-11-12T03:41:08Z

server/src/internalClusterTest/java/org/elasticsearch/snapshots/RestoreSnapshotIT.java

+        RestoreSnapshotResponse restoreResponse = client().admin()
+            .cluster()
+            .prepareRestoreSnapshot(TEST_REQUEST_TIMEOUT, repoName, snapshotName)
+            .setIndices(indexName)
+            .setRenamePattern("b+")
+            .setRenameReplacement("a".repeat(255))
+            .setWaitForCompletion(true)
+            .get();


Can we test that a restore with rename pattern b fails expectedly, i.e. it does not OOM? It should create a replacement name of "a".repeat (255 * 255) which then fails index name validation.

I think ideally we'd change the replaceAll method call with a manual loop of find and append calls so that it bails out as soon as the string builder is too long, similar to how this safeReplace works. That way we avoid building a long string (up to 65,025) unnecessarily and then abandonned it shortly after. The long name also fills logs. The heap usage is certainly not as bad as, or even close to, the unbounded rename replacement. That said, a restore request could contain thousands of indices and there could be multiple requests running concurrently (up to 50).

I've added a test that verifies the 255×255 scenario fails with proper index name validation rather than OOM.

Regarding the manual loop implementation similar to safeReplace, I attempted this approach but ran into issues with accurately estimating sizes for back-references ($1, $2, etc.). The conservative estimation (assuming each group could expand to the full match size) would reject valid cases where capture groups only match portions of the text. These false rejections would likely be edge cases (complex regex patterns with multiple capture groups), but they're still valid operations that would be blocked.

For now, I've kept the simpler validation that prevents the unbounded OOM cases. While the 65KB intermediate strings aren't ideal, they're bounded, and I don't believe they'll cause an OOM error (although I haven't actually tested it with 50 concurrent requests). As you noted, the heap usage is nowhere near the unbounded case this PR fixes.

How much of a concern is the current replaceAll implementation for you? The manual loop would be more efficient, but with the tradeoff that some valid rename patterns may be rejected. Happy to implement it if you think the efficiency gain is worth that tradeoff, or it could be explored more in a follow-up PR.

I attempted this approach but ran into issues with accurately estimating sizes for back-references ($1, $2, etc.).
with the tradeoff that some valid rename patterns may be rejected.

I don't quite follow. I was thinking something like the following change to RestoreService.java

@@ -1074,7 +1075,22 @@ public final class RestoreService implements ClusterStateApplier { if (prefix != null) { index = index.substring(prefix.length()); } - renamedIndex = index.replaceAll(request.renamePattern(), request.renameReplacement()); + final var matcher = Pattern.compile(request.renamePattern()).matcher(index); + var found = matcher.find(); + if (found) { + final var sb = new StringBuilder(); + do { + matcher.appendReplacement(sb, request.renameReplacement()); + found = matcher.find(); + } while (found && sb.length() <= 255); + if (sb.length() > 255) { + throw new IllegalArgumentException(); + } + matcher.appendTail(sb); + renamedIndex = sb.toString(); + } else { + renamedIndex = index; + } if (prefix != null) { renamedIndex = prefix + renamedIndex; }

Does it not work and could reject valid cases?

it could be explored more in a follow-up PR

I am OK for it to be a follow-up if you prefer.

Ahhh I see what you mean now, I think this avoids the issue I was running into... I was trying to copy the logic in safeReplace too closely, which does an estimation before actually making the string and led to that issue I described. I don't think we need to be that careful here, and this approach you provided should work without rejecting valid cases.

I will do some testing on it tomorrow.

# Conflicts: # server/src/internalClusterTest/java/org/elasticsearch/snapshots/RestoreSnapshotIT.java

ywangd

This looks great. I only had one minor comment. Thanks!

ywangd · 2025-11-14T02:31:35Z

server/src/main/java/org/elasticsearch/snapshots/RestoreService.java

+    private static String safeRenameIndex(String index, String renamePattern, String renameReplacement) {
+        final var matcher = Pattern.compile(renamePattern).matcher(index);
+        var found = matcher.find();
+        if (found) {
+            final var sb = new StringBuilder();
+            do {
+                matcher.appendReplacement(sb, renameReplacement);
+                found = matcher.find();
+            } while (found && sb.length() <= MAX_INDEX_NAME_BYTES);
+
+            if (sb.length() > MAX_INDEX_NAME_BYTES) {
+                throw new IllegalArgumentException("index name would exceed " + MAX_INDEX_NAME_BYTES + " bytes after rename");
+            }
+            matcher.appendTail(sb);
+            return sb.toString();
+        } else {
+            return index;
+        }
+    }


It would be great to have an unit test for this method itself.

ywangd

LGTM

Thanks for the iterations!

elasticsearchmachine · 2025-11-18T15:29:35Z

💔 Backport failed

Status	Branch	Result
✅	9.2
❌	8.19	Commit could not be cherrypicked due to conflicts
❌	9.1	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 137859

…store request (elastic#137859) * Add length validation for rename_replacement parameter in snapshot restore * [CI] Auto commit changes from spotless * Update RestoreSnapshotIT.java * Update docs/changelog/137859.yaml * Fixup integ testing # Conflicts: # server/src/internalClusterTest/java/org/elasticsearch/snapshots/RestoreSnapshotIT.java * Fix integration tests * Add new IT test case for long replacement * Add buffer to rename_replacement size check; add safeRenameIndex() * Randomize rename_replacement in testRenameReplacementNameTooLong() * [CI] Auto commit changes from spotless * Add unit test for safeRenameIndex() * fix testRenameReplacementNameTooLong() to use createTestInstance() * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>

…store request (#137859) (#138244) * Add length validation for rename_replacement parameter in snapshot restore * [CI] Auto commit changes from spotless * Update RestoreSnapshotIT.java * Update docs/changelog/137859.yaml * Fixup integ testing # Conflicts: # server/src/internalClusterTest/java/org/elasticsearch/snapshots/RestoreSnapshotIT.java * Fix integration tests * Add new IT test case for long replacement * Add buffer to rename_replacement size check; add safeRenameIndex() * Randomize rename_replacement in testRenameReplacementNameTooLong() * [CI] Auto commit changes from spotless * Add unit test for safeRenameIndex() * fix testRenameReplacementNameTooLong() to use createTestInstance() * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>

gmjehovich · 2025-11-19T19:58:26Z

💚 All backports created successfully

Status	Branch	Result
✅	8.19
✅	9.1

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

…store request (elastic#137859) * Add length validation for rename_replacement parameter in snapshot restore * [CI] Auto commit changes from spotless * Update RestoreSnapshotIT.java * Update docs/changelog/137859.yaml * Fixup integ testing # Conflicts: # server/src/internalClusterTest/java/org/elasticsearch/snapshots/RestoreSnapshotIT.java * Fix integration tests * Add new IT test case for long replacement * Add buffer to rename_replacement size check; add safeRenameIndex() * Randomize rename_replacement in testRenameReplacementNameTooLong() * [CI] Auto commit changes from spotless * Add unit test for safeRenameIndex() * fix testRenameReplacementNameTooLong() to use createTestInstance() * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> (cherry picked from commit 6b59792) # Conflicts: # server/src/internalClusterTest/java/org/elasticsearch/snapshots/RestoreSnapshotIT.java

Add length validation for rename_replacement parameter in snapshot re…

453ca8d

…store

gmjehovich self-assigned this Nov 11, 2025

elasticsearchmachine added the v9.3.0 label Nov 11, 2025

elasticsearchmachine and others added 2 commits November 11, 2025 01:11

[CI] Auto commit changes from spotless

b9123ed

Merge branch 'main' into fix-restore-snapshot-oom

bde9c08

gmjehovich added 3 commits November 11, 2025 13:07

Update RestoreSnapshotIT.java

e69ee31

Merge branch 'main' into fix-restore-snapshot-oom

fe6788c

Merge branch 'main' into fix-restore-snapshot-oom

9daf473

gmjehovich marked this pull request as ready for review November 12, 2025 01:04

gmjehovich added the >bug label Nov 12, 2025

Update docs/changelog/137859.yaml

6571a86

gmjehovich requested a review from a team November 12, 2025 01:13

ywangd reviewed Nov 12, 2025

View reviewed changes

gmjehovich added 5 commits November 12, 2025 17:47

Fixup integ testing

1aa2589

# Conflicts: # server/src/internalClusterTest/java/org/elasticsearch/snapshots/RestoreSnapshotIT.java

Fix integration tests

1e02937

Add new IT test case for long replacement

dadf20b

Add buffer to rename_replacement size check; add safeRenameIndex()

9ed2e59

Randomize rename_replacement in testRenameReplacementNameTooLong()

bc4012e

gmjehovich force-pushed the fix-restore-snapshot-oom branch from 8428e51 to bc4012e Compare November 13, 2025 23:26

elasticsearchmachine and others added 2 commits November 13, 2025 23:33

[CI] Auto commit changes from spotless

de81fab

Merge branch 'main' into fix-restore-snapshot-oom

942876b

ywangd reviewed Nov 14, 2025

View reviewed changes

gmjehovich and others added 4 commits November 14, 2025 21:04

Add unit test for safeRenameIndex()

6c3552f

fix testRenameReplacementNameTooLong() to use createTestInstance()

6274e48

[CI] Auto commit changes from spotless

d35e36b

Merge branch 'main' into fix-restore-snapshot-oom

1f2892d

ywangd approved these changes Nov 16, 2025

View reviewed changes

gmjehovich merged commit 6b59792 into elastic:main Nov 18, 2025
34 checks passed

gmjehovich mentioned this pull request Nov 18, 2025

[9.2] Add length validation for rename_replacement parameter in snapshot restore request (#137859) #138244

Merged

elasticsearchmachine added the backport pending label Nov 18, 2025

gmjehovich mentioned this pull request Nov 19, 2025

[8.19] Add length validation for rename_replacement parameter in snapshot restore request (#137859) #138320

Open

gmjehovich mentioned this pull request Nov 19, 2025

[9.1] Add length validation for rename_replacement parameter in snapshot restore request (#137859) #138323

Open

Add length validation for rename_replacement parameter in snapshot restore request #137859

Add length validation for rename_replacement parameter in snapshot restore request #137859

Conversation

gmjehovich commented Nov 11, 2025

Summary

Changes

Testing

Uh oh!

elasticsearchmachine commented Nov 12, 2025

Uh oh!

elasticsearchmachine commented Nov 12, 2025

Uh oh!

ywangd left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gmjehovich Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ywangd left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ywangd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Nov 18, 2025

💔 Backport failed

Uh oh!

gmjehovich commented Nov 19, 2025

💚 All backports created successfully

Questions ?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gmjehovich Nov 12, 2025 •

edited

Loading