Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement repair functionality for aliases colliding with indices bug #91887

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@

package org.elasticsearch.cluster.metadata;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.elasticsearch.Version;
import org.elasticsearch.action.admin.indices.rollover.RolloverInfo;
import org.elasticsearch.action.support.ActiveShardCount;
Expand Down Expand Up @@ -83,6 +85,8 @@

public class IndexMetadata implements Diffable<IndexMetadata>, ToXContentFragment {

private static final Logger logger = LogManager.getLogger(IndexMetadata.class);

public static final ClusterBlock INDEX_READ_ONLY_BLOCK = new ClusterBlock(
5,
"index read-only (api)",
Expand Down Expand Up @@ -1589,7 +1593,7 @@ public IndexMetadata apply(IndexMetadata part) {
builder.stats(stats);
builder.indexWriteLoadForecast(indexWriteLoadForecast);
builder.shardSizeInBytesForecast(shardSizeInBytesForecast);
return builder.build();
return builder.build(true);
}
}

Expand Down Expand Up @@ -1656,7 +1660,7 @@ public static IndexMetadata readFrom(StreamInput in, @Nullable Function<String,
builder.indexWriteLoadForecast(in.readOptionalDouble());
builder.shardSizeInBytesForecast(in.readOptionalLong());
}
return builder.build();
return builder.build(true);
}

/**
Expand Down Expand Up @@ -2012,6 +2016,11 @@ public Builder shardSizeInBytesForecast(Long shardSizeInBytesForecast) {
}

public IndexMetadata build() {
return build(false);
}

// package private for testing
IndexMetadata build(boolean repair) {
/*
* We expect that the metadata has been properly built to set the number of shards and the number of replicas, and do not rely
* on the default values here. Those must have been set upstream.
Expand Down Expand Up @@ -2140,7 +2149,16 @@ public IndexMetadata build() {
var aliasesMap = aliases.build();
for (AliasMetadata alias : aliasesMap.values()) {
if (alias.alias().equals(index)) {
throw new IllegalArgumentException("alias name [" + index + "] self-conflicts with index name");
if (repair && indexCreatedVersion.equals(Version.V_8_5_0)) {
var updatedBuilder = ImmutableOpenMap.builder(aliasesMap);
final var brokenAlias = updatedBuilder.remove(index);
final var fixedAlias = AliasMetadata.newAliasMetadata(brokenAlias, index + "-alias-corrupted-by-8-5");
aliasesMap = updatedBuilder.fPut(fixedAlias.getAlias(), fixedAlias).build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a risk that adding an alias with the same name as the index for an index created in 8.5 will now succeed and just result in the "corrupted" alias name instead?

Can we perhaps find or add an integration test that verifies that when adding aliases we have validation in place that catches this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right ... this is quite annoying. I could obviously add the validation for this at another level as well (not sure where, it seems not entirely trivial to add this check elsewhere in a way that would only catch mutations) but it does add some extra complexity to the whole thing. Is it worth doing that for this corner case? The name collision checks are quite complicated as it is already and checking for this specifically would add yet more complexity at some other layer ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a buildAndRepair method that is used from IndexMetadata.fromXContent, passing down a boolean to a private build method in order to only do the repair from that path?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recon my comment being slightly convoluted, so here is a diff of what I am thinking of:

Index: server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java b/server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java
--- a/server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java	(revision c6565a39e26f6c91f8e6f71f8021e07f8805639d)
+++ b/server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetadata.java	(date 1669302455554)
@@ -2016,6 +2016,10 @@
         }
 
         public IndexMetadata build() {
+            return build(false);
+        }
+
+        private IndexMetadata build(boolean repair) {
             /*
              * We expect that the metadata has been properly built to set the number of shards and the number of replicas, and do not rely
              * on the default values here. Those must have been set upstream.
@@ -2144,7 +2148,7 @@
             var aliasesMap = aliases.build();
             for (AliasMetadata alias : aliasesMap.values()) {
                 if (alias.alias().equals(index)) {
-                    if (indexCreatedVersion.equals(Version.V_8_5_0)) {
+                    if (repair && indexCreatedVersion.equals(Version.V_8_5_0)) {
                         var updatedBuilder = ImmutableOpenMap.builder(aliasesMap);
                         final var brokenAlias = updatedBuilder.remove(index);
                         final var fixedAlias = AliasMetadata.newAliasMetadata(brokenAlias, index + "-alias-corrupted-by-8-5");
@@ -2208,6 +2212,11 @@
             );
         }
 
+        // visible for testing.
+        IndexMetadata buildAndRepair() {
+            return build(true);
+        }
+
         @SuppressWarnings("unchecked")
         public static void toXContent(IndexMetadata indexMetadata, XContentBuilder builder, ToXContent.Params params) throws IOException {
             Metadata.XContentContext context = Metadata.XContentContext.valueOf(
@@ -2476,7 +2485,7 @@
             assert settingsVersion : "settings version should be present for indices created on or after 6.5.0";
             assert indexCreatedVersion(builder.settings).before(Version.V_7_2_0) || aliasesVersion
                 : "aliases version should be present for indices created on or after 7.2.0";
-            return builder.build();
+            return builder.buildAndRepair();
         }
 
         /**

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about that, but the would mean that the fix will only apply with a full cluster restart and not with a rolling restart? Wouldn't that be painful for Cloud?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also build and repair when receiving the index metadata over the wire maybe?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could I guess :) lets do it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to from--x-content, reading from the wire and (not sure this is necessary but I found it weird for it to behave different form normal transport read) diff application now :)

logger.warn("Repaired corrupted alias with the same name as its index for [{}]", index);
break;
} else {
throw new IllegalArgumentException("alias name [" + index + "] self-conflicts with index name");
}
}
}

Expand Down Expand Up @@ -2463,7 +2481,7 @@ public static IndexMetadata fromXContent(XContentParser parser, Map<String, Mapp
assert settingsVersion : "settings version should be present for indices created on or after 6.5.0";
assert indexCreatedVersion(builder.settings).before(Version.V_7_2_0) || aliasesVersion
: "aliases version should be present for indices created on or after 7.2.0";
return builder.build();
return builder.build(true);
}

/**
Expand Down Expand Up @@ -2574,7 +2592,7 @@ public static IndexMetadata legacyFromXContent(XContentParser parser) throws IOE
builder.putMapping(MappingMetadata.EMPTY_MAPPINGS); // just make sure it's not empty so that _source can be read
}

IndexMetadata indexMetadata = builder.build();
IndexMetadata indexMetadata = builder.build(true);
assert indexMetadata.getCreationVersion().isLegacyIndexVersion();
assert indexMetadata.getCompatibilityVersion().isLegacyIndexVersion();
return indexMetadata;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
import static org.elasticsearch.cluster.metadata.IndexMetadata.parseIndexNameCounter;
import static org.hamcrest.Matchers.containsString;
import static org.hamcrest.Matchers.equalTo;
import static org.hamcrest.Matchers.hasKey;
import static org.hamcrest.Matchers.is;

public class IndexMetadataTests extends ESTestCase {
Expand Down Expand Up @@ -497,16 +498,40 @@ public void testLifeCyclePolicyName() {
}

public void testIndexAndAliasWithSameName() {
final IllegalArgumentException iae = expectThrows(
IllegalArgumentException.class,
() -> IndexMetadata.builder("index")
.settings(Settings.builder().put(IndexMetadata.SETTING_VERSION_CREATED, Version.CURRENT))
.numberOfShards(1)
.numberOfReplicas(0)
.putAlias(AliasMetadata.builder("index").build())
.build()
);
assertEquals("alias name [index] self-conflicts with index name", iae.getMessage());
{
final IllegalArgumentException iae = expectThrows(
IllegalArgumentException.class,
() -> IndexMetadata.builder("index")
.settings(Settings.builder().put(IndexMetadata.SETTING_VERSION_CREATED, Version.CURRENT))
.numberOfShards(1)
.numberOfReplicas(0)
.putAlias(AliasMetadata.builder("index").build())
.build(randomBoolean())
);
assertEquals("alias name [index] self-conflicts with index name", iae.getMessage());
}
{
final IllegalArgumentException iae = expectThrows(
IllegalArgumentException.class,
() -> IndexMetadata.builder("index")
.settings(Settings.builder().put(IndexMetadata.SETTING_VERSION_CREATED, Version.V_8_5_0))
.numberOfShards(1)
.numberOfReplicas(0)
.putAlias(AliasMetadata.builder("index").build())
.build(false)
);
assertEquals("alias name [index] self-conflicts with index name", iae.getMessage());
}
}

public void testRepairIndexAndAliasWithSameName() {
final IndexMetadata indexMetadata = IndexMetadata.builder("index")
.settings(Settings.builder().put(IndexMetadata.SETTING_VERSION_CREATED, Version.V_8_5_0))
.numberOfShards(1)
.numberOfReplicas(0)
.putAlias(AliasMetadata.builder("index").build())
.build(true);
assertThat(indexMetadata.getAliases(), hasKey("index-alias-corrupted-by-8-5"));
}

private static Settings indexSettingsWithDataTier(String dataTier) {
Expand Down