Skip to content

Commit

Permalink
Prevent in-place downgrades and invalid upgrades (#41731)
Browse files Browse the repository at this point in the history
Downgrading an Elasticsearch node to an earlier version is unsupported, because
we do not make any attempt to guarantee that a node can read any of the on-disk
data written by a future version. Yet today we do not actively prevent
downgrades, and sometimes users will attempt to roll back a failed upgrade with
an in-place downgrade and get into an unrecoverable state.

This change adds the current version of the node to the node metadata file, and
checks the version found in this file against the current version at startup.
If the node cannot be sure of its ability to read the on-disk data then it
refuses to start, preserving any on-disk data in its upgraded state.

This change also adds a command-line tool to overwrite the node metadata file
without performing any version checks, to unsafely bypass these checks and
recover the historical and lenient behaviour.
  • Loading branch information
DaveCTurner committed May 21, 2019
1 parent ec63160 commit 7abeaba
Show file tree
Hide file tree
Showing 12 changed files with 556 additions and 40 deletions.
63 changes: 60 additions & 3 deletions docs/reference/commands/node-tool.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,23 @@
The `elasticsearch-node` command enables you to perform certain unsafe
operations on a node that are only possible while it is shut down. This command
allows you to adjust the <<modules-node,role>> of a node and may be able to
recover some data after a disaster.
recover some data after a disaster or start a node even if it is incompatible
with the data on disk.

[float]
=== Synopsis

[source,shell]
--------------------------------------------------
bin/elasticsearch-node repurpose|unsafe-bootstrap|detach-cluster
bin/elasticsearch-node repurpose|unsafe-bootstrap|detach-cluster|override-version
[--ordinal <Integer>] [-E <KeyValuePair>]
[-h, --help] ([-s, --silent] | [-v, --verbose])
--------------------------------------------------

[float]
=== Description

This tool has three modes:
This tool has four modes:

* `elasticsearch-node repurpose` can be used to delete unwanted data from a
node if it used to be a <<data-node,data node>> or a
Expand All @@ -36,6 +37,11 @@ This tool has three modes:
cluster bootstrapping was not possible, it also enables you to move nodes
into a brand-new cluster.

* `elasticsearch-node override-version` enables you to start up a node
even if the data in the data path was written by an incompatible version of
{es}. This may sometimes allow you to downgrade to an earlier version of
{es}.

[[node-tool-repurpose]]
[float]
==== Changing the role of a node
Expand Down Expand Up @@ -109,6 +115,25 @@ way forward that does not risk data loss, but it may be possible to use the
`elasticsearch-node` tool to construct a new cluster that contains some of the
data from the failed cluster.

[[node-tool-override-version]]
[float]
==== Bypassing version checks

The data that {es} writes to disk is designed to be read by the current version
and a limited set of future versions. It cannot generally be read by older
versions, nor by versions that are more than one major version newer. The data
stored on disk includes the version of the node that wrote it, and {es} checks
that it is compatible with this version when starting up.

In rare circumstances it may be desirable to bypass this check and start up an
{es} node using data that was written by an incompatible version. This may not
work if the format of the stored data has changed, and it is a risky process
because it is possible for the format to change in ways that {es} may
misinterpret, silently leading to data loss.

To bypass this check, you can use the `elasticsearch-node override-version`
tool to overwrite the version number stored in the data path with the current
version, causing {es} to believe that it is compatible with the on-disk data.

[[node-tool-unsafe-bootstrap]]
[float]
Expand Down Expand Up @@ -262,6 +287,9 @@ one-node cluster.
`detach-cluster`:: Specifies to unsafely detach this node from its cluster so
it can join a different cluster.

`override-version`:: Overwrites the version number stored in the data path so
that a node can start despite being incompatible with the on-disk data.

`--ordinal <Integer>`:: If there is <<max-local-storage-nodes,more than one
node sharing a data path>> then this specifies which node to target. Defaults
to `0`, meaning to use the first node in the data path.
Expand Down Expand Up @@ -423,3 +451,32 @@ Do you want to proceed?
Confirm [y/N] y
Node was successfully detached from the cluster
----

[float]
==== Bypassing version checks

Run the `elasticsearch-node override-version` command to overwrite the version
stored in the data path so that a node can start despite being incompatible
with the data stored in the data path:

[source, txt]
----
node$ ./bin/elasticsearch-node override-version
WARNING: Elasticsearch MUST be stopped before running this tool.
This data path was last written by Elasticsearch version [x.x.x] and may no
longer be compatible with Elasticsearch version [y.y.y]. This tool will bypass
this compatibility check, allowing a version [y.y.y] node to start on this data
path, but a version [y.y.y] node may not be able to read this data or may read
it incorrectly leading to data loss.
You should not use this tool. Instead, continue to use a version [x.x.x] node
on this data path. If necessary, you can use reindex-from-remote to copy the
data from here into an older cluster.
Do you want to proceed?
Confirm [y/N] y
Successfully overwrote this node's metadata to bypass its version compatibility checks.
----
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
public abstract class ElasticsearchNodeCommand extends EnvironmentAwareCommand {
private static final Logger logger = LogManager.getLogger(ElasticsearchNodeCommand.class);
protected final NamedXContentRegistry namedXContentRegistry;
static final String DELIMITER = "------------------------------------------------------------------------\n";
protected static final String DELIMITER = "------------------------------------------------------------------------\n";

static final String STOP_WARNING_MSG =
DELIMITER +
Expand Down Expand Up @@ -81,9 +81,8 @@ protected void processNodePathsWithLock(Terminal terminal, OptionSet options, En
throw new ElasticsearchException(NO_NODE_FOLDER_FOUND_MSG);
}
processNodePaths(terminal, dataPaths, env);
} catch (LockObtainFailedException ex) {
throw new ElasticsearchException(
FAILED_TO_OBTAIN_NODE_LOCK_MSG + " [" + ex.getMessage() + "]");
} catch (LockObtainFailedException e) {
throw new ElasticsearchException(FAILED_TO_OBTAIN_NODE_LOCK_MSG, e);
}
}

Expand Down Expand Up @@ -166,6 +165,18 @@ protected void writeNewMetaData(Terminal terminal, Manifest oldManifest, long ne
}
}

protected NodeEnvironment.NodePath[] toNodePaths(Path[] dataPaths) {
return Arrays.stream(dataPaths).map(ElasticsearchNodeCommand::createNodePath).toArray(NodeEnvironment.NodePath[]::new);
}

private static NodeEnvironment.NodePath createNodePath(Path path) {
try {
return new NodeEnvironment.NodePath(path);
} catch (IOException e) {
throw new ElasticsearchException("Unable to investigate path [" + path + "]", e);
}
}

//package-private for testing
OptionParser getParser() {
return parser;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import org.elasticsearch.cli.MultiCommand;
import org.elasticsearch.cli.Terminal;
import org.elasticsearch.env.NodeRepurposeCommand;
import org.elasticsearch.env.OverrideNodeVersionCommand;

// NodeToolCli does not extend LoggingAwareCommand, because LoggingAwareCommand performs logging initialization
// after LoggingAwareCommand instance is constructed.
Expand All @@ -39,6 +40,7 @@ public NodeToolCli() {
subcommands.put("repurpose", new NodeRepurposeCommand());
subcommands.put("unsafe-bootstrap", new UnsafeBootstrapMasterCommand());
subcommands.put("detach-cluster", new DetachClusterCommand());
subcommands.put("override-version", new OverrideNodeVersionCommand());
}

public static void main(String[] args) throws Exception {
Expand Down
11 changes: 8 additions & 3 deletions server/src/main/java/org/elasticsearch/env/NodeEnvironment.java
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
import org.apache.lucene.store.NativeFSLockFactory;
import org.apache.lucene.store.SimpleFSDirectory;
import org.elasticsearch.ElasticsearchException;
import org.elasticsearch.Version;
import org.elasticsearch.cluster.metadata.IndexMetaData;
import org.elasticsearch.cluster.node.DiscoveryNode;
import org.elasticsearch.common.CheckedFunction;
Expand Down Expand Up @@ -250,7 +251,7 @@ public NodeEnvironment(Settings settings, Environment environment) throws IOExce
sharedDataPath = null;
locks = null;
nodeLockId = -1;
nodeMetaData = new NodeMetaData(generateNodeId(settings));
nodeMetaData = new NodeMetaData(generateNodeId(settings), Version.CURRENT);
return;
}
boolean success = false;
Expand Down Expand Up @@ -395,7 +396,6 @@ private void maybeLogHeapDetails() {
logger.info("heap size [{}], compressed ordinary object pointers [{}]", maxHeapSize, useCompressedOops);
}


/**
* scans the node paths and loads existing metaData file. If not found a new meta data will be generated
* and persisted into the nodePaths
Expand All @@ -405,10 +405,15 @@ private static NodeMetaData loadOrCreateNodeMetaData(Settings settings, Logger l
final Path[] paths = Arrays.stream(nodePaths).map(np -> np.path).toArray(Path[]::new);
NodeMetaData metaData = NodeMetaData.FORMAT.loadLatestState(logger, NamedXContentRegistry.EMPTY, paths);
if (metaData == null) {
metaData = new NodeMetaData(generateNodeId(settings));
metaData = new NodeMetaData(generateNodeId(settings), Version.CURRENT);
} else {
metaData = metaData.upgradeToCurrentVersion();
}

// we write again to make sure all paths have the latest state file
assert metaData.nodeVersion().equals(Version.CURRENT) : metaData.nodeVersion() + " != " + Version.CURRENT;
NodeMetaData.FORMAT.writeAndCleanup(metaData, paths);

return metaData;
}

Expand Down
72 changes: 56 additions & 16 deletions server/src/main/java/org/elasticsearch/env/NodeMetaData.java
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@

package org.elasticsearch.env;

import org.elasticsearch.Version;
import org.elasticsearch.common.ParseField;
import org.elasticsearch.common.xcontent.ObjectParser;
import org.elasticsearch.common.xcontent.XContentBuilder;
Expand All @@ -31,66 +32,104 @@
import java.util.Objects;

/**
* Metadata associated with this node. Currently only contains the unique uuid describing this node.
* Metadata associated with this node: its persistent node ID and its version.
* The metadata is persisted in the data folder of this node and is reused across restarts.
*/
public final class NodeMetaData {

private static final String NODE_ID_KEY = "node_id";
private static final String NODE_VERSION_KEY = "node_version";

private final String nodeId;

public NodeMetaData(final String nodeId) {
private final Version nodeVersion;

public NodeMetaData(final String nodeId, final Version nodeVersion) {
this.nodeId = Objects.requireNonNull(nodeId);
this.nodeVersion = Objects.requireNonNull(nodeVersion);
}

@Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}

if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
NodeMetaData that = (NodeMetaData) o;

return Objects.equals(this.nodeId, that.nodeId);
return nodeId.equals(that.nodeId) &&
nodeVersion.equals(that.nodeVersion);
}

@Override
public int hashCode() {
return this.nodeId.hashCode();
return Objects.hash(nodeId, nodeVersion);
}

@Override
public String toString() {
return "node_id [" + nodeId + "]";
return "NodeMetaData{" +
"nodeId='" + nodeId + '\'' +
", nodeVersion=" + nodeVersion +
'}';
}

private static ObjectParser<Builder, Void> PARSER = new ObjectParser<>("node_meta_data", Builder::new);

static {
PARSER.declareString(Builder::setNodeId, new ParseField(NODE_ID_KEY));
PARSER.declareInt(Builder::setNodeVersionId, new ParseField(NODE_VERSION_KEY));
}

public String nodeId() {
return nodeId;
}

public Version nodeVersion() {
return nodeVersion;
}

public NodeMetaData upgradeToCurrentVersion() {
if (nodeVersion.equals(Version.V_EMPTY)) {
assert Version.CURRENT.major <= Version.V_7_0_0.major + 1 : "version is required in the node metadata from v9 onwards";
return new NodeMetaData(nodeId, Version.CURRENT);
}

if (nodeVersion.before(Version.CURRENT.minimumIndexCompatibilityVersion())) {
throw new IllegalStateException(
"cannot upgrade a node from version [" + nodeVersion + "] directly to version [" + Version.CURRENT + "]");
}

if (nodeVersion.after(Version.CURRENT)) {
throw new IllegalStateException(
"cannot downgrade a node from version [" + nodeVersion + "] to version [" + Version.CURRENT + "]");
}

return nodeVersion.equals(Version.CURRENT) ? this : new NodeMetaData(nodeId, Version.CURRENT);
}

private static class Builder {
String nodeId;
Version nodeVersion;

public void setNodeId(String nodeId) {
this.nodeId = nodeId;
}

public void setNodeVersionId(int nodeVersionId) {
this.nodeVersion = Version.fromId(nodeVersionId);
}

public NodeMetaData build() {
return new NodeMetaData(nodeId);
final Version nodeVersion;
if (this.nodeVersion == null) {
assert Version.CURRENT.major <= Version.V_7_0_0.major + 1 : "version is required in the node metadata from v9 onwards";
nodeVersion = Version.V_EMPTY;
} else {
nodeVersion = this.nodeVersion;
}

return new NodeMetaData(nodeId, nodeVersion);
}
}


public static final MetaDataStateFormat<NodeMetaData> FORMAT = new MetaDataStateFormat<NodeMetaData>("node-") {

@Override
Expand All @@ -103,10 +142,11 @@ protected XContentBuilder newXContentBuilder(XContentType type, OutputStream str
@Override
public void toXContent(XContentBuilder builder, NodeMetaData nodeMetaData) throws IOException {
builder.field(NODE_ID_KEY, nodeMetaData.nodeId);
builder.field(NODE_VERSION_KEY, nodeMetaData.nodeVersion.id);
}

@Override
public NodeMetaData fromXContent(XContentParser parser) throws IOException {
public NodeMetaData fromXContent(XContentParser parser) {
return PARSER.apply(parser, null).build();
}
};
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -172,10 +172,6 @@ private String toIndexName(NodeEnvironment.NodePath[] nodePaths, String uuid) {
}
}

private NodeEnvironment.NodePath[] toNodePaths(Path[] dataPaths) {
return Arrays.stream(dataPaths).map(NodeRepurposeCommand::createNodePath).toArray(NodeEnvironment.NodePath[]::new);
}

private Set<String> indexUUIDsFor(Set<Path> indexPaths) {
return indexPaths.stream().map(Path::getFileName).map(Path::toString).collect(Collectors.toSet());
}
Expand Down Expand Up @@ -226,14 +222,6 @@ private final Set<Path> uniqueParentPaths(Collection<Path>... paths) {
return Arrays.stream(paths).flatMap(Collection::stream).map(Path::getParent).collect(Collectors.toSet());
}

private static NodeEnvironment.NodePath createNodePath(Path path) {
try {
return new NodeEnvironment.NodePath(path);
} catch (IOException e) {
throw new ElasticsearchException("Unable to investigate path: " + path + ": " + e.getMessage());
}
}

//package-private for testing
OptionParser getParser() {
return parser;
Expand Down
Loading

0 comments on commit 7abeaba

Please sign in to comment.