New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use aliases for reopened indices #3897

Merged
merged 15 commits into from Jun 22, 2017

Conversation

Projects
None yet
2 participants
@dennisoelkers
Member

dennisoelkers commented Jun 8, 2017

Description

Motivation and Context

Before this change, a custom index setting (graylog2_reopened) was
used to flag reopened indices to be exempt from retention/rotation.
Starting with Elasticsearch 5.x, custom index settings are not supported
anymore.

This change now creates a special alias (<index name>_reopened)
pointing to the index flagging it as being reopened.

This change is downwards-compatible in that it includes a migration job that creates aliases for every index which is marked as reopened (either by using graylog2_reopened for 2.x or archived.graylog2_reopened for 5.x). There is no automated downgrade step which allows going back.

This change requires #3896 to be merged before.

Fixes #3871.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@dennisoelkers dennisoelkers added this to the 2.3.0 milestone Jun 8, 2017

@dennisoelkers dennisoelkers force-pushed the use-aliases-for-reopened-indices branch 2 times, most recently from f83d658 to 961c72f Jun 12, 2017

@kroepke kroepke self-requested a review Jun 13, 2017

@kroepke kroepke force-pushed the use-aliases-for-reopened-indices branch from 961c72f to bf26b76 Jun 13, 2017

@kroepke

I tested an upgrade from 2.4.2 to 5.4.2 while having a single reopened index from the master version of this code.
That was both with rebased to master and without.

Testing this code and closing/reopening indices both on 2.4.2 and 5.4.2 worked as expected, only the upgrade seems to have a problem.

Apart from the comments, lgtm!

@@ -18,6 +18,7 @@
import com.github.zafarkhaja.semver.Version;
import org.graylog2.indexer.cluster.Node;
import org.junit.Rule;

This comment has been minimized.

@kroepke

kroepke Jun 21, 2017

Member

this import is duplicated

This comment has been minimized.

@dennisoelkers
private JsonObject getIndexSettings(JsonObject indicesJson, String index) {
return Optional.ofNullable(asJsonObject(indicesJson))
.map(indices -> asJsonObject(indices.get(index)))

This comment has been minimized.

@kroepke

kroepke Jun 21, 2017

Member

This always yields an empty map for me when I have a single reopened index, causing the caller to skip the migration.

MongoIndexSet.Factory mongoIndexSetFactory,
Indices indices,
JestClient jestClient) {
this.elasticsearchVersion = node.getVersion().orElseThrow(() -> new ElasticsearchException("Unable to retrieve Elasticsearch version."));

This comment has been minimized.

@kroepke

kroepke Jun 21, 2017

Member

AFAICS this performs IO during injection, which can cause the server startup to fail.
Please move getting the version into the upgrade method.

This comment has been minimized.

@dennisoelkers

dennisoelkers Jun 21, 2017

Member

This is true, but it was done on purpose, so we do not have to pass the version through the whole call chain or look it up every time.

Causing the server startup to fail in case of an error was an unintended side effect though. But, isn't this what we should do in case of a failing migration, because all the rest of the code expects it to succeed and there is no way to retry it besides restarting the server?

This comment has been minimized.

@kroepke

kroepke Jun 21, 2017

Member

Personally I would always avoid IO during injection unless absolutely necessary as in general it is difficult to reason about how often it might be performed. Could be that in this case it is known, but IMHO it is an anti-pattern.
I strongly prefer to pass the version through the call chain, or even putting it into a mutable field in this case, maybe with a small comment explaining why that's ok in this class.

This comment has been minimized.

@kroepke

kroepke Jun 21, 2017

Member

As for the failing: If it is intended to fail, I think we should do that in a more obvious way. I know this is not something in the scope of this PR though.
Since we don't have a good strategy to deal with intermittent connection issues during start, I would vote for ignoring it for now.
None of the migrations can be re-run manually, I think all of them require a restart.

This comment has been minimized.

@dennisoelkers
import static org.graylog2.indexer.gson.GsonUtils.asJsonObject;
import static org.graylog2.indexer.gson.GsonUtils.asString;
public class V20170607164210_MigrateReopenedIndicesToAliases extends Migration {

This comment has been minimized.

@kroepke

kroepke Jun 21, 2017

Member

This class does not log anything, neither failures nor successes of its operation, making it very difficult to debug failures.

Also, while I appreciate the conciseness using streams and maps, they can be very difficult to debug, especially when they transform data which happens to be of an unexpected format, like it appears to be the case below. At the very least these methods should log something about their state when returning default values to aid debugging.

This comment has been minimized.

@dennisoelkers
final JsonElement value = entry.getValue();
if (value.isJsonObject()) {
final JsonObject indexSettingsJson = value.getAsJsonObject();
final JsonObject indexSettings = getIndexSettings(indexSettingsJson, indexName);

This comment has been minimized.

@kroepke

kroepke Jun 21, 2017

Member

Could this be replaced with a JsonPath expression instead of nested function calls?
Essentially we are trying to figure out whether one of settings.index.graylog2_reopened or archived.settings.index.graylog2_reopened exists or not for each index we looked at.

This comment has been minimized.

@dennisoelkers

dennisoelkers added some commits Jun 7, 2017

Making Elasticsearch Version injectable.
This allows us to reuse the mechanism of retrieving the version of the
connected Elasticsearch nodes in other parts of the code besides the
IndexMappingFactory.
Rewriting provider to generic service class.
This prevents I/O being done during injection, which could cause to
erratic and intransparent startup behavior in case of errors.
Marking reopened indices with alias instead of custom setting.
Before this change, a custom index setting ("graylog2_reopened") was
used to flag reopened indices to be exempt from retention/rotation.
Starting with Elasticsearch 5.x, custom index settings are not supported
anymore.

This change now creates a special alias ("<index name>_reopened")
pointing to the index flagging it as being reopened.

Fixes #3871.

@dennisoelkers dennisoelkers force-pushed the use-aliases-for-reopened-indices branch from bf26b76 to 39bd171 Jun 21, 2017

@kroepke

Apart from the comments, there's another change required in the archive code base, which I'll push myself (simple Gson to ObjectMapper change).

final String indexName = entry.getKey();
final JsonNode value = entry.getValue();
final JsonNode indexSettings = value.path("settings").path("index");

This comment has been minimized.

@kroepke

kroepke Jun 22, 2017

Member

Here we don't catch the archived section of the index settings in case we are upgrading.
I think we need to pass in the entire value into checkForReopened or simply inline that code and perform the missing settings check after the version check.

This happens when upgrading from 2.4.2 to 5.4.2 while migrating the reopen check.

This comment has been minimized.

@dennisoelkers
if (elasticsearchVersion.satisfies(">=2.1.0 & <5.0.0")) {
settings = indexSettings;
} else if (elasticsearchVersion.satisfies("^5.0.0")) {
settings = indexSettings.path("archived");

This comment has been minimized.

@kroepke

kroepke Jun 22, 2017

Member

As per the comment above, in case of upgrading to 5.x, the indexSettings here does not contain the archived subkey anymore, so the check comes back as false.

This comment has been minimized.

@dennisoelkers

dennisoelkers Jun 22, 2017

Member

Good catch, thanks!

@kroepke kroepke merged commit b7ab6e9 into master Jun 22, 2017

3 of 4 checks passed

graylog-project/pr Jenkins build graylog-project-pr-snapshot 266 has failed
Details
ci-web-linter Jenkins build graylog-pr-linter-check 1769 has succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@kroepke kroepke deleted the use-aliases-for-reopened-indices branch Jun 22, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment