Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TemplateUpgraders should be called during rolling restart #25263

Merged
merged 5 commits into from Jun 22, 2017

Conversation

@imotov
Copy link
Member

commented Jun 15, 2017

In #24379 we added ability to upgrade templates on full cluster startup. This PR invokes the same update procedure also when a new node first joins the cluster allowing to update templates on a rolling cluster restart as well.

Closes #24680

@abeyad, @spinscale could you take a look to make sure this will work for you?

In #24379 we added ability to upgrade templates on full cluster startup. This PR invokes the same update procedure also when a new node first joins the cluster allowing to update templates on a full cluster restart.

Closes #24680
@@ -460,6 +463,7 @@ protected Node(final Environment environment, Collection<Class<? extends Plugin>
b.bind(UsageService.class).toInstance(usageService);
b.bind(NamedWriteableRegistry.class).toInstance(namedWriteableRegistry);
b.bind(MetaDataUpgrader.class).toInstance(metaDataUpgrader);
b.bind(TemplateUpgradeService.class).toInstance(templateUpgradeService);

This comment has been minimized.

Copy link
@rjernst

rjernst Jun 15, 2017

Member

It seems this is only bound in Guice so that tests can get access to it, so they can check if upgrades are in progress. Is this something that should be exposed via an api? At least let's find another way so we can keep additional stuff out of guice.

This comment has been minimized.

Copy link
@imotov

imotov Jun 15, 2017

Author Member

Good point! I rewrote the test to get rid of guice there. I don't think we need an api for that. I was just trying to prevent a possible race condition in the test, but I found another way to do that without guice.

@abeyad
abeyad approved these changes Jun 17, 2017
Copy link
Contributor

left a comment

I left a few minor nits and a question. Otherwise, LGTM

@@ -431,7 +439,6 @@ public static void toXContent(IndexTemplateMetaData indexTemplateMetaData, XCont
}
builder.endObject();

This comment has been minimized.

Copy link
@abeyad

abeyad Jun 17, 2017

Contributor

nit: extra newline

* Upgrades Templates on behalf of installed {@link Plugin}s when a node joins the cluster
*/
public class TemplateUpgradeService extends AbstractComponent implements ClusterStateListener {
private final Tuple<Map<String, BytesReference>, Set<String>> EMPTY = new Tuple<>(Collections.emptyMap(), Collections.emptySet());

This comment has been minimized.

Copy link
@abeyad

abeyad Jun 17, 2017

Contributor

EMPTY could have a more descriptive name

AtomicInteger updateCount = new AtomicInteger();
// Make sure all templates are recreated correctly
assertBusy(() -> {
logger.info("checking....");

This comment has been minimized.

Copy link
@abeyad

abeyad Jun 17, 2017

Contributor

a more descriptive log message? or is this leftover?

assertThat(removedListener.getAndSet((ActionListener) args[1]), nullValue());
return null;
}).when(mockIndicesAdminClient).deleteTemplate(any(DeleteIndexTemplateRequest.class), any(ActionListener.class));

This comment has been minimized.

Copy link
@abeyad

abeyad Jun 17, 2017

Contributor

nit: extra newline


public class TemplateUpgradeServiceTests extends ESTestCase {

private ClusterService clusterService = new ClusterService(Settings.EMPTY, new ClusterSettings(Settings.EMPTY, ClusterSettings

This comment has been minimized.

Copy link
@abeyad

abeyad Jun 17, 2017

Contributor

final?

deleteTemplateListeners.add((ActionListener) args[1]);
return null;
}).when(mockIndicesAdminClient).deleteTemplate(any(DeleteIndexTemplateRequest.class), any(ActionListener.class));

This comment has been minimized.

Copy link
@abeyad

abeyad Jun 17, 2017

Contributor

nit: extra newline

Collections.emptyList());

service.updateTemplates(additions, deletions);

This comment has been minimized.

Copy link
@abeyad

abeyad Jun 17, 2017

Contributor

Should we add an assert here that putTemplateListeners.size() == additionsCount and deleteTemplateListeners.size() == deletionsCount?

This comment has been minimized.

Copy link
@spinscale

spinscale Jun 20, 2017

Member

nit^2: assertThat(putTemplateListeners, hasSize(additionsCount));

putTemplateListeners.get(i).onFailure(new RuntimeException());
} else {
putTemplateListeners.get(i).onResponse(new PutIndexTemplateResponse(randomBoolean()) {

This comment has been minimized.

Copy link
@abeyad

abeyad Jun 17, 2017

Contributor

its not clear to me what the purpose of this response listener is? we are not asserting anything when invoking the action listener, so I'm not clear what its role is in the test?

@spinscale

This comment has been minimized.

Copy link
Member

commented Jun 19, 2017

To make sure this is addressed (and I may have just missed this, which I assume). The current cluster state listener implementation of TemplateUpgradeService runs on every node in the cluster, whenever there is a change - this also means, that when two nodes get the same cluster state change, they try to store/delete the same templates?

@imotov

This comment has been minimized.

Copy link
Member Author

commented Jun 19, 2017

@spinscale you didn't miss anything. We are changing the way templates are upgraded - they are upgraded when a node with a new version joins not when it becomes the master, hence we need to run it on every node in the cluster. As a result, you are right, multiple nodes may try to creating the same template at the same time. Not sure if we can do much about out giving the requirements. Perhaps we can try optimizing MetaDataIndexTemplateService to not cause cluster state update if the template is exactly the same.

@imotov

This comment has been minimized.

Copy link
Member Author

commented Jun 19, 2017

@spinscale and I discussed this a bit more and we came up with a schema that will allow the new nodes to update the template but will prevent update swarms. Before trying to update the templates

  1. A node will check if the master has the newest version. If it does and the nodes is the master, the master will perform the update. Otherwise the node will don't do anything.
  2. If the master doesn't have the highest version, then the node with the highest id among all nodes with the highest version will perform the upgrade.
@imotov

This comment has been minimized.

Copy link
Member Author

commented Jun 20, 2017

@spinscale, @abeyad, I pushed the new logic that ensures that only one node can try to make updates at a time. Could you take another look?

Copy link
Member

left a comment

LGTM. I keep thinking if the explicit master node check is fully necessary or can just be part of the highest version check - but I dont have any particular opinion to be honest. Thanks for doing this and coming up with the much better solution.


private final UnaryOperator<Map<String, IndexTemplateMetaData>> indexTemplateMetaDataUpgraders;

public final ClusterService clusterService;

This comment has been minimized.

Copy link
@spinscale

spinscale Jun 20, 2017

Member

intentionally public along with the next two variables?


});
}
}

This comment has been minimized.

Copy link
@spinscale

spinscale Jun 20, 2017

Member

you are calling the listeners here, but not doing any further assertions? i.e. for the updatesInProgress variable?

ExecutorService executorService = mock(ExecutorService.class);
when(threadPool.generic()).thenReturn(executorService);
doAnswer(invocation -> {
Object[] args = invocation.getArguments();

This comment has been minimized.

Copy link
@spinscale

spinscale Jun 20, 2017

Member

you could just use a EsExecutors.newDirectExecutorService()

This comment has been minimized.

Copy link
@spinscale

spinscale Jun 20, 2017

Member

oh... didnt see the updateInvocation counter... nvm

Collections.emptyList());

service.updateTemplates(additions, deletions);

This comment has been minimized.

Copy link
@spinscale

spinscale Jun 20, 2017

Member

nit^2: assertThat(putTemplateListeners, hasSize(additionsCount));

}
}

public void testClientNodeRunsTemplateUpdates() {

This comment has been minimized.

Copy link
@spinscale

spinscale Jun 20, 2017

Member

that should not be doesNotRunTemplateUpgrade I assume

}
for (ObjectCursor<DiscoveryNode> node : nodes.getMasterAndDataNodes().values()) {
if (node.value.getVersion().equals(maxVersion)) {
if (node.value.getId().compareTo(localNode.getId()) > 0) {

This comment has been minimized.

Copy link
@spinscale

spinscale Jun 20, 2017

Member

can be combined to a single if-statement?

@abeyad
abeyad approved these changes Jun 20, 2017
Copy link
Contributor

left a comment

LGTM

@imotov imotov merged commit e6e5ae6 into elastic:master Jun 22, 2017
2 checks passed
2 checks passed
CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci Build finished.
Details
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Jun 23, 2017
* master:
  testCreateShrinkIndex: removed left over debugging log line that violated linting
  testCreateShrinkIndex should make sure to use the right source stats when testing shrunk target
  [Test] Add unit test for XContentParserUtilsTests.parseStoredFieldsValue (elastic#25288)
  Update percolate-query.asciidoc (elastic#25364)
  Remove remaining `index.mapping.single_type=false` (elastic#25369)
  test: Replace OldIndexBackwardsCompatibilityIT#testOldClusterStates with a full cluster restart qa test
  Fix use of spaces on Windows if JAVA_HOME not set
  ESIndexLevelReplicationTestCase.ReplicationAction#execute should send exceptions to it's listener rather than bubble them up
  testRecoveryAfterPrimaryPromotion shouldn't flush the replica with extra operations
  fix sort and string processor tests around targetField (elastic#25358)
  Ensure `InternalEngineTests.testConcurrentWritesAndCommits` doesn't pile up commits (elastic#25367)
  [TEST] Add debug logging if an unexpected exception is thrown
  Update Painless to Allow Augmentation from Any Class (elastic#25360)
  TemplateUpgraders should be called during rolling restart (elastic#25263)
imotov added a commit that referenced this pull request Jun 23, 2017
In #24379 we added ability to upgrade templates on full cluster startup. This PR invokes the same update procedure also when a new node first joins the cluster allowing to update templates on a rolling cluster restart as well.

Closes #24680
@clintongormley clintongormley added v6.0.0-beta1 and removed v6.0.0 labels Jul 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.