Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for merging custom meta data in tribe node #21552

Merged
merged 4 commits into from Nov 21, 2016

Conversation

Projects
None yet
3 participants
@areek
Copy link
Contributor

commented Nov 14, 2016

Currently, when any underlying cluster has custom metadata
(via plugin), tribe node does not store custom meta data in its
cluster state. This is because the tribe node has no idea how to
select the appropriate custom metadata from one or many custom
metadata (corresponding to the number of underlying clusters).

This change adds an interface that custom metadata implementations
can extend to add support for merging mulitple custom metadata of
the same type for storing in the tribe state.

Relates to #20544
Supersedes #20791
Closes #9372

@areek

This comment has been minimized.

Copy link
Contributor Author

commented Nov 14, 2016

@bleskes can you take a look?

@areek areek force-pushed the areek:enhancement/tribe-merge-custom-md branch Nov 14, 2016

@javanna

This comment has been minimized.

Copy link
Member

commented Nov 14, 2016

Seems like this PR fixes #9372 ?

@bleskes

This comment has been minimized.

Copy link
Member

commented Nov 15, 2016

FYI @areek is working on allowing custom type instances to be removed as well (which is currently not part of the PR)

@areek areek force-pushed the areek:enhancement/tribe-merge-custom-md branch 2 times, most recently Nov 15, 2016

@areek

This comment has been minimized.

Copy link
Contributor Author

commented Nov 15, 2016

@javanna this does implement a solution for #9372, it adds the merge functionality only for MetaData.Customs.
@bleskes I updated the PR according to our discussion offline, now we support removing custom types as well, can you take a look?

@areek areek force-pushed the areek:enhancement/tribe-merge-custom-md branch Nov 15, 2016

@bleskes
Copy link
Member

left a comment

This looks very good. I left some comments that I think will simplify the code.

interface MergableCustomMetaData<T extends MetaData.Custom> {

/**
* Merges this custom metadata with other, returning either this or <code>other</code> custom metadata

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 17, 2016

Member

nit: can we document that the neither original customs should be changed by this method?

This comment has been minimized.

Copy link
@areek

areek Nov 17, 2016

Author Contributor

added it to the docs

core/src/main/java/org/elasticsearch/tribe/TribeService.java Outdated
@@ -352,8 +375,7 @@ public String describeTasks(List<ClusterChangedEvent> tasks) {
BatchResult.Builder<ClusterChangedEvent> builder = BatchResult.builder();

try {
// we only need to apply the latest cluster state update
accumulator = applyUpdate(accumulator, tasks.get(tasks.size() - 1));
accumulator = applyUpdate(accumulator, tasks);

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 17, 2016

Member

this method became useless I think? the whole try catch can be removed as it's handled by the caller in the ClusterService. I think we should break applyUpdate into several methods each updating a different parts. i.e., updateNodes, updateIndices and now the now updateCustoms

This comment has been minimized.

Copy link
@areek

areek Nov 17, 2016

Author Contributor

Thanks for the suggestion. I decomposed applyUpdate to two methods updateNodes and updateIndicesAndMetaData instead. Currently, MetaData is updated when we update indices and when we update the customs, so it made sense to updateCustoms when we are done updating MetaData for updating indices. I think it would be messy to try to update MetaData in two separate methods from the top-level execute

core/src/main/java/org/elasticsearch/tribe/TribeService.java Outdated
final MetaData.Custom custom;
if (currentTribeName.equals(tribeName)) {
// current tribe node
custom = tribeState.metaData().custom(customMetaDataType);

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 17, 2016

Member

Why do you we need the custom of the tribe node?

This comment has been minimized.

Copy link
@areek

areek Nov 17, 2016

Author Contributor

Apologies for the misleading comment (updated it now). This special cases getting the custom for the tribeClient that triggered the current cluster state update. Because we have the latest cluster state from the clusterChangedEvent for this tribeClient, we use that instead of looking up the state from the tribeClient's cluster service.
I think this is the right thing to do here, though I have tested it out without this special case and it still works. I am worried if the tribeClient's state from the cluster service (which initiated the cluster state update) is outdated w.r.t its latest changes. WDYT @bleskes ?

core/src/main/java/org/elasticsearch/tribe/TribeService.java Outdated
// current tribe node
custom = tribeState.metaData().custom(customMetaDataType);
} else {
ClusterState currentTribeState = tribeNode.injector().getInstance(ClusterService.class).state();

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 17, 2016

Member

nit: can we have this ugliness in a getClusterService(Node node) method? we use it in doStart as well.

This comment has been minimized.

Copy link
@areek

areek Nov 17, 2016

Author Contributor

done

core/src/main/java/org/elasticsearch/tribe/TribeService.java Outdated
.collect(Collectors.toSet());
final List<Node> tribeNodes = TribeService.this.nodes;
Map<String, MetaData.Custom> mergedCustomMetaDataMap = mergeChangedCustomMetaData(changedCustomMetaDataTypeSet,
customMetaDataType -> {

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 17, 2016

Member

can we make this a proper getCurrentCustomsFromNodes method and test it etc?

This comment has been minimized.

Copy link
@areek

areek Nov 17, 2016

Author Contributor

we only get a list of MergableCustomMetaDatas from this for all the tribeClient nodes and if we do the special casing as mentioned here, extracting it to a static function and testing it would be difficult. I think we are better off beefing up the integration tests TribeIT to deal with multiple customs instead?

This comment has been minimized.

Copy link
@areek

areek Nov 17, 2016

Author Contributor

I added tribe integration test for merging multiple custom metadata types

core/src/main/java/org/elasticsearch/tribe/TribeService.java Outdated
// custom md has been removed
clusterStateChanged = true;
metaData.removeCustom(changedCustomMetaDataType);
} else if (mergedCustomMetaData.equals(tribeCustomMetaData) == false) {

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 17, 2016

Member

do we really need this extra protection? I mean the type was already in our changed type list?

This comment has been minimized.

Copy link
@areek

areek Nov 17, 2016

Author Contributor

no we don't, I removed the extra protection

core/src/main/java/org/elasticsearch/tribe/TribeService.java Outdated
List<MetaData.Custom> tribeCustomMetaDataList = customMetaDataByTribeNode.apply(customMetaDataType);
if (tribeCustomMetaDataList.isEmpty() == false) {
final MetaData.Custom mergedCustomMetaData;
if (tribeCustomMetaDataList.size() == 1) {

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 17, 2016

Member

I don't think you need an if else structure here? you can set the first and iterate over the rest? This can be simplified

This comment has been minimized.

Copy link
@areek

areek Nov 17, 2016

Author Contributor

I simplified it as suggested

Add support for merging custom meta data in tribe node
Currently, when any underlying cluster has custom metadata
(via plugin), tribe node does not store custom meta data in its
cluster state. This is because the tribe node has no idea how to
select the appropriate custom metadata from one or many custom
metadata (corresponding to the number of underlying clusters).

This change adds an interface that custom metadata implementations
can extend to add support for merging mulitple custom metadata of
the same type for storing in the tribe state.

Relates to #20544
Supersedes #20791

@areek areek force-pushed the areek:enhancement/tribe-merge-custom-md branch 2 times, most recently Nov 17, 2016

@areek areek force-pushed the areek:enhancement/tribe-merge-custom-md branch to cec9960 Nov 17, 2016

@areek

This comment has been minimized.

Copy link
Contributor Author

commented Nov 17, 2016

Thanks for the feedback @bleskes! I addressed your comments, left a question and added more tests. could you take another look?

@areek areek force-pushed the areek:enhancement/tribe-merge-custom-md branch to 16d6500 Nov 18, 2016

@bleskes
Copy link
Member

left a comment

Thx @areek . I left some more comments. I would be great to have a test that tests batching incoming cluster states as well.

core/src/main/java/org/elasticsearch/tribe/TribeService.java Outdated
} catch (Exception e) {
builder.failures(tasks, e);
ClusterState.Builder newState = ClusterState.builder(accumulator).incrementVersion();
boolean clusterStateChanged = updateNodes(accumulator, tasks, newState);

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 18, 2016

Member

accumulator is not currentstate?

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 18, 2016

Member

thinking about it more - do we even need this accumulator?

This comment has been minimized.

Copy link
@areek

areek Nov 18, 2016

Author Contributor

I cleaned up the code and removed the redundant accumulator

core/src/main/java/org/elasticsearch/tribe/TribeService.java Outdated
for (Node tribeClientNode : tribeClientNodes) {
String currentTribeName = TRIBE_NAME_SETTING.get(tribeClientNode.settings());
final MetaData.Custom custom;
if (currentTribeName.equals(tribeName)) {

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 18, 2016

Member

Answering here - I would prefer keeping the logic the same for all nodes i.e., any custom change is processed based on the latest cluster state of those nodes. No need to special case the current node as I don't see what it buys us but complexity? without it it can be a simple stream

This comment has been minimized.

Copy link
@areek

areek Nov 18, 2016

Author Contributor

okay, I am assuming tribe node receives the cluster changed event only after the change has been applied to the underlying cluster. I removed the special casing for this and converted into a simple stream

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 19, 2016

Member

yes it is - see here vs here.

Might be good to add an assertion the tribe listener to see that the cluster state is already exposed when it's called - this will prevent future trouble :)

core/src/main/java/org/elasticsearch/tribe/TribeService.java Outdated
MetaData.Custom mergedCustomMetaData = mergedCustomMetaDataMap.get(changedCustomMetaDataType);
if (mergedCustomMetaData == null) {
// custom md has been removed
clusterStateChanged = true;

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 18, 2016

Member

this can be out of if now.

core/src/main/java/org/elasticsearch/tribe/TribeService.java Outdated

// pkg-private for testing
static Map<String, MetaData.Custom> mergeChangedCustomMetaData(Set<String> changedCustomMetaDataTypeSet,
Function<String, List<MetaData.Custom>> customMetaDataByTribeNode) {

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 18, 2016

Member

we can make this a function to List<MergableCustomMetaData> - we alreay check with instance of in the implementation of it.

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 18, 2016

Member

if we do so, I think this becomes a simple stream reduce

This comment has been minimized.

Copy link
@areek

areek Nov 18, 2016

Author Contributor

changed it to accept a function returning List<MergableCustomMetaData and doing so did simplify the mergeChangedCustomMetaData function to a stream reduce

core/src/test/java/org/elasticsearch/tribe/TribeServiceTests.java Outdated

public void testMergeMultipleCustomMetaData() {
Map<String, List<MetaData.Custom>> inputMap = new HashMap<>();
inputMap.put(MergableCustomMetaData1.TYPE,

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 18, 2016

Member

I think we need some testing with removed customs? i.e., nulls?

This comment has been minimized.

Copy link
@areek

areek Nov 18, 2016

Author Contributor

the mergeChangedCustomMetaData never gets null values for removed customs, instead they are not supplied by the customMetaDataByTribeNode function at all when a mergable custom metadata has been removed. I added tests for removing customs in TribeIT (testMergingRemovedCustomMetaData and testMergingMultipleCustomMetaData)

@areek areek force-pushed the areek:enhancement/tribe-merge-custom-md branch Nov 18, 2016

@areek areek force-pushed the areek:enhancement/tribe-merge-custom-md branch to ae4c137 Nov 18, 2016

@bleskes
Copy link
Member

left a comment

LGTM. I like the last iteration a lot. Did you see my question about testing bulk processing of incoming states? Maybe I missed it.

}
return tribeCustomMetaDataList;
}
customMetaDataType -> tribeClientNodes.stream()

This comment has been minimized.

Copy link
@bleskes
MetaData.Custom mergedCustomMetaData = mergedCustomMetaDataMap.get(changedCustomMetaDataType);
if (mergedCustomMetaData == null) {
// we ignore merging custom md which doesn't implement MergableCustomMetaData interface
if (currentState.metaData().custom(changedCustomMetaDataType) instanceof MergableCustomMetaData) {

This comment has been minimized.

Copy link
@bleskes

bleskes Nov 19, 2016

Member

I'm wondering if we really need this check - when is it relevant?

This comment has been minimized.

Copy link
@areek

areek Nov 21, 2016

Author Contributor

when a non-mergable custom md changes, it is in changedCustomMetaDataTypeSet but is ignored by mergeChangedCustomMetaData (returns a null for the custom type). So we can delete non-mergable custom md from the tribe cluster, while trying to merge mergable custom md. This check prevents deleting these non-mergable custom md.

@areek

This comment has been minimized.

Copy link
Contributor Author

commented Nov 19, 2016

@bleskes I am working on writing some simple unit tests for TribeNodeClusterStateTaskExecutor (by making it to a static class), I plan on testing the bulk processing of incoming states there

@areek

This comment has been minimized.

Copy link
Contributor Author

commented Nov 21, 2016

I am going to merge this to master and backport it to 5.x branch now. Will add the unit tests (including tsting bulk processing of incoming states) for TribeNodeClusterStateTaskExecutor in a subsequent PR.

@areek areek merged commit 0ccf8a7 into elastic:master Nov 21, 2016

2 checks passed

CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci Build finished.
Details

areek added a commit that referenced this pull request Nov 21, 2016

Add support for merging custom meta data in tribe node (#21552)
* Add support for merging custom meta data in tribe node

Currently, when any underlying cluster has custom metadata
(via plugin), tribe node does not store custom meta data in its
cluster state. This is because the tribe node has no idea how to
select the appropriate custom metadata from one or many custom
metadata (corresponding to the number of underlying clusters).

This change adds an interface that custom metadata implementations
can extend to add support for merging mulitple custom metadata of
the same type for storing in the tribe state.

Relates to #20544
Supersedes #20791

* Simplify updating tribe state

* Add tests for merging multiple custom metadata types in tribe node

* cleanup merging custom md logic in tribe service
@areek

This comment has been minimized.

Copy link
Contributor Author

commented Nov 21, 2016

Thanks for the reviews, @bleskes! merged to master and backported to 5.x

jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Nov 22, 2016

Merge branch 'master' into lazy-resolve
* master: (42 commits)
  Add support for merging custom meta data in tribe node (elastic#21552)
  [DOCS] Show EC2's auto attribute (elastic#21474)
  Add information about the removal of store throttling to the migration guide.
  Add a recommendation against large documents to the docs. (elastic#21652)
  Add indices options tests to search api REST tests (elastic#21701)
  Fixing indentation in geospatial querying example. (elastic#21682)
  Fix typo in filters aggregation docs (elastic#21690)
  Add BWC layer for Exceptions (elastic#21694)
  Add checkstyle rule to forbid empty javadoc comments (elastic#20881)
  Docs: Added offline install link for discovery-file plugin
  remove pointless catch exception in TransportSearchAction (elastic#21689)
  Rename ClusterState#lookupPrototypeSafe to `lookupPrototype` and remove previous "unsafe" unused variant (elastic#21686)
  Use a buffer to do character to byte conversion in StreamOutput#writeString (elastic#21680)
  Fix integer overflows when dealing with templates. (elastic#21628)
  Fix highlighting on a stored keyword field (elastic#21645)
  Set execute permissions for native plugin programs (elastic#21657)
  adjust visibility of DiscoveryNodes.Delta constructor
  Remove unused DiscoveryNodes.Delta constructor
  Remove unused DiscoveryNode#removeDeadMembers public method
  Remove minNodeVersion and corresponding public `getSmallestVersion` getter method from DiscoveryNodes
  ...

areek added a commit to areek/elasticsearch that referenced this pull request Nov 29, 2016

Add unit test for TribeNodeClusterStateTaskExecutor
This commit makes TribeNodeClusterStateTaskExecutor a static class
for unit testing. TribeNodeClusterStateTaskExecutor is responsible
for applying cluster state updates to the tribe node state whenever
an underlying cluster state is updated (i.e. adding/removing indices,
updating nodes and merging custom metadata). The unit tests ensure
the tribe state is properly updating when underlying cluster state
tasks are seen by the tribe node in batches or as a single update.

relates elastic#21552
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.