Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for cluster state diffs #10212

Merged
merged 1 commit into from Apr 27, 2015

Conversation

Projects
None yet
4 participants
@imotov
Copy link
Member

commented Mar 23, 2015

First iteration of cluster state diffs that adds support for diffs to the most frequently changing elements - cluster state, meta data and routing table.

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Mar 23, 2015

I will review this today or tomorrow...

@s1monw s1monw self-assigned this Mar 23, 2015

return this.uuid;
}

public String getUuid() {

This comment has been minimized.

Copy link
@s1monw

s1monw Mar 24, 2015

Contributor

can we just use getUuid() no need to have two

@@ -171,7 +200,70 @@ public void handleException(TransportException exp) {
}
}

private void resendFullClusterState(final ClusterState clusterState, final DiscoveryNode node, final AtomicBoolean timedOutWaitingForNodes, final BlockingClusterStatePublishResponseHandler publishResponseHandler) {

This comment has been minimized.

Copy link
@s1monw

s1monw Mar 24, 2015

Contributor

can we have unittests for this stuff I think it's pretty much only tested in integration tests. We should abstract stuff away to make it unittestable!

try {
if (in.readBoolean()) {
clusterState = ClusterState.Builder.readFrom(in, nodesProvider.nodes().localNode());
;

This comment has been minimized.

Copy link
@s1monw

s1monw Mar 24, 2015

Contributor

extra ;

@@ -159,7 +159,7 @@ public void performStateRecovery(final GatewayStateRecoveredListener listener) t
}
ClusterState.Builder builder = ClusterState.builder(clusterName);
builder.metaData(metaDataBuilder);
listener.onSuccess(builder.build());
listener.onSuccess(builder.randomUuid().build());

This comment has been minimized.

Copy link
@s1monw

s1monw Mar 24, 2015

Contributor

I think we should have a randomUUID by default and only if somebody specify one explicitly we should use it. ie. remove the method entirely?

This comment has been minimized.

Copy link
@bleskes
@s1monw

This comment has been minimized.

Copy link
Contributor

commented Mar 24, 2015

I went over this and I have to admit I am in the middle of a jetlag. I really like it. What I miss is extensive unittesting of the individual components. I think there should be way more test code that is not integration tests to make this fly. I think code-wise we are super close.

@@ -591,7 +614,7 @@ public Builder removeCustom(String type) {
}

public ClusterState build() {
return new ClusterState(clusterName, version, metaData, routingTable, nodes, blocks, customs.build());
return new ClusterState(clusterName, version, uuid, metaData, routingTable, nodes, blocks, customs.build());

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 24, 2015

Member

Random thought I want to make sure is not lost :) - does it make sense to assert that uuid is not _na_ by the time we get here?


private class ClusterStateDiff implements Diff {

private final long version;

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 24, 2015

Member

can we maybe name this with something that indicates its the version of the after ClusterState? afterVersion sounds bad. If we use to and from in the ctor we can have toVersion, fromUuid, toUuid etc :)

This comment has been minimized.

Copy link
@imotov

imotov Mar 26, 2015

Author Member

Anything that doesn't start with "previous" is "after". If I will just make toVersion and leave everything else with "to" prefix it will be inconsistent. If I will add "to" prefix everywhere it will be redundant.

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 26, 2015

Member

Indeed redundant but clear (I had to double check because I wasn't aware of the convention, as most people reading this will be). I suggested the to & from to make it shorter so less of an overheard. as said, just a suggestion.

/**
* Returns serializable object representing differences between this and previousState
*/
Diff diffs(T previousState);

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 24, 2015

Member

can we call it diff? it's just a single one

/**
* Reads the {@link org.elasticsearch.cluster.Diff} from StreamInput and applies them to this
*/
T readDiffs(StreamInput in) throws IOException;

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 24, 2015

Member

maybe call this readDiffAndApply? this doesn't really read a diff and return it.

}
}

public ClusterState readDiffs(StreamInput in, DiscoveryNode localNode) throws IOException {

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 24, 2015

Member

do we need the localNode param here? can't we take it from the current state?

for (ObjectObjectCursor<String, T> partIter : after) {
T beforePart = before.get(partIter.key);
if (!partIter.value.equals(beforePart)) {
if (beforePart == null) {

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 24, 2015

Member

I think we can check the beforePart == null out of the if(!..equals) and it will make it cleaner.

}

/**
* Reads Diff object produced by {@link org.elasticsearch.cluster.DiffableUtils#diff(org.elasticsearch.common.collect.ImmutableOpenMap, org.elasticsearch.common.collect.ImmutableOpenMap)} method

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 24, 2015

Member

do we really need all of these org.elasticsearch.cluster paths?


/**
* Reads Diff object produced by {@link org.elasticsearch.cluster.DiffableUtils#diff(org.elasticsearch.common.collect.ImmutableOpenMap, org.elasticsearch.common.collect.ImmutableOpenMap)} method
*

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 25, 2015

Member

can we also add a line saying this variant allows for custom value deserialization based on the map using a KeyedReader? I got confused by why this is useful (until I saw how it is used).

}

/**
* A reader that can deserialize an object

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 25, 2015

Member

can we be slightly more verbose on why this is use full? (different deserialization logic based on key in parent map)

* A reader that can deserialize an object
* @param <T>
*/
public static interface KeyedReader<T> {

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 25, 2015

Member

static is not needed.

@@ -38,6 +38,7 @@
import org.elasticsearch.test.ElasticsearchIntegrationTest;
import org.elasticsearch.test.InternalTestCluster;
import org.elasticsearch.test.TestCluster;
import org.elasticsearch.test.junit.annotations.TestLogging;

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 25, 2015

Member

left over?

* Thrown by {@link org.elasticsearch.cluster.Diffable#readDiffs(org.elasticsearch.common.io.stream.StreamInput, org.elasticsearch.cluster.node.DiscoveryNode)} method
*/
public class IncompatibleClusterStateVersionException extends ElasticsearchException {
public IncompatibleClusterStateVersionException(String msg) {

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 25, 2015

Member

Nit picky: I wonder if we should make this ctor get all the parameters and construct the message locally. Will be easier to read the code and use, imho.


import java.io.IOException;

public abstract class SimpleDiffable<T> implements Diffable<T> {

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 25, 2015

Member

can we add java docs?

return Builder.readFrom(in, localNode);
}

public Diff diffs(DiscoveryNodes previousState) {

This comment has been minimized.

Copy link
@bleskes

bleskes Mar 25, 2015

Member

this is SimpleDiffable no?

@imotov

This comment has been minimized.

Copy link
Member Author

commented Apr 16, 2015

@s1monw, I push another commit. I was thinking about the unittest for diffables and I started to write them, but I quickly found that they are way to close to the code in the diffable itself. So, basically they tend to follow the structure and therefore most likely to repeat the errors. For example, if you forget to send a diff for a field. So, I took a different (and very randomized approach) that was implemented in ClusterStateDiffTests. It tests each diffable in common framework. by applying random changes to the random diffables and making sure they are propagated through diffs correctly. What do you think?

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Apr 16, 2015

@imotov I left a comment regarding BWC - I think we should not think about backporting here at all

/**
* Represents difference between states of cluster state parts
*/
public interface Diff<T> {

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 16, 2015

Contributor

can we somehow make sure that reading and writing code is on the same Interface ie. I think the writeTo and readFrom here on this inteface?

This comment has been minimized.

Copy link
@imotov

imotov Apr 16, 2015

Author Member

The reading code is in constructor. I cannot put it on the interface.

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 24, 2015

Contributor

yeah I remember... ;)

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Apr 23, 2015

I took another look at the two added commits and it looks good to me... I know @bleskes had some concerns about the diff logic somewhere but I don't recall what it was. @bleskes can you take anothter look at it?

return new SimpleCompleteDiff<T>(reader, in);
}

private static class SimpleCompleteDiff<T extends Diffable<T>> implements Diff<T> {

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 24, 2015

Contributor

can we name this CompleteDiff don't use simple please :)


import java.io.IOException;

public interface ImmutableReader<T> {

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 24, 2015

Contributor

can we call this StreamableReader? or StreamReader

ClusterState newNodeSpecificClusterState = null;
synchronized (this) {
// we do the marshaling intentionally, to check it works well...
if (discovery.lastProcessedClusterState != null && clusterChangedEvent.previousState().nodes().nodeExists(discovery.localNode.id())) {

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 24, 2015

Contributor

can you comment what this does? It's hard for me to understand it really

@@ -83,73 +90,43 @@ public void close() {
transportService.removeHandler(ACTION_NAME);

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 24, 2015

Contributor

this class seems to be pretty important do you think we can add a unittest for this that it really does what it should do. We can simply override the methods with assertions that do networking?

This comment has been minimized.

Copy link
@imotov

imotov Apr 24, 2015

Author Member

Isn't it what ClusterStateDiffPublishingTests is doing?

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 24, 2015

Contributor

yeah pretty much - sorry 3k lines added is easy to miss tests


Entry entry = (Entry) o;

if (!name.equals(entry.name)) return false;

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 24, 2015

Contributor

can we get {} around the single line if statments?

This comment has been minimized.

Copy link
@imotov

imotov Apr 24, 2015

Author Member

It's Intellij-generated equals. I don't think we are doing it anywhere else.

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 24, 2015

Contributor

fair enough...

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Apr 24, 2015

I did another review with a fresh brain after good sleep and I think this LGTM except of the couple of comments I left. @imotov what do you think if we add a method to our test framework that after a test fetches the current clusterstate from all the nodes and if their IDs match we compare them and then must be identical? I think this is a pretty fair check and it gives us a much better coverage?

@imotov

This comment has been minimized.

Copy link
Member Author

commented Apr 24, 2015

@s1monw I pushed another revision. I tried adding the check at the and of the test but ran into an issue with some tests. The problem is that not all nodes arrived to the end of the test with the same state, there are typically some nodes that are lagging behind. So, I would need to bring them up to speed somehow in a reliable fashion. Any ideas what would be the best way to do it?

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Apr 24, 2015

Any ideas what would be the best way to do it?

IMO we don't test that all nodes are on the same state, I'd just get all the states that are the same as the master and compare them. If some nodes are lacking that's fine we can just omit them most of the cases should give us something to compare though.

@imotov

This comment has been minimized.

Copy link
Member Author

commented Apr 24, 2015

@s1monw makes sense. I pushed 4eec468

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Apr 24, 2015

@imotov can we compare the bytes instead of the length? using Arrays.equals(master, current)

@imotov

This comment has been minimized.

Copy link
Member Author

commented Apr 24, 2015

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Apr 24, 2015

@imotov I think we should try to do this... I wonder how much effort it really is here. Maybe toXContent helps and we can play some tricks here and deserialize and sort or whatever... This would really be good to have though.

@imotov

This comment has been minimized.

Copy link
Member Author

commented Apr 24, 2015

@s1monw yes, I probably can add an additional check that will use toXContent, parse it into a map, remove local node reference and then compare maps. I will give it a short.

@imotov

This comment has been minimized.

Copy link
Member Author

commented Apr 25, 2015

@s1monw I have added the check that generated JSON is same for all nodes. What do you think?

* Verifies that all nodes that have the same version of the cluster state as master have same cluster state
*/
void ensureClusterStateConsistency() throws IOException {
if (cluster() != null && isInconsistentClusterStateTest(getTestClass()) == false) {

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 25, 2015

Contributor

instead of having an annotation can we just have a method that we can override? I would like to not make this something people should use.

if (size > originalList.size()) {
throw new IllegalArgumentException("Can\'t pick " + size + " random objects from a list of " + originalList.size() + " objects");
}
List<T> list = newArrayList();

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 25, 2015

Contributor

can we just shufffle the list and then return sublist(0, size)?

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Apr 25, 2015

left some minor comments. LGTM otherwise. Feel free to push once you fixed them! Good stuff, another big step.

@@ -54,6 +55,7 @@
/**
*
*/
@InconsistentClusterStateTest(reason = "testBrokenMapping creates unserializable cluster state. It should be fixed in https://github.com/elastic/elasticsearch/pull/8802")

This comment has been minimized.

Copy link
@s1monw

s1monw Apr 25, 2015

Contributor

hmm maybe we should just AwaitsFix the test?

@imotov imotov force-pushed the imotov:manual-diffs branch from ec10fe4 to 8f218e5 Apr 27, 2015

Add support for cluster state diffs
Adds support for calculating and sending diffs instead of full cluster state of the most frequently changing elements - cluster state, meta data and routing table.

Closes #6295

@imotov imotov force-pushed the imotov:manual-diffs branch from 8f218e5 to d746e14 Apr 27, 2015

@imotov imotov merged commit d746e14 into elastic:master Apr 27, 2015

1 check passed

CLA Commit author is a member of Elasticsearch
Details

@clintongormley clintongormley changed the title Add support to cluster state diffs Add support for cluster state diffs Jun 6, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.