Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GSI protocol for multi-clusters #1800

Merged
merged 58 commits into from
Sep 16, 2016

Conversation

sebastianburckhardt
Copy link
Contributor

Implements Global-single-instance registration strategy. This is the next installment of the Geo-Distribution Project #948.

@galvesribeiro
Copy link
Member

galvesribeiro commented May 31, 2016

@sebastianburckhardt great work!

Question... Since every silo has a LocalGrainDirectory and I you added a RemClusterGrainDirectory to it so, does that means that each silo will keep a copy of the whole cluster directory? Would not be better if we have global grains registered as grains inside the local directory and somehow have it marked as a global grain? Just for simplicity... I didn't understood the rationale behind the decision to keep a diff directory that will be hold by all the clusters.

Besides that question, greet work. LGTM.

Thanks! #Closed

@sebastianburckhardt
Copy link
Contributor Author

sebastianburckhardt commented Jun 1, 2016

I know it does not look that way, but there is actually only one grain directory. I agree that it is definitely cleaner that way (consider handoff etc.). RemClusterGrainDirectory is a wrapper around the LocalGrainDirectory that implements the IClusterGrainDirectory functions, which are called from remote clusters. This is the same pattern that already existed for RemGrainDirectory, which is a wrapper around LocalGrainDirectory as well, and implements the IRemoteGrainDirectory functionality. #Closed

@galvesribeiro
Copy link
Member

galvesribeiro commented Jun 1, 2016

Ok got it. Thanks! Great work, LGTM #Closed

}

var precLeft = grain.GetUniformHashCode() ^ clusterLeft.GetHashCode();
var precRight = grain.GetUniformHashCode() ^ clusterRight.GetHashCode();
Copy link
Member

@jdom jdom Jun 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not use String.GetHashCode(). Create and use a uniform hashcode instead, as the value of GetHashCode might be different depending on where it is ran. #Closed

Copy link
Contributor Author

@sebastianburckhardt sebastianburckhardt Jun 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. #Closed

@sebastianburckhardt
Copy link
Contributor Author

sebastianburckhardt commented Jun 22, 2016

Unregistration of global-single-instance grains seems to not have not been thought through completely. Need to fix this. I think we can do the equivalent of what is done within a cluster (send unregistration messages, and if those should get lost for any reason, fall back on removing stale directory entries when detecting NonExistentActivation exceptions). #Closed

@sergeybykov
Copy link
Contributor

sergeybykov commented Jun 23, 2016

@sebastianburckhardt

I think we can do the equivalent of what is done within a cluster (send unregistration messages, and if those should get lost for any reason, fall back on removing stale directory entries when detecting NonExistentActivation exceptions).

I think that's alright. I can't think of a simple better solution. Let's see how it works, iterate if necessary. #Closed

@@ -124,6 +126,22 @@ public bool RemoveActivation(ActivationId act, bool force)
return Instances.Count == 0;
}

public IActivationInfo LookupAndRemoveActivation(ActivationAddress address, bool force)
Copy link
Member

@jdom jdom Jul 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method shares some (non-trivial) logic duplication with the method above (RemoveActivation) although they differ slightly in the case of force==true. Is this by design or should they be doing the same and avoiding duplication instead? #Closed

Copy link
Contributor Author

@sebastianburckhardt sebastianburckhardt Jul 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the case I implemented, I need to know if an activation was removed, and if so, what was its content.
I could not see how to do that without adding a new LookupAndRemove function on the directory.

The existing RemoveActivation does not always do a lookup, and does not return an activation. Instead, it returns a boolean indicating whether there are any activations left after the remove.

#Closed

@sergeybykov
Copy link
Contributor

sergeybykov commented Sep 7, 2016

Looks like we are down to a single remaining design issue - with Cluster ID on the client. I'll try to look if there's a reasonable way to avoid passing it through client config. #Closed

@sergeybykov
Copy link
Contributor

I submitted a PR - sebastianburckhardt#6 against @sebastianburckhardt's branch to eliminate the need to configure cluster ID on the client side.

sergeybykov and others added 2 commits September 9, 2016 10:17
Change initialization of OutsideRuntimeClient to start with a handshake client ID, and update it with cluster ID received from gateway.
…kCycle field can cause NullPointerException).
@sergeybykov
Copy link
Contributor

Shouldn't 84ee457 go as a separate PR, if it fixes a general bug?

@sebastianburckhardt
Copy link
Contributor Author

If you prefer. It's your call. (It's a very small change).

@sergeybykov
Copy link
Contributor

I think it is better to review (and merge if everything is fine) separately. You can leave it in this branch for now for testing. When you finally rebase/squash, it will disappear if it's already merge to master by then.

@sergeybykov
Copy link
Contributor

I got a dozen functional test failures. Need to investigate what's going on there.

@jdom jdom merged commit ed1b4ef into dotnet:master Sep 16, 2016
@jdom
Copy link
Member

jdom commented Sep 16, 2016

Thanks a lot @sebastianburckhardt and everyone involved in this effort. It took 3 months, but this is IN!

@sergeybykov
Copy link
Contributor

It was an epic effort. Thank you all!

@sebastianburckhardt sebastianburckhardt deleted the geo-globalsingleinstance branch September 22, 2016 22:58
@github-actions github-actions bot locked and limited conversation to collaborators Dec 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants