Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a strict ordering option #55

Merged
merged 6 commits into from
Apr 11, 2020
Merged

Conversation

johningve
Copy link
Member

This pull request adds a new option: gorums.strict_ordering, that uses a single gRPC stream per node to send all "strict ordering" messages. The strict ordering is achieved due to the fact that gRPC streams preserve the order in which messages are sent on the stream.

The code is mostly based on my previous pull request (#52), but modified such that there is only one gRPC stream per Node instead of one gRPC stream per "strict ordering" method. The reason for this is to avoid the requirement that each "strict ordering" message must specify an "ID" field. Another benefit is that calls across different strict ordering methods is preserved. Let's consider the example below:

service GorumsTest {
	rpc QC(Request) returns (Response) {
		option (gorums.qc) = true;
		option (gorums.strict_ordering) = true;
	}

	rpc UnaryRPC(Request) returns (Response) {
		option (gorums.strict_ordering) = true;
	}
}

message Request {
	uint64 Num = 1;
}

message Response {
	bool InOrder = 1;
}

In order to use strict ordering, the gorums.strict_ordering is set in combination with other gorums options to determine the call type of the method. The QC rpc is a quorum call since it uses the gorums.qc option. The UnaryRPC rpc is really just a regular rpc, but it is also sent over the same gRPC stream as QC, since it also specifies the gorums.strict_ordering option. Note that a "message ID" field is not required unlike the previous version (PR #52), and the rpc do not specify its messages as streams.

Again, its important to note that the order of calls to both QC and UnaryRPC in sequence is preserved.

API

The client API for Quorum Calls should be identical to the regular Quorum Calls. The client API for Unary RPCs is similar to the gRPC way, with the method being exposed through the Nodes. The server API is quite different, as the below example shows:

type testSrv struct {}

// QC is the server-side handler for the 'QC' Quorum Call.
func (s *testSrv) QC(req *Request) *Response {
  ...
}

// UnaryRPC is the server-side handler for the 'UnaryRPC' RPC.
func (s *testSrv) UnaryRPC(req *Request) *Response {
  ...
}

func main() {
  tSrv := &testSrv{}

  srv := NewGorumsServer()

  // each handler is registered with the server
  srv.RegisterQCHandler(tSrv)
  srv.RegisterUnaryRPCHandler(tSrv)
  
  lis, _ := net.Listen(...)

  go srv.Serve(lis)

  ...

  srv.Stop()
}

Implementation details

Under the hood, the strict ordering methods marshal messages into a google.protobuf.Any format and sends it to the Nodes through each Node's sendQ channel. The Node takes these messages in sequence and sends them on the stream. Messages are sent back through a recvQ channel stored in a map for each message ID. The message ID's are part of the gorums message types sent on the stream. The definition of the gorums messages and the "strict ordering stream" is shown below:

service Gorums {
  rpc StrictOrdering(stream Message) returns (stream Message);
}

message Message {
  uint64 ID = 1;
  string URL = 2;
  google.protobuf.Any Data = 3;
}

Some state is stored both in the Manager and Nodes. This state includes the recvQ channels map, the sendQ channels, a counter for the message ID, and some mutexes. This is all quite similar to pr #52, except that the state is shared for all strict ordering methods.

In order to correctly generate these new "calltypes" that rely on a combination of gorums.strict_ordering + other options, I created some internal only options that the plugin uses to detect the correct strict ordering type. These are located in internal/strictordering and are only meant to be used internally by the protoc-gen-gorums program when it decides which templates to use. This is a bit of a hack, but I think that the simplicity of being able to specify gorums.strict_ordering + other options is nicer than creating new options for each "strict ordering" variant.

Other notes

  • Because the StrictOrdering rpc is the only "real" gRPC call that is made when using strict ordering methods, it is not necessary to compile a gRPC file for a service that only uses strict ordering methods.

  • Reconnection has been added to strict ordering methods with this PR, and is configurable through the WithBackoff manager option that takes a gRPC BackoffConfig. This option also sets the backoffconfig for the regular gRPC connections. While reconnection is in progress, all strict ordering rpcs will fail with an error.

This adds a new option 'gorums.strict_ordering' that works
with some call types (currently, quorum calls and unary RPCs).

When the option is used, a separate template is used for the
calltype. The messages are sent to each node over a shared (per node)
stream.
@johningve
Copy link
Member Author

johningve commented Mar 27, 2020

I added some tests to cmd/protoc-gen-gorums/tests/strictordering. Is that the correct place to put them? Also, for some reason VSCode is showing a lot of "undeclared name" errors in the test file, but the tests run just fine. Probably a bug with vscode-go.

@meling
Copy link
Member

meling commented Mar 27, 2020

Yes, this a known bug in gopls, which VSCode use.

Edit: There is a workaround described in the issue above.

About the placement of the tests. My current plan is to keep things that only refer to testing of the plugin and generated code inside cmd/protoc-gen-gorums/tests, while other tests that may involve some form of integration and running of servers and clients, I'm thinking that could be moved to a top level tests folder or examples folder. But I've not written any myself, so I'm still thinking about the best way to make this follow standard Go practices and so forth.


type strictOrderingManager struct {
msgID uint64
recvQ map[uint64]chan *strictOrderingResult
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider if perhaps sync.Map should be used instead, because:
The Map type is optimized for two common use cases: (1) when the entry for a given key is only ever written once but read many times, as in caches that only grow, or (2) when multiple goroutines read, write, and overwrite entries for disjoint sets of keys. In these two cases, use of a Map may significantly reduce lock contention compared to a Go map paired with a separate Mutex or RWMutex.

It would be interesting to do a performance comparison, given the comment from the documentation, which seems to match our use case, since we only ever write to the same key once, but read it many times.

gorums.proto Outdated Show resolved Hide resolved
gorums.proto Outdated Show resolved Hide resolved
// return error if stream is broken
if s.streamBroken {
err := status.Errorf(codes.Unavailable, "stream is down")
s.recvQMut.RLock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this err be recorded?

	s.node.setLastErr(err)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary to record this error since it doesn't add any information about why the node stream was closed. Maybe we could return node.LastErr() instead of this generic error to give specific information about why the node errored? If node.SetLastErr() is called by reconnectStream, then it will probably contain the error that caused the stream to fail. Of course there is a chance that some unrelated error will have taken its place.

gorums.proto Outdated Show resolved Hide resolved
gorums.proto Outdated Show resolved Hide resolved
},
strictordering.E_StrictOrderingRpc: func(m *protogen.Method) bool {
return hasMethodOption(m, gorums.E_StrictOrdering) && !hasMethodOption(m, gorumsCallTypes...)
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a neat way to organize the various checkers that we need; to that end, I'm wondering if we should add more checkers here or in a separate map, e.g.:

	gorums.E_Multicast: func(m *protogen.Method) bool {
		return hasMethodOption(m, gorums.E_Multicast) && m.Desc.IsStreamingClient()
	},
... and so on ...

If we do something like this, maybe we can probably avoid some of the checks in the switch cases in validateMethodExtensions...

Any thoughts?

@johningve johningve force-pushed the strictordering_v2 branch 3 times, most recently from 01542ec to 25b792e Compare April 6, 2020 17:03
gorums.proto Outdated Show resolved Hide resolved
* Compiled proto files with the new protobuf
* Moved StrictOrdering gRPC code into dev
* Renamed and refactored structs
* Added helper methods to receiveQueue
* Made some tweaks to bundle
And changed numStrictOrderingMethods to hasStrictOrderingMethods
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants