Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WatchCreateRequest should take a temporary requestId #7036

Closed
Chumper opened this issue Dec 19, 2016 · 26 comments
Closed

WatchCreateRequest should take a temporary requestId #7036

Chumper opened this issue Dec 19, 2016 · 26 comments
Milestone

Comments

@Chumper
Copy link

Chumper commented Dec 19, 2016

Hey guys, I am currently implementing a scala library for etcd3 in which I use the grpc protocol and using the reference here.

I am currently implementing the Watch contract and have a few questions and suggestions.

Am I correct in assuming that all watches and notifications I want to handle can be done through a single RPC stream call (because I have a bidirectional stream)?

If no, am I supposed to open a new connection for each watch?

If yes and I can handle multiple watches over a single stream, then am I correct that the creation of watches needs to be forced to happen one after another?
There is no way for me to pair a response with a request because I will get a watchId as reponse, but there is no way to pair it with a request unless there is only one request pending which makes no sense for me?

If I send two request for a watch then there is no way for me to match the both responses to any request.

I would like to have a temporary watchId just for the create request so i can associate a request with a response.

Like this:

message WatchCreateRequest {
  // key is the key to register for watching.
  bytes key = 1;

  // used to associate the request with a given response after creating a watch
  string tmp_watch_id = 123;
  ...
}

and

message WatchResponse {
  ResponseHeader header = 1;
  // watch_id is the ID of the watcher that corresponds to the response.
  int64 watch_id = 2;
  // created is set to true if the response is for a create watch request.
  // The client should record the watch_id and expect to receive events for
  // the created watcher from the same stream.
  // All events sent to the created watcher will attach with the same watch_id.
  bool created = 3;
  // set when created is set to true, will contain the client provided tmp_watch_id so the client can 
  // associate the response with a request. Note that the tmp_watch_id will not be saved, please
  // use watch_id from now on
  string tmp_watch_id = 123; 
  ...
}

This would make much more sense to me as now we can send create requests in parallel.
Or am I completely wrong here?

@xiang90
Copy link
Contributor

xiang90 commented Dec 19, 2016

Am I correct in assuming that all watches and notifications I want to handle can be done through a single RPC stream call (because I have a bidirectional stream)?

Sort of. But in Go client we give users the ability to choose. They can create multiple watchers. Each watcher establishes a stream with the etcd server. They can watch on multiple keys using one watcher.

then am I correct that the creation of watches needs to be forced to happen one after another?

If it is easier for you to handle the creation, then yes.

There is no way for me to pair a response with a request because I will get a watchId as reponse, but there is no way to pair it with a request unless there is only one request pending which makes no sense for me?

This is not 100% true. The response also has key and other metadata attached. You can use that to pair. You can disallow concurrent creation of duplicated watches on the same key or actually it does not really matter most of the time to pair which to which if the request is the same.

@Chumper
Copy link
Author

Chumper commented Dec 19, 2016

Thanks for the explanation and you are right, for the same key it does not matter.
However I still have issues to wrap my head around my use case:

I offer the developer to watch on a given key (or prefix, but lets stick to a single key) by giving me a function to call when a change happens.

Now there are may two threads which want to watch on different keys and have different function bodies.
If I issue both CreateWatchRequest at the same time then I expect two WatchResponse back, because etcd will acknowledge my create request by setting the created flag on both responses without giving any events.

Is that right?

If that is the case, then how can I determine which response belongs to which request?
As far as I can see (and tested) the WatchResponse will not return the key on which to watch.

So I dont know how to pair them.

Is the order of the responses guaranteed to be preserved if I send two requests? If so, then I can work with that.

@xiang90
Copy link
Contributor

xiang90 commented Dec 19, 2016

If that is the case, then how can I determine which response belongs to which request?

Now you cannot. So you do not allow concurrent watch creation. It is usually not a problem, at least based on our experience for the Go implementation.

If there is a solid reasoning to support it, we can add additional metadata to the create response to help client lib to pair them up.

@Chumper
Copy link
Author

Chumper commented Dec 19, 2016

I dont know if my use case is a solid reason for you to implement additional metadata, I am just currently looking on how I can offer the best experience to my library users.

As Scala heavily uses Futures and Promises it would be great to also offer concurrent watch creation, otherwise I am forced to block on the creation of a watch (even only for a very short amount).

Thats why I would love to either have the key returned in the create response (which would solve all my problem on concurrent creation) or to have another way to pair up request and response (e.g. with a tmp_watch_id)

I mean I can work around this issue but I would love to see this fixed.

Edit: So far everything makes perfect sense in designing the rpc api, this is the only thing so far that i stumpled across

@xiang90
Copy link
Contributor

xiang90 commented Dec 19, 2016

@heyitsanthony opinions?

@Chumper
Copy link
Author

Chumper commented Dec 19, 2016

While we are currently on this topic:

You created separate request object for creating and deleting, however for the response you chose to implement it in a single response with boolean flags for created and deleted.

oneof request_union {
    WatchCreateRequest create_request = 1;
    WatchCancelRequest cancel_request = 2;
  }

Wouldn't it make sense to also split that into three submessages?

oneof request_union {
    WatchCreateResponse create_response = 1;
    WatchCancelResponse cancel_response = 2;
    WatchEventResponse event_response = 3;
  }

This would make the separation clearer and easier for you to implement aditional metadata on the response.

@Chumper
Copy link
Author

Chumper commented Dec 21, 2016

Any opinions?

I think that when you offer the client to handle multiple watches over one stream, then you can not force the client to create watches synchronous.

Otherwise do it like with the leases, one request grants a watcher, one cancels a watcher and a stream is used to tell the client about watch updates.

@xiang90
Copy link
Contributor

xiang90 commented Dec 21, 2016

@Chumper Anthony is on vacation (I will be on vacation today). Please expect some delay on this discussion. Sorry!

@heyitsanthony
Copy link
Contributor

As Scala heavily uses Futures and Promises it would be great to also offer concurrent watch creation, otherwise I am forced to block on the creation of a watch (even only for a very short amount).

I don't understand this blocking argument. Anything providing a watcher will still need a create response to know whether the watch could be created and the request/response are still ultimately serialized on the tcp stream. The protocol doesn't preclude returning a future that's fulfilled once there's a create response; etcd's clientv3 library handles it by maintaining a queue of create requests, dequeuing and sending create requests once the pending response arrives.

The bigger problem is that registering n watches will necessarily cost n*RTT. This is a problem if there are many watches registered on a single stream, but it's unclear whether this is a common case.

Wouldn't it make sense to also split that into three submessages?

Sure, but this will break backwards compatibility. Not an option at this point.

Concurrent watch creation can probably be supported by adding a watch id field to the create request similar to the tmp_watch_id thing. If the given watch_id is 0 then etcd uses the old serializing behavior. Otherwise, etcd returns the usual create response if the watcher can be created / return a canceled response if it fails. This should buy full client control over the response tagging (instead of having to ask etcd for a watchid/tag) without introducing a lot of extra protocol details.

@heyitsanthony heyitsanthony added this to the v3.2.0-maybe milestone Dec 29, 2016
@Chumper
Copy link
Author

Chumper commented Dec 29, 2016

The bigger problem is that registering n watches will necessarily cost n*RTT. This is a problem if there are many watches registered on a single stream, but it's unclear whether this is a common case.

That is exactly the point in my question, in my library I would also need to create a queue with requests and block behind the curtain aka. create one watch after another.
For the end developer it still looks like he can create multiple watches, but in fact the client is blocking in this case on each creation.

I currently have only one watch created, so I don't have the required use case for multiple creations.
But if I just imagine, that an application on start needs to create like 10 watches then the start up time increases.

Sure, but this will break backwards compatibility. Not an option at this point.

Yes, this was just a side question, I understand your point here, even though you could add submessages while still being backwards compatible.

This should buy full client control over the response tagging (instead of having to ask etcd for a watchid/tag) without introducing a lot of extra protocol details.

Thats why i suggested a create_id.
The id etcd returns is fine as it is unique for the etcd instance.
If the client is in full control of the watch_id it is a bad thing because then the client has to come up with a reasonable good (kind of) uuid for watches.

Instead I would just let the client control the creation flow id and leave the rest for etcd which will minimize the potential conflicts of ids as the client only creates them for the request.

You don't even have to store it anywhere, just add a create_id or whatever to the request and response and let etcd return the create_id if available, then the client can match request and response and you wouldn't have to change any logic and the client would be able to request multiple watches at the same time.

@heyitsanthony
Copy link
Contributor

heyitsanthony commented Dec 29, 2016

If the client is in full control of the watch_id it is a bad thing because then the client has to come up with a reasonable good (kind of) uuid for watches.

The watch id is 64-bit. Keep a counter per stream and increment for every create request; it'll never run out of unique id's.

I don't see the advantage to having a separate create id instead of directly asking for a specific watch id. The client would still be responsible for choosing unique ids to avoid collisions on create_id's and it lacks the flexibility of being able to choose a watch id (e.g., a client keeps an array and wants to map the indexes watch ids).

add a create_id or whatever to the request and response and let etcd return the create_id if available

I feel that if the client is going through the trouble of sending something to the server, it'd be nicer to have etcd do more than just relaying the data back.

@Chumper
Copy link
Author

Chumper commented Dec 29, 2016

The watch id is 64-bit. Keep a counter per stream and increment for every create request; it'll never run out of unique id's.

So if two different streams chose the same watch_id then etcd will create two different watchers?
I thought the watch_id must be unique for all watches, but maybe I am wrong here?
Or how do I set the starting id for the watch_ids?

I don't see the advantage to having a separate create id instead of directly asking for a specific watch id. The client would still be responsible for choosing unique ids to avoid collisions on create_id's and it lacks the flexibility of being able to choose a watch id (e.g., a client keeps an array and wants to map the indexes watch ids).

Yes, but it is only responsible for choosing an id for its own creation flow. So the potential conflict zone is limited to all creation flows the client does on this same stream, here the client can just increment a counter for the temporary ids to surely match request and responses.

Maybe I have a wrong assumption here, I am assuming the watch_id MUST be unique for all watchers of the etcd cluster. If that is the case then letting the client choose the id will rise the conflict potential.

If that is not the case and watch_ids must only be unique per stream, then sure we can go ahead and just add a new field to the create request.

I feel that if the client is going through the trouble of sending something to the server, it'd be nicer to have etcd do more than just relaying the data back.

Fair enough.

@heyitsanthony
Copy link
Contributor

So if two different streams chose the same watch_id then etcd will create two different watchers?

Yes. The watch id is a per-stream resource. Two separate streams can watch on different ranges with the same watch id.

@Chumper
Copy link
Author

Chumper commented Dec 29, 2016

Ok, sorry for the missunderstanding and then I do fully agree with you that adding a watch_id to the request will also solve the problem when the server uses the given id.

@xiang90
Copy link
Contributor

xiang90 commented Jan 27, 2017

@Chumper Good to see you and @heyitsanthony agree on the same solution. Are you interested in working on this feature?

@Chumper
Copy link
Author

Chumper commented Jan 31, 2017

I would, but unfortunately I haven`t programmed in go before. I searched for the file but because I am not familiar wih go and the structure of code I did not find it. So I would love to see it implemented but I think I am unable to do so.

@xiang90
Copy link
Contributor

xiang90 commented Jan 31, 2017

@Chumper Understood. No problem. Thank you!

@xiang90 xiang90 modified the milestones: v3.3.0, v3.2.0-maybe Mar 31, 2017
@akauppi
Copy link
Contributor

akauppi commented Jun 6, 2017

@Chumper Thanks for bringing this issue up. I'm also implementeding v3 watch on Scala, so maybe we can get in touch and track this together. Will watch :) the issue.

@connor4312
Copy link
Contributor

@xiang90 I'd be more than happy to take this on if no one else is already doing so. @styleex just pointed out that this is a potentially buggy behavior in my Js/Ts etcd client (I assumed ordering). For now I'll likely implement a queue as Chumper mentioned earlier but I'd like to fix this at the source 😄

@xiang90
Copy link
Contributor

xiang90 commented Aug 3, 2017

@connor4312 It would be great if you can work on it. Thank you!

@xiang90
Copy link
Contributor

xiang90 commented Oct 5, 2017

moving this to 3.4.

@connor4312 it would be great if you still have interest in this one.

@xiang90 xiang90 modified the milestones: v3.3.0, v3.4.0 Oct 5, 2017
@connor4312
Copy link
Contributor

I'll take a stab at it this weekend

@connor4312
Copy link
Contributor

I've implemented this in #8662

@xiang90
Copy link
Contributor

xiang90 commented Jan 5, 2018

@connor4312 @Chumper this is fixed.

@connor4312
Copy link
Contributor

Sweet!

@Chumper
Copy link
Author

Chumper commented Jan 6, 2018

Perfect, thanks to @connor4312 for the implementation and @xiang90 @heyitsanthony for the feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants