-
Notifications
You must be signed in to change notification settings - Fork 290
399 - Speculative execution #323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a few questions/nits. Let's address those, but overall it looks great.
cass_future_wait(close_future); | ||
cass_future_free(close_future); | ||
cass_cluster_free(cluster); | ||
cass_session_free(session); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the session be freed before the cluster that presumably owns it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter. Session copies what it needs from cluster. In general the driver API uses the pattern "const *" for a parameter when it's going to either copy or keep a const reference to the object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool.
|
||
CassError cass_cluster_set_no_speculative_execution_policy(CassCluster* cluster) { | ||
cluster->config().set_speculative_execution_policy( | ||
new cass::NoSpeculativeExecutionPolicy()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have a singleton NoSpeculativeExecutionPolicy that can be set into any cluster object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, the driver avoids allocating memory during static initialization (Actually, we try to avoid any sort of complex static initialization) The main reason being that it prevents up from using a custom allocator to make those allocations when the default allocator is overridden. We've had several users ask for this feature: https://datastax-oss.atlassian.net/browse/CPP-360
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A potentially better optimization is to keep returning the same instance of NoSpeculativeExecutionPlan
(vs policy) since it is allocated for every request. I've attempted to do this (using custom operator new
and operator delete
) without luck because multiple threads attempt to write the vtable simultaneously. In the end, that tiny allocation doesn't add very much overhead because is likely cached in the allocator anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, not a huge deal here. That said, you could create a NoSpeculativeExecutionPlan data member in NoSpeculativeExecutionPolicy's constructor and return that one plan object anytime a user asks for a plan from this policy object. So, you won't have one NoSpeculativeExecutionPlan object in the whole application, but rather O(# NoSpeculativeExecutionPolicy objects), which is still better than O(# active requests). In fact, most of the time you'll only have one cluster object, so # of NoSpeculativeExecutionPolicy is often 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can't be done because the calling code is responsible for deleting the plan and would incorrectly delete the shared instance. We could use a ref-counted object for plans or figure out some custom allocation solution as I attempted before, but I think it's overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed; not worth the trouble if it's even remotely non-trivial.
bool IOWorker::execute(const RequestHandler::Ptr& request_handler) { | ||
request_handler->inc_ref(); // Queue reference | ||
if (!request_queue_.enqueue(request_handler.get())) { | ||
request_handler->dec_ref(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice several object types in the driver are ref-counted, which makes sense so that objects can be shared indiscriminately and only cleaned up when the last ref is lost. However, imo having the "user level code" be responsible for incrementing and decrementing the ref-count as objects are shared leaves room for developer error. Have you considered passing shared-refs by value and have the increment/decrement count logic in the constructor/destructor? I understand there's a cost to that -- constructing and destroying intermediate copies, but I think implementing in that way also makes you resilient to uncaught exceptions. For example, in the above code, if enqueue could potentially throw an exception, the request-handler ref-count will not be decreased.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually a pass-by-ref when calling methods may be fine most of the time, but store the shared-ptr object by value, so that the ref-count increases. So in the above case, enqueue could take a ref, but in its impl it adds a copy to the queue. Something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do agree with the error proneness of manually inc/dec ref-counted object. I think we try to avoid this in most parts of the driver and I've attempted to make improvements in this release. If there are specific cases where you think it can be fixed let me know. There are a few cases where this might not be avoided:
- Returning a reference to the external API. This is because the fact that it's a ref-counted type is lost.
- In async code where the type is erased or it's added to a data structure that doesn't manage lifetime (e.g. an intrusively linked list). There's likely some improvement that could be made here in some places in
Connection
. - I've attempted to use
MCMPQueue<RequestHandler::Ptr>
instead ofMSMPQueue<RequestHandler*>
. The problem is the queue holds on to request memory until another request overwrites it or until the queue is destroyed. One solution is to proactively set the item to an empty object,T()
, when dequeuing it, however, I am almost certain that would cause data races. The problem with keeping the memory alive is the size of requests are determine by the integrating application and could be quite large and there's an expectation that when a request is done that memory will be reclaimed.
The driver does not use exceptions and limits the use of std library methods that throw exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd like to have a longer discussion about this later; this is a broader question that isn't particularly related to speculative execution, so I'm cool with what you've got here.
while (remaining != 0 && io_worker->request_queue_.dequeue(request_handler)) { | ||
if (request_handler != NULL) { | ||
while (remaining != 0 && io_worker->request_queue_.dequeue(temp)) { | ||
RequestHandler::Ptr request_handler(temp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since request_handler isn't used beyond this if, you could create it inside the if, and have the if condition be
if (temp) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above for why the queue uses RequestHandler*
instead of RequestHandler::Ptr
.
The goal was to get the RequestHandler
into a shared ptr as soon as possible to avoid error (in future changes). Also, the SepeculativeExecution
constructor takes a shared ptr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha.
request_handler->set_io_worker(io_worker); | ||
request_handler->retry(); | ||
request_handler->start_request(io_worker); | ||
SpeculativeExecution::Ptr speculative_execution(new SpeculativeExecution(request_handler, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You decreased the ref-count on request_handler above, but you don't increment it before sending it to the SpeculativeExecution constructor. And the constructor doesn't increase the ref count either. Is this a bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been added to a shared pointer which increments and tracks the reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I see that now.
case CQL_OPCODE_RESULT: { | ||
ResultResponse* result = | ||
static_cast<ResultResponse*>(response->response_body().get()); | ||
if (result->kind() == CASS_RESULT_KIND_PREPARED) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a scenario where we're simply happy with the response and don't retry on current nor next host? I would've thought that if the result is a PREPARED, then we're actually good, since presumably a prepare request led to this PrepareCallback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The request has been chained and we're retrying the original request. We've attempted to run a prepared query and the server responded with UNPREPARED
so we've wrapped the original request in a prepared request and are now returning to the original request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
error->required(), | ||
error->data_present() > 0, | ||
num_retries_)); | ||
decision = request_handler_->retry_policy()->on_read_timeout(request(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice that you do an idempotent check for write-timeout below, but not for read-timeout here. Shouldn't the check be here also, since if a statement is not idempotent, we don't want to retry regardless?
Or is it the retry policy's responsibility to account for the idempotency of the request? If that's the case, then the write-timeout handling below should not check for idempotency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure that Cassandra read requests are always idempotent. That is, reads don't change data so they can be retried without unintended side effects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe read-timeout error means a socket read timeout occurred between the coordinator and node processing the statement, so it can happen for any kind of statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Read_timeout: Timeout exception during a read request."
https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v3.spec#L982
It explicitly says "write request" and "read request". I'm pretty sure it doesn't have to do we a failed acknowledgement from a replica write as that would manifest as a Write_timeout
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
request_handler->encoding_cache())); | ||
RequestHandler* temp = NULL; | ||
while (session->request_queue_->dequeue(temp)) { | ||
RequestHandler::Ptr request_handler(temp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could move down into the if and change the if condition to be if (temp) {
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal was to get the RequestHandler into a shared ptr as soon as possible to avoid error (in future changes). Also, see above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Ok, I think you've addressed everything I wondered about. Merge at will! :) |
No description provided.