Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add AsyncProgressQueueWorker to queue and deliver all events #692

Merged
merged 10 commits into from
Oct 6, 2017

Conversation

mkrufky
Copy link
Collaborator

@mkrufky mkrufky commented Sep 5, 2017

add AsyncProgressQueueWorker: wake the main thread like AsyncProgressWorkerBase, except all events are queued and delivered

There are often cases where we want to handle events initiated by threads other than node threads. Developers are often told to look at AsyncProgressWorker , which allows us to send progress updates from external threads, but this is not necessarily enough. AsyncProgressWorker does what it needs to do, but if it is invoked many times before it wakes the main thread, only the last event will be delivered. This is great for sending "progress" style events, but sometimes we need to make sure that all events get delivered. AsyncProgressWorker doesn't do the job in those cases.

This PR refactors AsyncProgressWorkerBase to separate common elements and adds a new structure, AsyncProgressQueueWorker that behaves like AsyncProgressWorkerBase except all events are queued and delivered.

  • The first patch, AsyncProgressWorkerBase inherits AsyncBareProgressWorker is a no-op change, that simply separates AsyncBareProgressWorker from AsyncProgressWorkerBase.

  • The second patch adds the new functionality: AsyncProgressQueueWorker

  • The third patch adds a test that shows how AsyncProgressWorker can (and will) miss some events.

  • The fourth patch adds tests that show how AsyncProgressQueueWorker does not miss events.

  • The fifth patch adds adds documentation for the new AsyncProgressQueueWorker class.

  • The sixth patch ensures that any possible remaining events in the queue are drained within WorkComplete()

The API of AsyncProgressQueueWorker is exactly the same as the API of AsyncProgressWorkerBase -- it's a drop-in replacement. The only difference is that events will not be missed as a result of multiple wake requests being made before the uv_async_t wakes the main thread. For more info, see the warning at the bottom of the uv_async_t documentation:

libuv will coalesce calls to uv_async_send(), that is, not every call to it will yield an execution of the callback. For example: if uv_async_send() is called 5 times in a row before the callback is called, the callback will only be called once. If uv_async_send() is called again after the callback was called, it will be called again.

@kkoopa
Copy link
Collaborator

kkoopa commented Sep 5, 2017

Overall it looks fine to me. This particular issue has been brought up on several occasions, so it would be good to have a correct example of an implementation with a queue.

@mkrufky mkrufky force-pushed the AsyncWakeRequestor branch 6 times, most recently from 9d3191e to 243565c Compare September 5, 2017 18:26
@mkrufky
Copy link
Collaborator Author

mkrufky commented Sep 5, 2017

Thanks for the feedback! Tests are passing on appveyor and on travis / linux, but I had to work around some different OSX behavior in the tests where the destructor of the class got called before the main thread had a chance to call the first callback.... It's all good now (tested locally on my own mac) but I may have tied up Travis CI for the next few hours with older commits.

@mkrufky
Copy link
Collaborator Author

mkrufky commented Sep 8, 2017

After staring at these templates so much these past few days to make the tests pass, I've decided that AsyncProgressQueueWorker would be a much better name instead of AsyncWakeRequestorBase

I'm trying to separate the AsyncWorker from the AsyncWakeRequestorBase, if I manage to get that to work properly, the resulting class will have a uv_async_t but not have a uv_work_t and THAT class should be called AsyncWakeRequestor ...

I'll be away for the next few days so this will have to wait until I get back.

@mkrufky mkrufky force-pushed the AsyncWakeRequestor branch 2 times, most recently from 2d547e9 to cacd3d5 Compare September 17, 2017 21:26
@mkrufky
Copy link
Collaborator Author

mkrufky commented Sep 18, 2017

I'm not going to handle the separation of the AsyncWorker from the AsyncBareProgressWorker that I mentioned in my previous comment at this time. It's too big a change for this single pull request.

I did go ahead and rename the classes from AsyncWakeRequestorBase to AsyncBareProgressWorker and from AsyncWakeRequestor to AsyncProgressQueueWorker - these new classes are named more in the spirit of the original AsyncProgressWorker.

Tests have been added for both AsyncProgressWorker and AsyncProgressQueueWorker to sleep for 1 ms (instead of 100 ms) after each call to Progress.Send(). I would have preferred having no sleep for those tests, but it can cause unpredictable behavior on some older versions of node and other operating systems, so going with 1 ms minimum. If you compare results across every test, you'll notice that test/js/asyncprogressworker-1ms-test.js sometimes runs all 502 tests, often only 501, sometimes 498, and sometimes even less. This is expected behavior.
test/js/asyncprogressqueueworker-1ms-test.js always runs and passes 502 tests.

Travis CI has been flakey for some reason, but all tests should be passing correctly.

Documentation has been added to explain the differences between AsyncProgressWorker and AsyncProgressQueueWorker, and I think the tests that have been added cover the new features. Please let me know if anything else is required.

@mkrufky mkrufky changed the title wake the main thread like AsyncProgressWorkerBase, except all events are delivered add AsyncProgressQueueWorker based on AsyncProgressWorkerBase but all events are queued and delivered Sep 20, 2017
@mkrufky mkrufky changed the title add AsyncProgressQueueWorker based on AsyncProgressWorkerBase but all events are queued and delivered add AsyncProgressQueueWorker to queue and deliver all events based on AsyncProgressWorkerBase Sep 20, 2017
@mkrufky mkrufky changed the title add AsyncProgressQueueWorker to queue and deliver all events based on AsyncProgressWorkerBase add AsyncProgressQueueWorker to queue and deliver all events Sep 20, 2017
@mkrufky
Copy link
Collaborator Author

mkrufky commented Sep 23, 2017

ping @kkoopa ... I know github's notifications aren't the best so here's a bump.

@kkoopa
Copy link
Collaborator

kkoopa commented Sep 23, 2017

I'll take a look tomorrow and see if I spot anything.

@kkoopa
Copy link
Collaborator

kkoopa commented Sep 23, 2017

As far as I can tell, this looks good. I would however like to ask @bnoordhuis to please take a look regarding the locking and thread safety.

kkoopa

This comment was marked as off-topic.

@kkoopa
Copy link
Collaborator

kkoopa commented Sep 24, 2017 via email

@mkrufky
Copy link
Collaborator Author

mkrufky commented Sep 26, 2017

I just added a patch that makes sure that AsyncProgressQueueWorker::WorkProgress() gets called before AsyncWorker::WorkComplete(), which otherwise could be risky without at least 1 ms sleep, so I also added a test for 0 ms sleep.

baf5168

This might be useful for AsyncProgressWorker as well, but this PR aims to leave the behavior of AsyncProgressWorker untouched. If we decide to "fix" AsyncProgressWorker, then we can just move void WorkProgress() to the AsyncBareProgressWorker

Making sure that WorkProgress() gets called before WorkComplete() is important to AsyncProgressQueueWorker, since it's aim is to deliver every event that is generated from progress.Send()

Otherwise, we run the risk that WorkProgress() may get called after WorkComplete() while this->callback is null and then this->HandleProgressCallback() wont be called with that last event. We can sometimes see that happening if we run test/js/asyncprogressqueueworker-0ms-test.js with this WorkComplete override removed, or if we run a similar 0ms test instead based on asyncprogressworker

@mkrufky mkrufky force-pushed the AsyncWakeRequestor branch 2 times, most recently from de3c75a to 3fa8441 Compare September 26, 2017 12:27
@mkrufky
Copy link
Collaborator Author

mkrufky commented Oct 2, 2017

@bnoordhuis , can you please take a look regarding the locking and thread safety?

@bnoordhuis
Copy link
Member

I'll add it to my TBR (To Be Reviewed) list but at +341 −43 it's a bit too big to review right now.

bnoordhuis

This comment was marked as off-topic.

@kkoopa kkoopa merged commit b29be26 into nodejs:master Oct 6, 2017
@kkoopa
Copy link
Collaborator

kkoopa commented Oct 6, 2017

Awesome! Thank you both for all your efforts.

@mkrufky mkrufky deleted the AsyncWakeRequestor branch October 6, 2017 08:44
@maxbrunsfeld
Copy link

Since upgrading to nan 2.8.0, my module which uses AsyncProgressWorkerBase has started to intermittently crash. I don't yet have a minimal repro for the bug, but I will try to produce one now.

@kkoopa
Copy link
Collaborator

kkoopa commented Nov 16, 2017

For starters, you seem to be missing HandleScopes in all HandleOkCallbacks.
The Finish() method returning a pair of Locals is not really proper, as it does not use an EscapableHandleScope (which would make returning a pair of Locals impossible), but it should be alright as it is currently used. Same goes for the Finish method returning a single Local.

The most likely reason for crashes is the missing HandleScopes in HandleOKCallback. Whenever execution goes from libuv to node, a HandleScope must be set up if any handles are created.

@mkrufky
Copy link
Collaborator Author

mkrufky commented Nov 16, 2017

The only change that comes to mind that affects AsyncProgressWorkerBase in this PR is this: b2cf2fe . Otherwise, the functionality of AsyncProgressWorkerBase remains exactly the same as it always was. kkoopa's answer is likely the problem, but before he posted, I was scrambling for a fix for you to try, and I wonder how the following patch would affect things:

diff --git a/nan.h b/nan.h
index 7c7699f..f1e5338 100644
--- a/nan.h
+++ b/nan.h
@@ -1626,6 +1626,11 @@ class Callback {
   virtual ~AsyncBareProgressWorkerBase() {
   }
 
+  void WorkComplete() {
+    WorkProgress();
+    AsyncWorker::WorkComplete();
+  }
+
   virtual void WorkProgress() = 0;
 
   virtual void Destroy() {
@@ -1807,11 +1812,6 @@ class AsyncProgressQueueWorker : public AsyncBareProgressQueueWorker<T> {
     uv_mutex_destroy(&async_lock);
   }
 
-  void WorkComplete() {
-    WorkProgress();
-    AsyncWorker::WorkComplete();
-  }
-
   void WorkProgress() {
     uv_mutex_lock(&async_lock);
 

If that fixes things, don't rely on it -- I can't promise we'll merge it (although it would be nice to hear if it does) Regardless, please fix your missing HandleScope in HandleOKCallback and hopefully that will eliminate the problem altogether.

@maxbrunsfeld
Copy link

Hmm, adding Nan::HandleScope scope to the top of every HandleOkCallback doesn't seem to affect the problem. Thanks for pointing that out though.

@maxbrunsfeld
Copy link

I am guessing that my problem is related to the fact that I construct an instance of a AsyncProgressWorkerBase subclass on the stack, as a convenient way of implementing a synchronous variant of an async method. The crash seems to occur sometime after calling my loadSync method. Is there a reason why instances of AsyncProgressWorkerBase should need to be heap-allocated?

@maxbrunsfeld
Copy link

maxbrunsfeld commented Nov 16, 2017

I noticed that the async field is no longer heap allocated as of b2cf2fe. Prior to that commit, what code was responsible for deleting that object?

@agnat
Copy link
Collaborator

agnat commented Nov 16, 2017

The post-close delete in 1672 I believe...

@kkoopa
Copy link
Collaborator

kkoopa commented Nov 16, 2017

For the asynchronous case it almost surely needs to be allocated elsewhere than the stack, otherwise it would likely not survive until the worker thread is done. As for using it synchronously, nothing comes straight to mind, but I have never thought of doing that.

@agnat
Copy link
Collaborator

agnat commented Nov 16, 2017

Everything is a AsyncWorker... even a synchronous one.

:-P

@mkrufky
Copy link
Collaborator Author

mkrufky commented Nov 16, 2017

If you don't queue the worker using AsyncQueueWorker then nobody ever calls worker->Destroy()

@agnat
Copy link
Collaborator

agnat commented Nov 16, 2017

Ok, I think I understand what happens...

Your stack allocated worker goes out of scope. This triggers an async close. The async close handler, that runs later, accesses something that is already gone... boom.

@agnat
Copy link
Collaborator

agnat commented Nov 16, 2017

Ok, I don't get it ... @mkrufky is more qualified to spot this anyway ...

@maxbrunsfeld
Copy link

maxbrunsfeld commented Nov 17, 2017

Ok, I'm just going to avoid using my AsyncWorkerBase subclass synchronously. I'll have to pull out its guts into a separate object. I didn't realize that the constructor of that class did non-trivial work.

It might be nice if the uv_async_init happened in a separate initialization method that AsyncQueueWorker would call so that you could construct these worker objects on the stack if desired.

@mkrufky
Copy link
Collaborator Author

mkrufky commented Nov 17, 2017

I think, change TextBufferWrapper::load_sync() as follows:

LoadWorker *worker = new LoadWorker(...)

and obviously, fix all the worker pointers, and then call

  worker->Destroy();

at the end of the function.

(and never delete worker - let worker->Destroy() do it for you)

@maxbrunsfeld
Copy link

maxbrunsfeld commented Nov 17, 2017

@mkrufky Ha! That's a much quicker fix - great suggestion!

@kkoopa
Copy link
Collaborator

kkoopa commented Nov 17, 2017

And the HandleScopes I mentioned in HandleOkCallback should not be necessary after all, since AsyncWorker::WorkComplete makes a HandleScope before calling HandleOKCallback (unless AsyncWorker::WorkComplete has been overridden).

@mkrufky
Copy link
Collaborator Author

mkrufky commented Nov 17, 2017

AsyncWorker::WorkComplete never gets called in this case.

@maxbrunsfeld
Copy link

Ah, thanks for the follow-up @kkoopa.

I actually don't call HandleOkCallback in the synchronous case. I call an underlying method called Finish.

@maxbrunsfeld
Copy link

maxbrunsfeld commented Nov 17, 2017

In general, is this an ok way of creating synchronous variants of async methods? In other words, is creating AsyncProgressWorker instances and calling Destroy on them manually without ever using Nan::AsyncQueueWorker a supported use case that I can rely on?

It is convenient for me, but I don't mind extracting the logic into a separate object if need be. That would just be a little more boilerplate.

@kkoopa
Copy link
Collaborator

kkoopa commented Nov 17, 2017

Please, no. This is the first time I have ever seen this. The AsyncWorker classes are rather finicky (for a number of reasons, the main one being legacy). The standard way is to take the body of AsyncWorker::Execute, put it in a function f and call f from AsyncWorker::Execute. Export two methods: FooSync and FooAsync. Do the async setup in FooAsync, while FooSync simply calls f.

@maxbrunsfeld
Copy link

maxbrunsfeld commented Nov 17, 2017

Yeah, in this case it's a little more complicated because there is a bunch of intermediate state that both Execute and HandleOkCallback need to read and write, but I see what you're saying. Thanks for all of the quick help and clarification.

maxbrunsfeld referenced this pull request in atom/superstring Nov 17, 2017
This was an invalid use of the nan API which became more apparent
w/ nan 2.8
@maxbrunsfeld
Copy link

Maybe Destroy shouldn't be public.

@kkoopa
Copy link
Collaborator

kkoopa commented Nov 17, 2017

It is a virtual method serving part of the role of a destructor, which are public. I am not sure if it could be made protected, but that would constitute a breaking change so it will not happen in the foreseeable future. Again, legacy.

@mkrufky
Copy link
Collaborator Author

mkrufky commented Nov 20, 2017

I think we can safely do it without breaking anything, unless there's other code out there that calls these functions worker->Destroy(), AsyncExecute() & AsyncExecuteComplete() directly, and there shouldn't be, since I believe the only supported invocation is made via AsyncQueueWorker:

diff --git a/nan.h b/nan.h
index 7c7699f..a0b82ee 100644
--- a/nan.h
+++ b/nan.h
@@ -1502,6 +1502,7 @@ class Callback {
 };
 
 /* abstract */ class AsyncWorker {
+ friend class AsyncQueueWorker;
  public:
   explicit AsyncWorker(Callback *callback_)
       : callback(callback_), errmsg_(NULL) {
@@ -1571,14 +1572,14 @@ class Callback {
 
   uv_work_t request;
 
-  virtual void Destroy() {
-      delete this;
-  }
-
  protected:
   Persistent<v8::Object> persistentHandle;
   Callback *callback;
 
+  virtual void Destroy() {
+      delete this;
+  }
+
   virtual void HandleOKCallback() {
     HandleScope scope;
 
@@ -1628,6 +1629,7 @@ class Callback {
 
   virtual void WorkProgress() = 0;
 
+ protected:
   virtual void Destroy() {
       uv_close(reinterpret_cast<uv_handle_t*>(&async), AsyncClose_);
   }
@@ -1856,25 +1858,31 @@ class AsyncProgressQueueWorker : public AsyncBareProgressQueueWorker<T> {
   std::queue<std::pair<T*, size_t> > asyncdata_;
 };
 
-inline void AsyncExecute (uv_work_t* req) {
-  AsyncWorker *worker = static_cast<AsyncWorker*>(req->data);
-  worker->Execute();
-}
+class AsyncQueueWorker {
+ public:
+  explicit AsyncQueueWorker(AsyncWorker* worker) {
+    uv_queue_work(
+        uv_default_loop()
+      , &worker->request
+      , AsyncExecute
+      , reinterpret_cast<uv_after_work_cb>(AsyncExecuteComplete)
+    );
+  }
 
-inline void AsyncExecuteComplete (uv_work_t* req) {
-  AsyncWorker* worker = static_cast<AsyncWorker*>(req->data);
-  worker->WorkComplete();
-  worker->Destroy();
-}
+ private:
+  NAN_DISALLOW_ASSIGN_COPY_MOVE(AsyncQueueWorker)
 
-inline void AsyncQueueWorker (AsyncWorker* worker) {
-  uv_queue_work(
-      uv_default_loop()
-    , &worker->request
-    , AsyncExecute
-    , reinterpret_cast<uv_after_work_cb>(AsyncExecuteComplete)
-  );
-}
+  inline static void AsyncExecute (uv_work_t* req) {
+    AsyncWorker *worker = static_cast<AsyncWorker*>(req->data);
+    worker->Execute();
+  }
+
+  inline static void AsyncExecuteComplete (uv_work_t* req) {
+    AsyncWorker* worker = static_cast<AsyncWorker*>(req->data);
+    worker->WorkComplete();
+    worker->Destroy();
+  }
+};
 
 namespace imp {
 

...passes all the tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants