Fix empty(QueueCpuAsync) returning true even though the last task is still executing #627

BenjaminW3 · 2018-09-03T16:41:24Z

Fixes #621

By the way the test found that this did not work for QueueCpuSync and QueueCudaRtSync as well (when the last task was a callback).

theZiz · 2018-09-04T08:34:05Z

include/alpaka/queue/QueueCpuSync.hpp

@@ -186,8 +189,7 @@ namespace alpaka
                    queue::QueueCpuSync const & queue)
                -> bool
                {
-                    alpaka::ignore_unused(queue);
-                    return true;
+                    return !queue.m_spQueueImpl->m_bCurrentlyExecutingTask;


I am wondering for which case this is useful? If I call enqueue, the task will not return until, so nobody except the task itself has a change to check to queue for emptyness. Of course a concurrent task could test the queue, but in that case m_bCurrentlyExecutingTask should be an atomic?

The task being executed can be a host callback which can check the state (this is exactly what the unit test is doing).

Ah, makes sense. I will just check the changes with my code. 😉

BenjaminW3 · 2018-09-04T08:46:01Z

include/alpaka/queue/QueueCudaRtSync.hpp


-                    task();
+                    t.join();


The code here is not new but only copied from the QueueCudaRtAsync. This had been forgotten when a bug had been fixed there.

theZiz

lgtm, thanks!

BenjaminW3 · 2018-09-04T19:32:57Z

Hmmm... CI had one single build that was not happy with the QueueCpuAsync unit test.
I think I understand what the problem is but it is really really hard to solve.
The test is doing wait(queue) and after this expects that empty(queue) is true. To understand the failure you have to know how the CPU queue implements waiting. wait(queue) simply enqueues an event and waits for this event. The event itself is only a condition_variable which is notified once the event is "executed". Now there is a very very short time in between the event notification and the line where it is counted as finished. In the end it boils down to the event being a task that does essentially m_conditionVariable.notify_all(); and the task queue doing --m_numActiveTasks; If the thread is interrupted in between, the event has been notified, the wait for the event/queue finishes and empty(queue) still returns false.

BenjaminW3 · 2018-09-05T10:07:35Z

So in the end I will have to fix the waiting for QueueCpuAsync either by not using events but using something different, or by changing the behaviour of the events.
This may take some time. Until then the current state of this branch already fixes your problem but will not be merged.

theZiz · 2018-09-06T08:41:11Z

I understand the problem, but except for the test, I think the solution is "less wrong" than the old behaviour. Of course of you check for emptiness, you will get an unexpected result, but at least you can assume emptiness of the queue, as no user-given task is executed anymore.
Maybe this would be even a solution? Flagging tasks, whether they are "real" user tasks or internal tasks like for the event waiting. Not incrementing and decrementing the counter for the internal event task for wait(queue) may be a simple solution?

…ave finished

BenjaminW3 · 2018-09-14T07:36:48Z

I have now made changes to the events so that they are not signaled ready before they are removed from the queue. I had to try many ways to find the current solution but now it is cleaner than the original version.
I might have to fix a gcc-4.9 workaround I have removed at the moment but CI will tell me.

Signaling the event ready from within the enqueued event function is wrong because waits for the event may be resolved even though the work queue still has the event task itself in progress. Using the future of the work queue is better.

theZiz · 2018-09-17T15:35:32Z

include/alpaka/event/EventCpu.hpp

+                        while(enqueueCount > m_LastReadyEnqueueCount)
+                        {
+                            auto future = m_future;
+                            lk.unlock();


Shoudln't that the other way around? locking, working, unlocking?

No, I think this is correct. At a specific point A in time we want to wait. In this case we lock the event mutex, check if we need to wait at all and copy the current future. Then we unlock the event and wait for the future to finish. This waiting can not be done while locked because it might deadlock for multiple reasons but most prominently because the execution of the event itself requires the lock to increase the m_LastReadyEnqueueCount.
We waited on a copy of the future because in the meantime the event could have been re-enqueued to a later point in time. This is no problem because after waiting we simply lock the event again and the while check is false and we correctly waited for the event.

You are right, sorry, I mixed up lock and mutex 🤦‍♂️

theZiz · 2018-09-17T16:05:09Z

Now for something completely different!
I found the "bug". I already had the topic branch with your first fix attempt and just merged the new changes to it. Will, figures out, this failed. So with removing the branch and repulling your latest changes, everything works fine, thx! 😆 At least we got a new unit test... 🙄

…lved

BenjaminW3 · 2018-09-17T16:09:18Z

But the new unit test does not compile due to many warnings and it is not really new because both things are already tested. I have pushed some minor changes and revert the test.

BenjaminW3 · 2018-09-17T17:43:31Z

@theZiz Please approve if this works for you.

theZiz · 2018-09-18T09:49:39Z

@theZiz Please approve if this works for you.

Works! 👍

BenjaminW3 · 2018-09-18T10:40:51Z

@theZiz You should be allowed to give a real review approval so that I can merge it.

theZiz · 2018-09-18T10:46:04Z

@theZiz You should be allowed to give a real review approval so that I can merge it.

I doubt so. @ax3l or @psychocoderHPC need to approve the PR.

Edit: And I was wrong...

psychocoderHPC · 2018-09-18T10:57:27Z

I will check the PR in ~2h

BenjaminW3 · 2018-09-18T13:15:04Z

@psychocoderHPC Are you done?

psychocoderHPC · 2018-09-18T14:32:36Z

no sry I start now. Was busy with other tasks

psychocoderHPC

The style issue is not important but the other question about using callbacks is very important.

psychocoderHPC · 2018-09-18T14:50:18Z

include/alpaka/core/ConcurrentExecPool.hpp

+#endif
+                {
+                    auto boundTask([=](){return task(args...);});
+                    auto decrementNumActiveTasks = [this](){--m_numActiveTasks;};


not important but should be: auto decrementNumActiveTasks([this](){--m_numActiveTasks;});

psychocoderHPC · 2018-09-18T15:06:54Z

include/alpaka/queue/QueueCudaRtSync.hpp

-                            lock,
-                            [pCallbackSynchronizationData](){
-                                return pCallbackSynchronizationData->notified;
+                    // We start a new std::thread which stores the task to be executed.


Do we always enqueue CUDA callbacks with alpaka?

OK after reviewing the corresponding parts again I found that this is the code to create a callback introduced in #373.
The code itself looks like it is the kernel enqueue stuff but kernel start in the name space alpaka::exec.

I have not changed anything about the cudaStreamAddCallback here. The bug I found was that the callback was directly executed within the CUDA callback thread. In the CUDA callback thread you are not allowed to do CUDA calls, which I did in my test. This had already been fixed in the Async version. I simply copied the fix from over there.

psychocoderHPC · 2018-09-18T15:34:34Z

form my side it can be merged if expect @BenjaminW3 will fix the style issue

BenjaminW3 · 2018-09-19T07:06:38Z

Please re-approve

… still executing backport of alpaka-group#627 Fixes alpaka-group#621 By the way the test found that this did not work for QueueCpuSync and QueueCudaRtSync as well (when the last task was a callback).

BenjaminW3 added the Type:Bug label Sep 3, 2018

BenjaminW3 added this to the Version 0.4.0 milestone Sep 3, 2018

BenjaminW3 assigned theZiz Sep 3, 2018

BenjaminW3 requested review from theZiz, ax3l and psychocoderHPC September 3, 2018 16:41

theZiz reviewed Sep 4, 2018

View reviewed changes

BenjaminW3 commented Sep 4, 2018

View reviewed changes

ax3l added the Backend:OpenMP label Sep 4, 2018

theZiz previously approved these changes Sep 4, 2018

View reviewed changes

BenjaminW3 added the State:Work In Progress label Sep 5, 2018

BenjaminW3 added 4 commits September 8, 2018 14:32

Add test for queue empty when last task is executing

2b84fe4

Fix alpaka::queue::empty returning true even though not all tasks h…

f47cc54

…ave finished

Add missing include

5768270

fix naming of queues in test

b3c3a01

BenjaminW3 dismissed theZiz’s stale review via 47d48ff September 14, 2018 07:26

BenjaminW3 force-pushed the topic-fix-QueueCpuAsync branch from ed288ea to 47d48ff Compare September 14, 2018 07:26

BenjaminW3 force-pushed the topic-fix-QueueCpuAsync branch 6 times, most recently from e507323 to 5f0ea8e Compare September 15, 2018 14:14

BenjaminW3 added 2 commits September 15, 2018 17:14

Clean up TaskPkg construction

8468eb6

theZiz reviewed Sep 17, 2018

View reviewed changes

Move counter decrement into task to be executed before future is reso…

d83844e

…lved

BenjaminW3 force-pushed the topic-fix-QueueCpuAsync branch from 0825fe4 to 167769e Compare September 17, 2018 16:09

Lock before calling wait

a6f960e

BenjaminW3 force-pushed the topic-fix-QueueCpuAsync branch from 167769e to a6f960e Compare September 17, 2018 16:23

theZiz previously approved these changes Sep 18, 2018

View reviewed changes

psychocoderHPC previously approved these changes Sep 18, 2018

View reviewed changes

Fix style

0353f45

BenjaminW3 dismissed stale reviews from psychocoderHPC and theZiz via 0353f45 September 18, 2018 15:40

psychocoderHPC approved these changes Sep 19, 2018

View reviewed changes

psychocoderHPC merged commit 3dbf482 into alpaka-group:develop Sep 19, 2018

psychocoderHPC removed the State:Work In Progress label Sep 19, 2018

BenjaminW3 deleted the topic-fix-QueueCpuAsync branch September 19, 2018 16:02

psychocoderHPC mentioned this pull request Sep 25, 2018

Backports for 0.3.4 Part 1 #657

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix empty(QueueCpuAsync) returning true even though the last task is still executing #627

Fix empty(QueueCpuAsync) returning true even though the last task is still executing #627

BenjaminW3 commented Sep 3, 2018 •

edited

Loading

theZiz Sep 4, 2018

BenjaminW3 Sep 4, 2018

theZiz Sep 4, 2018

BenjaminW3 Sep 4, 2018

theZiz left a comment

BenjaminW3 commented Sep 4, 2018

BenjaminW3 commented Sep 5, 2018

theZiz commented Sep 6, 2018

BenjaminW3 commented Sep 14, 2018

theZiz Sep 17, 2018

BenjaminW3 Sep 17, 2018 •

edited

Loading

theZiz Sep 17, 2018

theZiz commented Sep 17, 2018

BenjaminW3 commented Sep 17, 2018

BenjaminW3 commented Sep 17, 2018

theZiz commented Sep 18, 2018

BenjaminW3 commented Sep 18, 2018

theZiz commented Sep 18, 2018 •

edited

Loading

psychocoderHPC commented Sep 18, 2018

BenjaminW3 commented Sep 18, 2018

psychocoderHPC commented Sep 18, 2018

psychocoderHPC left a comment

psychocoderHPC Sep 18, 2018

BenjaminW3 Sep 18, 2018

psychocoderHPC Sep 18, 2018

psychocoderHPC Sep 18, 2018

BenjaminW3 Sep 18, 2018

psychocoderHPC commented Sep 18, 2018 •

edited

Loading

BenjaminW3 commented Sep 19, 2018

Fix empty(QueueCpuAsync) returning true even though the last task is still executing #627

Fix empty(QueueCpuAsync) returning true even though the last task is still executing #627

Conversation

BenjaminW3 commented Sep 3, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

theZiz left a comment

Choose a reason for hiding this comment

BenjaminW3 commented Sep 4, 2018

BenjaminW3 commented Sep 5, 2018

theZiz commented Sep 6, 2018

BenjaminW3 commented Sep 14, 2018

Choose a reason for hiding this comment

BenjaminW3 Sep 17, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

theZiz commented Sep 17, 2018

BenjaminW3 commented Sep 17, 2018

BenjaminW3 commented Sep 17, 2018

theZiz commented Sep 18, 2018

BenjaminW3 commented Sep 18, 2018

theZiz commented Sep 18, 2018 • edited Loading

psychocoderHPC commented Sep 18, 2018

BenjaminW3 commented Sep 18, 2018

psychocoderHPC commented Sep 18, 2018

psychocoderHPC left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

psychocoderHPC commented Sep 18, 2018 • edited Loading

BenjaminW3 commented Sep 19, 2018

BenjaminW3 commented Sep 3, 2018 •

edited

Loading

BenjaminW3 Sep 17, 2018 •

edited

Loading

theZiz commented Sep 18, 2018 •

edited

Loading

psychocoderHPC commented Sep 18, 2018 •

edited

Loading