-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
here.runningTasks()
doesn't seem to respect serial
statements
#17562
Comments
Note that this impacts begins and coforalls as well: serial true {
writeln("begin");
sync {
begin writeln(here.runningTasks());
begin writeln(here.runningTasks());
}
writeln("cobegin");
cobegin {
writeln(here.runningTasks());
writeln(here.runningTasks());
}
writeln("coforall");
coforall 1..2 {
writeln(here.runningTasks());
}
} |
But not forall. My 6-core reports it correctly within such a loop. But incorrectly as 4 in the first case you noted. It actually gets your last two examples correct. |
This might be the wrong approach but here is a quick fix diff --git a/modules/internal/ChapelBase.chpl b/modules/internal/ChapelBase.chpl
index e2337b7496..643954f847 100644
--- a/modules/internal/ChapelBase.chpl
+++ b/modules/internal/ChapelBase.chpl
@@ -1196,7 +1196,8 @@ module ChapelBase {
e.taskCnt += 1;
}
}
- if countRunningTasks {
+ var isSerial: bool = __primitive("task_get_serial");
+ if countRunningTasks && !isSerial {
here.runningTaskCntAdd(1); // decrement is in _waitEndCount()
chpl_comm_task_create(); // countRunningTasks is a proxy for "is local"
// here. Comm layers are responsible for the
@@ -1210,7 +1211,8 @@ module ChapelBase {
proc _upEndCount(e: _EndCount, param countRunningTasks=true, numTasks) {
e.i.add(numTasks:int, memoryOrder.release);
- if countRunningTasks {
+ var isSerial: bool = __primitive("task_get_serial");
+ if countRunningTasks && !isSerial {
if numTasks > 1 {
here.runningTaskCntAdd(numTasks:int-1); // decrement is in _waitEndCount()
}
@@ -1265,7 +1267,8 @@ module ChapelBase {
// Wait for all tasks to finish
e.i.waitFor(0, memoryOrder.acquire);
- if countRunningTasks {
+ var isSerial: bool = __primitive("task_get_serial");
+ if countRunningTasks && !isSerial {
const taskDec = if isAtomic(e.taskCnt) then e.taskCnt.read() else e.taskCnt;
// taskDec-1 to adjust for the task that was waiting for others to finish
here.runningTaskCntSub(taskDec-1); // increment is in _upEndCount()
@@ -1290,7 +1293,8 @@ module ChapelBase {
// Wait for all tasks to finish
e.i.waitFor(0, memoryOrder.acquire);
- if countRunningTasks {
+ var isSerial: bool = __primitive("task_get_serial");
+ if countRunningTasks && !isSerial {
if numTasks > 1 {
here.runningTaskCntSub(numTasks:int-1);
} It's possible that the check for the serial state looks in the task local storage and adds some overhead to task launch. I don't know if that's an issue. I also recall something about tracking running tasks in the compiler but I don't remember more details, so we'll want to check if the compiler needs changes as well. |
This is along the lines of what I was thinking, but I haven't had time to think about any downsides to this approach. The compiler code is only for on-stmts and I don't think is impacted by serial statements, but I'll have to check. |
Summary of Problem
While working with @strikeraryu on a recursive parallel algorithm, we're finding that
here.runningTasks()
seems to be giving the wrong value when used within aserial
block. Specifically, it seems to count the number of tasks created rather than the number that are concurrently executing. This is problematic because it means that the idiom we usually advertise for throttling parallelism:doesn't necessarily work as intended because the number of running tasks will always be over-counted.
Steps to Reproduce
Source Code:
Here's a simple program that demonstrates this:
On my 4-core system, I'm seeing
here.runningTasks()
always print as 4, and the other three writeln()s seem to always appear in sequence. Whereas if I remove thecobegin
, I see the number of running tasks as 1, and if I remove theserial
, I see the output emerge in random orders as expected. This suggests to me that the serial is working as intended, but that the accounting for the number of running tasks is incorrect.Compile command:
chpl testit.chpl
./testit
Configuration Information
chpl --version
:chpl version 1.25.0 pre-release (d54a88a62f)
$CHPL_HOME/util/printchplenv --anonymize
:The text was updated successfully, but these errors were encountered: