Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task assignment to current Pool has unintended consequences #2914

Closed
biddisco opened this issue Sep 24, 2017 · 10 comments

Comments

@biddisco
Copy link
Contributor

commented Sep 24, 2017

The changes made in the recent PR #2891 and specifically the commit 0c4b3b5 has the unintended side effect that in our matrix code that uses one pool for communication and another for computation, that tasks made ready by the completion of the communication event are then placed on the mpi pool - where they perform poorly due to a there being only a few threads on that pool.
Several fixes are possible

  • Add a flag to the mpi or communication pool to in effect say "this pool does not accept default tasks" so that only tasks placed on the pool by an executor are allowed and others spawned as sub tasks - or continuations will go to the default pools - this can be done (for example) by extending the scheduler flags at creation time.

  • force the user to add an executor to the continuations made to tasks that go onto a special pool - if they don't want those continuations to also go onto the special pools. This is reasonable since the user has already extended their code so request a special pool - so annotating the continuations with executors is not unthinkable - however, I prefer a solution like the first at the moment.

@hkaiser

This comment has been minimized.

Copy link
Member

commented Sep 24, 2017

FWIW, by default, continuations should run on the same pool as the tasks they depend on.

@biddisco

This comment has been minimized.

Copy link
Contributor Author

commented Sep 24, 2017

Yes. agreed. But in some cases, flagging that pool with a 'no default tasks' bit so that instead the continuations go onto the default pool would be useful. imho. Unless we go for option 2 above - or another solution. The reason I prefer this solution is basically because if you have to annotate every continuation with an executor and then change one thing - you have to go through all the places and change it again - adding a 'no default tasks' flag to a pool makes it much easier.

@hkaiser

This comment has been minimized.

Copy link
Member

commented Sep 24, 2017

If the default behavior is not what is required, simply explicitly pass an executor representing the pool where things should run. I don't think we should introduce yet another way to control where/how things are scheduled to run.

@biddisco

This comment has been minimized.

Copy link
Contributor Author

commented Sep 28, 2017

In the particular use case, we have a large number of matrix operations that are performed and generated futures, to which continuations are attached - certain of the continuations involve sending data to other nodes or receiving data from others and the task returned from the receive triggers a new wave of matrix operations. Unfortunately - the futures are part of a list with indexing based on block numbers {i,j} of the matrix sub tiles which is overwritten on each iteration with new futures.
The problem is that calculating the index of the tasks that are continuations of the comms tasks is a non trivial operation and adding code to ensure that the continuations run on the matrix pool is a PITA. If I simply add a flag to the mpi pool to say "no default tasks" then everything will just drop into place.

@hkaiser

This comment has been minimized.

Copy link
Member

commented Sep 28, 2017

So all you need is a special pool type and wrap it into an executor - yes?

@biddisco

This comment has been minimized.

Copy link
Contributor Author

commented Sep 28, 2017

I'm not sure I understand - we already have a special pool and a customized_pool_executor - the problem is that I don't know which tasks are which when the futures come ready and the continuation are attached - currently most of the futures come from the default pool - and continuations go to the default pool - but occasionally a matrix block arrives from the network (comms task) and that is done on the mpi pool, but the future looks just like all the rest and continuations attached to it should go onto the default pool - not the mpi pool.

All I need to do is to add a check in here that says "if parent pool doesn't allow default tasks then use default pool"

{
detail::thread_pool_base *pool = nullptr;
if (get_self_ptr())
{
auto tid = get_self_id();
pool = tid->get_scheduler_base()->get_parent_pool();
}
else
{
pool = &default_pool();
}
pool->create_thread(data, id, initial_state, run_now, ec);
}
///////////////////////////////////////////////////////////////////////////
void threadmanager::register_work(
thread_init_data& data, thread_state_enum initial_state, error_code& ec)
{
detail::thread_pool_base *pool = nullptr;
if (get_self_ptr())
{
auto tid = get_self_id();
pool = tid->get_scheduler_base()->get_parent_pool();
}
else
{
pool = &default_pool();
}
pool->create_work(data, initial_state, ec);
}

Does that not seem simple enough?

@hkaiser

This comment has been minimized.

Copy link
Member

commented Sep 29, 2017

Yes, I understand? What I'm wondering about is whether this wouldn't be a 'constant' property of the pool? I.e. whether a certain pool in your scenario wouldn't always return 'that it doesn't allow default tasks'. If that's true, then the above could be rewritten as:

    {
        detail::thread_pool_base *pool = nullptr;
        if (get_self_ptr())
        {
            auto tid = get_self_id();
            pool = tid->get_scheduler_base()->get_parent_pool_if_default_tasks_are_allowed();
        }
        if (pool == nullptr)
        {
            pool = &default_pool();
        }
        pool->create_thread(data, id, initial_state, run_now, ec);
    }

Is that what you're looking for?

@biddisco

This comment has been minimized.

Copy link
Contributor Author

commented Sep 29, 2017

yes, that's basically it, but first what I'm doing is upgrading the cholesky code to use a custom_pool_executor on the default pool for our matrix tasks, instead of a default_executor. This should solve the problem without requiring any changes inside hpx.

It makes me wonder if now the default_executor should be simply replaced by a typedef into a custom_pool_executor on the default pool. I will reserve my thoughts on that until the changes I make turn out to be ok...

@biddisco

This comment has been minimized.

Copy link
Contributor Author

commented Sep 29, 2017

NB. above comment is not quite right because - we forgot to add priortiy and stacksize options to the custom_pool_executor, so I'm fixing that in hpx - so there are some changes needed, but not to the code we spoke about before.

@biddisco

This comment has been minimized.

Copy link
Contributor Author

commented Oct 4, 2017

This was resolved by #2921

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.