New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement parallel::partition_copy. #2716

Merged
merged 14 commits into from Jul 10, 2017

Conversation

3 participants
@taeguk
Member

taeguk commented Jun 25, 2017

Check Box

  • Implementation of partition_copy.
  • Unit tests for partition_copy.
  • Benchmark codes for partition_copy.
  • Benchmark with using policy.executor() instead of hpx::launch::sync in scan_partitioner.
  • Adapting iterator requirements (#2733).
  • Adapt to Ranges TS (#1668).
  • Adapt to master branch.

** Issue List **

Note

2017/6/25

I implemented partition_copy with reference to copy_if.
The behavior is good, but the benchmark results are very bad.
I must find new efficient way and re-implement partition_copy

2017/7/4

The bad performance is due to #2325.
With using policy.executor() instead of hpx::launch::sync in scan_partitioner, the performance is good. (https://gist.github.com/taeguk/6abe03f9b4cb878872d2bb634cae65b0)

@hkaiser hkaiser added this to the 1.1.0 milestone Jun 25, 2017

@hkaiser hkaiser added this to Work in progress in Standard Algorithms Jun 25, 2017

@hkaiser hkaiser referenced this pull request Jun 25, 2017

Open

Implement N4409 on top of HPX #1141

40 of 41 tasks complete
@hkaiser

This comment has been minimized.

Show comment
Hide comment
@hkaiser

hkaiser Jun 25, 2017

Member

@taeguk: invoking the predicate twice for each element is not allowed, see here: http://en.cppreference.com/w/cpp/algorithm/partition_copy

Complexity

Exactly distance(first, last) applications of p.

Member

hkaiser commented Jun 25, 2017

@taeguk: invoking the predicate twice for each element is not allowed, see here: http://en.cppreference.com/w/cpp/algorithm/partition_copy

Complexity

Exactly distance(first, last) applications of p.

@taeguk

This comment has been minimized.

Show comment
Hide comment
@taeguk

taeguk Jun 25, 2017

Member

@hkaiser You're right. I did not know that.
But, still it does not seem good to allocate additional memory.
However, unfortunately I have no way to resolve that issue.
It would be nice to be able to use memory space of the container pointed to by the Output Iterator, but this does not seem easy because of generalization of algorithm interface.

Member

taeguk commented Jun 25, 2017

@hkaiser You're right. I did not know that.
But, still it does not seem good to allocate additional memory.
However, unfortunately I have no way to resolve that issue.
It would be nice to be able to use memory space of the container pointed to by the Output Iterator, but this does not seem easy because of generalization of algorithm interface.

@hkaiser

This comment has been minimized.

Show comment
Hide comment
@hkaiser

hkaiser Jun 25, 2017

Member

Yes, I agree. I think the only viable solution is to allocate that additional array of Booleans as you had it in the first place.

Member

hkaiser commented Jun 25, 2017

Yes, I agree. I think the only viable solution is to allocate that additional array of Booleans as you had it in the first place.

Show outdated Hide outdated hpx/parallel/algorithms/partition.hpp
{
while (first != last)
{
if (hpx::util::invoke(pred, hpx::util::invoke(proj, *first)))

This comment has been minimized.

@Naios

Naios Jun 25, 2017

Contributor

You should perfect forward callable objects since those can be overloaded on r-value references:

struct my_callable {
  void operator()()&& {
    // ...
  }
};
@Naios

Naios Jun 25, 2017

Contributor

You should perfect forward callable objects since those can be overloaded on r-value references:

struct my_callable {
  void operator()()&& {
    // ...
  }
};

This comment has been minimized.

@hkaiser

hkaiser Jun 25, 2017

Member

@Naios: Not sure if all of our compilers support that. We should add a feature test if we want to use this. Also, if we do, this could be applied in many places.

@hkaiser

hkaiser Jun 25, 2017

Member

@Naios: Not sure if all of our compilers support that. We should add a feature test if we want to use this. Also, if we do, this could be applied in many places.

This comment has been minimized.

@hkaiser

hkaiser Jun 25, 2017

Member

@Naios: To clarify, I meant the operator()() &&

@hkaiser

hkaiser Jun 25, 2017

Member

@Naios: To clarify, I meant the operator()() &&

This comment has been minimized.

@hkaiser

hkaiser Jun 25, 2017

Member

@Naios: also, in this context, perfect forwarding wouldn't work as pred (and proj) shouldn't be invalidated (as it might - if moved); pred will be used more than once.

@hkaiser

hkaiser Jun 25, 2017

Member

@Naios: also, in this context, perfect forwarding wouldn't work as pred (and proj) shouldn't be invalidated (as it might - if moved); pred will be used more than once.

This comment has been minimized.

@Naios

Naios Jun 25, 2017

Contributor

@hkaiser I guess all required compiler versions should support it:

  • GCC since 4.8 or 4.9
  • Clang since 3.3 or 3.4
  • MSVC since 2015 (19.0)

However, it is safe for the future to support perfect forwarding for callable objects.

// EDIT Because of the second comment: yes that's right

@Naios

Naios Jun 25, 2017

Contributor

@hkaiser I guess all required compiler versions should support it:

  • GCC since 4.8 or 4.9
  • Clang since 3.3 or 3.4
  • MSVC since 2015 (19.0)

However, it is safe for the future to support perfect forwarding for callable objects.

// EDIT Because of the second comment: yes that's right

@hkaiser

This comment has been minimized.

Show comment
Hide comment
@hkaiser
Member

hkaiser commented Jun 26, 2017

@hkaiser

I'd suggest to keep the work on the scan_partitioner separate from the partition_copy algorithm implementation.

@taeguk

This comment has been minimized.

Show comment
Hide comment
@taeguk

taeguk Jul 2, 2017

Member

@hkaiser Sorry, this is my mistake. This is just for backup. This commit will be removed.

Member

taeguk commented Jul 2, 2017

@hkaiser Sorry, this is my mistake. This is just for backup. This commit will be removed.

@hkaiser

This comment has been minimized.

Show comment
Hide comment
@hkaiser

hkaiser Jul 8, 2017

Member

@taeguk: could we merge this step by step? Could we merge the current state of partition_copy?

Wrt bad performance: I'd suggest to make the change from hpx::launch::sync to policy.executor() and watch the regression tests (rostam.cct.lsu.edu).

Member

hkaiser commented Jul 8, 2017

@taeguk: could we merge this step by step? Could we merge the current state of partition_copy?

Wrt bad performance: I'd suggest to make the change from hpx::launch::sync to policy.executor() and watch the regression tests (rostam.cct.lsu.edu).

@taeguk

This comment has been minimized.

Show comment
Hide comment
@taeguk

taeguk Jul 8, 2017

Member

@hkaiser I had already done such a test. With policy.executor, I could get acceptable performance.
I have some commits that are not pushed. And I have to fix some tiny things like inspection errors.

I stopped the progress of this PR because of #2733.
I had a plan to add some commits for adapting this PR to #2733 after #2733 is merged.
I am waiting for it to be merged.

Do you want that I will finish this PR without adapting to #2733?

Member

taeguk commented Jul 8, 2017

@hkaiser I had already done such a test. With policy.executor, I could get acceptable performance.
I have some commits that are not pushed. And I have to fix some tiny things like inspection errors.

I stopped the progress of this PR because of #2733.
I had a plan to add some commits for adapting this PR to #2733 after #2733 is merged.
I am waiting for it to be merged.

Do you want that I will finish this PR without adapting to #2733?

@hkaiser

This comment has been minimized.

Show comment
Hide comment
@hkaiser

hkaiser Jul 8, 2017

Member

@taeguk ok, understood. I applied the first batch of name changes to #2733. I will try to apply the rest asap. In the meantime it might be helpful if you added your review to #2733 as our policy is to have at least one review before merging.

Member

hkaiser commented Jul 8, 2017

@taeguk ok, understood. I applied the first batch of name changes to #2733. I will try to apply the rest asap. In the meantime it might be helpful if you added your review to #2733 as our policy is to have at least one review before merging.

@taeguk taeguk changed the title from [Working] Implement parallel::partition_copy. to Implement parallel::partition_copy. Jul 8, 2017

@taeguk

This comment has been minimized.

Show comment
Hide comment
@taeguk

taeguk Jul 8, 2017

Member

@hkaiser I'm ready to be merged if you want that you will do adapting this PR to #2733.
Otherwise, if you want that I will do that, I will do more commits in here after #2733 is merged.

Member

taeguk commented Jul 8, 2017

@hkaiser I'm ready to be merged if you want that you will do adapting this PR to #2733.
Otherwise, if you want that I will do that, I will do more commits in here after #2733 is merged.

@taeguk

This comment has been minimized.

Show comment
Hide comment
@taeguk

taeguk Jul 8, 2017

Member

@hkaiser Sorry, I have one thing to do for Ranges TS. I will add partition_copy to container_algorithms/partition.hpp.

I have a question. As you think, which one is better between "one PR for both an implemention of parallel algorithm and an adaption for Ranges TS" and "divide one PR into two for each"?

Member

taeguk commented Jul 8, 2017

@hkaiser Sorry, I have one thing to do for Ranges TS. I will add partition_copy to container_algorithms/partition.hpp.

I have a question. As you think, which one is better between "one PR for both an implemention of parallel algorithm and an adaption for Ranges TS" and "divide one PR into two for each"?

@taeguk taeguk changed the title from Implement parallel::partition_copy. to [Working] Implement parallel::partition_copy. Jul 8, 2017

@taeguk taeguk changed the title from [Working] Implement parallel::partition_copy. to Implement parallel::partition_copy. Jul 9, 2017

@taeguk

This comment has been minimized.

Show comment
Hide comment
@taeguk

taeguk Jul 9, 2017

Member

@hkaiser Finally, I'm ready to be merged!

Member

taeguk commented Jul 9, 2017

@hkaiser Finally, I'm ready to be merged!

@hkaiser

LGTM, thanks a lot!

@hkaiser

This comment has been minimized.

Show comment
Hide comment
@hkaiser

hkaiser Jul 10, 2017

Member

Benchmark results are inconsistent. (They are different whenever I benchmark.) (https://gist.github.com/taeguk/e40b628fefe025aa3ffc1335bbeed4ee)

@taeguk: Have you tried whether you get repeatable results when forcing the same seed? The different results could be caused by a different amount of work needed to perform the partitioning.

Member

hkaiser commented Jul 10, 2017

Benchmark results are inconsistent. (They are different whenever I benchmark.) (https://gist.github.com/taeguk/e40b628fefe025aa3ffc1335bbeed4ee)

@taeguk: Have you tried whether you get repeatable results when forcing the same seed? The different results could be caused by a different amount of work needed to perform the partitioning.

@taeguk

This comment has been minimized.

Show comment
Hide comment
@taeguk

taeguk Jul 10, 2017

Member

@hkaiser Oh, I got almost the same results when forcing the same seed! Great :)

Member

taeguk commented Jul 10, 2017

@hkaiser Oh, I got almost the same results when forcing the same seed! Great :)

@hkaiser hkaiser merged commit 11bce88 into STEllAR-GROUP:master Jul 10, 2017

2 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
@hkaiser

This comment has been minimized.

Show comment
Hide comment
@hkaiser

hkaiser Jul 10, 2017

Member

@taeguk: Thanks a lot for your work on this. That's much appreciated! Please get in contact with aserio on IRC so he can send you a STE||AR-group t-shirt ;)

Member

hkaiser commented Jul 10, 2017

@taeguk: Thanks a lot for your work on this. That's much appreciated! Please get in contact with aserio on IRC so he can send you a STE||AR-group t-shirt ;)

@hkaiser hkaiser referenced this pull request Jul 11, 2017

Open

Adapt all parallel algorithms to Ranges TS #1668

21 of 38 tasks complete

@hkaiser hkaiser moved this from Work in progress to Merged to master in Standard Algorithms Jul 21, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment