Simd Find #6302

Johan511 · 2023-07-16T12:17:10Z

using simd-helpers (from #6286 ) to add vectorization to stl algorithms

hkaiser · 2023-08-05T20:04:45Z

libs/core/algorithms/include/hpx/parallel/algorithms/detail/find.hpp

@@ -34,8 +34,7 @@ namespace hpx::parallel::detail {
            sequential_find_t<ExPolicy>, Iterator first, Sentinel last,
            T const& value, Proj proj = Proj())
        {
-            return util::loop_pred<
-                std::decay_t<hpx::execution::sequenced_policy>>(
+            return util::loop_pred<ExPolicy>(


Could you add static_assert verifying that ExPolicy is actually a sequential policy?

Change had been made so sequential_find_t can accept sequential and unsequential execution policies. Do you want me to rather use static_assert(is_seq || is_unseq) ?

libs/core/algorithms/include/hpx/parallel/unseq/simd_helpers.hpp

hkaiser · 2023-08-05T20:05:48Z

libs/core/algorithms/include/hpx/parallel/unseq/simd_helpers.hpp

+    /*
+        Compiler and Hardware should also support vector operations for IterDiff,
+        else we see slower performance when compared to sequential version
+    */


What does this comment mean? Should that requirement be somehow enforced?

As discussed previously, changing data types can lead to vectorization being broken. I wanted to leave a note for the future regarding it. Use perf regression tests might not be the best way to do this as they often seem to have lot of variance in them.

Well, a comment doesn't help with this, does it? I think setting up performance tests would be the better option. @Pansysk75 can help with that.

@hkaiser Do you mean automated ones that report on GH? He asked me about this but I discouraged him from doing that yet. It can certainly be done though, if that's what you have in mind
@Johan511 There are ways of getting around that variance, so that (hopefully) wont be a probem.

Well, I thought we could add it to our performance tests.

added it in a separate PR

libs/core/algorithms/include/hpx/parallel/util/loop.hpp

libs/core/algorithms/tests/unit/algorithms/util/test_simd_helpers.cpp

Johan511 · 2023-08-08T17:56:37Z

Speedups observed for seq vs unseq

hkaiser · 2023-08-08T20:44:19Z

Speedups observed for seq vs unseq

Nice!

StellarBot · 2023-08-08T21:33:49Z

Performance test report

HPX Performance

Comparison

BENCHMARK	FORK_JOIN_EXECUTOR	PARALLEL_EXECUTOR	SCHEDULER_EXECUTOR
For Each	(=)	??	-

Info

Property	Before	After
HPX Datetime	2023-05-10T12:07:53+00:00	2023-08-08T21:21:49+00:00
HPX Commit	`dcb5415`	`f524d6e`
Clustername	rostam	rostam
Datetime	2023-05-10T14:50:18.616050-05:00	2023-08-08T16:30:27.710089-05:00
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu
Compiler	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile

Comparison

BENCHMARK	NO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch	=

Info

Property	Before	After
HPX Datetime	2023-05-10T12:07:53+00:00	2023-08-08T21:21:49+00:00
HPX Commit	`dcb5415`	`f524d6e`
Clustername	rostam	rostam
Datetime	2023-05-10T14:52:35.047119-05:00	2023-08-08T16:32:40.990850-05:00
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu
Compiler	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile

Comparison

BENCHMARK	FORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATOR	PARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATOR	SCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add	(=)	(=)	(=)
Stream Benchmark - Scale	(=)	(=)	(=)
Stream Benchmark - Triad	(=)	(=)	(=)
Stream Benchmark - Copy	(=)	(=)	(=)

Info

Property	Before	After
HPX Datetime	2023-05-10T12:07:53+00:00	2023-08-08T21:21:49+00:00
HPX Commit	`dcb5415`	`f524d6e`
Clustername	rostam	rostam
Datetime	2023-05-10T14:52:52.237641-05:00	2023-08-08T16:32:57.921975-05:00
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu
Compiler	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile

Explanation of Symbols

Symbol	MEANING
=	No performance change (confidence interval within ±1%)
(=)	Probably no performance change (confidence interval within ±2%)
(+)/(-)	Very small performance improvement/degradation (≤1%)
+/-	Small performance improvement/degradation (≤5%)
++/--	Large performance improvement/degradation (≤10%)
+++/---	Very large performance improvement/degradation (>10%)
?	Probably no change, but quite large uncertainty (confidence interval with ±5%)
??	Unclear result, very large uncertainty (±10%)
???	Something unexpected…

StellarBot · 2023-08-08T21:42:58Z

Performance test report

HPX Performance

Comparison

BENCHMARK	FORK_JOIN_EXECUTOR	PARALLEL_EXECUTOR	SCHEDULER_EXECUTOR
For Each	(=)	??	-

Info

Property	Before	After
HPX Datetime	2023-05-10T12:07:53+00:00	2023-08-08T21:33:54+00:00
HPX Commit	`dcb5415`	`f663f52`
Datetime	2023-05-10T14:50:18.616050-05:00	2023-08-08T16:40:03.129538-05:00
Clustername	rostam	rostam
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu
Envfile
Compiler	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Comparison

BENCHMARK	NO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch	(=)

Info

Property	Before	After
HPX Datetime	2023-05-10T12:07:53+00:00	2023-08-08T21:33:54+00:00
HPX Commit	`dcb5415`	`f663f52`
Datetime	2023-05-10T14:52:35.047119-05:00	2023-08-08T16:42:16.273814-05:00
Clustername	rostam	rostam
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu
Envfile
Compiler	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Comparison

BENCHMARK	FORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATOR	PARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATOR	SCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add	(=)	(=)	(=)
Stream Benchmark - Scale	(=)	(=)	(=)
Stream Benchmark - Triad	(=)	(=)	(=)
Stream Benchmark - Copy	(=)	(=)	(=)

Info

Property	Before	After
HPX Datetime	2023-05-10T12:07:53+00:00	2023-08-08T21:33:54+00:00
HPX Commit	`dcb5415`	`f663f52`
Datetime	2023-05-10T14:52:52.237641-05:00	2023-08-08T16:42:33.204088-05:00
Clustername	rostam	rostam
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu
Envfile
Compiler	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1

Explanation of Symbols

Symbol	MEANING
=	No performance change (confidence interval within ±1%)
(=)	Probably no performance change (confidence interval within ±2%)
(+)/(-)	Very small performance improvement/degradation (≤1%)
+/-	Small performance improvement/degradation (≤5%)
++/--	Large performance improvement/degradation (≤10%)
+++/---	Very large performance improvement/degradation (>10%)
?	Probably no change, but quite large uncertainty (confidence interval with ±5%)
??	Unclear result, very large uncertainty (±10%)
???	Something unexpected…

StellarBot · 2023-08-09T06:43:31Z

Performance test report

HPX Performance

Comparison

BENCHMARK	FORK_JOIN_EXECUTOR	PARALLEL_EXECUTOR	SCHEDULER_EXECUTOR
For Each	(=)	??	-

Info

Property	Before	After
HPX Datetime	2023-05-10T12:07:53+00:00	2023-08-09T06:33:19+00:00
HPX Commit	`dcb5415`	`17aba85`
Clustername	rostam	rostam
Compiler	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu
Datetime	2023-05-10T14:50:18.616050-05:00	2023-08-09T01:40:37.750853-05:00
Envfile

Comparison

BENCHMARK	NO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch	(=)

Info

Property	Before	After
HPX Datetime	2023-05-10T12:07:53+00:00	2023-08-09T06:33:19+00:00
HPX Commit	`dcb5415`	`17aba85`
Clustername	rostam	rostam
Compiler	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu
Datetime	2023-05-10T14:52:35.047119-05:00	2023-08-09T01:42:50.859516-05:00
Envfile

Comparison

BENCHMARK	FORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATOR	PARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATOR	SCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add	(=)	(=)	(=)
Stream Benchmark - Scale	(=)	(=)	(=)
Stream Benchmark - Triad	=	(=)	(=)
Stream Benchmark - Copy	(=)	(=)	(=)

Info

Property	Before	After
HPX Datetime	2023-05-10T12:07:53+00:00	2023-08-09T06:33:19+00:00
HPX Commit	`dcb5415`	`17aba85`
Clustername	rostam	rostam
Compiler	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu
Datetime	2023-05-10T14:52:52.237641-05:00	2023-08-09T01:43:07.824650-05:00
Envfile

Explanation of Symbols

Symbol	MEANING
=	No performance change (confidence interval within ±1%)
(=)	Probably no performance change (confidence interval within ±2%)
(+)/(-)	Very small performance improvement/degradation (≤1%)
+/-	Small performance improvement/degradation (≤5%)
++/--	Large performance improvement/degradation (≤10%)
+++/---	Very large performance improvement/degradation (>10%)
?	Probably no change, but quite large uncertainty (confidence interval with ±5%)
??	Unclear result, very large uncertainty (±10%)
???	Something unexpected…

hkaiser · 2023-08-09T14:16:50Z

@Johan511 could you please fix the reported clang-format issues as well?

Johan511 · 2023-08-09T16:18:03Z

@Johan511 could you please fix the reported clang-format issues as well?

There is one loop yet to be vectorized, once that's done vectorization should be enabled for parallel case too. We can merge after that.

StellarBot · 2023-08-11T12:23:20Z

Performance test report

HPX Performance

Comparison

BENCHMARK	FORK_JOIN_EXECUTOR	PARALLEL_EXECUTOR	SCHEDULER_EXECUTOR
For Each	(=)	??	-

Info

Property	Before	After
HPX Datetime	2023-05-10T12:07:53+00:00	2023-08-11T12:14:49+00:00
HPX Commit	`dcb5415`	`5cccf76`
Datetime	2023-05-10T14:50:18.616050-05:00	2023-08-11T07:20:22.167797-05:00
Compiler	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile
Clustername	rostam	rostam
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu

Comparison

BENCHMARK	NO-EXECUTOR
Future Overhead - Create Thread Hierarchical - Latch	(=)

Info

Property	Before	After
HPX Datetime	2023-05-10T12:07:53+00:00	2023-08-11T12:14:49+00:00
HPX Commit	`dcb5415`	`5cccf76`
Datetime	2023-05-10T14:52:35.047119-05:00	2023-08-11T07:22:34.867171-05:00
Compiler	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile
Clustername	rostam	rostam
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu

Comparison

BENCHMARK	FORK_JOIN_EXECUTOR_DEFAULT_FORK_JOIN_POLICY_ALLOCATOR	PARALLEL_EXECUTOR_DEFAULT_PARALLEL_POLICY_ALLOCATOR	SCHEDULER_EXECUTOR_DEFAULT_SCHEDULER_EXECUTOR_ALLOCATOR
Stream Benchmark - Add	(=)	(=)	(=)
Stream Benchmark - Scale	(=)	(=)	=
Stream Benchmark - Triad	=	(=)	(=)
Stream Benchmark - Copy	(=)	-	(=)

Info

Property	Before	After
HPX Datetime	2023-05-10T12:07:53+00:00	2023-08-11T12:14:49+00:00
HPX Commit	`dcb5415`	`5cccf76`
Datetime	2023-05-10T14:52:52.237641-05:00	2023-08-11T07:22:51.879951-05:00
Compiler	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1	/opt/apps/llvm/13.0.1/bin/clang++ 13.0.1
Envfile
Clustername	rostam	rostam
Hostname	medusa08.rostam.cct.lsu.edu	medusa08.rostam.cct.lsu.edu

Explanation of Symbols

Symbol	MEANING
=	No performance change (confidence interval within ±1%)
(=)	Probably no performance change (confidence interval within ±2%)
(+)/(-)	Very small performance improvement/degradation (≤1%)
+/-	Small performance improvement/degradation (≤5%)
++/--	Large performance improvement/degradation (≤10%)
+++/---	Very large performance improvement/degradation (>10%)
?	Probably no change, but quite large uncertainty (confidence interval with ±5%)
??	Unclear result, very large uncertainty (±10%)
???	Something unexpected…

Johan511 · 2023-08-12T17:51:28Z

@hkaiser the unseq_first_n function has been changed a bit, the the function now takes iterator as input instead of value, this does not break vectorization.

Signed-off-by: Hari Hara Naveen S <johan511@rostam1.rostam.cct.lsu.edu>

Signed-off-by: Hari Hara Naveen S <johan511@rostam1.rostam.cct.lsu.edu> changing test according to change made to helper Signed-off-by: Hari Hara Naveen S <harihara.sn@gmail.com>

Signed-off-by: Hari Hara Naveen S <harihara.sn@gmail.com> unseq Signed-off-by: Hari Hara Naveen S <harihara.sn@gmail.com>

…for function call deduction) Signed-off-by: Hari Hara Naveen S <harihara.sn@gmail.com>

Signed-off-by: Hari Hara Naveen S <harihara.sn@gmail.com>

Signed-off-by: Hari Hara Naveen S <harihara.sn@gmail.com> fixed expolicy not accepting par_unseq Signed-off-by: Hari Hara Naveen S <harihara.sn@gmail.com> cleanup changes Signed-off-by: Hari Hara Naveen S <harihara.sn@gmail.com> fixing missing concept header Signed-off-by: Hari Hara Naveen S <harihara.sn@gmail.com>

codacy-production · 2024-02-08T23:49:16Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ -0.10%	✅ 80.00%

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`17bdd0d`)	206733	176202	85.23%
Head commit (`d8096f0`)	190774 (-15959)	162418 (-13784)	85.14% (-0.10%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#6302)	10	8	80.00%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

_{You may notice some variations in coverage metrics with the latest Coverage engine update. For more details, visit the documentation}

Johan511 requested review from aurianer, msimberg, biddisco and hkaiser as code owners July 16, 2023 12:17

hkaiser added type: enhancement type: compatibility issue category: algorithms labels Jul 16, 2023

hkaiser added this to the 1.10.0 milestone Jul 16, 2023

Johan511 changed the title ~~Simd first~~ Simd Find Jul 17, 2023

hkaiser reviewed Aug 5, 2023

View reviewed changes

Johan511 force-pushed the simd-first branch from be275f0 to 215ae4f Compare August 12, 2023 10:53

Johan511 mentioned this pull request Aug 15, 2023

simd std::remove #6322

Open

msimberg removed request for msimberg, biddisco and aurianer November 1, 2023 08:35

Hari Hara Naveen S and others added 6 commits February 8, 2024 14:11

unseq find

7289d0b

Signed-off-by: Hari Hara Naveen S <johan511@rostam1.rostam.cct.lsu.edu>

predicate parameter in simd-helpers first take iterators

09b39e6

Signed-off-by: Hari Hara Naveen S <johan511@rostam1.rostam.cct.lsu.edu> changing test according to change made to helper Signed-off-by: Hari Hara Naveen S <harihara.sn@gmail.com>

adding par_unseq to find

c6a6932

Signed-off-by: Hari Hara Naveen S <harihara.sn@gmail.com> unseq Signed-off-by: Hari Hara Naveen S <harihara.sn@gmail.com>

removed ExPolicy as template for loop_pred (tag_invoke is being used …

cbc7cf6

…for function call deduction) Signed-off-by: Hari Hara Naveen S <harihara.sn@gmail.com>

false changed to 0

7f69717

Signed-off-by: Hari Hara Naveen S <harihara.sn@gmail.com>

Johan511 force-pushed the simd-first branch from 629c4a6 to d8096f0 Compare February 8, 2024 20:28

hkaiser modified the milestones: 1.10.0, 1.11.0 May 3, 2024

Simd Find #6302

Are you sure you want to change the base?

Simd Find #6302

Conversation

Johan511 commented Jul 16, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Pansysk75 Aug 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Johan511 Aug 18, 2023 • edited Loading

Choose a reason for hiding this comment

Johan511 commented Aug 8, 2023 • edited Loading

hkaiser commented Aug 8, 2023

StellarBot commented Aug 8, 2023

HPX Performance

Comparison

Info

Comparison

Info

Comparison

Info

Explanation of Symbols

StellarBot commented Aug 8, 2023

HPX Performance

Comparison

Info

Comparison

Info

Comparison

Info

Explanation of Symbols

StellarBot commented Aug 9, 2023

HPX Performance

Comparison

Info

Comparison

Info

Comparison

Info

Explanation of Symbols

hkaiser commented Aug 9, 2023

Johan511 commented Aug 9, 2023

StellarBot commented Aug 11, 2023

HPX Performance

Comparison

Info

Comparison

Info

Comparison

Info

Explanation of Symbols

Johan511 commented Aug 12, 2023

codacy-production bot commented Feb 8, 2024

Coverage summary from Codacy

Pansysk75 Aug 8, 2023 •

edited

Loading

Johan511 Aug 18, 2023 •

edited

Loading

Johan511 commented Aug 8, 2023 •

edited

Loading