Switch on SLP vectorize. #6275

stuartarchibald · 2020-09-23T08:52:33Z

As title.

stuartarchibald · 2020-09-23T08:53:03Z

Seeing what impact this has on testing run times.

stuartarchibald · 2020-09-23T09:35:14Z

Command: time ./runtests.py --random=0.1

Mainline does this:

Ran 995 tests in 636.697s

OK (skipped=19, expected failures=1)

real    10m42.166s
user    11m33.653s
sys     0m7.492s

This branch does this:

Ran 995 tests in 632.633s

OK (skipped=19, expected failures=1)

real    10m37.993s
user    11m38.904s
sys     0m7.392s

seems like the impact on the test suite run time is potentially small/zero.

stuartarchibald · 2020-09-23T09:40:06Z

Mainline:

In [1]: %cpaste                                                                                                                 
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:from numba import njit
import numpy as np

@njit
def foo(a1, a2, b1, b2, A):
    for x in range(0, len(A), 4):
        Z = A[x:x + 4]
        Z[0] = a1*(a1 + b1)
        Z[1] = a2*(a2 + b2)
        Z[2] = a1*(a1 + b1)
        Z[3] = a2*(a2 + b2)

foo(1.2, 3.4, 5.6, 7.8, np.empty(4000))::::::::::::
:<EOF>

In [2]: G = np.empty(40000)                                                                                                     

In [3]: %timeit foo(1.2, 3.4, 5.6, 7.8, G)                                                                                      
12.9 µs ± 2.75 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

This branch:

In [1]: %cpaste                                                                                                                 
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:from numba import njit
import numpy as np

@njit
def foo(a1, a2, b1, b2, A):
    for x in range(0, len(A), 4):
        Z = A[x:x + 4]
        Z[0] = a1*(a1 + b1)
        Z[1] = a2*(a2 + b2)
        Z[2] = a1*(a1 + b1)
        Z[3] = a2*(a2 + b2)

foo(1.2, 3.4, 5.6, 7.8, np.empty(4000))::::::::::::
:<EOF>

In [2]: G = np.empty(40000)                                                                                                     

In [3]: %timeit foo(1.2, 3.4, 5.6, 7.8, G)                                                                                      
10.3 µs ± 22.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

sklam · 2020-10-13T17:11:32Z

numba/core/codegen.py

@@ -714,7 +714,8 @@ def _pass_manager_builder(self):
        missed...
        """
        pmb = lp.create_pass_manager_builder(
-            opt=config.OPT, loop_vectorize=config.LOOP_VECTORIZE)
+            opt=config.OPT, loop_vectorize=config.LOOP_VECTORIZE,
+            slp_vectorize=True)


Shouldn't this also get a config flag? e.g. config.SLP_VECTORIZE

Can do, but I really want to move away from specific config flags for each feature in the optimiser, I think it should be in the hands of the user but also it shouldn't be spelled one at a time like this. For 0.52 shall we do a flag so it's easy to debug/turn off but this is the last one like it!?!

Ok, make it the last flag.

As title.

sklam · 2020-10-14T13:42:10Z

/azp run

azure-pipelines · 2020-10-14T13:42:19Z

Azure Pipelines successfully started running 1 pipeline(s).

Switch on SLP vectorize.

0ca18bf

As title.

stuartarchibald added the 2 - In Progress label Sep 23, 2020

stuartarchibald added this to the PR Backlog milestone Sep 23, 2020

stuartarchibald added 3 - Ready for Review and removed 2 - In Progress labels Sep 23, 2020

stuartarchibald modified the milestones: PR Backlog, Numba 0.52 RC Oct 12, 2020

sklam reviewed Oct 13, 2020

View reviewed changes

sklam added 4 - Waiting on author Waiting for author to respond to review and removed 3 - Ready for Review labels Oct 13, 2020

stuartarchibald added 4 commits October 13, 2020 20:52

Make SLP vectorize config based, update docs.

b019a6c

As title.

Merge remote-tracking branch 'upstream/master' into wip/slp

8cb3ab9

Switch off SLP vectorize in cheap module optimisation run.

20eb8cd

As title.

Add test for SLP being on

0140f78

stuartarchibald added 4 - Waiting on reviewer Waiting for reviewer to respond to author and removed 4 - Waiting on author Waiting for author to respond to review labels Oct 14, 2020

sklam approved these changes Oct 14, 2020

View reviewed changes

sklam added 4 - Waiting on CI Review etc done, waiting for CI to finish 5 - Ready to merge Review and testing done, is ready to merge and removed 4 - Waiting on reviewer Waiting for reviewer to respond to author 4 - Waiting on CI Review etc done, waiting for CI to finish labels Oct 14, 2020

sklam merged commit 9ceac80 into numba:master Oct 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch on SLP vectorize. #6275

Switch on SLP vectorize. #6275

stuartarchibald commented Sep 23, 2020

stuartarchibald commented Sep 23, 2020

stuartarchibald commented Sep 23, 2020

stuartarchibald commented Sep 23, 2020

sklam Oct 13, 2020

stuartarchibald Oct 13, 2020

sklam Oct 13, 2020

sklam commented Oct 14, 2020

azure-pipelines bot commented Oct 14, 2020

Switch on SLP vectorize. #6275

Switch on SLP vectorize. #6275

Conversation

stuartarchibald commented Sep 23, 2020

stuartarchibald commented Sep 23, 2020

stuartarchibald commented Sep 23, 2020

stuartarchibald commented Sep 23, 2020

sklam Oct 13, 2020

Choose a reason for hiding this comment

stuartarchibald Oct 13, 2020

Choose a reason for hiding this comment

sklam Oct 13, 2020

Choose a reason for hiding this comment

sklam commented Oct 14, 2020

azure-pipelines bot commented Oct 14, 2020