Split optimisation passes. #6335

stuartarchibald · 2020-10-09T13:41:22Z

This splits up the module level optimisation passes as follows:

Runs a cheap pass to inline across the module, this in an
attempt to bring as many refops into the same function as
possible.
Runs the reference count pruning pass.
Runs the full optimisation suite, this should discover many
more opportunities for optimisation as a result of the inline
and refop prune.

This splits up the module level optimisation passes as follows: 1. Runs a cheap pass to inline across the module, this in an attempt to bring as many refops into the same function as possible. 2. Runs the reference count pruning pass. 3. Runs the full optimisation suite, this should discover many more opportunities for optimisation as a result of the inline and refop prune. Closes numba#5033

sklam · 2020-10-09T21:53:43Z

numba/core/codegen.py

+        self._mpm_cheap = self._module_pass_manager(loop_vectorize=False,
+                                                    opt=2)


Just curious... why O2 and not O1?

Also, do we need to override the inlining_threshold? e.g. cheap and full run has different threshold.

Was think that producing more optimised code might let the refpruner run quicker and also permit more inlining if the complexity is reduce. Turns out in some checks @esc did that O2 massively increases compile time, whereas O1 increases it a small bit, but both cases leading to huge performance gains, so I think O1 is probably the way to go for now. RE inline threshold, I've been thinking lately that it'd be a good idea to put more of these "trade-off" options into the hands of users, some will want to optimise something as much as possible regardless of the compilation cost, others will want to optimise for short compilation times, others may be inbetween!

4cf27a9 moves to O1

As title, results in similar runtime performance as O2 but at much cheaper compile time cost.

stuartarchibald · 2020-10-13T10:47:49Z

Build fail is due to #6356

…ectorization_1

numba/core/codegen.py

sklam · 2020-10-13T17:08:44Z

@stuartarchibald, just some minor suggestions to simplify the code a little.

As title

stuartarchibald · 2020-10-13T17:49:47Z

@stuartarchibald, just some minor suggestions to simplify the code a little.

Thanks, done in 449d0f5

sklam

LGTM

stuartarchibald added the 3 - Ready for Review label Oct 9, 2020

stuartarchibald added this to the Numba 0.52 RC milestone Oct 9, 2020

stuartarchibald mentioned this pull request Oct 9, 2020

Improve typed list performance. #6278

Merged

sklam reviewed Oct 9, 2020

View reviewed changes

esc mentioned this pull request Oct 12, 2020

Dev/6335+6278 #6351

Closed

Switch cheap pass to use O1.

4cf27a9

As title, results in similar runtime performance as O2 but at much cheaper compile time cost.

stuartarchibald added 4 - Waiting on reviewer Waiting for reviewer to respond to author and removed 3 - Ready for Review labels Oct 12, 2020

sklam mentioned this pull request Oct 13, 2020

Fix enumerate invalid decref #6357

Merged

Merge remote-tracking branch 'upstream/master' into wip/enable_more_v…

b5b7de2

…ectorization_1

sklam reviewed Oct 13, 2020

View reviewed changes

numba/core/codegen.py Outdated Show resolved Hide resolved

numba/core/codegen.py Outdated Show resolved Hide resolved

Respond to feedback.

449d0f5

As title

sklam approved these changes Oct 13, 2020

View reviewed changes

sklam added 5 - Ready to merge Review and testing done, is ready to merge and removed 4 - Waiting on reviewer Waiting for reviewer to respond to author labels Oct 13, 2020

stuartarchibald merged commit 790373e into numba:master Oct 13, 2020

sklam mentioned this pull request Aug 26, 2022

50% performance drop from py3.8/numba 0.51.2 to py3.9/numba 0.55.1 #8398

Open

dlee992 mentioned this pull request May 14, 2024

Gain more vectorization opportunities #9570

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split optimisation passes. #6335

Split optimisation passes. #6335

stuartarchibald commented Oct 9, 2020

sklam Oct 9, 2020

stuartarchibald Oct 12, 2020

stuartarchibald Oct 12, 2020

stuartarchibald commented Oct 13, 2020

sklam commented Oct 13, 2020

stuartarchibald commented Oct 13, 2020

sklam left a comment

		self._mpm_cheap = self._module_pass_manager(loop_vectorize=False,
		opt=2)

Split optimisation passes. #6335

Split optimisation passes. #6335

Conversation

stuartarchibald commented Oct 9, 2020

sklam Oct 9, 2020

Choose a reason for hiding this comment

stuartarchibald Oct 12, 2020

Choose a reason for hiding this comment

stuartarchibald Oct 12, 2020

Choose a reason for hiding this comment

stuartarchibald commented Oct 13, 2020

sklam commented Oct 13, 2020

stuartarchibald commented Oct 13, 2020

sklam left a comment

Choose a reason for hiding this comment