Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor registry init. #6948

Merged
merged 16 commits into from Apr 29, 2021
Merged

Conversation

stuartarchibald
Copy link
Contributor

This moves the implementations of core supported features out of
the base context and into explicit targets. This is to ensure that
targets only carry implementations of things they actually
support.

This moves the implementations of core supported features out of
the base context and into explicit targets. This is to ensure that
targets only carry implementations of things they actually
support.
@gmarkall
Copy link
Member

from numba import cuda
import numpy as np


@cuda.jit
def f(x):
    s = slice(1, 10, 2)
    x[0] = s.start
    x[1] = s.stop
    x[2] = s.step


x = np.zeros(3, dtype=np.int64)

f[1, 1](x)

print(x)

prints

[ 1 10  2]

with this PR even though the slicing registry is not installed on the CUDA target - how is it working now? (Not that it shouldn't be, of course!)

@stuartarchibald
Copy link
Contributor Author

RE slicing. Seems like a lot of registries are being populated as a side effect of import. In the case of slicing it's stencil from numba's __init__ which triggers an import chain that reaches it.

self.assertFalse(unexpected, "some modules unexpectedly imported")

def test_no_impl_import(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmarkall thoughts on having a test like this for the CUDA target? Essentially something to make sure that from numba import cuda / from numba.cuda import jit doesn't trigger import of modules that are known to have registration side effects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I'll have an experiment with adding a test like this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that from numba import cuda ends up importing numba.cpython.cmathimpl and numba.cpython.mathimpl.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... though, this seems to be quite easily fixable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test and fix added in 0611f05 - what do you think @stuartarchibald ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gmarkall, the patch looks good. Are you happy for me to include it in this branch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for confirming :)

@stuartarchibald stuartarchibald added Effort - long Long size effort needed and removed Effort - short Short size effort needed labels Apr 20, 2021
@stuartarchibald
Copy link
Contributor Author

RE slicing. Seems like a lot of registries are being populated as a side effect of import. In the case of slicing it's stencil from numba's __init__ which triggers an import chain that reaches it.

The rest of the patches on this PR are to deal with this! They make it so that modules that register lowering implementations are not imported at import time, or at decoration time, but only once compilation starts. The side effect of this is that nothing (not even the NRT) is jit compiled on import, this speeds up import times by about 40% locally. These patches also move all the lowering registrations out of the base context and onto the specific targets, such that e.g. the CPU just registers things it needs and CUDA just registers things it needs.

@sklam
Copy link
Member

sklam commented Apr 21, 2021

@stuartarchibald, the only way I can think of to avoid the move for typed.Dict and typed.List is to use PEP562 to redirect the import. I tried it at sklam@8230acb. Is it valid?

@stuartarchibald
Copy link
Contributor Author

@stuartarchibald, the only way I can think of to avoid the move for typed.Dict and typed.List is to use PEP562 to redirect the import. I tried it at sklam@8230acb. Is it valid?

Thanks @sklam, I think this approach may well work, it's similar to previous shims. As discussed OOB, first Python 3.6 removal is required, will open a new PR for that.

gmarkall and others added 4 commits April 27, 2021 12:33
This also required a small fix to prevent the registration of mathimpl
and cmathimpl when `cuda` was imported.
Conflicts:
	numba/cuda/target.py
@stuartarchibald
Copy link
Contributor Author

@gmarkall @sklam both your patches are now on this branch, many thanks.

Comment on lines +59 to +60
# Initialize typed containers
import numba.typed
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sklam Do you recall why this is here? Could it go into the CPU context?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is needed to register the typed containers for cases where they are used by implicitly; e.g. d = {}

Copy link
Member

@sklam sklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good. It's good to see the reduction in import time and that the import side-effects are better controlled.

I profile the import time again from python -X importtime -c "import numba". Here is a descending list of cumulative import time:

       self |  cumulative | package
       0.2% |       26.3% | numba.core.config
       0.1% |       21.6% | llvmlite.binding
       0.1% |       21.2% | numba.core.decorators
       0.1% |       20.9% | numba.stencils.stencil
       0.1% |       19.7% | numba.core.types
       0.1% |       19.6% | llvmlite.binding.dylib
       1.1% |       19.0% | llvmlite.binding.ffi
       7.0% |       17.8% | pkg_resources
       5.3% |       17.2% | numpy
       0.1% |       14.7% | numba.core.registry
       0.1% |       14.5% | numba.core.dispatcher
       0.7% |       14.1% | numba.core.compiler
       0.1% |        6.4% | numba.core.callconv
       1.3% |        6.3% | numba.core.base
       0.0% |        5.9% | numba.testing
       0.2% |        5.6% | numba.core.ir_utils

If we are interested in further importtime reduction, we should try using PEP562 in numba/__init__.py to defer more of the import

@sklam sklam added 5 - Ready to merge Review and testing done, is ready to merge and removed 3 - Ready for Review labels Apr 28, 2021
@stuartarchibald
Copy link
Contributor Author

Thanks for the review @sklam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to merge Review and testing done, is ready to merge Effort - long Long size effort needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants