New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving context stack #6814
Improving context stack #6814
Conversation
Adjusts CUDA target options. # Conflicts: # numba/cuda/compiler.py
This will need CUDA smoketest when done |
stuartarchibald@4666c77 hacks through the flags for |
There're known cuda error due to changes to error msg. To be fixed later. |
looks promising. I added a print of the function name and the flags in the lowerer and it showing the right things:
for from numba import njit
@njit(fastmath=True)
def foo(x, y):
a = x + y
return a
x, y = 1, 2
r = foo(x, y)
print(r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch @sklam, looks good, just a few minor things to resolve.
flags.set('no_cfunc_wrapper') | ||
if cstk: | ||
tls_flags = cstk.top() | ||
if tls_flags.is_set("nrt") and tls_flags.nrt: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are both of these conditions necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tls_flags.nrt
by itself can be reading the default. Here, it is checking that "nrt" is not default and its value is True
.
|
||
|
||
class _MetaTargetConfig(type): | ||
def __init__(cls, name, bases, dct): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstr? Particularly explaining what this is going to do.
I've also checked that this does transmit the flags correctly, applying this patch: stuartarchibald@2f63896 will get the flags to the right place and also disables the from numba import njit, types
import numpy as np
import operator
def foo(n):
return operator.add(n, 3.14)
# no fast
jfoo = njit(foo)
jfoo.compile((types.float64,),)
print(jfoo.inspect_llvm(jfoo.signatures[0])) with and without |
Co-authored-by: stuartarchibald <stuartarchibald@users.noreply.github.com>
Mainly cleanups and adding docstrings.
Buildfarm ID: |
Both CPU and CUDA tests are running through the farm. |
CUDA tests are failing on all platforms with:
|
started new cuda smoketest: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixes, change set from ec69a18 onwards is checked, few minor things to resolve else looks good.
Co-authored-by: stuartarchibald <stuartarchibald@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixes!
This passed on the 5/6 of the CUDA test matrix that ran (the rest suffered an intermittent unrelated problem) and given the nature of the patch/previous failures, that any part of the matrix passed is sufficient confirmation of the fixes. Thanks again for all your work on this, it should unlock a load more optimisation opportunities :) Marking ready to merge...! |
Azure CI build 24b16a6 twice. The first time is all green https://dev.azure.com/numba/numba/_build/results?buildId=8387&view=results. The second time it got stuck in internet traffic jam. |
based on #6762
Improves the context stack implementation in #6762 and refactor the target options and flags.