Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving context stack #6814

Merged
merged 24 commits into from Mar 29, 2021
Merged

Improving context stack #6814

merged 24 commits into from Mar 29, 2021

Conversation

sklam
Copy link
Member

@sklam sklam commented Mar 11, 2021

based on #6762

Improves the context stack implementation in #6762 and refactor the target options and flags.

@sklam
Copy link
Member Author

sklam commented Mar 24, 2021

This will need CUDA smoketest when done

@stuartarchibald
Copy link
Contributor

stuartarchibald@4666c77 hacks through the flags for operator.add.

@sklam
Copy link
Member Author

sklam commented Mar 24, 2021

There're known cuda error due to changes to error msg. To be fixed later.

@sklam
Copy link
Member Author

sklam commented Mar 24, 2021

@stuartarchibald,

stuartarchibald/numba@4666c77 hacks through the flags for operator.add.

looks promising. I added a print of the function name and the flags in the lowerer and it showing the right things:

----- jit_wrapper__built_in_function_add_ Flags(enable_looplift=True, enable_pyobject=False, enable_pyobject_looplift=True, debuginfo=False, boundscheck=False, no_cpython_wrapper=True, nrt=True, fastmath=FastMathOptions({'fast'}))
----- foo Flags(enable_looplift=True, enable_pyobject=False, enable_pyobject_looplift=True, debuginfo=False, boundscheck=False, nrt=True, fastmath=FastMathOptions({'fast'}))

for

from numba import njit

@njit(fastmath=True)
def foo(x, y):
  a = x + y
  return a


x, y = 1, 2
r = foo(x, y)
print(r)

@sklam sklam added the needs initial review This PR needs an initial review to check the code change is well formed, documented, efficient etc. label Mar 24, 2021
@sklam sklam marked this pull request as ready for review March 25, 2021 19:46
Copy link
Contributor

@stuartarchibald stuartarchibald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch @sklam, looks good, just a few minor things to resolve.

numba/core/base.py Outdated Show resolved Hide resolved
flags.set('no_cfunc_wrapper')
if cstk:
tls_flags = cstk.top()
if tls_flags.is_set("nrt") and tls_flags.nrt:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are both of these conditions necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tls_flags.nrt by itself can be reading the default. Here, it is checking that "nrt" is not default and its value is True.

numba/core/ccallback.py Show resolved Hide resolved
numba/core/targetconfig.py Show resolved Hide resolved
numba/core/targetconfig.py Show resolved Hide resolved
numba/core/targetconfig.py Show resolved Hide resolved


class _MetaTargetConfig(type):
def __init__(cls, name, bases, dct):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstr? Particularly explaining what this is going to do.

numba/core/targetconfig.py Outdated Show resolved Hide resolved
numba/core/targetconfig.py Outdated Show resolved Hide resolved
numba/core/utils.py Show resolved Hide resolved
@stuartarchibald
Copy link
Contributor

I've also checked that this does transmit the flags correctly, applying this patch: stuartarchibald@2f63896 will get the flags to the right place and also disables the fastmath visitor. Then running something like:

from numba import njit, types
import numpy as np
import operator

def foo(n):
    return operator.add(n, 3.14)

# no fast
jfoo = njit(foo)
jfoo.compile((types.float64,),)
print(jfoo.inspect_llvm(jfoo.signatures[0]))

with and without fastmath=True in the jit decorator demonstrates the presence (or not) of a fadd fast in the LLVM IR.

sklam and others added 2 commits March 26, 2021 11:13
Co-authored-by: stuartarchibald <stuartarchibald@users.noreply.github.com>
Mainly cleanups and adding docstrings.
@stuartarchibald
Copy link
Contributor

Buildfarm ID: numba_smoketest_cpu_yaml_13.

@stuartarchibald
Copy link
Contributor

Both CPU and CUDA tests are running through the farm.

@stuartarchibald
Copy link
Contributor

CUDA tests are failing on all platforms with:

ERROR: test_target_cuda_unrecognized_arg (numba.cuda.tests.cudapy.test_vectorize_decor.TestVectorizeUnrecognizedArg)
<traceback>
KeyError: "cuda vectorize target does not support option: 'nonexistent'"

@sklam sklam added 4 - Waiting on reviewer Waiting for reviewer to respond to author and removed 4 - Waiting on CI Review etc done, waiting for CI to finish labels Mar 26, 2021
@sklam
Copy link
Member Author

sklam commented Mar 29, 2021

started new cuda smoketest: numba_smoketest_cuda_yaml_35

Copy link
Contributor

@stuartarchibald stuartarchibald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes, change set from ec69a18 onwards is checked, few minor things to resolve else looks good.

numba/tests/npyufunc/test_vectorize_decor.py Outdated Show resolved Hide resolved
sklam and others added 3 commits March 29, 2021 09:35
Co-authored-by: stuartarchibald <stuartarchibald@users.noreply.github.com>
Copy link
Contributor

@stuartarchibald stuartarchibald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes!

@stuartarchibald stuartarchibald added 4 - Waiting on CI Review etc done, waiting for CI to finish and removed 4 - Waiting on reviewer Waiting for reviewer to respond to author labels Mar 29, 2021
@stuartarchibald
Copy link
Contributor

started new cuda smoketest: numba_smoketest_cuda_yaml_35

This passed on the 5/6 of the CUDA test matrix that ran (the rest suffered an intermittent unrelated problem) and given the nature of the patch/previous failures, that any part of the matrix passed is sufficient confirmation of the fixes. Thanks again for all your work on this, it should unlock a load more optimisation opportunities :) Marking ready to merge...!

@stuartarchibald stuartarchibald added 5 - Ready to merge Review and testing done, is ready to merge BuildFarm Passed For PRs that have been through the buildfarm and passed Effort - long Long size effort needed and removed 4 - Waiting on CI Review etc done, waiting for CI to finish Pending BuildFarm For PRs that have been reviewed but pending a push through our buildfarm labels Mar 29, 2021
@sklam
Copy link
Member Author

sklam commented Mar 29, 2021

Azure CI build 24b16a6 twice. The first time is all green https://dev.azure.com/numba/numba/_build/results?buildId=8387&view=results. The second time it got stuck in internet traffic jam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to merge Review and testing done, is ready to merge BuildFarm Passed For PRs that have been through the buildfarm and passed Effort - long Long size effort needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants