Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Message value of NPY_FEATURE_VERSION used #24861

Closed
wants to merge 2 commits into from

Conversation

jakirkham
Copy link
Contributor

To provide a bit more visibility to how extensions using NumPy are built, message the value of NPY_FEATURE_VERSION used during the build so that this can be more easily seen by builders trying to validate build behavior.

@jakirkham
Copy link
Contributor Author

It might also be useful to bake this value into binaries for validating the binaries themselves. One option to do this might be the approach outlined in this SO answer. Maybe there are better ways to accomplish this.

@charris charris changed the title Message value of NPY_FEATURE_VERSION used MAINT: Message value of NPY_FEATURE_VERSION used Oct 4, 2023
@charris charris changed the title MAINT: Message value of NPY_FEATURE_VERSION used MAINT: Message value of NPY_FEATURE_VERSION used Oct 4, 2023
@jakirkham
Copy link
Contributor Author

cc @seberg (in case you have thoughts on this)

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good idea. I noticed that MSVC needs parentheses, however, I think the string concatenation doesn't actually work here? We could stringify it via some macros, although unfortunatly, I don't think the feature version has convenient to understand values.

Maybe the best is to just hardcode the 1.19 API version (since its defined just above) and for the other one stress that it was set to the user provided NPY_TARGET_VERSION. (can still stringify it, although its unfortunately not clear to read)

numpy/core/include/numpy/numpyconfig.h Outdated Show resolved Hide resolved
numpy/core/include/numpy/numpyconfig.h Outdated Show resolved Hide resolved
numpy/core/include/numpy/numpyconfig.h Outdated Show resolved Hide resolved
jakirkham and others added 2 commits October 13, 2023 10:10
To provide a bit more visibility to how extensions using NumPy are built, message the value of `NPY_FEATURE_VERSION` used during the build so that this can be more easily seen by builders trying to validate build behavior.
Unfortunately, clang at least on MacOS really insists on a string literal,
so that the macro expansion has to be done explicitly.
@seberg
Copy link
Member

seberg commented Oct 13, 2023

I pushed a fix. Now running with it, I see two things:

  1. This gets hit surprisingly often also when building NumPy. Which I don't mind too much, although unfortunately it does look a bit like a warning, with -Wall it likely is, which may be a problem?
  2. We have a surprising amount of places where this gets included and that don't set the internal build.

I still like the idea, the fact that it shows as a warning: (in the NumPy build with -Wall probably) makes me hesitate a bit now.

I had talked with @jakirkham he would like to be able to see/check if things are compiled correctly. Two ideas came up:

  • Try to make sure the binary includes the info in a way that can be found (it's there somewhere, but...).
  • I think we could probably hack a weird small tool that runs a bit of Python code with a current NumPy version but monkeypatches NumPy to pretend it is an older NumPy version. You could then basically try:/except: until the import works.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that some info like this is nice to have when reading build logs of downstream packages. However, using a pragma to emit messages from C code directly cannot be right. That's not only true in this instance, this should never be done for something that is for info rather than a warning.

It also seems to me that you do not want to see this message per build target. Once per extension module is far too much - these kinds of things are incredibly noisy every time Cython does something like this. In the end, the Python package that is being built has a single ABI constraint baked in - the minimum numpy version that, as a whole, it is compatible with.

If you look at NumPy's own build log, this info should not be there at all I think, since it's not relevant. In downstream packages, it belongs in the configure stage output, not the compile output.

That minimum numpy value is already needed for the package - because it needs to write out the correct constraint in its runtime dependencies. At present I think almost no package gets that right (I haven't even thought about doing this for SciPy yet). Isn't that the more important thing to figure out? And then the right info is much more easily accessible.

It might also be useful to bake this value into binaries for validating the binaries themselves.

Isn't this present already? There should be an ABI version check that makes use of this.

@seberg
Copy link
Member

seberg commented Oct 14, 2023

Isn't this present already?

Well, clearly it's baked in somewhere, but the only clear way to get to it is to attempt an import and see the failure (a thought now, is that you could fake the NumPy version to be 0, then the error message will print out which version you compiled for, although its the hex version unfortunately).

So, I think the ask/question is that we could bake it in so that it is easy to check a conda recipy (or wheel) for whether it was compiled with the correct support range, without necessarily having to run the full tests against the oldest NumPy version.

@rgommers
Copy link
Member

Since extension modules don't have to have public symbols being PyInit, a failed import when trying with version 0 and parsing the message from that may be a good option? If there was a simple separate utility for that, it would also work on existing extension modules, not just future ones.

@seberg
Copy link
Member

seberg commented Oct 14, 2023

Yeah, that could be a "challenge accepted" for someone, not feeling it right now. One problem is that library X may import SciPy first, so you may need to target the import to test or inspect the frame to make it reliable.

Since extension modules don't have to have public symbols being PyInit

We could injecting public/hidden symbols in principle whenever _import_array and the table is defined in principle. They might have only a partially fixed name and I am not sure how well it would actually work to probe in practice.

@rgommers
Copy link
Member

rgommers commented Oct 14, 2023

The other thing that would be useful is to query the compatible version at build time. From https://numpy.org/devdocs/dev/depending_on_numpy.html#adding-a-dependency-on-numpy: By default, NumPy will expose an API that is backwards compatible with the oldest NumPy version that supports the currently oldest compatible Python version.

That can of course be hardcoded in the downstream package, but it's annoying logic to maintain. I think there may not be a better way than reading NPY_FEATURE_VERSION from numpyconfig.h during the downstream package's build.

That same doc link already says: This define must be consistent for each extension module (use of import_array()) and also applies to the umath module. So it is necessary anyway to handle inserting this define in a single place. For SciPy we already have similar logic to deal with NPY_NO_DEPRECATED_API. I'll see if I can add something that deals with feature version and prints a message like this in the build log:

Targeted NumPy C API version: 1.19.0

I think it's not hard to do. Making the wording of that line useful and not confusing is a little tricky - I'd be happy to hear better ideas.

@rgommers
Copy link
Member

Okay, I got that mostly working in the SciPy build log now. With numpy 1.25.2 it shows:

Fetching value of define "NPY_FEATURE_VERSION" : 0x0000000d
Message: Targeted NumPy C API version:  1.19.0

and with numpy 1.24.3:

Fetching value of define "NPY_FEATURE_VERSION" :
Fetching value of define "NPY_API_VERSION" : 0x00000010
Message: Targeted NumPy C API version:  1.23.0

The code to implement that:

# This is the minimum supported NumPy version from a binary compatibility
# perspective. If we ever need to set NPY_TARGET_VERSION, it should be defined
# here, taken into account for the message below, and added to all extension
# modules. `NPY_FEATURE_VERSION` first became available in NumPy 1.25.0.
_np_feature_version = cc.get_define('NPY_FEATURE_VERSION',
  include_directories: inc_np, prefix: '#include "numpy/numpyconfig.h"'
)
if _np_feature_version == ''
  # We're dealing with numpy <1.25.0 here. So get API version instead
  _np_feature_version = cc.get_define('NPY_API_VERSION',
    include_directories: inc_np, prefix: '#include "numpy/numpyconfig.h"'
  )
endif
_np_target_version =  {
  '0x0000000d': '1.19.0',
  '0x0000000e': '1.20.0',  # also 1.21.0 (no API additions there)
  '0x0000000f': '1.22.0',
  '0x00000010': '1.23.0',  # also 1.24.0
}.get(_np_feature_version)
message('Targeted NumPy C API version: ', _np_target_version)

The annoyance here is of course the translation of hex API numbers to string versions. We can't know what the hex numbers for future numpy release will be. To avoid that problem, we need a string equivalent like:

#define NPY_FEATURE_VERSION_STRING

If we'd have that, I think the above code works - and will reduce down to a single cc.get_define('NPY_FEATURE_VERSION_STRING', ...) call once a project has dropped <2.0 versions.

Does this address the use case, and if so should we add such a define?

@h-vetinari
Copy link
Contributor

h-vetinari commented Oct 17, 2023

Does this address the use case, and if so should we add such a define?

I don't think it's been laid out explicitly, but the use-case in conda-forge is that we need to be able to set a global baseline in conda-forge for things to work in general. If there was no numpy 2.0 coming up, it's likely we'd just keep stepping through the numpy versions according to NEP29, and accepting their defaults (even though that'd be well more conservative than what our metadata claims; e.g. building against numpy 1.25 would generate a numpy >=1.25 constraint from the POV of conda-forge, even though vanilla 1.25 will default to NPY_FEATURE_VERSION=1.19).

However, to avoid duplicating our build matrix between numpy 1 & 2, it's very tempting to use the NPY_FEATURE_VERSION mechanism, only that this comes with the downside of exposing us to wrong metadata if upstream projects override this somewhere in their build scripts to a higher feature level, and we do not adapt our metadata accordingly. That's why there's a desire to be able to deduce/query the NPY_FEATURE_VERSION of a given extension module from the compiled binary, so that we could automate this check and raise an error in such cases.

Realistically, something showing up in the log is likely not good enough for that.

@jakirkham
Copy link
Contributor Author

Right was alluding to the need for some ability to inspect binaries above

#24861 (comment)

Maybe there are better ways than what I proposed there

@jakirkham
Copy link
Contributor Author

Given that context Ralf, how should we proceed?

@rgommers
Copy link
Member

Given the discussion above, I think the way forward is:

  1. Someone should try to implement small standalone tool that implements the suggestion by Sebastian about initializing with version 0 and then checking the error message for the actual version. This addresses the "binary inspection" need.
  2. Consider adding NPY_FEATURE_VERSION_STRING (and maybe also NPY_API_VERSION_STRING) as I suggested. That will address the need of including version numbers easily in build logs of downstream packages.

And close this PR - turning it into an issue with feature requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants