Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Add CMake module for compiling Cython code #5486

Open
vyasr opened this issue Jun 14, 2023 · 24 comments
Open

[ENH] Add CMake module for compiling Cython code #5486

vyasr opened this issue Jun 14, 2023 · 24 comments
Assignees

Comments

@vyasr
Copy link
Contributor

vyasr commented Jun 14, 2023

Is your feature request related to a problem? Please describe.

Currently the recommended method for building Cython modules is using setuptools via setup.py. While this is still completely valid, the general trend of the Python packaging ecosystem is to move away from this pattern. This is true in multiple respects:

I do not know if Cython has any plans for supporting setup.cfg files (I wasn't able to find anything in the issue tracker), nor do I have enough knowledge of Cython internals to propose a plan for how Cython could be integrated into a setup.cfg-based build. For projects leveraging tools like meson or CMake, though, it there is a fairly should be possible to chart a reasonable path forward here.

Describe the solution you'd like.

The general pattern for Python builds using an underlying build system is to use cython to transpile Cython code into C/C++ and then to manage compilation of the resulting source file using the build generator (as opposed to using cythonize) since that allows users to leverage the full features of the build system to control the compilation. I do not personally make use of Meson, but I do have experience with CMake and scikit-build. To help facilitate the transition for Cython users using more extensive CMake-based builds, I would like for Cython to provide a CMake module that (at minimum) defines a function that can be used to run the cython CLI and generate code with it. This function would provide an interface to all the command line flags for the cython CLI and handle any necessary preprocessing. The resulting module could then be used by e.g. scikit-build users or those manually handling CMake-based compilation.

If the developers are interested in this feature, I am happy to contribute it in a PR.

Describe alternatives you've considered.

scikit-build currently provides a UseCython.cmake module for this purpose. However, the successor package scikit-build-core does not. I originally considered porting the functionality there, but the Cython project itself seems like a more appropriate home for this logic. Moreover, putting this code into Cython would allow other builders to also take advantage of it instead of relying on scikit-build-core.

Additional context

The one caveat with putting this module into Cython is that if Cython is installed via pip the module will go into the Python site-packages prefix. This location will not be on CMake's module search path by default, unlike if it was installed directly into the primary $PREFIX directly (e.g. a system /usr/lib directory equivalent, or perhaps a conda env prefix). That means that users would either need to query Python's CLI using a CMake custom command to get the site directory, or CMake would need to add support for this directly. Some discussion of that is in the CMake issue I originally opened regarding Cython support.

@henryiii
Copy link

henryiii commented Jun 14, 2023

FYI, a good example of supporting this, without requiring CMake when building Cython, is at https://github.com/wjakob/nanobind/blob/master/cmake/nanobind-config.cmake (and an example that does require CMake at build time can be found in pybind11, but I think this is better). By supplying a cmake/cython-config.cmake file in the PyPI package, find_package(cython) will work out-of-the-box in scikit-build-core (and therefore eventually in scikit-build classic too). (As mentioned above, there's a bit of querying that you'd have to do if you are using CMake directly; that can be simplified a little by providing a helper function that produces the location of the cmake module; and I'm hoping to improve support for this in CMake in the future.)

If you are interested in this, I could either help develop or eventually contribute the file.

@h-vetinari
Copy link
Contributor

Meson also added native support for Cython: https://mesonbuild.com/Cython.html

@vyasr
Copy link
Contributor Author

vyasr commented Jun 22, 2023

@henryiii I don't quite follow your last comment. What do you mean by

without requiring CMake when building Cython

How does the nanobind example avoid CMake? It is a CMake config module so I'm not understanding what step in the process is CMake-free. The Cython transpiling itself?

As for the rest, I think the discussion on https://gitlab.kitware.com/cmake/cmake/-/issues/24982 is indicative of the helper function for the location of the CMake module by using the extra package paths provided by FindPython, yes?

@henryiii
Copy link

How does the nanobind example avoid CMake?

I might not have been clear - what I meant is nanobind itself doesn't use CMake to make SDists and wheels of nanobind. This is not like pybind11, which actually runs CMake when making an SDist in order to produce a config file.

If Cython added a cmake file similar to nanobind's, it would literally just be adding one file (plus any config necessary to make sure it's including in SDist/wheel), and it would just start working with scikit-build-core (and I think py-build-cmake too). The discussion in the linked issue is about how you'd also use it from pure CMake too, which should be pretty easy.

@vyasr
Copy link
Contributor Author

vyasr commented Jun 22, 2023

Ah OK then yes, I agree. I've already started drafting a sample cython-config.cmake module for my own use that would essentially expose the cython CLI invocation via CMake. Happy to contribute that either here or in CMake proper if that is determined to be the correct home (although based on current discussions I think Cython is a better home for it). Then it's just a matter of adding an example of how to add the necessary site-packages dir to the search path using vanilla CMake as you mentioned, Marc showed us the necessary CMake vars in the other discussion.

@vyasr
Copy link
Contributor Author

vyasr commented Jul 11, 2023

@henryiii you can see my first pass here. It's a fairly direct translation of the cython CLI to CMake. I didn't want to do any of the extra magic that scikit-build classic's UseCython.cmake did since with a proper CMake setup that should be handled by the caller using target_include_directories, but let me know if you disagree with that approach (and of course with any other bits here). If you think this looks solid we can see what the Cython devs think about me upstreaming this into Cython (along with some documentation).

@eli-schwartz
Copy link
Contributor

What is the advantage of copying the cython CLI as properties compared to, say, a cython_args array of arguments to invoke cython with?

@vyasr
Copy link
Contributor Author

vyasr commented Jul 11, 2023

That's a good question. I think there are tradeoffs here.
Pros:

  • Copying the CLI allows us to control the argument list a little more cleanly, e.g. we can prevent users from specifying an output directory (which this module should probably manage directly.
  • It simplifies internal parsing of some pieces like determining what file extensions to use, and it makes it easier to validate inputs.
  • It allows us to have different defaults if there are cases where it may make sense to use a different default in the context of CMake. For example, it might make sense to set the --gdb flag based on CMAKE_BUILD_TYPE.

Cons:

  • It requires keeping a set of mappings up-to-date. If cython adds or changes options, the cython-config.cmake will have to change as well. This is additional maintenance that Cython devs may want to avoid.
  • It requires users to learn a second set of arguments instead of directly using the ones they're familiar with. I suspect that this argument is partially moot, however, since I think most users of Cython go the setup.py route and don't invoke Cython directly anyway.

There are probably other considerations as well. Copying the CLI was my first thought, but I could be convinced to go either way depending on how others feel about it.

@eli-schwartz
Copy link
Contributor

But don't all of those pros also apply to *.cpp files where you can specify target_compile_options(target PRIVATE -o foobar.cpp.o)?

But no one actually does that. They are far more likely to specify the std, in which case... last wins.

Although cmake does have a special property for defines, and also properties for things like -fvisibility=hidden or -fvisibility-inlines-hidden so presumably it is sometimes worth it...

...

The reason I bring this up is because meson (which I'm a developer for) has first-class builtin support for cython, and we offer 3 basic tunables for this:

  • cython_language={c|cpp}
  • cython_version={2|3}
  • cython_args=<array-of-string>

We do not technically need the second option at all, but originally we unconditionally specified -3, and it's a wise default so we don't want to force everyone to set it in cython_args. The first option is crucial to select the output filename. Nothing else is really our responsibility, so we felt little motivation to offer distinctive target options, especially because it's important to ultimately give users the escape hatch of providing their own CXXFLAGS cython flags without updating their meson version to track updates to cython.

@vyasr
Copy link
Contributor Author

vyasr commented Jul 11, 2023

But don't all of those pros also apply to *.cpp files where you can specify target_compile_options(target PRIVATE -o foobar.cpp.o)?

But no one actually does that. They are far more likely to specify the std, in which case... last wins.

I'm not sure I understand what you mean here. Are you saying that CMake allows arbitrary e.g. CMAKE_CXX_FLAGS rather than providing a direct mapping of all possible options? If so, I agree and that's a good argument for why we should avoid the mapping. I'm not sure how that's related to "specifying the std" though, are you referring to flags like -std=c++17?

I'd be fine with stripping it down to the bare minimum of the language and an arbitrary set of cython_args like you suggest. The Python version (2 vs. 3) is also a reasonable default to include.

@eli-schwartz
Copy link
Contributor

eli-schwartz commented Jul 11, 2023

I'm not sure how that's related to "specifying the std" though, are you referring to flags like -std=c++17?

Yes, which has a soft conflict with the use of CXX_STANDARD. That particular option (and a few others) could be specified either via a cmake property or via an arbitrary flag -- what happens if a project does both? Presumably, you get both, and then the compiler just uses whichever one comes last.

@robertmaynard
Copy link

But don't all of those pros also apply to *.cpp files where you can specify target_compile_options(target PRIVATE -o foobar.cpp.o)?

That won't break CMake :) . The template layout for the compilation line that CMake uses places the CMake computed -o as the last item on the compilation line.

Yes, which has a soft conflict with the use of CXX_STANDARD. That particular option (and a few others) could be specified either via a cmake property or via an arbitrary flag -- what happens if a project does both? Presumably, you get both, and then the compiler just uses whichever one comes last.

The very rough order of template line for C++ compilation in CMake in left to right:

  • CMAKE_CXX_FLAGS
  • computed target properties e.g. CXX_STANDARD
  • user properties e.g. target_compile_options
  • CMake's computed -o <output_file>

@eli-schwartz
Copy link
Contributor

eli-schwartz commented Jul 11, 2023

That won't break CMake :) . The template layout for the compilation line that CMake uses places the CMake computed -o as the last item on the compilation line.

And the same rule can easily apply to cython integrations. ;) No matter how conceptually broken it is for a CMakeLists.txt to try doing that, required options can always be placed in the override position to avoid the breakage.

@vyasr
Copy link
Contributor Author

vyasr commented Jul 11, 2023

All right, I've updated the code with the stripped down approach you suggested. @eli-schwartz does this match your expectations? It looks like Meson's cython_args support is documented here, but Meson's approach is different enough that my changes are the equivalent for CMake AFAICT.

It would be possible to support something like CMAKE_CYTHON_ARGS as well to mirror e.g. CMAKE_CXX_ARGS, but I'd rather not do that via a third-party config file. It would be different if we were hoping to add this config directly into CMake, but if it's living in a third-party package like Cython then I think it would be confusing to name it in a way that suggests a CMake built-in. We could introduce the variable with a different name, but I don't think the benefits outweigh the costs (more complex and difficult to understand resolution order as Robert illustrated for CXX flags above).

@eli-schwartz
Copy link
Contributor

That looks like what I was thinking of (with the caveat that I'm not the world's biggest expert on cmake custom functions).

I agree that consuming the CMAKE_* namespace would be out of scope for something not shipping with upstream cmake, yeah.

@vyasr
Copy link
Contributor Author

vyasr commented Jul 11, 2023

Cool thanks. We'll see what @henryiii has to say, if he likes this then I think it'll be in good shape for the Cython devs to weigh in.

@henryiii
Copy link

On flight to SciPy, so might be a bit delayed looking at this. If anyone is at SciPy we can discuss there a bit.

@vyasr
Copy link
Contributor Author

vyasr commented Jul 12, 2023

Thanks for the heads up. No rush, just ping here whenever you have the chance to look. I am unfortunately not at SciPy this year, maybe next year!

@jcfr
Copy link

jcfr commented Jul 15, 2023

@henryiii, @thewtex and myself are now attending the SciPy sprints and will be working on this 🚀

We will consolidate existing work and propose a pull request to cython.

Depending on our finding, we may also create a scikit-build/cython-cmake project allowing us to (1) decouple scikit-build release from update to this new module and (2) deprecate the package to promote the use of the one that will be available in cython proper.

@vyasr
Copy link
Contributor Author

vyasr commented Jul 15, 2023

Sounds great! Please keep me in the loop as well. If my existing prototype is close enough to what you envision, I'm happy to do the work myself and save you the trouble of recreating the wheel. I'm on board with the scikit-build -> Cython gradual progression and can push changes to either or both as you see fit.

@scoder
Copy link
Contributor

scoder commented Jul 17, 2023

attending the SciPy sprints and will be working on this

Great. I'll push the final Cython release soon (it's actually tagged already), but given that this is an entirely new feature, I'll be happy to include it in 3.0.1 or whenever you deem it ready.

@jcfr
Copy link

jcfr commented Jul 27, 2023

I'll be happy to include it in 3.0.1 or whenever you deem it ready.

Thanks for accommodating, we will follow up shortly.

@vyasr
Copy link
Contributor Author

vyasr commented Aug 10, 2023

Please assign this issue to me so that I don't lose track of it. Thanks!

@vyasr
Copy link
Contributor Author

vyasr commented Nov 27, 2023

I've put together a working version of cython-cmake now: https://github.com/vyasr/cython-cmake/. I'll discuss with @henryiii and @jcfr how we can get this migrated to shared ownership in the scikit-build org: scikit-build/cython-cmake#2. Once that's done we can start putting it into production and then iterate as needed. Once we're happy with the code and it's being used successfully in a few places I think we can look into migrating the CMake module into Cython proper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants