Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cython support issue with generated pyx when touching cimported pxd: ninja needs to be run twice to fully build the project? #13212

Open
lesteve opened this issue May 14, 2024 · 10 comments

Comments

@lesteve
Copy link
Contributor

lesteve commented May 14, 2024

Describe the bug
When touching a pxd cimported in a generated Cython pyx files, ninja needs to be run twice to fully build the project. This was originally seen in scikit-learn see scikit-learn/scikit-learn#28837. Seems like something maybe similar/related was reported in mesonbuild/meson-python#589 where @dnicolodi investigated and mentioned a work-around. In our case we don't have generated pxi but generated pyx.

To Reproduce
A simpler reproducer is available in https://github.com/lesteve/meson-partial-build

Here is a summary to give the gist of it

Package layout:

package
├── __init__.py
├── lib
│   ├── __init__.py
│   ├── lib.pyx
│   ├── lib.pyx.in
│   └── meson.build
├── meson.build
└── utils
    ├── __init__.py
    ├── meson.build
    └── utils.pxd

package/lib/meson.build

lib_cython_tree = [fs.copyfile('__init__.py')]

# In scikit-learn this would actually use tempita on a .pyx.tp file, but
# simpler to reproduce with only copyfile
lib_pyx = fs.copyfile('lib.pyx.in', 'lib.pyx')

py.extension_module(
  'lib',
  lib_pyx,
  lib_cython_tree,
  root_cython_tree,
  utils_cython_tree,
  cython_args: cython_args,
  install: true,
  subdir: 'package/lib',
)

package/utils/meson.build

utils_cython_tree = [
  fs.copyfile('__init__.py'),
  fs.copyfile('utils.pxd')
]

Initial Setup + build:

meson setup build
ninja -C build

First build:

touch package/utils/utils.pxd
# This does not rebuild package/lib/lib shared library
ninja -C build -d explain -v

Output:

ninja: Entering directory `build'
ninja explain: output meson-test-prereq of phony edge with no inputs doesn't exist
ninja explain: meson-test-prereq is dirty
ninja explain: output meson-benchmark-prereq of phony edge with no inputs doesn't exist
ninja explain: meson-benchmark-prereq is dirty
ninja explain: restat of output package/utils/utils.pxd older than most recent input ../package/utils/utils.pxd (1713451993361700259 vs 1713452006018287455)
ninja explain: package/utils/utils.pxd is dirty
[1/1] /home/lesteve/micromamba/envs/scikit-learn-dev/bin/meson --internal copy ../package/utils/utils.pxd package/utils/utils.pxd

Second build still do things:

# The second time this builds package/lib/lib shared library
ninja -C build -d explain -v

Output:

ninja: Entering directory `build'
ninja explain: output meson-test-prereq of phony edge with no inputs doesn't exist
ninja explain: meson-test-prereq is dirty
ninja explain: output meson-benchmark-prereq of phony edge with no inputs doesn't exist
ninja explain: meson-benchmark-prereq is dirty
ninja explain: restat of output package/lib/lib.cpython-312-x86_64-linux-gnu.so.p/package/lib/lib.pyx.c older than most recent input /home/lesteve/dev/meson-partial-build/build/package/utils/utils.pxd (1713452000454989082 vs 1713452006018287455)
ninja explain: package/lib/lib.cpython-312-x86_64-linux-gnu.so.p/package/lib/lib.pyx.c is dirty
ninja explain: package/lib/lib.cpython-312-x86_64-linux-gnu.so.p/package/lib/lib.pyx.c is dirty
ninja explain: package/lib/lib.cpython-312-x86_64-linux-gnu.so.p/meson-generated_package_lib_lib.pyx.c.o is dirty
ninja explain: package/lib/lib.cpython-312-x86_64-linux-gnu.so is dirty
[1/3] cython -M --fast-fail -3 --include-dir /home/lesteve/dev/meson-partial-build/build package/lib/lib.pyx -o package/lib/lib.cpython-312-x86_64-linux-gnu.so.p/package/lib/lib.pyx.c
[2/3] ccache cc -Ipackage/lib/lib.cpython-312-x86_64-linux-gnu.so.p -Ipackage/lib -I../package/lib -Ipackage -Ipackage/utils -I/home/lesteve/micromamba/envs/scikit-learn-dev/include/python3.12 -fvisibility=hidden -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -std=c99 -O2 -g -fPIC -MD -MQ package/lib/lib.cpython-312-x86_64-linux-gnu.so.p/meson-generated_package_lib_lib.pyx.c.o -MF package/lib/lib.cpython-312-x86_64-linux-gnu.so.p/meson-generated_package_lib_lib.pyx.c.o.d -o package/lib/lib.cpython-312-x86_64-linux-gnu.so.p/meson-generated_package_lib_lib.pyx.c.o -c package/lib/lib.cpython-312-x86_64-linux-gnu.so.p/package/lib/lib.pyx.c
[3/3] cc  -o package/lib/lib.cpython-312-x86_64-linux-gnu.so package/lib/lib.cpython-312-x86_64-linux-gnu.so.p/meson-generated_package_lib_lib.pyx.c.o -Wl,--as-needed -Wl,--allow-shlib-undefined -shared -fPIC

Expected behavior
After touching utils.pxd the project is fully built in one go. Calling ninja again says "No work to do"

system parameters

  • Is this a cross build or just a plain native build (for the same computer)? native build
  • what operating system (e.g. MacOS Catalina, Windows 10, CentOS 8.0, Ubuntu 18.04, etc.) ArchLinux
  • what Python version are you using: 3.12
  • what meson --version: 1.4.0
  • what ninja --version if it's a Ninja build: 1.11.1
@dnicolodi
Copy link
Member

I am not sure whether the solution I described in mesonbuild/meson-python#589 is a work-around or the intended way of coding the dependency. The solution is taken directly from the Meson's test for Cython support, thus I tend to think that it is the intended way for thing to work. Why do you think it is a work-around?

I have the impression that the issue reported here is due to the inclusion of a .py file in the sources but I need to look further into it.

@dnicolodi
Copy link
Member

The __init__.py file in the test package linked in the issue is empty. Why is it passed along with the Cython source code to py.extension_module()?

@lesteve
Copy link
Contributor Author

lesteve commented May 15, 2024

Thanks for taking a look!

Why is it passed along with the Cython source code to py.extension_module()?

This is needed for Cython cimports to work, you generally need __init__.py and cimported .pxd copied to the build directory, so that Cython detects this as a package and can cimport the relevant files. Maybe scikit-learn/scikit-learn#28040 (comment) has a tiny bit more context about this point. You also need the utils __init__.py to be copied before lib is built and I believe adding the __init__.py and .pxd to the py.extension_module achieves this through an order-only dependency in the build.ninja.

I am not sure whether the solution I described in mesonbuild/meson-python#589 is a work-around or the intended way of coding the dependency

I don't know either. If there was a similar work-around for my stand-alone example repo, it would be fine. At the same time, as a naive user, it feels quite convoluted to have to create a dependency from files rather than pass directly files to the py.extension_module but I still have plenty to learn about Meson, so I am probably missing some subtleties here ...

Edit: I should have mentioned that mesonbuild/meson-python#589 may be slightly different and happens with generated .pxi where this issue happens with generated .pyx. Not a Cython expert, but I would say that .pxi are bit more weird/special than .pyx.

@lesteve
Copy link
Contributor Author

lesteve commented May 29, 2024

cc @rgommers mostly for a FYI, not sure whether there is a better person to ping in the Meson team for Cython-specific things.

Summary: in some cases when updating a .pxd you need to run ninja twice to fully build the project. I have a standalone repo reproducing the issue https://github.com/lesteve/meson-partial-build. This was originally seen in scikit-learn scikit-learn/scikit-learn#28837.

@rgommers
Copy link
Contributor

The reason I think has to do with depfile support in Cython not knowing about files that are copied with fs.copyfile:

$ cd build/package/lib/lib.cpython-311-x86_64-linux-gnu.so.p/package/lib/
$ cat lib.pyx.c.dep
package/lib/lib.cpython-311-x86_64-linux-gnu.so.p/package/lib/lib.pyx.c: \
  /home/rgommers/code/tmp/meson-partial-build/build/package/utils/utils.pxd \
  package/lib/lib.pyx

That explains why it works when invoking ninja twice: the first time the copyfile is executed, and the second time around the build/package/utils/utils.pxd (which is tracked in the depfile) has changed, so now the extension module rebuilds.

If you're already running Tempita as a Meson generator, the easiest fix is probably to use the depends: keyword to that generator and make that depend on the .pxd files. Something like:
https://github.com/scipy/scipy/blob/74c13e4406ae7a95b4509cb436e8fee5b34b2868/scipy/special/meson.build#L119-L122

it feels quite convoluted to have to create a dependency from files rather than pass directly files to the py.extension_module but I still have plenty to learn about Meson, so I am probably missing some subtleties here ...

Yes, we'd like it to work automatically whenever possible. It's already pretty solid after we added depfile support to Cython. But this templating use case is the exception I believe. I don't think it comes up for other languages, because there isn't a need to manually go copy header files around.

The missing link is in here somewhere (from build/build.ninja):

build package/utils/utils.pxd: CUSTOM_COMMAND ../package/utils/utils.pxd
 COMMAND = /home/rgommers/mambaforge/envs/scipy-dev/bin/meson --internal copy ../package/utils/utils.pxd package/utils/utils.pxd
 description = Copying$ file$ package/utils/utils.pxd

build package/lib/__init__.py: CUSTOM_COMMAND ../package/lib/__init__.py
 COMMAND = /home/rgommers/mambaforge/envs/scipy-dev/bin/meson --internal copy ../package/lib/__init__.py package/lib/__init__.py
 description = Copying$ file$ package/lib/__init__.py

build package/lib/lib.pyx: CUSTOM_COMMAND ../package/lib/lib.pyx.in
 COMMAND = /home/rgommers/mambaforge/envs/scipy-dev/bin/meson --internal copy ../package/lib/lib.pyx.in package/lib/lib.pyx
 description = Copying$ file$ package/lib/lib.pyx
build package/lib/lib.cpython-311-x86_64-linux-gnu.so.p/meson-generated_package_lib_lib.pyx.c.o: c_COMPILER package/lib/lib.cpython-311-x86_64-linux-gnu.so.p/package/lib/lib.pyx.c || package/__init__.py package/lib/__init__.py package/utils/__init__.py package/utils/utils.pxd
 DEPFILE = package/lib/lib.cpython-311-x86_64-linux-gnu.so.p/meson-generated_package_lib_lib.pyx.c.o.d
 DEPFILE_UNQUOTED = package/lib/lib.cpython-311-x86_64-linux-gnu.so.p/meson-generated_package_lib_lib.pyx.c.o.d

cc @rgommers mostly for a FYI, not sure whether there is a better person to ping in the Meson team for Cython-specific things.

Thanks for the ping. There's a few folks that try to be helpful with Cython stuff, I'm one of them. I think I have a decent feel for all the ways to shoot oneself in the foot with Cython's path-dependent behavior by now, so feel free to ping me indeed:)

@eli-schwartz
Copy link
Member

eli-schwartz commented May 29, 2024

package/utils/utils.pxd is a builddir relative path to a fs.copyfile() output, hardcoded as a header dependency rule in build.ninja

/home/rgommers/code/tmp/meson-partial-build/build/package/utils/utils.pxd is an absolute path to the same exact file, inferred via a depfile as a header dependency rule.

The first rule guarantees that fs.copyfile() is correctly run before cythoning that file. The issue is apparently that the depfile makes ninja think it now depends on another instance of that file, ninja doesn't dedupe and realize both are the same path, and ninja erroneously thinks the cythoned file is out of date?

This is either a ninja bug or a cython depfile support bug depending on how you look at it. Possibly it's a "both sides are buggy" bug.

@lesteve
Copy link
Contributor Author

lesteve commented May 29, 2024

Thanks a lot for your inputs!

If you're already running Tempita as a Meson generator, the easiest fix is probably to use the depends: keyword to that generator and make that depend on the .pxd files. Something like:
scipy/scipy@74c13e4/scipy/special/meson.build#L119-L122

I'll have a closer look at using a Meson generator, it looks like indeed a reasonable work-around.

@rgommers
Copy link
Contributor

The issue is apparently that the depfile makes ninja think it now depends on another instance of that file, ninja doesn't dedupe and realize both are the same path, and ninja erroneously thinks the cythoned file is out of date?

Good points. I verified that if I edit the depfile by hand, changing the absolute path to a relative one, the rebuild does work as expected. So if we can avoid Cython writing absolute paths, things should start working as intended it looks like.

@lesteve
Copy link
Contributor Author

lesteve commented May 30, 2024

FWIW, by looking a bit further, it looks like another work-around is to use the depends argument in the tempita custom_target

commit 3e13e9fff6bbefc9c68510dbe2653f4d79196d72 (HEAD -> fix, origin/fix)
Author: Loïc Estève <loic.esteve@ymail.com>
Date:   Thu May 30 14:34:29 2024 +0200

    Fix by adding depends to custom_target

diff --git a/package/lib/meson.build b/package/lib/meson.build
index 961ebc8..83651b5 100644
--- a/package/lib/meson.build
+++ b/package/lib/meson.build
@@ -5,14 +5,12 @@ lib_pyx = custom_target(
   output: 'lib.pyx',
   input: 'lib.pyx.in',
   command: [py, tempita, '@INPUT@', '-o', '@OUTDIR@'],
+  depends: [lib_cython_tree, root_cython_tree, utils_cython_tree]
 )
 
 py.extension_module(
   'lib',
   lib_pyx,
-  lib_cython_tree,
-  root_cython_tree,
-  utils_cython_tree,
   cython_args: cython_args,
   install: true,
   subdir: 'package/lib',

This is probably not as exact as the cython generator (the dependencies are actually indeed for the .pyx -> c step and not the pyx.in -> pyx step) but it seems to have the same result and it's probably a bit easier given how scikit-learn meson.build are currently written.

@rgommers
Copy link
Contributor

@lesteve yes definitely, that is the right workaround for now. It also was what I meant with the suggestion in my first comment ("... the easiest fix is probably to use the depends: keyword to that generator and make that depend on the .pxd files ...").

The depfile discussion is the structural fix longer-term, but it needs a Cython fix and then a Cython release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants