Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLD: Setup meson builds #49115

Merged
merged 129 commits into from May 9, 2023
Merged

Conversation

lithomas1
Copy link
Member

@lithomas1 lithomas1 commented Oct 15, 2022

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

This can be merged before or after #49114. The meson build system is not going to work properly, anyways, until the PR enabling it as the default for PEP517 builds is merged anyways.

pandas can be compiled with meson now. Building pandas via pyproject.toml will be implemented in the future.

@lithomas1 lithomas1 added the Build Library building on various platforms label Oct 15, 2022
pandas/meson.build Show resolved Hide resolved
@@ -0,0 +1,75 @@
# TODO: can this be removed, I wish meson copied .pyx source to the build dir automatically
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this copy actually required? Seems to somewhat blur the lines of the purpose of an in source vs out of source build if we have to copy the source over

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT the ones that are passed to the extension_module() definitions below are the original sources, so it's not clear to me either, what this is doing.

In general I would not expect this to be needed unless somehow cython is attempting to automatically figure out information traversing multiple files. For example includes. But there do not seem to be any generated sources in this directory anyway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .pxi files are generated in the build directory, but the other sources are in the source directory. I don't think cython has an option to pass an include directory for .pxi files, so we have to copy the source files that depend on the .pxi files over.

If a source doesn't have a dependency on a .pxi file, we don't copy it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What confused me mostly is that you don't seem to be using the copied .pxd anyway? Meson is getting told to run the cython tool on a .pyx in the source directory, not the build directory, as far as I can tell.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think cython has an option to pass an include directory for .pxi files

Right, and this is another thing that may make sense to add to cython upstream, there's a couple other options that already got added while SciPy was porting over.

EDIT: Hold on, this should already exist...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What confused me mostly is that you don't seem to be using the copied .pxd anyway? Meson is getting told to run the cython tool on a .pyx in the source directory, not the build directory, as far as I can tell.

cython_sources is a dict mapping source file -> configure_file output(build directory location).

The copied pxd would be used if we did something like cython_sources['algos.pyx'].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is too crazy but would it make sense to just have any imports of the generated templates assume those templates come from the build folder rather than the source?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would mean patching Cython unfortunately. I don't think the include dir command line option works, for pxi files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was afraid of that... I wonder why not though. Is this a cython bug? If not, it's definitely a cython feature request.

meson.build Outdated
project(
'pandas',
'c', 'cpp', 'cython',
version: '1.6.0.dev0',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this version interact with some of the other work you've done to get versioneer to work?

Copy link
Member Author

@lithomas1 lithomas1 Oct 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

versioneer determines the __version__ attribute. The version here is currently only used be meson-python(incorrectly) to determine the wheel filename(we are working on getting meson-python to use the __version__ tag as well).

In the future, this'll probably only be a fallback version, if everything fails to detect the pandas version properly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still a bit wary of using meson-python coming from the angle that it is relatively immature. Sorry if I lost the answer to this question but what is the advantage of using meson-python versus just creating a simple setuptools hook to invoke meson as needed? The latter wouldn't require any changes to versioneer or other setuptools-based tooling @rgommers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From lithomas1#19 (comment):

@WillAyd it is possible, but I think that opens a whole new can of worms for no real reason. For example, you build the sdist with setuptools while you're building the wheel with a mix Meson and setuptools, leading to possible inconsistencies. And you'll be making cross-compiling worse rather than better.

If you want a gradual transition, it'd be better to merge this PR first, but still default to setuptools in pyproject.toml. That allows devs to start using Meson, without all contributors and users who are pip-installing from GitHub getting it straight away. And then it's a patch of a few lines to switch that default later.

In addition, I think it's optimistic to think that you can write a hook like that which won't have bugs or surprising behavior.

I am still a bit wary of using meson-python coming from the angle that it is relatively immature.

That's understandable. I'm sure there's a couple of bugs left to discover. That said, it's now been in SciPy main since last December, and in the SciPy 1.9.x release series. So hopefully the most glaring defects have been found by now. And meson and meson-python are quite actively maintained, so any painful bugs can hopefully be remedied quickly.

You also have a number of months till the next release, right? Can always just try it, and in case you're not happy right before the next release, it's literally a one-line patch to revert back to setuptools for one more release cycle.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with sticking to meson-python right now.

IMO, meson-python is pretty close to being feature-complete for our use case. We're just missing FFY00/meson-python#177 and the versioning issue(we can always rename the wheel by hand in the meantime).

@lithomas1 lithomas1 marked this pull request as ready for review October 18, 2022 13:37
pandas/meson.build Outdated Show resolved Hide resolved
@lithomas1 lithomas1 changed the title BLD: Add meson.build files BLD: Setup and enable meson builds via pyproject.toml Oct 23, 2022
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. Do we have to make changes to setup.py and/or CI still?

@lithomas1 lithomas1 changed the title BLD: Setup and enable meson builds via pyproject.toml BLD: Setup meson builds Oct 24, 2022
@lithomas1
Copy link
Member Author

Looks pretty good. Do we have to make changes to setup.py and/or CI still?

Sorry, it looks like I updated the title too eagerly. It looks like the pyproject.toml changes will have to go in with CI changes.

This would turn on meson builds for anyone doing a pip install from main. This is probably fine, since the meson builds pass all the tests anyways.

I don't think there will be any changes needed for setup.py(we'll keep it as a fallback, until we're ready to adopt meson).

@lithomas1 lithomas1 added this to the 2.0 milestone Oct 24, 2022
@WillAyd
Copy link
Member

WillAyd commented Oct 24, 2022

Is it a lot of effort to make this work for the development process as well? I think I'd prefer we use it consistently all the way through rather than just on install, as there are likely some build differences versus setuptools that we should catch in CI

@eli-schwartz
Copy link
Contributor

There's a couple options here:

  • merge meson.build, developers can use something like SciPy's dev.py to build, install with Meson so that it is importable, and test it, but it cannot be installed with pip or provide pip metadata
  • update pyproject.toml to use meson, and only meson, affects pip install, people can still invoke python setup.py bdist_wheel if they want to build with setuptools
  • delete setup.py, use only meson

Option 2 is the sensible approach for fallbacks, when porting to a new build system. Critically, it relies on setup.py natively supporting this, because...

... pip doesn't allow you to select a backend. Pip doesn't even allow you to ignore pyproject.toml and try the default setuptools (which it would do if you simply delete pyproject.toml before running pip). To add to the awkwardness, setuptools is busy deprecating and hoping to remove all support for a CLI, and the ecosystem around PyPA, pip, and PEP 517 encourages build backends to not provide a frontend, but use tools such as pip or build, which run whatever backend is in pyproject.toml.

Supporting multiple build systems during a migration is not something the python ecosystem has tried to take into account, and it cannot be generically expected to work. You have to pray that the build system ignores the hint to rely on pip, and provides its own frontend. Or you could edit pyproject.toml, either by hand or with a small python script, every time you want to switch which one you are trying.

@WillAyd
Copy link
Member

WillAyd commented Oct 24, 2022

Thanks for the input @eli-schwartz . Still a bit unclear to me but is the issue with doing the Option 3 of "only meson" that it doesn't work when doing a pip install?

@eli-schwartz
Copy link
Contributor

Option 3 means you better hope no one reports a bug with the port, because you cannot just tell them to "keep using setuptools for now, and we'll make sure to fix it in the next release". :D

@WillAyd
Copy link
Member

WillAyd commented Oct 24, 2022

If an issue pops up we can just pin or exclude certain versions of meson no? I think there are a lot of disadvantages to maintaining both setuptools and meson as a build backend from a code complexity perspective

@lithomas1
Copy link
Member Author

If an issue pops up we can just pin or exclude certain versions of meson no? I think there are a lot of disadvantages to maintaining both setuptools and meson as a build backend from a code complexity perspective

I think the main problem is that meson/meson-python hasn't cut a release with the necessary fixes to build pandas properly.

As a side note, There is also a nasty issue affecting rebuilds of pandas where the pxi dependencies are not generated(mesonbuild/meson#8961). I have an idea in the works for this though(@eli-schwartz can you comment maybe on my proposed fixes?)

You can basically develop pandas using meson right now though. If you're on my meson-poc branch, you would just need to pip install git+https://github.com/mesonbuild/meson.git@master and pip install git+https://github.com/mesonbuild/meson-python.git@main. Then you can do pip install --no-build-isolation . (editable installs won't work, since pandas is not built inplace anymore)

If you want to run tests, it would be pytest pandas ... --import-mode=importlib. (This is necessary, since Python will try to look in the current directory for pandas, and it won't find the inplace extension modules(again meson doesn't build inplace) and will fail).

@lithomas1
Copy link
Member Author

FWIW, I'm not too worried about maintaining the setuptools config. It doesn't really change much, and I guess the only issue would be if someone were to rename/remove/add a Cython module.

@WillAyd
Copy link
Member

WillAyd commented Oct 25, 2022

I think the main problem is that meson/meson-python hasn't cut a release with the necessary fixes to build pandas properly.

Do you know how far out this is? I think should just wait until that becomes possible

Then you can do pip install --no-build-isolation . (editable installs won't work, since pandas is not built inplace anymore)

Is this always going to be a limitation?

@eli-schwartz
Copy link
Contributor

Do you know how far out this is? I think should just wait until that becomes possible

I can answer for the Meson part. We're going to start rc1 for the next version of Meson pretty soon. The release manager was hoping to do it today, but there was a testsuite failure that was only present in Debian packaging, and not in CI (a test case assumed there was a network available) so that needs to be fixed first.

Is this always going to be a limitation?

There is an open ticket for meson-python to implement PEP 660 editable installs: mesonbuild/meson-python#47

AFAIK the only holdup is that no one has put in the work.

@WillAyd
Copy link
Member

WillAyd commented May 4, 2023

Just tried this locally but wasn't able to get it to work. I first ran python -m pip install -ve . --config-settings builddir="debug" --config-settings setup-args="-Ddebug=true" then when I try to import pandas I get:

>>> import pandas
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1140, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1080, in _find_spec
  File "/home/willayd/mambaforge/envs/pandas-dev/lib/python3.11/site-packages/_pandas_editable_loader.py", line 268, in find_spec
    tree = self.rebuild()
           ^^^^^^^^^^^^^^
  File "/home/willayd/mambaforge/envs/pandas-dev/lib/python3.11/site-packages/_pandas_editable_loader.py", line 309, in rebuild
    subprocess.run(self._build_cmd, cwd=self._build_path, env=env, stdout=stdout, check=True)
  File "/home/willayd/mambaforge/envs/pandas-dev/lib/python3.11/subprocess.py", line 548, in run
    with Popen(*popenargs, **kwargs) as process:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/willayd/mambaforge/envs/pandas-dev/lib/python3.11/subprocess.py", line 1024, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/home/willayd/mambaforge/envs/pandas-dev/lib/python3.11/subprocess.py", line 1917, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pip-build-env-w10tgoog/overlay/bin/ninja'

Am I doing something wrong?

@lithomas1
Copy link
Member Author

Can you try with --no-build-isolation also?
(Make sure that you have all dependencies installed and up to date with this, though.)

I might be totally off here, but it seems like meson-python is referring to the temporary ninja (which is deleted after build because of build isolation) when trying to rebuild.

@WillAyd
Copy link
Member

WillAyd commented May 4, 2023

Ah nice. Sorry I see that in the docs not sure how I missed it. User error

@WillAyd
Copy link
Member

WillAyd commented May 4, 2023

Only other thing I've noticed is that the performance seems to be lagging. Here are my local results for the two ways of building a debug version from scratch:

>>> time python -m pip install -ve . --no-build-isolation --config-settings builddir="debug" --config-settings setup-args="-Ddebug=true"
real	2m34.515s
user	19m58.334s
sys	0m28.947s

>>> python setup.py clean --all
>>> time python setup.py build_ext --inplace -j8 --with-debugging-symbols
real	0m44.110s
user	4m24.354s
sys	0m10.020s

These aren't completely apples to apples but the entire cython/compile/link steps feel a lot slower with meson

@WillAyd
Copy link
Member

WillAyd commented May 4, 2023

That said, I don't think performance is a blocker. We can continue to refine, especially since this doesn't preclude us from using setuptools. I think we can move forward if we get a windows test from @Dr-Irv

@eli-schwartz
Copy link
Contributor

These aren't completely apples to apples but the entire cython/compile/link steps feel a lot slower with meson

Maybe the cython step specifically? Meson runs cython as a command-line program, once per file, while setup.py imports Cython.Build.cythonize and calls that without forking.

@WillAyd
Copy link
Member

WillAyd commented May 4, 2023

Chatted with @lithomas1 offline and the performance issue traced back to using the wrong settings. Changing --config-settings setup-args="-Ddebug=true" to -config-settings=setup-args="-Dbuildtype=debug" made a huge difference; now it is even faster than setuptools

@WillAyd
Copy link
Member

WillAyd commented May 8, 2023

Can anyone else from @pandas-dev/pandas-core take a look? This one is pretty tough to keep in sync with master, so would be good to not let hang around. Would be ideal for a Windows user to give approval too

# where provided Python is not compiled in debug mode
'buildtype=release',
# TODO: Reactivate werror, some warnings on Windows
#'werror=true',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never add -Werror as a default flag in any project, that's a no no - instead put it in a CI job. I suggest deleting the above two lines.

@@ -2,13 +2,15 @@
# Minimum requirements for the build system to execute.
# See https://github.com/scipy/scipy/pull/12940 for the AIX issue.
requires = [
"setuptools>=61.0.0",
"meson-python==0.13.1",
"meson[ninja]==1.0.1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ninja] should be removed here; meson-python already adds that if it's missing (but prefers the system version if installed).

Also, these pins are typically not what you want in main, only in a release branch. Or better, an upper bound and a comment that it's for safety and not because of known incompatibilities - that way packagers know that a looser bound or no bound is fine if they are otherwise running into conflicts.

@rgommers
Copy link
Contributor

rgommers commented May 8, 2023

This is looking quite good, and most of the remaining comments are minor/cleanups. There inevitably will be something that will not work 100% right at some point given how large this PR is, but it seems pretty much good to go. I'd suggest addressing the final comments and merging once CI is green, or even merging it now and addressing the rest in a follow-up PR. It's unlikely that there are structural issues left, and more unlikely that any further review is going to catch them now.

@Dr-Irv
Copy link
Contributor

Dr-Irv commented May 8, 2023

Can anyone else from @pandas-dev/pandas-core take a look? This one is pretty tough to keep in sync with master, so would be good to not let hang around. Would be ideal for a Windows user to give approval too

I'm going to do a Windows test now

@Dr-Irv
Copy link
Contributor

Dr-Irv commented May 8, 2023

Can anyone else from @pandas-dev/pandas-core take a look? This one is pretty tough to keep in sync with master, so would be good to not let hang around. Would be ideal for a Windows user to give approval too

I'm going to do a Windows test now

I can confirm that the builds worked fine on Windows. I followed the docs instructions, and it just worked. I did tests from running python from the command line as well as pytest, and it all seemed pretty smooth.

@WillAyd
Copy link
Member

WillAyd commented May 8, 2023

Right on. Unless any objections I think @lithomas1 can merge after conflict fixup + green

@lithomas1
Copy link
Member Author

@fangchenli
Can you take another look?
(No rush, but I can't merge this myself with a "Changes Requested" review.)

@WillAyd WillAyd merged commit 7304396 into pandas-dev:main May 9, 2023
64 checks passed
@WillAyd
Copy link
Member

WillAyd commented May 9, 2023

Great work @lithomas1 . Very exciting stuff

@lithomas1
Copy link
Member Author

Thanks everyone for helping me out with this.

This was referenced May 19, 2023
Rylie-W pushed a commit to Rylie-W/pandas that referenced this pull request May 19, 2023
@WillAyd WillAyd added the Docs label May 24, 2023
Daquisu pushed a commit to Daquisu/pandas that referenced this pull request Jul 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms Docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet