Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Create tools to enable optional components (like Gandiva, Flight) to be built and deployed as separate Python packages #24688

Open
asfimport opened this issue Apr 19, 2020 · 1 comment

Comments

@asfimport
Copy link
Collaborator

Our current monolithic approach to Python packaging isn't likely to be sustainable long-term.

At a high level, I would propose a structure like this:

pip install pyarrow  # core package containing libarrow, libarrow_python, and any other common bundled C++ library dependencies

pip install pyarrow-flight  # installs pyarrow, pyarrow_flight

pip install pyarrow-gandiva # installs pyarrow, pyarrow_gandiva

We can maintain the semantic appearance of a single pyarrow package by having thin API modules that would look like

CONTENTS OF pyarrow/flight.py

from pyarrow_flight import *

Obviously, this is more difficult to build and package:

  • CMake and setup.py files must be refactored a bit so that we can reuse code between the parent and child packages

  • Separate conda and wheel packages must be produced. With conda this seems more straightforward but since the child wheels depend on the parent core wheel, the build process seems more complicated

    In any case, I don't think these challenges are insurmountable. This will have several benefits:

  • Smaller installation footprint for simple use cases (though note we are STILL duplicating shared libraries in the wheels, which is quite bad)

  • Less developer anxiety about expanding the scope of what Python code is shipped from apache/arrow. If in 5 years we are shipping 5 different Python wheels with each Apache Arrow release, that sounds completely fine to me.

Reporter: Wes McKinney / @wesm

PRs and other links:

Note: This issue was originally created as ARROW-8518. Please see the migration documentation for further details.

@amol-
Copy link
Member

amol- commented Feb 1, 2024

https://github.com/amol-/consolidatewheels which was created for the purpose of splitting wheels in arrow. Requires real world testing on more complex cases than what https://github.com/amol-/wheeldeps can verify

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants