fix(python): exclude mimalloc from Python wheel to avoid macOS segfault#1769
Open
andygrove wants to merge 1 commit into
Open
fix(python): exclude mimalloc from Python wheel to avoid macOS segfault#1769andygrove wants to merge 1 commit into
andygrove wants to merge 1 commit into
Conversation
On macOS, libmimalloc installs a static constructor that registers
itself as a malloc zone, so every malloc in the process (including
Python's PyObject_Malloc) is intercepted. The Python extension wheel
unintentionally linked libmimalloc through two paths:
pyballista -> ballista (default = standalone) -> ballista-executor
(default features include mimalloc)
pyballista -> datafusion-python
(default features include mimalloc)
The first allocation Python attempts after loading the extension
recurses inside mi_heap_malloc_zero_aligned_at, blowing the main
thread stack (lldb shows >100k frames of the same function before
EXC_BAD_ACCESS). Linux is unaffected because libmimalloc does not
auto-interpose there; the linked code is dead unless declared as
#[global_allocator], which the wheel never does.
Cut both paths:
* ballista/client: depend on ballista-executor and ballista-scheduler
with default-features = false. The standalone feature only needs
the in-process constructors and (optionally) arrow-ipc-optimizations.
* ballista/executor: move mimalloc from default features into the
build-binary feature, since the global allocator is only installed
in src/bin/main.rs.
* python: depend on datafusion-python with default-features = false.
After the fix, `cargo tree -p pyballista --invert mimalloc` reports
no match, and a locally-built cp310-abi3 wheel constructs and starts
BallistaScheduler/BallistaExecutor on macOS arm64 without crashing.
The ballista-executor binary still pulls in mimalloc via its default
features (build-binary).
Member
Author
|
@kevinjqliu fyi - thanks for catching this! |
milenkovicm
approved these changes
May 25, 2026
Contributor
milenkovicm
left a comment
There was a problem hiding this comment.
Makes sense, thanks @andygrove
kevinjqliu
approved these changes
May 25, 2026
| datafusion = { version = "53", features = ["avro"] } | ||
| datafusion-proto = { version = "53" } | ||
| datafusion-python = { version = "53" } | ||
| datafusion-python = { version = "53", default-features = false } |
Contributor
There was a problem hiding this comment.
interesting that datafusion-python doesnt run into this issue while still using mimalloc.
this is a good temporary fix, we might want to explore other options too. someone might want to turn on default-features later on
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #.
Rationale for this change
The published
ballista==53.0.0cp310-abi3 macOS wheel on TestPyPI segfaults the moment Python constructs aBallistaSchedulerorBallistaExecutor:Output:
The same exact API works on Linux (manylinux x86_64 wheel via docker).
Root cause
LLDB shows ~104,500 frames of the same symbol at offset
+696before_PyType_call. Disassembly at that offset is ablback to the function's own entry (a self-call), and the binary's strings includemimalloc: warning:,aligned allocation request is too large (size %zu, alignment %zu), etc. The crash is insidelibmimallocrecursing during the firstmallocPython issues after the extension loads.On macOS,
libmimallocinstalls a static constructor that registers itself as a malloc zone, so everymallocin the process (including Python'sPyObject_Malloc) is intercepted. On Linux that auto-interposition does not happen, so the linked mimalloc code is dead unless explicitly declared as#[global_allocator], which the Python wheel never does. Hence the platform asymmetry.mimallocwas reaching the wheel through two independent paths:Cutting either alone is not enough; both must be cut.
What changes are included in this PR?
ballista/client/Cargo.toml: depend onballista-executorandballista-schedulerwithdefault-features = false. Thestandalonefeature only needs the in-process constructors.arrow-ipc-optimizationsis re-enabled on the executor dep so the in-process executor keeps its IPC read perf optimization.ballista/executor/Cargo.toml: movemimallocfrom the crate's default feature set into thebuild-binaryfeature, since#[global_allocator]is only set insrc/bin/main.rs. Theballista-executorbinary still pulls in mimalloc viacargo builddefaults.python/Cargo.toml: depend ondatafusion-pythonwithdefault-features = false.python/Cargo.lock: regenerated.After the change,
cargo tree -p pyballista --invert mimallocreturns no match. Locally rebuilt cp310-abi3 macOS wheel constructs and starts bothBallistaSchedulerandBallistaExecutor, runsCREATE EXTERNAL TABLE+SELECT COUNT(*)end-to-end, and exits cleanly.cargo checkpasses for default,--no-default-features,ballistawith--features standalone, and theballista-executorbinary build.Are there any user-facing changes?
No public API change. The Python wheel will no longer link
libmimalloc, so memory allocation in the Python process is unchanged from systemmalloc. Users runningballista-executoras a binary continue to use mimalloc as the global allocator.