fix(python): ignore ballista-namespaced cluster_config keys locally#1613
Merged
andygrove merged 1 commit intoapache:mainfrom Apr 28, 2026
Merged
Conversation
The local DataFusion SessionConfig built inside BallistaSessionContext does not register the BallistaConfig extension, so applying any ballista.* key panics with 'Could not find config namespace ballista'. These keys are only meaningful on the scheduler-side session and were never intended to be set on the local config. Skip ballista.* keys in the local apply loop. They are still preserved on the context and forwarded to the scheduler via cluster_config.
milenkovicm
approved these changes
Apr 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #1607
Rationale for this change
BallistaSessionContextaccepts acluster_configdict that is intended to be forwarded to the scheduler-side session (e.g.ballista.shuffle.sort_based.enabled). The constructor also applies every entry to the local DataFusionSessionConfigso that DataFusion-namespaced keys (e.g.datafusion.execution.target_partitions) are honored during local table registration and planning.The problem is that the local
SessionConfigbuilt insideBallistaSessionContextdoes not have theBallistaConfigextension registered, so anyballista.*key blows up with:This makes the documented
cluster_configparameter effectively unusable for any Ballista-specific tuning from Python.What changes are included in this PR?
python/python/ballista/extension.py: skipballista.*keys in the local-apply loop. The full dict is still stored on the context and forwarded to the scheduler viacreate_ballista_data_frame, so scheduler-side behavior is unchanged.python/python/tests/test_context.py: newtest_cluster_config_accepts_ballista_namespaced_keyscovering both a DataFusion key and a Ballista key in the samecluster_configand asserting that context construction and a trivial query both succeed.Are there any user-facing changes?
Yes, but only in the sense that a previously panicking call now works. No public API shape changes.