Skip to content

Ballista 53.0.0 blog post#188

Merged
andygrove merged 3 commits into
mainfrom
site/ballista-53
May 25, 2026
Merged

Ballista 53.0.0 blog post#188
andygrove merged 3 commits into
mainfrom
site/ballista-53

Conversation

@andygrove
Copy link
Copy Markdown
Member

@andygrove andygrove commented May 24, 2026

Summary

Staged version: https://datafusion.staged.apache.org/blog/2026/05/24/datafusion-ballista-53.0.0/

  • Adds release blog post for Apache DataFusion Ballista 53.0.0
  • Covers changes since 43.0.0 (production deployment, shuffle subsystem, REST API, Python interface, Spark/Substrait)
  • Covers 53.0.0-specific highlights (TUI, experimental AQE, EXPLAIN ANALYZE, sort-merge join default, sort-based shuffle default)
  • Covers active work and roadmap based on open issues and EPICs
  • Notes that Python wheels in 53.0.0 still report 52.0.0 and that 53-line wheels will land with 53.1.0 on PyPI

Tracking issue: apache/datafusion-ballista#1762

Test plan

  • Render locally with make and inspect formatting / image references
  • Confirm link to the previous Ballista 43.0.0 post resolves

@andygrove andygrove marked this pull request as draft May 24, 2026 14:27
- Per-executor system and process metrics are reported, and Prometheus metrics integration is available
behind a feature flag.

### A new Python interface
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should mention work with datafusion python team to improve this integration?

Copy link
Copy Markdown
Contributor

@milenkovicm milenkovicm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor comments, good overview.
I'm very happy with this release

[Apache DataFusion]: https://datafusion.apache.org

The last Ballista blog post covered [43.0.0], released in January 2025. In the year and a bit since, the
project has quietly shipped a release for every DataFusion release: 44, 45, 46, 47, 48, 49, 50, 51, 52, and
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to mention every release by number ? Each of them is dear to my heart but it might be a bit excessive 😀

Spark-compatible SQL semantics on a Ballista cluster.

The scheduler also has a Substrait surface: `SubstraitSchedulerClient` accepts Substrait logical plans, and
the deprecated SQL-string submission path has been removed. This is an important step toward decoupling
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL string submission was very small functionality to mention here, maybe we do not need it

Plan rendering, including a graph view, is available directly from the TUI:

<img
src="/blog/images/datafusion-ballista-53.0.0/tui-job-plan-graph-popup.png"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe image with text plan might be better than graph current one

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point me to an appropriate image to use?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping the graph image in this revision — happy to add a text-plan screenshot in a follow-up if you have one you'd like to use.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have

but its very busy.

We can create two

image image

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swapped in the two text-plan screenshots you posted (logical + physical) in place of the graph image. Thanks for grabbing them.

## Thank You

This release is the result of work from many contributors over the past 16 months. Thanks especially to
Marko Milenković, Martin Grigorov, Daniel Tu, Alexander Domenti, Mete Genez, Saj, Harrison Crosse, and
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mere Genez => Metehan Yildirim
Saj => Sajeevan Achuthan


This release is the result of work from many contributors over the past 16 months. Thanks especially to
Marko Milenković, Martin Grigorov, Daniel Tu, Alexander Domenti, Mete Genez, Saj, Harrison Crosse, and
many others whose contributions are visible in the [changelog]. Thanks also to the broader DataFusion
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've forgotten to put your name Andy

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude doesn't value my contributions to this release, apparently!

Copy link
Copy Markdown
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

maybe mention the new logo! it looks great

Comment thread content/blog/2026-05-24-datafusion-ballista-53.0.0.md Outdated

- **S3 object store support** has been added to both the executor and scheduler binaries, including
credentials derived from the standard AWS environment, instance metadata, and explicit configuration.
- **Docker images** for the scheduler and executor are now published on each release, making Docker Compose
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


### REST API and observability

The scheduler's REST API has grown from a small status surface to the primary control plane for inspecting
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +122 to +125
The release process has also been extended so that future Ballista releases will publish Python wheels to
[PyPI] as `ballista`. Note that the Python bindings included in **53.0.0 still report version 52.0.0**
because the version bump landed shortly after the 53.0.0 release candidate was tagged. Wheels matching
the 53 line will be published with **53.1.0**, which is expected to follow shortly.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this still the case? the testpypi package reports 53.0.0

uv run --python 3.10 --with "ballista==53.0.0" \
    --index-url https://test.pypi.org/simple/ \
    --extra-index-url https://pypi.org/simple/ \
    --index-strategy unsafe-best-match \
    python -u -c "import ballista; print(ballista.__version__)"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was fixed in 53.0.0-rc1-testpypi3 but was incorrect in 53.0.0-rc1 .. maybe I should just skip this note

alt="Ballista TUI plan graph popup"
/>

A web rendering of the TUI is in development.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are very optimistic here! 😄
The web version of the TUI is functional but it is very ugly! Ratzilla produces a ton of empty spans (<span>&nbsp;</span>) to render empty "pixel"s

While writing the above I realized that the WebTUI could look much better if it was rendered in a <canvas> HTML element and I went to suggest this to the Ratzilla project but obviously someone else already realized this before me! 😄

Image

Now I just need to make a different banner for the WebTUI build!

Co-authored-by: Kevin Liu <kevinjqliu@users.noreply.github.com>
@andygrove andygrove marked this pull request as ready for review May 25, 2026 14:09
@andygrove
Copy link
Copy Markdown
Member Author

LGTM!

maybe mention the new logo! it looks great

We already announced the new logo in the previous blog post: https://datafusion.apache.org/blog/output/2025/02/02/datafusion-ballista-43.0.0/

- Tighten release-numbers paragraph
- Link Docker images and REST API mentions to the user guide
- Acknowledge collaboration with the datafusion-python team
- Drop wheel-version 52.0.0 caveat (wheels now report 53.0.0)
- Remove mention of the deprecated SQL-string submission path
- Replace TUI graph image with logical and physical text-plan screenshots
- Fix contributor names and add Andy Grove to the Thank You section
@andygrove
Copy link
Copy Markdown
Member Author

Thanks for the reviews @milenkovicm @kevinjqliu @martin-g. I will go ahead and merge. Feel free to create follow on PRs if you spot any issues once this is live.

@andygrove
Copy link
Copy Markdown
Member Author

@milenkovicm I need an approval before I can merge - could you oblige?

@milenkovicm
Copy link
Copy Markdown
Contributor

Thanks a lot @andygrove

@andygrove andygrove merged commit c3f829f into main May 25, 2026
4 checks passed
@andygrove andygrove deleted the site/ballista-53 branch May 25, 2026 14:19
@andygrove
Copy link
Copy Markdown
Member Author

Post is live at https://datafusion.apache.org/blog/output/2026/05/24/datafusion-ballista-53.0.0/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants