v0.1.0
First release of spark-connect, a pure-Ruby client for Apache Spark Connect.
Highlights:
- PySpark-style DataFrame API (select/filter/join/group_by/agg/window/SQL/...)
- Column expressions and a broad function library (
SparkConnect::F) - DataFrameReader/Writer (CSV, JSON, Parquet, ORC, JDBC, tables) + v2 writer
- Catalog, runtime config, observations, full Spark SQL type system
- Apache Arrow result decoding over a resilient gRPC client
- Targets the Spark Connect 4.0 protocol (works with 3.5+ servers)
Documentation: https://hyukjinkwon.github.io/spark-connect-ruby/
See CHANGELOG.md for the full list.
Full Changelog: https://github.com/HyukjinKwon/spark-connect-ruby/commits/v0.1.0