Skip to content

v0.1.0

Choose a tag to compare

@HyukjinKwon HyukjinKwon released this 10 Jun 09:35
· 29 commits to main since this release

First release of spark-connect, a pure-Ruby client for Apache Spark Connect.

Highlights:

  • PySpark-style DataFrame API (select/filter/join/group_by/agg/window/SQL/...)
  • Column expressions and a broad function library (SparkConnect::F)
  • DataFrameReader/Writer (CSV, JSON, Parquet, ORC, JDBC, tables) + v2 writer
  • Catalog, runtime config, observations, full Spark SQL type system
  • Apache Arrow result decoding over a resilient gRPC client
  • Targets the Spark Connect 4.0 protocol (works with 3.5+ servers)

Documentation: https://hyukjinkwon.github.io/spark-connect-ruby/

See CHANGELOG.md for the full list.

Full Changelog: https://github.com/HyukjinKwon/spark-connect-ruby/commits/v0.1.0