Skip to content

Rumble 1.9.0 "Ficus Bonsai" beta

Pre-release
Pre-release
Compare
Choose a tag to compare
@ghislainfourny ghislainfourny released this 28 Oct 15:41
· 2620 commits to master since this release
  • Left-outer equi-joins with let clauses: if you have two large tabular datasets, Rumble can nest one into the other with just a few lines of code, and fast.
  • Inner equi-joins and generic joins with where clauses are detected.
  • Renamed --result-size to --materialization-size to avoid confusion, and adding more hints about --output-path for getting the complete output from a parallel query.
  • New CLI options --output-format and output-format-option:* for outputting structured output to other formats than JSON (Parquet, CSV...).
  • New CLI option --number-of-output-partitions to repartition the output as desired
  • New function local-text-file() to read a file as a sequence of string items, but without Spark parallelism (streaming instead). This makes Rumble faster for smaller files
  • Performance improvements for FLWOR queries on structured data (Avro, Parquet, structured JSON, CSV)...
  • Performance improvement for when parallelism is not used at all
  • Stability improvement for json-doc(), which will now also work after json-file() has been used.