Rumble 1.9.0 "Ficus Bonsai" beta
Pre-release
Pre-release
ghislainfourny
released this
28 Oct 15:41
·
2620 commits
to master
since this release
- Left-outer equi-joins with let clauses: if you have two large tabular datasets, Rumble can nest one into the other with just a few lines of code, and fast.
- Inner equi-joins and generic joins with where clauses are detected.
- Renamed --result-size to --materialization-size to avoid confusion, and adding more hints about --output-path for getting the complete output from a parallel query.
- New CLI options --output-format and output-format-option:* for outputting structured output to other formats than JSON (Parquet, CSV...).
- New CLI option --number-of-output-partitions to repartition the output as desired
- New function local-text-file() to read a file as a sequence of string items, but without Spark parallelism (streaming instead). This makes Rumble faster for smaller files
- Performance improvements for FLWOR queries on structured data (Avro, Parquet, structured JSON, CSV)...
- Performance improvement for when parallelism is not used at all
- Stability improvement for json-doc(), which will now also work after json-file() has been used.