Release Sparksoniq 0.9.5 "Larch" · RumbleDB/rumble

New alpha release for Sparksoniq, a JSONiq engine to query large-scale JSON datasets stored on HDFS. Spark under the hood.

New:

Many bugfixes
All FLWOR clauses are now supported locally (that is when parallelize() or json-file() is not used) Locally means: without invoking Spark transformations. Local FLWOR expressions can execute on the client but also within a transformation triggered by a non-local FLWOR.
Local FLWOR expressions can fully nest. All queries of the tutorial now work and you can use and abuse let clauses.
Pushdowns: json-file("file.json").foo[].bar[[2]].foobar works on Spark
Significant improvements in memory footprint: some queries are no longer materialized in memory (e.g., filtering query with a where clause or count).
Significant improvements in performance: a file of 16,000,000 objects was successfully tested for count, filtering, grouping and ordering with a local Spark execution on a single laptop. Performance also improved on bigger datasets on clusters.

The jar file with ANTLR 4.7 is to be used with Spark 2.3+. Older versions (2.0+) use ANTLR 4.5.3.

Provide feedback