Skip to content

Releases: apache/datasketches-java

Apache Release 6.0.0

27 Apr 01:41
Compare
Choose a tag to compare
  • New: quantiles T-Digest sketch
  • New: BloomFilter
  • New: Exact and Bounded Sampling Proportional to Size (EB-PPS) sketch
  • Added Weighted Inputs to quantiles KllFloatsSketch, KllDoublesSketch and KllItemsSketch
  • Added Vector Inputs to quantiles KLLFloatsSketch and KllDoublesSketch
  • Enhanced quantiles Sorted Views for KLL and Classic quantiles sketches.
  • Enhanced Partitioning API.

5.0.2

26 Mar 22:21
ba29502
Compare
Choose a tag to compare

This is a PATCH release. No new functionality has been introduced. There are a number of changes stemming from two issues:

  • Issue 527: Properly use the comparator for sorting level 0 in the KllItemsSketch
  • A new version of SpotBugs created a number of potential security warnings around Finalizer Attacks. Having done our best to look into the matter, we do not believe sketches are meaningfully vulnerable -- any data in the sketches is already available via reflection and there are no methods with special conditional access. Regardless, we felt that good code hygiene meant that we should prioritize fixing any issues found.

Apache Release 5.0.1

03 Jan 00:37
Compare
Choose a tag to compare

5.01 fixed two issues:

PR 482: The HLL Union :: toString(), which prints out a simple diagnostic summary of the sketch, might change the internal state of the union. This was not intended.
PR 485: The KllItemsSketch<Boolean> was not serializing and deserializing the min and max values properly. It only affects this specific generic case of <Boolean>. This is a rather bizarre use case for quantiles -- but nonetheless it is fixed! :)

Apache Release 5.0.0

09 Dec 01:23
Compare
Choose a tag to compare
  • A new Example Partitioner Tool is useable in its own right for partitioning medium sized data sets up to about 1E9 items. But the same algorithm could be used in a parallel environment for partitioning data sets many orders-of-magnitude larger.
  • Lots of internal cleanup and a few API improvements for consistency across the different quantile sketches, for example. These changes in the API, although relatively minor, were the reason to move to a major release.
  • Fixed an integer overflow bug caught by Karan Kumar (via Druid), where very large partitioning datasets using the classic quantiles DoublesSketch::getPartitionBoundaries() would fail.

Apache DataSketches 4.2.0

07 Sep 18:48
Compare
Choose a tag to compare
  • added generic KLL quantile sketch

Apache Release 4.1.0

14 Jun 17:00
Compare
Choose a tag to compare
  • This is a minor release that primarily fixed a problem where the Java KLL sketches could not read KLL images produced by C++.
  • In addition, a number of code improvements to fix issues found by SpotBugs and CodeQL.
  • Documentation improvements both internal as well as Javadocs.

Apache Release 4.0.0

18 May 19:55
Compare
Choose a tag to compare

Major new features and enhancements

- Quantile Sketches
    - The major APIs for all the quantile sketches now derive from interfaces common to all the quantile sketches. This makes it much easier for the user to move from one quantile sketch to another with only very minor API changes.
    - All the quantile sketches now have a "SortedView", which is iterable and makes analysis of the quantile distribution even easier.
- HLL Sketches
    - Major speed performance improvements for HLL union/merge operations.
    - Major improvements to the HLL Javadocs.
- Theta Sketches
    - The Theta sketch has been enhanced with an optional compress operation that makes the serialized theta sketch smaller.
- TestNG has been updated to version 7.5.1 (works with Java 8), which includes the Zip Slip Vulnerability fix.

Apache Release 3.3.0

06 Jun 20:56
Compare
Choose a tag to compare

This release removes support for allocating off-heap memory in Java9 and Java10

Only LTS Java versions are now fully supported and since the Java 9 and Java 10 no
longer receive security patches, this release removes off-heap memory support
for these versions.

This applies to Datasketches-Memory when allocating Memory-Mapped files,
Direct or direct ByteBuffers in these runtime versions.
Support for allocating Memory in both heap and non-heap remains unaffected
for Java 8, Java 11, Java 12 and Java 13.

Apache Release 3.2.0

27 Apr 19:36
Compare
Choose a tag to compare
  • This release extends the quantiles KLL sketch to include both doubles and float types as well as full updatable operation off-heap in contiguous memory.

  • This release also adds two new methods to Theta Union based on the discussion on Druid issue # 12261:

    • getCurrentBytes() This behaves similarly to the Theta Sketch method by the same name.
    • getMaxUnionBytes() This behaves similarly to the static Sketches method by the same name, but is non-static and returns the maximum stored bytes for this union given its nominal entries configuration.

Apache Release 3.1.0

26 Jan 23:41
Compare
Choose a tag to compare

This is primarily a maintenance release.
Highlights:

  • Fixed "Mikhail's Issue #368". This was a specific Theta set-operation corner case bug.
    • In addition, Developed a standard model for all Theta, Tuple set operation corner cases. Implemented across all corner cases for Java (and C++).
  • Improvements in clarity of Javadocs, primarily in Theta, Tuple sketches.
  • Fixed some parameter leakage cases related to LGTM warnings.
  • Improved variable naming to enhance code clarity.
  • Fixed warnings discovered by latest SpotBugs.
  • Significant improvements and cleanup of ArrayOfDoubles Tuple Sketch code.