Skip to content

Releases: apache/datasketches-java

PriorToRefactor

20 Jan 18:00
f6c44f9
Compare
Choose a tag to compare
PriorToRefactor Pre-release
Pre-release

This is a fall-back tag prior to refactoring the code from com.yahoo... to org.apache... This is not a release.

0.13.4 May 14, 2019: Fix Theta Direct Union Bug, alternative path.

14 May 19:26
Compare
Choose a tag to compare

It turns out there were two different code paths that would reveal the Direct Union Bug. This fixes the alternative, but seldom used path, which wasn't caught in the previous release.

0.13.3 May 8, 2019: Fix Theta Direct Union Bug

09 May 00:01
Compare
Choose a tag to compare

This release fixes a nasty bug that occurred when merging estimating sketches into a Direct Union.
Three lines of code were accidentally deleted between 0.13.0 and 0.13.1.
This bug existed only in releases 0.13.1 and 0.13.2.

0.13.2 Apr 25, 2019: Frequent Distinct Tuples Sketch

26 Apr 02:21
Compare
Choose a tag to compare

0.13.1 Apr 2, 2019: Fix Direct DoublesUnion Quantiles Bug

02 Apr 20:00
Compare
Choose a tag to compare
  • Bug fix for Quantiles Sketches

    • Environment: Using DoublesUnion in Direct (off-heap) mode.
    • Symptom 1: quantiles are out-of-order: q(0.99) < q(0.98)
    • Symptom 2: garbage values amongst otherwise normal quantile values: q(0.99) = 100, q(0.98) = 1E100, q(0.97) = 90.
  • Bug fix for Theta Sketches

    • Environment: using Union in Direct (off-heap) mode
    • Symptom: getEstimate() returns NaN.
      It requires an unusual set of circumstances to actually observe this.
  • Logic change for Theta Sketches

    • Empty sketches do not affect unions and can be ignored

0.13.0 Mar 14, 2019: Added new CPC Sketch

14 Mar 21:04
Compare
Choose a tag to compare
  • Added new CPC Sketch. This new sketch has superior accuracy per stored space than the HLL sketch.
  • Added a high-performance thread-safe version of Theta UpdateSketch for use in applications that require very high throughput
  • Added API calls for easier understanding of error in the Frequent Items sketches
  • Added more general ceiling and floor powers of X functions to sketches.Util.*
  • Optimized serialization of single item KLL sketches
  • Minor changes to HLL API: getIterator() becomes the Java convention iterator().
  • Added xxHash() and faster version of MurmurHash3 (v2).

0.12.0 Aug 7, 2018: Update POM to Memory 0.12.0, improves performance.

08 Aug 01:42
Compare
Choose a tag to compare
  • Updated to Memory 0.12.0, which will improve performance
  • Fixed handling of min and max values in KLL sketch merge
  • Minor API changes

0.11.1 Apr 20, 2018: Quantiles, KLL, Tuple, Fixes & Improvements

20 Apr 22:22
Compare
Choose a tag to compare
  • Quantiles sketch
    • fixed issue #195
    • added DoublesUnion.heapify() and DoublesUnion.wrap() methods
    • deprecated DoublesUnionBuilder.heapify() and DoublesUnionBuilder.wrap() methods
  • KLL sketch
    • methods to obtain rank error for both single-sided and double-sided queries
    • methods to compute parameter k given a target rank error
    • Javadoc improvements
  • Tuple sketch
    • added Filter

0.11.0 Mar 15, 2018: KLL quantiles sketch, tuple sketch API change and more

16 Mar 02:20
Compare
Choose a tag to compare
  • New KLL sketch: KllFloatsSketch:
    • This is a new quantiles sketch with better accuracy per stored bit than the original quantiles DoublesSketch. If you select a value of K for the KLL sketch so that it matches the same accuracy as the DoublesSketch, the K will be larger, but the space required will be much smaller. This sketch is specifically tuned for the smallest amount of space usage as possible (near theoretical optimum) and uses floats rather than doubles. On update this new KLL sketch is a little faster than the original DoublesSketch, but may be slower on merge. Also, this KLL sketch currently does not have a generic version (as does the DoublesSketch) nor does it provide off-heap capability like the DoublesSketch. Refer to the javadocs for a link to the KLL theoretical paper.
  • Tuple:
    • generic sketch API change
      • removed the convention to require static methods with a certain signature, these methods are now based on a more visible API
      • added SummaryDeserializer
      • The need to serialize factories has been removed
      • removed getSummaries() method - use iterator instead
  • Theta:
    • added new SingleItemSketch - fast way to create sketches with a single input item
  • Original quantiles sketch enhancements:
    • added getRank() - faster than getCDF() with one split point
    • empty sketch returns null from getQuantiles(), getPMF() and getCDF()
    • empty sketch returns NaN from getQuantile(), getMinValue() and getMaxValue()
    • Komologorov-Smirnov Statistic between two quantiles sketches
    • fixed sorting using comparator in generic ItemsSketch

0.10.3 Oct 26, 2017: Theta backward compatibility

27 Oct 00:21
Compare
Choose a tag to compare

Theta sketch: As a part of the resize factor serialization fix in version 0.10.2 a validation check was added, which led to inability to deserialize UpdateSketch or Union serialized using sketches-core-0.8.4 and above. This release is to address the issue.