Releases: apache/datasketches-java
PriorToRefactor
This is a fall-back tag prior to refactoring the code from com.yahoo... to org.apache... This is not a release.
0.13.4 May 14, 2019: Fix Theta Direct Union Bug, alternative path.
It turns out there were two different code paths that would reveal the Direct Union Bug. This fixes the alternative, but seldom used path, which wasn't caught in the previous release.
0.13.3 May 8, 2019: Fix Theta Direct Union Bug
This release fixes a nasty bug that occurred when merging estimating sketches into a Direct Union.
Three lines of code were accidentally deleted between 0.13.0 and 0.13.1.
This bug existed only in releases 0.13.1 and 0.13.2.
0.13.2 Apr 25, 2019: Frequent Distinct Tuples Sketch
- Released the new Frequent Distinct Tuples Sketch.
- Minor updates to javadocs
- Minor bug fixes.
0.13.1 Apr 2, 2019: Fix Direct DoublesUnion Quantiles Bug
-
Bug fix for Quantiles Sketches
- Environment: Using DoublesUnion in Direct (off-heap) mode.
- Symptom 1: quantiles are out-of-order: q(0.99) < q(0.98)
- Symptom 2: garbage values amongst otherwise normal quantile values: q(0.99) = 100, q(0.98) = 1E100, q(0.97) = 90.
-
Bug fix for Theta Sketches
- Environment: using Union in Direct (off-heap) mode
- Symptom: getEstimate() returns NaN.
It requires an unusual set of circumstances to actually observe this.
-
Logic change for Theta Sketches
- Empty sketches do not affect unions and can be ignored
0.13.0 Mar 14, 2019: Added new CPC Sketch
- Added new CPC Sketch. This new sketch has superior accuracy per stored space than the HLL sketch.
- Added a high-performance thread-safe version of Theta UpdateSketch for use in applications that require very high throughput
- Added API calls for easier understanding of error in the Frequent Items sketches
- Added more general ceiling and floor powers of X functions to sketches.Util.*
- Optimized serialization of single item KLL sketches
- Minor changes to HLL API: getIterator() becomes the Java convention iterator().
- Added xxHash() and faster version of MurmurHash3 (v2).
0.12.0 Aug 7, 2018: Update POM to Memory 0.12.0, improves performance.
- Updated to Memory 0.12.0, which will improve performance
- Fixed handling of min and max values in KLL sketch merge
- Minor API changes
0.11.1 Apr 20, 2018: Quantiles, KLL, Tuple, Fixes & Improvements
- Quantiles sketch
- fixed issue #195
- added DoublesUnion.heapify() and DoublesUnion.wrap() methods
- deprecated DoublesUnionBuilder.heapify() and DoublesUnionBuilder.wrap() methods
- KLL sketch
- methods to obtain rank error for both single-sided and double-sided queries
- methods to compute parameter k given a target rank error
- Javadoc improvements
- Tuple sketch
- added Filter
0.11.0 Mar 15, 2018: KLL quantiles sketch, tuple sketch API change and more
- New KLL sketch:
KllFloatsSketch
:- This is a new quantiles sketch with better accuracy per stored bit than the original quantiles
DoublesSketch
. If you select a value of K for the KLL sketch so that it matches the same accuracy as the DoublesSketch, the K will be larger, but the space required will be much smaller. This sketch is specifically tuned for the smallest amount of space usage as possible (near theoretical optimum) and usesfloats
rather thandoubles
. On update this new KLL sketch is a little faster than the originalDoublesSketch
, but may be slower on merge. Also, this KLL sketch currently does not have a generic version (as does theDoublesSketch
) nor does it provide off-heap capability like theDoublesSketch
. Refer to the javadocs for a link to the KLL theoretical paper.
- This is a new quantiles sketch with better accuracy per stored bit than the original quantiles
- Tuple:
- generic sketch API change
- removed the convention to require static methods with a certain signature, these methods are now based on a more visible API
- added SummaryDeserializer
- The need to serialize factories has been removed
- removed getSummaries() method - use iterator instead
- generic sketch API change
- Theta:
- added new
SingleItemSketch
- fast way to create sketches with a single input item
- added new
- Original quantiles sketch enhancements:
- added getRank() - faster than getCDF() with one split point
- empty sketch returns null from getQuantiles(), getPMF() and getCDF()
- empty sketch returns NaN from getQuantile(), getMinValue() and getMaxValue()
- Komologorov-Smirnov Statistic between two quantiles sketches
- fixed sorting using comparator in generic ItemsSketch
0.10.3 Oct 26, 2017: Theta backward compatibility
Theta sketch: As a part of the resize factor serialization fix in version 0.10.2 a validation check was added, which led to inability to deserialize UpdateSketch or Union serialized using sketches-core-0.8.4 and above. This release is to address the issue.