When serializing/deserializing a sketch we create a "compact" version of the datasketch.
For List, this means the array of values is as long as the len of the lists.
The update code on the end, attempts to append the value by scanning through the list of values,
up to the point where a SENTINEL value is reached (value 0) and inserts it there.
Due to the compact representation of lists, we are never able to add an element.
If the end of the list is reached, we do not panic, and exit silently.
After updating, we see if we need to upgrade the list to a set.
Minimal reproducing test.
#[test]
fn hll_serialize_deserialize_then_update() {
const LG_K: u8 = 11;
let mut sketch = HllSketch::new(LG_K, HllType::Hll4);
sketch.update(1);
// Serialize / Deserialize round-trip
let bytes = sketch.serialize();
let mut sketch = HllSketch::deserialize(&bytes).unwrap();
sketch.update(2);
// The sketch should have observed 3 distinct coupons
let est = sketch.estimate();
assert!(
(est - 2.0).abs() < 0.1,
"expected estimate close to 2, got {est}"
);
}
When serializing/deserializing a sketch we create a "compact" version of the datasketch.
For List, this means the array of values is as long as the len of the lists.
The update code on the end, attempts to append the value by scanning through the list of values,
up to the point where a SENTINEL value is reached (value 0) and inserts it there.
Due to the compact representation of lists, we are never able to add an element.
If the end of the list is reached, we do not panic, and exit silently.
After updating, we see if we need to upgrade the list to a set.
Minimal reproducing test.