Skip to content

bug upgrading a deserialized hll sketch #115

@fulmicoton

Description

@fulmicoton

When serializing/deserializing a sketch we create a "compact" version of the datasketch.
For List, this means the array of values is as long as the len of the lists.

The update code on the end, attempts to append the value by scanning through the list of values,
up to the point where a SENTINEL value is reached (value 0) and inserts it there.

Due to the compact representation of lists, we are never able to add an element.

If the end of the list is reached, we do not panic, and exit silently.
After updating, we see if we need to upgrade the list to a set.

Minimal reproducing test.

    #[test]
    fn hll_serialize_deserialize_then_update() {
        const LG_K: u8 = 11;
        let mut sketch = HllSketch::new(LG_K, HllType::Hll4);

        sketch.update(1);

        // Serialize / Deserialize round-trip
        let bytes = sketch.serialize();
        let mut sketch = HllSketch::deserialize(&bytes).unwrap();
        sketch.update(2);

        // The sketch should have observed 3 distinct coupons
        let est = sketch.estimate();
        assert!(
            (est - 2.0).abs() < 0.1,
            "expected estimate close to 2, got {est}"
        );
    }

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions