Skip to content

Conversation

@freakyzoidberg
Copy link
Member

Switch Murmur3 dependency to https://github.com/twmb/murmur3 as it yield better performance than the current one.

pkg: github.com/apache/datasketches-go/hll
                                  │      before.bench      │             after.bench             │
                                  │         sec/op         │   sec/op     vs base                │
HLLDataSketch/lgK4_HLL4_uint-10               13.365n ± 1%   9.237n ± 2%  -30.88% (p=0.000 n=10)
HLLDataSketch/lgK16_HLL4_uint-10              13.535n ± 1%   9.496n ± 1%  -29.84% (p=0.000 n=10)
HLLDataSketch/lgK21_HLL4_uint-10               18.96n ± 1%   13.04n ± 1%  -31.25% (p=0.000 n=10)
HLLDataSketch/lgK4_HLL6_uint-10               13.545n ± 1%   9.361n ± 0%  -30.89% (p=0.000 n=10)
HLLDataSketch/lgK16_HLL6_uint-10              13.560n ± 1%   9.504n ± 1%  -29.91% (p=0.000 n=10)
HLLDataSketch/lgK21_HLL6_uint-10               18.55n ± 0%   12.76n ± 1%  -31.23% (p=0.000 n=10)
HLLDataSketch/lgK4_HLL8_uint-10               12.595n ± 1%   8.440n ± 1%  -32.99% (p=0.000 n=10)
HLLDataSketch/lgK16_HLL8_uint-10              12.670n ± 1%   8.460n ± 1%  -33.23% (p=0.000 n=10)
HLLDataSketch/lgK21_HLL8_uint-10               17.34n ± 1%   11.28n ± 1%  -34.97% (p=0.000 n=10)
HLLDataSketch/lgK4_HLL4_slice-10              13.445n ± 0%   9.716n ± 1%  -27.74% (p=0.000 n=10)
HLLDataSketch/lgK16_HLL4_slice-10             13.670n ± 1%   9.731n ± 1%  -28.82% (p=0.000 n=10)
HLLDataSketch/lgK21_HLL4_slice-10              19.09n ± 0%   13.32n ± 3%  -30.21% (p=0.000 n=10)
HLLDataSketch/lgK4_HLL6_slice-10              13.570n ± 0%   9.803n ± 1%  -27.76% (p=0.000 n=10)
HLLDataSketch/lgK16_HLL6_slice-10             13.770n ± 1%   9.826n ± 0%  -28.64% (p=0.000 n=10)
HLLDataSketch/lgK21_HLL6_slice-10              18.85n ± 1%   12.81n ± 1%  -32.06% (p=0.000 n=10)
HLLDataSketch/lgK4_HLL8_slice-10              12.765n ± 1%   8.835n ± 8%  -30.79% (p=0.000 n=10)
HLLDataSketch/lgK16_HLL8_slice-10             12.880n ± 1%   8.870n ± 1%  -31.13% (p=0.000 n=10)
HLLDataSketch/lgK21_HLL8_slice-10              17.69n ± 1%   11.54n ± 1%  -34.72% (p=0.000 n=10)
HLLDataSketch/lgK4_HLL8_union-10               26.76n ± 1%   22.68n ± 1%  -15.21% (p=0.000 n=10)
HLLDataSketch/lgK16_HLL8_union-10              26.60n ± 1%   22.68n ± 1%  -14.76% (p=0.000 n=10)
HLLDataSketch/lgK21_HLL8_union-10              26.50n ± 1%   22.68n ± 0%  -14.43% (p=0.000 n=10)
geomean                                        16.10n        11.45n       -28.87%

@AlexanderSaydakov
Copy link
Contributor

Have you tried to test compatibility with C++ and Java? Even a slightest difference in hash function can make sketches incompatible (disjoint sets despite of the same input)

@freakyzoidberg
Copy link
Member Author

freakyzoidberg commented Dec 19, 2023

Checked a few hash output for equality.

I am actually thinking of checking back in the serialised output test data, to test for possible change like this.

Anyhow, in this case, crc32 for the various generations match with both implementations

❯ sdiff  <(crc32 ./go_generated_files/*) <(crc32 ./go_generated_files_old/*)
422a5893	./go_generated_files/hll4_n0_go.sk            | 422a5893	./go_generated_files_old/hll4_n0_go.sk
dcb6ae94	./go_generated_files/hll4_n1000000_go.sk      | dcb6ae94	./go_generated_files_old/hll4_n1000000_go.sk
415450fd	./go_generated_files/hll4_n100000_go.sk       | 415450fd	./go_generated_files_old/hll4_n100000_go.sk
9942a074	./go_generated_files/hll4_n10000_go.sk        | 9942a074	./go_generated_files_old/hll4_n10000_go.sk
44466b15	./go_generated_files/hll4_n1000_go.sk         | 44466b15	./go_generated_files_old/hll4_n1000_go.sk
cee02cc2	./go_generated_files/hll4_n100_go.sk          | cee02cc2	./go_generated_files_old/hll4_n100_go.sk
f6ac7c4e	./go_generated_files/hll4_n10_go.sk           | f6ac7c4e	./go_generated_files_old/hll4_n10_go.sk
f39fce1b	./go_generated_files/hll4_n1_go.sk            | f39fce1b	./go_generated_files_old/hll4_n1_go.sk
45479c8a	./go_generated_files/hll6_n0_go.sk            | 45479c8a	./go_generated_files_old/hll6_n0_go.sk
8f48e75f	./go_generated_files/hll6_n1000000_go.sk      | 8f48e75f	./go_generated_files_old/hll6_n1000000_go.sk
cf60fc77	./go_generated_files/hll6_n100000_go.sk       | cf60fc77	./go_generated_files_old/hll6_n100000_go.sk
2a06d1cb	./go_generated_files/hll6_n10000_go.sk        | 2a06d1cb	./go_generated_files_old/hll6_n10000_go.sk
96f9cd30	./go_generated_files/hll6_n1000_go.sk         | 96f9cd30	./go_generated_files_old/hll6_n1000_go.sk
b37ef4b7	./go_generated_files/hll6_n100_go.sk          | b37ef4b7	./go_generated_files_old/hll6_n100_go.sk
1567d7bd	./go_generated_files/hll6_n10_go.sk           | 1567d7bd	./go_generated_files_old/hll6_n10_go.sk
061f68db	./go_generated_files/hll6_n1_go.sk            | 061f68db	./go_generated_files_old/hll6_n1_go.sk
4cf1d0a1	./go_generated_files/hll8_n0_go.sk            | 4cf1d0a1	./go_generated_files_old/hll8_n0_go.sk
151be68f	./go_generated_files/hll8_n1000000_go.sk      | 151be68f	./go_generated_files_old/hll8_n1000000_go.sk
c1170afe	./go_generated_files/hll8_n100000_go.sk       | c1170afe	./go_generated_files_old/hll8_n100000_go.sk
3c8bcc28	./go_generated_files/hll8_n10000_go.sk        | 3c8bcc28	./go_generated_files_old/hll8_n10000_go.sk
935c33e5	./go_generated_files/hll8_n1000_go.sk         | 935c33e5	./go_generated_files_old/hll8_n1000_go.sk
35dd9c28	./go_generated_files/hll8_n100_go.sk          | 35dd9c28	./go_generated_files_old/hll8_n100_go.sk
ea4a2de9	./go_generated_files/hll8_n10_go.sk           | ea4a2de9	./go_generated_files_old/hll8_n10_go.sk
c3ef85da	./go_generated_files/hll8_n1_go.sk            | c3ef85da	./go_generated_files_old/hll8_n1_go.sk
❯ sdiff  <(crc32 ./go_generated_files/*) <(crc32 ./java_generated_files/*)
422a5893	./go_generated_files/hll4_n0_go.sk            | 422a5893	./java_generated_files/hll4_n0_java.sk
dcb6ae94	./go_generated_files/hll4_n1000000_go.sk      | dcb6ae94	./java_generated_files/hll4_n1000000_java.sk
415450fd	./go_generated_files/hll4_n100000_go.sk       | 415450fd	./java_generated_files/hll4_n100000_java.sk
9942a074	./go_generated_files/hll4_n10000_go.sk        | 9942a074	./java_generated_files/hll4_n10000_java.sk
44466b15	./go_generated_files/hll4_n1000_go.sk         | 44466b15	./java_generated_files/hll4_n1000_java.sk
cee02cc2	./go_generated_files/hll4_n100_go.sk          | cee02cc2	./java_generated_files/hll4_n100_java.sk
f6ac7c4e	./go_generated_files/hll4_n10_go.sk           | f6ac7c4e	./java_generated_files/hll4_n10_java.sk
f39fce1b	./go_generated_files/hll4_n1_go.sk            | f39fce1b	./java_generated_files/hll4_n1_java.sk
45479c8a	./go_generated_files/hll6_n0_go.sk            | 45479c8a	./java_generated_files/hll6_n0_java.sk
8f48e75f	./go_generated_files/hll6_n1000000_go.sk      | 8f48e75f	./java_generated_files/hll6_n1000000_java.sk
cf60fc77	./go_generated_files/hll6_n100000_go.sk       | cf60fc77	./java_generated_files/hll6_n100000_java.sk
2a06d1cb	./go_generated_files/hll6_n10000_go.sk        | 2a06d1cb	./java_generated_files/hll6_n10000_java.sk
96f9cd30	./go_generated_files/hll6_n1000_go.sk         | 96f9cd30	./java_generated_files/hll6_n1000_java.sk
b37ef4b7	./go_generated_files/hll6_n100_go.sk          | b37ef4b7	./java_generated_files/hll6_n100_java.sk
1567d7bd	./go_generated_files/hll6_n10_go.sk           | 1567d7bd	./java_generated_files/hll6_n10_java.sk
061f68db	./go_generated_files/hll6_n1_go.sk            | 061f68db	./java_generated_files/hll6_n1_java.sk
4cf1d0a1	./go_generated_files/hll8_n0_go.sk            | 4cf1d0a1	./java_generated_files/hll8_n0_java.sk
151be68f	./go_generated_files/hll8_n1000000_go.sk      | 151be68f	./java_generated_files/hll8_n1000000_java.sk
c1170afe	./go_generated_files/hll8_n100000_go.sk       | c1170afe	./java_generated_files/hll8_n100000_java.sk
3c8bcc28	./go_generated_files/hll8_n10000_go.sk        | 3c8bcc28	./java_generated_files/hll8_n10000_java.sk
935c33e5	./go_generated_files/hll8_n1000_go.sk         | 935c33e5	./java_generated_files/hll8_n1000_java.sk
35dd9c28	./go_generated_files/hll8_n100_go.sk          | 35dd9c28	./java_generated_files/hll8_n100_java.sk
ea4a2de9	./go_generated_files/hll8_n10_go.sk           | ea4a2de9	./java_generated_files/hll8_n10_java.sk
c3ef85da	./go_generated_files/hll8_n1_go.sk            | c3ef85da	./java_generated_files/hll8_n1_java.sk

@AlexanderSaydakov AlexanderSaydakov merged commit 284f40b into apache:main Dec 20, 2023
@freakyzoidberg freakyzoidberg deleted the murmur3-switch branch December 31, 2023 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants