Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor performance of IsSuperSet #148

Closed
nemars opened this issue Dec 8, 2023 · 1 comment
Closed

Poor performance of IsSuperSet #148

nemars opened this issue Dec 8, 2023 · 1 comment

Comments

@nemars
Copy link
Contributor

nemars commented Dec 8, 2023

Hi, I just started using this package and it is great. My use case relies heavily on Complement and IsSuperSet with ~30k bits. I noticed that IsSuperSet is the only method in the "family" (Difference, Union, Intersection, Complement) not utilizing bitwise operations on set []uint64 so I forked the repo and changed the implementation.

Is there a specific reason for that? If not, I'm proposing a following solution. I'm not sure about the contributions policy, so I didn't send a PR but I could if that's OK with you.

I added some tests to make sure I'm not breaking anything, and ran some benchmarks to make sure I'm not making performance worse. Looks like it is significantly worse (+75%) only in case of all zeros (both sets are just 10k zero bits). See details below:

go install golang.org/x/perf/cmd/benchstat@latest
go test -bench=BenchmarkIsSuperSet -count 6 | Tee-Object -FilePath old.txt
go test -bench=BenchmarkIsSuperSet -count 6 | Tee-Object -FilePath new.txt
benchstat old.txt new.txt
goos: windows
goarch: amd64
pkg: github.com/bits-and-blooms/bitset
cpu: Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz
                                                    │    old.txt     │               new.txt               │
                                                    │     sec/op     │    sec/op     vs base               │
IsSuperSet/equal,_len=1-6                                8.977n ± 1%    5.468n ± 2%  -39.08% (p=0.002 n=6)
IsSuperSet/equal,_len=1,_strict-6                        6.611n ± 0%    5.981n ± 1%   -9.52% (p=0.002 n=6)
IsSuperSet/equal,_len=10-6                              22.235n ± 1%    5.499n ± 1%  -75.27% (p=0.002 n=6)
IsSuperSet/equal,_len=10,_strict-6                       6.569n ± 1%    5.993n ± 0%   -8.78% (p=0.002 n=6)
IsSuperSet/equal,_len=100-6                            163.450n ± 2%    6.001n ± 0%  -96.33% (p=0.002 n=6)
IsSuperSet/equal,_len=100,_strict-6                      8.414n ± 1%    7.869n ± 1%   -6.48% (p=0.002 n=6)
IsSuperSet/equal,_len=1000-6                           1756.50n ± 0%    14.12n ± 1%  -99.20% (p=0.002 n=6)
IsSuperSet/equal,_len=1000,_strict-6                     23.67n ± 1%    23.64n ± 1%        ~ (p=0.937 n=6)
IsSuperSet/equal,_len=10000-6                          17295.0n ± 1%    103.2n ± 2%  -99.40% (p=0.002 n=6)
IsSuperSet/equal,_len=10000,_strict-6                    201.9n ± 0%    202.2n ± 2%        ~ (p=0.290 n=6)
IsSuperSet/equal,_len=100000-6                        171296.5n ± 0%    940.3n ± 1%  -99.45% (p=0.002 n=6)
IsSuperSet/equal,_len=100000,_strict-6                   1.853µ ± 0%    1.855µ ± 1%        ~ (p=0.394 n=6)
IsSuperSet/equal,_density=0.00-6                         59.94n ± 1%   105.15n ± 2%  +75.41% (p=0.002 n=6)
IsSuperSet/equal,_density=0.00,_strict-6                 201.8n ± 1%    203.7n ± 1%   +0.94% (p=0.045 n=6)
IsSuperSet/equal,_density=0.05-6                        2483.5n ± 2%    103.1n ± 1%  -95.85% (p=0.002 n=6)
IsSuperSet/equal,_density=0.05,_strict-6                 201.2n ± 0%    202.1n ± 1%   +0.47% (p=0.002 n=6)
IsSuperSet/equal,_density=0.20-6                        7662.0n ± 0%    104.0n ± 1%  -98.64% (p=0.002 n=6)
IsSuperSet/equal,_density=0.20,_strict-6                 201.4n ± 0%    204.9n ± 1%   +1.74% (p=0.002 n=6)
IsSuperSet/equal,_density=0.80-6                       26343.0n ± 0%    103.3n ± 2%  -99.61% (p=0.002 n=6)
IsSuperSet/equal,_density=0.80,_strict-6                 201.5n ± 0%    202.7n ± 0%   +0.60% (p=0.006 n=6)
IsSuperSet/equal,_density=0.95-6                       31173.5n ± 0%    103.6n ± 1%  -99.67% (p=0.002 n=6)
IsSuperSet/equal,_density=0.95,_strict-6                 202.0n ± 1%    202.9n ± 1%        ~ (p=0.143 n=6)
IsSuperSet/equal,_density=1.00-6                       32853.0n ± 1%    103.5n ± 3%  -99.68% (p=0.002 n=6)
IsSuperSet/equal,_density=1.00,_strict-6                 201.4n ± 0%    204.0n ± 1%   +1.34% (p=0.002 n=6)
IsSuperSet/subset,_len=10000,_diff=0-6                   4.731n ± 0%    3.919n ± 1%  -17.17% (p=0.002 n=6)
IsSuperSet/subset,_len=10000,_diff=0,_strict-6           201.3n ± 0%    204.0n ± 1%   +1.37% (p=0.002 n=6)
IsSuperSet/subset,_len=10000,_diff=100-6               179.250n ± 0%    4.441n ± 1%  -97.52% (p=0.002 n=6)
IsSuperSet/subset,_len=10000,_diff=100,_strict-6         201.2n ± 0%    202.7n ± 1%   +0.72% (p=0.002 n=6)
IsSuperSet/subset,_len=10000,_diff=1000-6              1755.50n ± 0%    13.07n ± 6%  -99.26% (p=0.002 n=6)
IsSuperSet/subset,_len=10000,_diff=1000,_strict-6        203.1n ± 1%    203.7n ± 1%        ~ (p=0.457 n=6)
IsSuperSet/subset,_len=10000,_diff=9999-6              17310.5n ± 1%    106.0n ± 1%  -99.39% (p=0.002 n=6)
IsSuperSet/subset,_len=10000,_diff=9999,_strict-6        201.4n ± 0%    202.3n ± 1%        ~ (p=0.065 n=6)
IsSuperSet/superset,_len=10000,_diff=0-6               17289.0n ± 0%    102.5n ± 1%  -99.41% (p=0.002 n=6)
IsSuperSet/superset,_len=10000,_diff=0,_strict-6       17572.5n ± 2%    312.4n ± 4%  -98.22% (p=0.002 n=6)
IsSuperSet/superset,_len=10000,_diff=100-6             17287.0n ± 0%    103.0n ± 2%  -99.40% (p=0.002 n=6)
IsSuperSet/superset,_len=10000,_diff=100,_strict-6     17521.0n ± 0%    308.6n ± 2%  -98.24% (p=0.002 n=6)
IsSuperSet/superset,_len=10000,_diff=1000-6            17369.0n ± 2%    104.5n ± 2%  -99.40% (p=0.002 n=6)
IsSuperSet/superset,_len=10000,_diff=1000,_strict-6    17815.0n ± 1%    309.8n ± 1%  -98.26% (p=0.002 n=6)
IsSuperSet/superset,_len=10000,_diff=9999-6            17294.0n ± 1%    102.8n ± 1%  -99.41% (p=0.002 n=6)
IsSuperSet/superset,_len=10000,_diff=9999,_strict-6    17499.0n ± 1%    308.9n ± 5%  -98.23% (p=0.002 n=6)
geomean                                                  823.2n         76.60n       -90.69%
@lemire
Copy link
Member

lemire commented Dec 8, 2023

Please do issue a PR. There is no need to argue, we'll take it!!! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants