Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialize BitArray equality to improve performance. #21

Open
modulovalue opened this issue Dec 23, 2023 · 2 comments
Open

Specialize BitArray equality to improve performance. #21

modulovalue opened this issue Dec 23, 2023 · 2 comments

Comments

@modulovalue
Copy link

I've noticed that BitArray equality could be specialized to be more efficient when two BitArray instances are being compared.

An is check would probably be fine (i.e. #18 (comment)) which could be used to introduce a fast path that uses for loops over the underlying buffers instead of having to use iterators (as is currently being done in the base BitSet implementation).

Furthermore, I feel like inlining the fold in hashCode might give us some easy gains, too. I'm not sure about the overhead of fold and the closure... JIT might be able to deal with it just fine, but I have doubts about AOT.

@modulovalue
Copy link
Author

modulovalue commented Dec 23, 2023

Here's what I came up with:

// Inside of BitArray
  // We specialize equality here for performance.
  @override
  bool operator ==(Object other) {
    if (other is BitArray) {
      if (identical(this, other)) {
        return true;
      }
      if (length == other.length) {
        final aData = _data;
        final bData = other._data;
        if (aData.length == bData.length) {
          for (int i = 0; i < aData.length; i++) {
            if (aData[i] != bData[i]) {
              return false;
            }
          }
          return true;
        } else {
          throw Exception("Invalid State?");
        }
      } else {
        return super == other;
      }
    } else {
      return super == other;
    }
  }

  // We specialize hashCode here to use an explicit for-loop for performance.
  @override
  int get hashCode {
    int total = 0;
    for (int i = 0; i < _data.length; i++) {
      total ^= _data[i].hashCode;
    }
    return total ^ length.hashCode;
  }

Here are some measurements with that change applied:

JIT
Measure: run: took: 8434.35000 ms
Measure: run: took: 8449.63900 ms
Measure: run: took: 8339.67400 ms
Measure: run: took: 8414.71000 ms
Measure: run: took: 8630.52200 ms
...
AOT
Measure: run: took: 10304.25000 ms
Measure: run: took: 10292.99600 ms
...

without:

JIT
Measure: run: took: 9160.16500 ms
Measure: run: took: 9042.79700 ms
Measure: run: took: 9231.17700 ms
Measure: run: took: 9020.62700 ms
Measure: run: took: 9085.06600 ms
...
AOT
Measure: run: took: 15854.65300 ms
Measure: run: took: 16027.64400 ms
...

@isoos what do you think? I'm not entirely sure that I've completely understood the invariants that you are trying to maintain (e.g., is the throw Exception("Invalid State?") case really never supposed to happen? That's what I'm assuming here.)

@isoos
Copy link
Owner

isoos commented Dec 23, 2023

I think a bit of defensive programming never hurts, and throwing e.g. StateError would be fine if the internals do not match.
It is sometimes easier to follow if you put the negative case earlier, e.g.

if (aData.length != bData.length) {
  throw StateError();
}
// follow comparison

another note: I think equals operator should return true if it is the same object type and internals, not when it represents the same abstract bitset. If we want to have a semantic check that would allow mixing different bit sets and check their set bits, it should be a separate method, not the operator ==. With that, we don't need to call super in the method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants