Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Add a BitArray.Contains method #72999

Closed
Tracked by #77391
eiriktsarpalis opened this issue Jul 28, 2022 · 20 comments · Fixed by #81527
Closed
Tracked by #77391

[API Proposal]: Add a BitArray.Contains method #72999

eiriktsarpalis opened this issue Jul 28, 2022 · 20 comments · Fixed by #81527
Labels
api-approved API was approved in API review, it can be implemented area-System.Collections help wanted [up-for-grabs] Good issue for external contributors
Milestone

Comments

@eiriktsarpalis
Copy link
Member

eiriktsarpalis commented Jul 28, 2022

Background and motivation

The BitArray class has been optimized to take advantage of hardware vectorization but it doesn't appear to have an efficient way to check if all flags within the array are either set or unset. The only suggested workaround I could find is to use Linq, i.e.

input.Cast<bool>().Contains(true);

which is far from optimal. The type could really benefit from a dedicated Contains implementation that uses vectorization.

API Proposal

namespace System.Collections;

public class BitArray
{
    public bool Contains(bool value);
}

API Usage

BitArray value;
BitArray mask;

value.And(mask).Contains(true);

Alternative Designs

No response

Risks

No response

@eiriktsarpalis eiriktsarpalis added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Jul 28, 2022
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jul 28, 2022
@ghost
Copy link

ghost commented Jul 28, 2022

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

The BitArray class has been optimized to take advantage of hardware vectorization but it doesn't appear to have an efficient way to check if all flags within the array are set or unset. The only suggested workaround I could find is to use Linq, i.e.

input.Cast<bool>().Contains(true);

which is far from optimal. The type could really benefit from a dedicated Contains implementation that uses vectorization.

API Proposal

namespace System.Collections;

public class BitArray
{
    public bool Contains(bool value);
}

API Usage

BitArray value;
BitArray mask;

value.And(mask).Contains(true);

Alternative Designs

No response

Risks

No response

Author: eiriktsarpalis
Assignees: -
Labels:

api-suggestion, area-System.Collections

Milestone: -

@eiriktsarpalis
Copy link
Member Author

cc @krwq @stephentoub

@krwq
Copy link
Member

krwq commented Jul 28, 2022

I'd only recommend renaming Contains into AllBitsEqual, otherwise LGTM

@eiriktsarpalis
Copy link
Member Author

Exposing an All method is equivalent to an Any/Contains method, since you can express it as

public AllBitsEqual(bool value) => !Contains(!value);

and vice versa. I feel Contains is more appropriate since this is a collection type but we could expose both variants.

@danmoseley
Copy link
Member

danmoseley commented Jul 28, 2022

I had the same expectation as @krwq. (not that means much)

What do other platforms/implementations name it?

Also, if I was translating an algorithm would I generally be thinking in terms of "are any bits set" (in which case Contains might be what I look for) or "are all bits false"?

@Tornhoof
Copy link
Contributor

Tornhoof commented Jul 28, 2022

What do other platforms/implementations name it?

Java only implements IsEmpty() -> https://docs.oracle.com/javase/7/docs/api/java/util/BitSet.html#isEmpty()

The other way round is not implemented, there is a discussion around that here: https://stackoverflow.com/questions/36308666/check-if-all-bits-in-bitset-are-set-to-true

@EgorBo
Copy link
Member

EgorBo commented Jul 28, 2022

Alternatives:

bool AllBitsSet { get; }
bool AllBitsUnset => !AllBitsSet;

@tannergooding
Copy link
Member

tannergooding commented Jul 28, 2022

Alternatives:

We wouldn't want this as a property. Properties are typically used for things that are "as cheap as a field read" and not for some potentially expensive algorithm that needs to check every element in the collection.

since you can express it as

While you can express it as this, I personally find comparing against a negative much harder to read/understand.

Not only can ! in if (!value.Contains(...)) get easily missed or misread, but you have to remember to read this as "if value does not contain" which while still not difficult, it is much clearer to me when it is if (value.DoesNotContain(...)) or if (value.AllBitsEqual(false)) (I've seen a good number of cases where dev's actually prefer if (cond == false) over if (!cond) due to the increased legibility, particularly for devs with reading troubles).

There are several other cases where I really wish we had both positive and negative APIs for binary cases (e.g. rather than IsLittleEndian and !IsLittleEndian it should've been IsLittleEndian and IsBigEndian) and since this is strictly a binary case (and always will be), I think it might be beneficial to expose it as such.

@eiriktsarpalis eiriktsarpalis removed the untriaged New issue has not been triaged by the area owner label Jul 28, 2022
@eiriktsarpalis eiriktsarpalis added this to the 8.0.0 milestone Jul 28, 2022
@eiriktsarpalis
Copy link
Member Author

eiriktsarpalis commented Jul 28, 2022

bool AllBitsUnset => !AllBitsSet;

There are several other cases where I really wish we had both positive and negative APIs for binary cases

FWIW AllBitsEqual is not a direct negation of Contains, one is expressible in terms of the other using De Morgan transformations. But I think that further proves the point that both existential and universal quantification are useful as first-class APIs here.

If we do ship both variants, I think the names should complement each other. Following existing .NET terminology that could look as follows:

public bool Any(bool value);
public bool All(bool value);

@danmoseley
Copy link
Member

danmoseley commented Jul 28, 2022

golang bit array doesn't seem to have this
rust has all() and none()
c++ bitset has all() and none()
Python has all() but for none, you have to negate any()
as noted above Java has isEmpty() but no easy way to check all set.

@sakno
Copy link
Contributor

sakno commented Jul 28, 2022

LeadingZeroCount() and/or TrailingZeroCount() can do the same but reusable for other purposes.

@tannergooding
Copy link
Member

FWIW AllBitsEqual is not a direct negation of Contains

Right. I meant for the case people are requesting of array.All(true), we shouldn't need to represent it as !array.Contains(false) nor as !array.Contains(true) for array.All(false). It makes it immensely harder to read/etc.

Contains() (or Any()) is the "minimum" API and then there is a common/requested scenario of All() (but possibly with a better/longer name).

@danmoseley
Copy link
Member

LeadingZeroCount() and/or TrailingZeroCount() can do the same but reusable for other purposes.

none() can be implemented as LeadingZeroCount() == Length; but how do you implement all() with those?

@sakno
Copy link
Contributor

sakno commented Jul 28, 2022

none() can be implemented as LeadingZeroCount() == Length; but how do you implement all() with those?

Yes, you're right. Probably, these methods are subjects to different proposal.

@tannergooding
Copy link
Member

You'd need something like BitArray.PopCount(bitArray) == bitArray.Length).

But, that's also not strictly speaking the best or optimal implementation. Any/All/None can "early exit" where-as PopCount must process the entire array. Likewise LeadingZeroCount and TrailingZeroCount must process up to the first set bit from opposite sides of the array.

@eiriktsarpalis eiriktsarpalis added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Oct 7, 2022
@eiriktsarpalis eiriktsarpalis self-assigned this Oct 7, 2022
@bartonjs
Copy link
Member

bartonjs commented Oct 18, 2022

Video

  • The proposed API and the proposed reason are logical opposites... "has all bits set" is !arr.Contains(false). So we came up with more direct questions.
  • Count == 0 almost certainly wants both of these methods to return false.
namespace System.Collections
{
    public partial class BitArray
    {
        public bool HasAllSet();
        public bool HasAnySet();
    }
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Oct 18, 2022
@eiriktsarpalis
Copy link
Member Author

Count == 0 almost certainly wants both of these methods to return false.

Can you elaborate? Generally speaking "All" quantifiers return true on empty sets.

@layomia layomia self-assigned this Oct 18, 2022
@bartonjs
Copy link
Member

Heh, should've left it as my original "needs careful attention". Enumerable.All is the only "all" I know of in .NET, and it returns true on empty, so I guess true makes sense here, too.

The trouble is, with an empty set, All is the same as divide by zero to a mathematician: it means "this isn't answerable in a general context, but you might have a specific interpretation to fall back on" (so, had I been the author if it, the LINQ All it would have thrown on empty). (if (blockingIssues.HasAllSet()) { panic(); } clearly thinks that blockingIssues is non-empty)

@eiriktsarpalis
Copy link
Member Author

See https://en.wikipedia.org/wiki/Universal_quantification#The_empty_set

By convention/definition, "forall" over empty is true and "exists" over empty is false. This makes sense if you look at inductive definitions of such quantifiers.

@stephentoub
Copy link
Member

Count == 0 almost certainly wants both of these methods to return false.

Can you elaborate? Generally speaking "All" quantifiers return true on empty sets.

HasAllSet should return true for an empty collection. That's what Enumerable.All does, that's what List<T>.TrueForAll does, that's what Array.TrueForAll does, etc.

@krwq krwq modified the milestones: 8.0.0, Future Jan 30, 2023
@krwq krwq added the help wanted [up-for-grabs] Good issue for external contributors label Jan 30, 2023
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Feb 2, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 13, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Mar 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Collections help wanted [up-for-grabs] Good issue for external contributors
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants