Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance BinarySearch to accept a Func<T, int> instead of value and comparer #63074

Open
GSPP opened this issue Dec 22, 2021 · 6 comments
Open
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Collections
Milestone

Comments

@GSPP
Copy link

GSPP commented Dec 22, 2021

The most common expression of binary search looks like this:

int BinarySearch<T>(T[] array, T value)

The search key is of the same type as the array elements. Multiple times, though, I have wanted to search by some other type. A motivating example is finding the stock price by date in a sorted array. Like this:

record DateAndPrice(DateTime Date, decimal Price);

DateAndPrice[] pricesSortedByDate = ...;

var index = Array.BinarySearch(pricesSortedByDate, DateTime.UtcNow.Date, /* something more, see below */);

The array elements are of type PriceTimeSeriesPoint but the search key is just the date. Fabricating a new PriceTimeSeriesPoint with a dummy price is dirty (quite a few drawbacks such as performance and potential impossibility due to API shape).

Binary search, at its core, does not require any lookup value at all. All that is needed is a Func<T, int>! No value and no comparer. This delegate allows for the bisection algorithm to work.

I, therefore, suggest that all binary search API surfaces in the framework are enhanced with an overload of the following form:

+    public int BinarySearch<T>(T[] array, Func<T, int> comparison)

With this, we can say:

var index = Array.BinarySearch(
     pricesSortedByDate,
     DateTime.UtcNow.Date,
     (priceAndDate, dateToFind) => priceAndDate.Date.CompareTo(dateToFind)); //This is the key idea

An extension of this idea would be to pass through a state parameter so that this lambda can be made static (saves allocations).


I believe that the relevant places to modify are

  • Array
  • List
  • ImmutableArray
  • ImmutableList
  • ImmutableListBuilder
  • ArrayList, if we still care
  • MemoryExtensions (for spans)

Alternative designs do not come to mind short of doing nothing. The risk would be adding a method that has too little use to justify its existence. I personally have needed this multiple times (I have it in my personal utility library).

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Collections untriaged New issue has not been triaged by the area owner labels Dec 22, 2021
@ghost
Copy link

ghost commented Dec 22, 2021

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

The most common expression of binary search looks like this:

int BinarySearch<T>(T[] array, T value)

The search key is of the same type as the array elements. Multiple times, though, I have wanted to search by some other type. A motivating example is finding the stock price by date in a sorted array. Like this:

record DateAndPrice(DateTime Date, decimal Price);

DateAndPrice[] pricesSortedByDate = ...;

var index = Array.BinarySearch(pricesSortedByDate, DateTime.UtcNow.Date, /* something more, see below */);

The array elements are of type PriceTimeSeriesPoint but the search key is just the date. Fabricating a new PriceTimeSeriesPoint with a dummy price is dirty (quite a few drawbacks such as performance and potential impossibility due to API shape).

Binary search, at its core, does not require any lookup value at all. All that is needed is a Func<T, int>! No value and no comparer. This delegate allows for the bisection algorithm to work.

I, therefore, suggest that all binary search API surfaces in the framework are enhanced with an overload of the following form:

+    public int BinarySearch<T>(T[] array, Func<T, int> comparison)

With this, we can say:

var index = Array.BinarySearch(
     pricesSortedByDate,
     DateTime.UtcNow.Date,
     (priceAndDate, dateToFind) => priceAndDate.Date.CompareTo(dateToFind)); //This is the key

An extension of this idea would be to pass through a state parameter so that this lambda can be made static (saves allocations).


I believe that the relevant places to modify are

  • Array
  • List
  • ImmutableArray
  • ImmutableList
  • ImmutableListBuilder
  • ArrayList, if we still care
  • MemoryExtensions (for spans)

Alternative designs do not come to mind short of doing nothing. The risk would be adding a method that has too little use to justify its existence. I personally have needed this multiple times.

Author: GSPP
Assignees: -
Labels:

area-System.Collections, untriaged

Milestone: -

@GSPP
Copy link
Author

GSPP commented Dec 22, 2021

I have just found this near duplicate: BinarySearch based on values of other types #50537

The core idea is the same but the API is more flexible in this issue. The other issue still searches in terms of a value and compare, while this issue uses the maximally general form of a Func<T, int>.

@krwq krwq added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Dec 22, 2021
@krwq
Copy link
Member

krwq commented Dec 22, 2021

I personally like the idea as proposed without adding extra state parameters. Can you find exact list of APIs where we would potentially add it for completeness before we submit it for review?

@krwq krwq added this to the Future milestone Dec 22, 2021
@krwq krwq removed the untriaged New issue has not been triaged by the area owner label Dec 22, 2021
@zmj
Copy link

zmj commented Dec 22, 2021

There's an extension method that covers this use case: https://docs.microsoft.com/en-us/dotnet/api/system.memoryextensions.binarysearch
public static int BinarySearch<T,TComparer> (this ReadOnlySpan<T> span, T value, TComparer comparer) where TComparer : System.Collections.Generic.IComparer<T>;

@GSPP
Copy link
Author

GSPP commented Dec 23, 2021

@zmj That's an interesting overload that I have not seen before. It appears that its purpose is to allow for a struct comparer. It does not allow a search key that deviates in type, though. I don't believe it satisfies what this proposal is about.


@krwq OK, I'll take a look.

Without the state parameter, any search invocation will cause an allocation of a new closure and delegate. For my personal uses, that has always been fine. It's just a point to discuss.

The state version looks a lot like what the other linked issue proposes. It's this:

public int BinarySearch<T, TOther>(T[] array, TOther value, Func<T, TOther, int> comparison)

And actually, the sample invocation is exactly what I posted in my request.

var index = Array.BinarySearch(
     pricesSortedByDate,
     DateTime.UtcNow.Date,
     (priceAndDate, dateToFind) => priceAndDate.Date.CompareTo(dateToFind));

Yes, this means that my original sample code was mistaken. I accidentally used the state version. I personally find that to be rather clean code, and therefore I advocate to add that as well. The non-state version is this:

var dateToFind = DateTime.UtcNow.Date;

var index = Array.BinarySearch(
     pricesSortedByDate,
     priceAndDate => priceAndDate.Date.CompareTo(dateToFind)); 

Which style do we like better? I tend to think, we should add both.


I believe this to be a complete list of all the places this needs to be added:

public class Array
{
    public int BinarySearch(Func<T, int> comparison)
    public int BinarySearch<TOther>(TOther value, Func<T, TOther, int> comparison)
}

public class List<T>
{
    public int BinarySearch(Func<T, int> comparison)
    public int BinarySearch<TOther>(TOther value, Func<T, TOther, int> comparison)
	
    //These methods search a subrange of the list
    public int BinarySearch(int index, int count, Func<T, int> comparison)
    public int BinarySearch<TOther>(int index, int count, TOther value, Func<T, TOther, int> comparison)
}

public class ImmutableList<T> //Exactly the same as List<T>
{
    public int BinarySearch(Func<T, int> comparison)
    public int BinarySearch<TOther>(TOther value, Func<T, TOther, int> comparison)
	
    //These methods search a subrange of the list
    public int BinarySearch(int index, int count, Func<T, int> comparison)
    public int BinarySearch<TOther>(int index, int count, TOther value, Func<T, TOther, int> comparison)
}

public class ImmutableList<T>.Builder //Exactly the same as List<T>
{
    public int BinarySearch(Func<T, int> comparison)
    public int BinarySearch<TOther>(TOther value, Func<T, TOther, int> comparison)
	
    //These methods search a subrange of the list
    public int BinarySearch(int index, int count, Func<T, int> comparison)
    public int BinarySearch<TOther>(int index, int count, TOther value, Func<T, TOther, int> comparison)
}

public static class ImmutableArray
{
    public int BinarySearch<T>(this ImmutableArray<T> array, Func<T, int> comparison)
    public int BinarySearch<T, TOther>(this ImmutableArray<T> array, TOther value, Func<T, TOther, int> comparison)
	
    //These methods search a subrange of the list
    public int BinarySearch<T>(this ImmutableArray<T> array, int index, int count, Func<T, int> comparison)
    public int BinarySearch<T, TOther>(this ImmutableArray<T> array, int index, int count, TOther value, Func<T, TOther, int> comparison)
}

public static class MemoryExtensions
{
    public int BinarySearch<T>(this Span<T> span, Func<T, int> comparison)
    public int BinarySearch<T, TOther>(this Span<T> span, TOther value, Func<T, TOther, int> comparison)
	
    public int BinarySearch<T>(this ReadOnlySpan<T> span, Func<T, int> comparison)
    public int BinarySearch<T, TOther>(this ReadOnlySpan<T> span, TOther value, Func<T, TOther, int> comparison)
}

I tried to be very diligent here but chances are I messed up some detail. I left out ArrayList. For Array, I did not add subrange searching because that's covered with spans.

@GSPP
Copy link
Author

GSPP commented Dec 23, 2021

And while we are at it, we might ponder "Binary search for IList and IReadOnlyList" #15804. The framework is missing a strategy for how generic collection algorithms can be exposed for interfaces. Binary search on lists is one instance of that general issue. There are various ways of doing this and none seems to check all the boxes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Collections
Projects
None yet
Development

No branches or pull requests

3 participants