Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linear Search #23203

Open
csbrown opened this issue Feb 13, 2023 · 7 comments
Open

Linear Search #23203

csbrown opened this issue Feb 13, 2023 · 7 comments

Comments

@csbrown
Copy link

csbrown commented Feb 13, 2023

There is a specific implementation for binary search. Can there also be one for linear search? There are use cases where the first element where [condition] might frequently be very close to the front of the array, but existing methods for executing a linear search in numpy require operations on the entire array. There seems to be some demand for this (see also linked questions). I can be convinced to work on this, but I've never contributed, and would appreciate some tips on where to start with this.

Related: #1835

@charris
Copy link
Member

charris commented Feb 13, 2023

There is a method that starts by doubling the interval starting with the last bound until it finds an interval containing the value, then bisects that interval. It is commonly used when looking to interpolate a piecewise function at many values in order. Not sure where the original came from, I first saw it in Numerical Recipes long ago.

@seberg
Copy link
Member

seberg commented Feb 13, 2023

I think we are talking of different things. One issue is to not work with the full array, the other about how to search within a chunk of data (or the full array).

I think this request basically a duplicate of gh-8513 and maybe another issue or two.

@MatteoRaso
Copy link
Contributor

Can there also be one for linear search?

Wouldn't that just be a while loop? I'm not sure if adding a function to alias a single loop is a good idea.

@charris
Copy link
Member

charris commented Feb 14, 2023

One issue is to not work with the full array

What I mentioned works for that, but does require a sorted array to be searched. It starts at the beginning, or last found object, and tries intervals of increasing size until it finds one containing the wanted element, then bisects that interval. It is practically linear time if the items searched for are also in order.

@MatteoRaso
Copy link
Contributor

What I mentioned works for that, but does require a sorted array to be searched. It starts at the beginning, or last found object, and tries intervals of increasing size until it finds one containing the wanted element, then bisects that interval. It is practically linear time if the items searched for are also in order.

This sounds similar to a binary search, which is already implemented. On top of that, you need to account for the time it takes to sort the array, which can be non-trivial. What the author is describing is a search that checks each element sequentially, which really just requires a while loop.

@rkern
Copy link
Member

rkern commented Feb 14, 2023

Most of numpy is "just a for loop", but implemented in C for performance and expressiveness.

I have definitely had cases where I want to find the first or last occurrence of something in an unsorted array, and a linear search would be perfect for the task. Think of trim_zeros(), only just returning the found indices for further processing (and application to associated arrays) rather than doing the trimming. I think I've written it by hand in Cython, multiple times.

@charris
Copy link
Member

charris commented Feb 14, 2023

This sounds similar to a binary search

It is, but faster for specific applications. For instance, if you have a piece-wise polynomial and want for evaluate it on a dense set of points, each segment might be reused many times and won't need searching to find.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants