Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: slices: add functions MinNFunc and MaxNFunc #67376

Open
jech opened this issue May 15, 2024 · 3 comments
Open

proposal: slices: add functions MinNFunc and MaxNFunc #67376

jech opened this issue May 15, 2024 · 3 comments
Labels
Milestone

Comments

@jech
Copy link

jech commented May 15, 2024

Proposal Details

I propose the addition of a function

func MinNFunc[S ~[]E, E any](n int, x S, cmp func(a, b E) int) S

which returns the n smallest elements of x in increasing order. If len(x) < n, then MinNFunc returns x sorted.

MinNFunc(n, x, cmp) is equivalent to

    y := SortFunc(x, cmp)
    if len(y) < n {
        return y
    }
    return y[:n]

but it avoids an allocation of size O(len(x)).

For symmetry, there could be a function MaxNFunc which returns the n largest elements of x in decreasing order. MaxNFunc(n, x, cmp) is equivalent to

MinNFunc(n, x, func(a, b E) int {
    return -cmp(a, b)
})
@jech jech added the Proposal label May 15, 2024
@gopherbot gopherbot added this to the Proposal milestone May 15, 2024
@seankhliao
Copy link
Member

this sounds very specialized
does it have wide usage elsewhere?
https://go.dev/doc/faq#x_in_std

@earthboundkid
Copy link
Contributor

earthboundkid commented May 15, 2024

I think this is the wrong proposal. Instead container/heap should be made generic, and you can just pop the first N items off a heap. See #47632.

@jech
Copy link
Author

jech commented May 16, 2024

this sounds very specialized
does it have wide usage elsewhere?

I am not aware of it being included in the stdlib of any popular language, granted. However, it is a simple, well-defined operation that I often find useful and that is not completely trivial to implement efficiently, but that can reuse a lot of the machinery in pdqsort.

As an additional data point, if you perform a web search for "k largest elements", it appears to be a fairly popular operation, but perhaps not for the right reasons (it would seem it's a standard interview question, for some reason).

Instead container/heap should be made generic, and you can just pop the first N items off a heap. See #47632.

If you mean that MinNFunc can be implemented using a priority queue (and therefore using a binary heap) in O(n log k) time, then you're absolutely right. However, doing it efficiently is not completely trivial (the naive implementation copies every element, even those that end up being discarded.)

If you mean that I should be using a priority queue instead of computing the prioritary elements on demand, then I cannot do that: the priority changes dynamically (it depends on the network peers, don't ask), and there's no obvious way to determine when the priorities have changed sufficiently to justify rebuilding the priority queue.

I agree that it'd be nice to have a generic priority queue, but that's orthogonal to this proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

4 participants