Skip to content

proposal: cmp: add CompareBy function #71238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
coady opened this issue Jan 12, 2025 · 13 comments
Open

proposal: cmp: add CompareBy function #71238

coady opened this issue Jan 12, 2025 · 13 comments
Labels
LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool Proposal
Milestone

Comments

@coady
Copy link

coady commented Jan 12, 2025

Proposal Details

An addition to the cmp package to support a common use case in ordered comparison.

func CompareBy[K cmp.Ordered, V any](key func(V) K) func(V, V) int {
	return func(a, b V) int { return cmp.Compare(key(a), key(b)) }
}

The majority of custom comparison functions apply the same transformation to each element. Analogous to Python's key functions, and would integrate well with Reverse proposed in #65632.

@coady coady added the Proposal label Jan 12, 2025
@gopherbot gopherbot added this to the Proposal milestone Jan 12, 2025
@gabyhelp
Copy link

@gabyhelp gabyhelp added the LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool label Jan 12, 2025
@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Jan 13, 2025
@ianlancetaylor
Copy link
Member

It would help to show a couple of examples from the standard library or elsewhere where existing code could be changed to use the proposed function. Would the result look better? Thanks.

@krishpranav
Copy link

krishpranav commented Jan 13, 2025

Message to Maintainer:

Hello 👋,

I would like to submit a proposal for integrating the new CompareBy function into the cmp package. This feature aims to provide a flexible and efficient way to compare ordered values by applying a custom key function, similar to Python’s key function. It simplifies sorting, comparisons, and data handling by allowing developers to define how values should be compared. 🔄

Key Details:

  • The CompareBy function enhances Go's cmp package by allowing custom key functions for comparisons. 🔑
  • It integrates seamlessly with existing features like Reverse for even more advanced use cases. 🔁
  • This change solves a common problem where developers need to apply a transformation to each element before comparison, which is currently not possible with the standard cmp.Compare function. 🧩

Relevant PR:

I’ve included test cases to ensure the new function works as expected, and the results are passing successfully.
✅ I’m confident this feature will significantly improve the flexibility of Go’s cmp package for ordered comparisons. 🚀

Looking forward to your feedback! 🙏💬

Thanks :)

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/642395 mentions this issue: 🚀 cmp: Unleash the Power of Custom Comparisons with the CompareBy Function! 🔥💥

@earthboundkid
Copy link
Contributor

From the proposed doc string:

sort.Slice(people, cmp.CompareBy(func(p Person) int { return p.Age }))

Presumably that was supposed to be slices.SortFunc. I don't think it looks better than slices.SortFunc(people, func(a, b Person) int { return cmp.Compare(a.Age, b.Age) }). It also lets you do a reversal by doing cmp.Compare(b.Age, a.Age) or chain comparisons with cmp.Or.

@dils2k
Copy link

dils2k commented Jan 13, 2025

I don't quite understand the use cases. In #71241 a use case that you mentioned is sorting an array of structs. But is the code below not what you are trying to do?

type Person struct {
	age int
}

ppl := []Person{{age: 2}, {age: 1}}
slices.SortFunc(ppl, func(a, b Person) int {
	return cmp.Compare(a.age, b.age)
})

fmt.Println(ppl)

@jimmyfrasche
Copy link
Member

It would be nice if there were a slices.SortBy that took a key func directly so you could do

slices.SortBy(ppl, func(p Person) int { return p.age })

That would be very handy. I'm not sure of other places where a key func would be useful like that, though.

@coady
Copy link
Author

coady commented Jan 13, 2025

Example from SortFunc.

names := []string{"Bob", "alice", "VERA"}
slices.SortFunc(names, func(a, b string) int { return cmp.Compare(strings.ToLower(a), strings.ToLower(b)) })
slices.SortFunc(names, CompareBy(func(n string) string { return strings.ToLower(n) }))
slices.SortFunc(names, CompareBy(strings.ToLower))

The readability improvement will be proportional to how much logic is duplicated, so simple attribute access is the least compelling. Relatedly, the more logic there is the more likely the function or method will already exist - or should - and can be in-lined.

type Person struct {
	children map[string]int // name: age
}

func (p Person) NumChildren() int {
	return len(p.children)
}

func (p Person) ChildAge(name string) int {
	return p.children[name]
}

slices.SortFunc(names, func(a, b string) int { return cmp.Compare(p.ChildAge(a), p.ChildAge(b)) })
slices.SortFunc(names, CompareBy(func(n string) int { return p.ChildAge(n) }))
slices.SortFunc(names, CompareBy(p.ChildAge))

slices.SortFunc(ppl, func(a, b Person) int { return cmp.Compare(a.NumChildren(), b.NumChildren()) })
slices.SortFunc(ppl, CompareBy(func(p Person) int { return p.NumChildren() }))
slices.SortFunc(ppl, CompareBy(Person.NumChildren))

I'm not opposed to SortBy either; just proposed the most general version. The use cases in slices are: CompareFunc, IsSortedFunc, MaxFunc, MinFunc, Sort{Stable}Func, Sorted{Stable}Func. I think the new SortedFunc is the most useful now.

@adonovan
Copy link
Member

adonovan commented Jan 14, 2025

Applying the key function to every pair of V elements on demand at comparison time (O(n log n) calls, each computing two elements) may be significantly less efficient than using the "decorate, sort, undecorate" approach, which calls the key function exactly once for each V element---especially if the key function allocates memory.

@coady
Copy link
Author

coady commented Jan 15, 2025

Applying the key function to every pair of V elements on demand at comparison time (O(n log n) calls, each computing two elements) may be significantly less efficient than using the "decorate, sort, undecorate" approach...

Which happens now with the binary Func. That could be another point for SortedBy, with caching of the sort keys. Python does that for sorting, as well as for min and max.

@krishpranav
Copy link

I get the concerns about performance with CompareBy. The idea was to make sorting more readable for simple cases, but I see how the repeated key function calls could be less efficient. I agree that SortFunc or SortBy might still be better for more complex or performance-critical situations.

The SortBy idea sounds great too, and I’m open to refining this based on your thoughts. Appreciate the input!

@coady
Copy link
Author

coady commented Jan 24, 2025

Here's a rundown of the binary *Func functions in slices that could use a key function. Ones that wouldn't benefit from key caching.

  • BinarySearch - already uses a different type for target, so cmp is only called once per element
  • Compact and Equal - compare parallel slices, so eq is only called once per element
  • IsSorted - calls cmp twice per element, but since it only returns a bool, it could be more generally optimized by applying a MapFunc utility first
  • Sort{Stable} - in-place isn't particularly relevant if a key needs to be cached

Which leaves these 4, where the limitation of a key function could be justified by improving performance.

func CompactBy[K comparable, V any](seq iter.Seq[V], key func(V) K) iter.Seq2[K, []V]
func MaxBy[K cmp.Ordered, V any](seq iter.Seq[V], key func(V) K) []V
func MinBy[K cmp.Ordered, V any](seq iter.Seq[V], key func(V) K) []V
func SortedBy[K cmp.Ordered, V any](seq iter.Seq[V], key func(V) K) iter.Seq[V] {

So some basic trade-offs.

  1. CompareBy is more general, but has use case limitations, no performance improvements, and readability varies.
  2. The 4 examples are more specific, with the same use case limitations, but would offer more performance and readability.

@jimmyfrasche
Copy link
Member

SortBy/SortStableBy could do the whole decorate-sort-undecorate dance behind the scenes or create a slice of the results of the key func and sort that while also rearranging the original slice appropriately or just keeping track of the permutation and applying it at the end. I'm sure there's a lot of things it could do that would be less efficient than using a custom comparison but would only need to be done once in std and could be reasonably optimized. I wouldn't be surprised if there were literature on efficient implementations though I'm not aware of any offhand. The docs would probably have to say "hey if it's REALLY important write a custom cmp" but for many uses I'm sure it would be fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool Proposal
Projects
Status: Incoming
9 participants