-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: container/set: new package to provide a generic set type #69230
Comments
Related Issues and Documentation
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.) |
Thank you gaby! I'll try compiling a small history of set-related issues: January 2014: #7088Proposed adding sets to the language itself (instead of adding them to the standard library). The closing comment, by https://github.com/ianlancetaylor:
There was also some discussion about the performance trade-offs between sets based on maps and sets based on slices. November 2017: #22812Proposed an unordered set type with "a simplified range operator for sets (with only one assigned variable)". Today this is covered by range over func/iterators. It was closed as a duplicate of #7088 but also of #15292, a general proposal for adding generics to Go. July 2021: #47331Almost all of proposal is based on that discussion. This was done after generics were introduced in Go but before range over func/iterators (#61405). My understanding is that it was closed because it depended on the outcome of the iteration protocol, as said by https://github.com/fzipp in #61884 (comment):
August 2021: #47963Not directly related to sets but a more general question on what to do with the existing container packages following the adoption of generics. August 2023: #61884Closed for being a duplicate of #47331. |
Yes, it's finally time to add this! 😄 🎉 I suggest calling the iterator |
I suggest also adding set.From which takes an iter.Seq. This would let you easily consume a map or a slice, etc. |
|
Thanks for the proposal. I was actually getting very close to making my own proposal. Here is mine, for comparison. The main thing I am still pondering is whether functions that return a set should return a // Package set defines a Set type.
package set
import(
"iter"
"maps"
)
// A Set is a set of elements of some comparable type.
// The zero value of a Set is ready to use.
// A Set contains a map, and has similar performance characteristics,
// copying behavior, and similar handling of floating-point NaN values.
type Set[E comparable] struct {
m map[E]struct{}
}
// Of returns a new [Set] containing the listed elements.
func Of[E comparable](v ...E) Set[E] {
s := Set[E]{m: make(map[E]struct{})}
for _, w := range v {
s.m[w] = struct{}{}
}
return s
}
// Add adds an element to s.
// It reports whether the element was not present before.
func (s *Set[E]) Add(v E) bool {
if s.m == nil {
s.m = make(map[E]struct{})
}
ln := len(s.m)
s.m[v] = struct{}{}
return len(s.m) > ln
}
// AddSeq adds the values from seq to s.
func (s *Set[E]) AddSeq(seq iter.Seq[E]) {
for v := range seq {
s.Add(v)
}
}
// Delete removes an element from s.
// It reports whether the element was in the set.
func (s *Set[E]) Delete(v E) bool {
ln := len(s.m)
delete(s.m, v)
return len(s.m) < ln
}
// DeleteSeq deletes the elements in seq from s.
// Elements that are not present are ignored.
func (s *Set[E]) DeleteSeq(seq iter.Seq[E]) {
for v := range seq {
delete(s.m, v)
}
}
// DeleteFunc deletes the elements in s for which del returns true.
func (s *Set[E]) DeleteFunc(del func(E) bool) {
for v := range s.m {
if del(v) {
delete(s.m, v)
}
}
}
// Contains reports whether s contains an element.
func (s *Set[E]) Contains(v E) bool {
_, ok := s.m[v]
return ok
}
// ContainsAny reports whether any of the elements in seq are in s.
// It stops reading from seq after finding an element that is in s.
func (s *Set[E]) ContainsAny(seq iter.Seq[E]) bool {
for v := range seq {
if _, ok := s.m[v]; ok {
return true
}
}
return false
}
// ContainsAll reports whether all of the elements in seq are in s.
// It stops reading from seq after finding an element that is not in s.
func (s *Set[E]) ContainsAll(seq iter.Seq[E]) bool {
for v := range seq {
if _, ok := s.m[v]; !ok {
return false
}
}
return true
}
// All returns an iterator over all the elements of s.
// The elements are returned in an unpredictable order.
func (s *Set[E]) All() iter.Seq[E] {
return func(yield func(E) bool) {
for v := range s.m {
if !yield(v) {
return
}
}
}
}
// Equal reports whether s and s2 contain the same elements.
func (s *Set[E]) Equal(s2 *Set[E]) bool {
if len(s.m) != len(s2.m) {
return false
}
for v := range s2.m {
if _, ok := s.m[v]; !ok {
return false
}
}
return true
}
// Clear removes all elements from s, leaving it empty.
func (s *Set[E]) Clear() {
clear(s.m)
}
// Clone returns a copy of s. This is a shallow clone:
// the new elements are set using ordinary assignment.
func (s *Set[E]) Clone() Set[E] {
return Set[E]{m: maps.Clone(s.m)}
}
// Len returns the number of elements in s.
func (s *Set[E]) Len() int {
return len(s.m)
}
// Collect collects elements from seq into a new [Set].
func Collect[E comparable](seq iter.Seq[E]) Set[E] {
var r Set[E]
for v := range seq {
r.Add(v)
}
return r
}
// Union returns a new [Set] containing the union of two sets.
func Union[E comparable](s1, s2 *Set[E]) Set[E] {
var r Set[E]
for v := range s1.m {
r.Add(v)
}
for v := range s2.m {
r.Add(v)
}
return r
}
// Intersection returns a new [Set] containing the intersection of two sets.
func Intersection[E comparable](s1, s2 *Set[E]) Set[E] {
var r Set[E]
ma, mb := s1.m, s2.m
if len(ma) > len(mb) {
// Loop through the shorter set.
mb, ma = ma, mb
}
for v := range ma {
if _, ok := mb[v]; ok {
r.Add(v)
}
}
return r
}
// Difference returns a new [Set] containing the elements of s1
// that are not present in s2.
func Difference[E comparable](s1, s2 *Set[E]) Set[E] {
var r Set[E]
for v := range s1.m {
if _, ok := s2.m[v]; !ok {
r.Add(v)
}
}
return r
}
// SymmetricDifference returns a new [Set] containing the elements
// that are in either s1 or s2, but not both.
func SymmetricDifference[E comparable](s1, s2 *Set[E]) Set[E] {
var r Set[E]
ma, mb := s1.m, s2.m
if len(ma) > len(mb) {
// Loop through the shorter set.
mb, ma = ma, mb
}
for v := range ma {
if _, ok := mb[v]; !ok {
r.Add(v)
}
}
return r
} |
@earthboundkid Our current convention is that the |
@ianlancetaylor map in internally a pointer, so i believe it should return a EDIT: I missed that methods like Add, initialize the map, so not every method. |
@mateusz834 Right, with the approach I''m outlining some methods have to use a pointer receiver. And it's generally easier to understand a type if all receivers are either pointers or values. |
Missing predicates: Subset, ProperSubset, Disjoint. |
@ianlancetaylor For what it's worth, I do like your API way more. |
Can we avoid the stutter in |
|
You will rarely write set.Set as an end user because it mostly starts with set.Of or set.Collect. It seems like you’d only need to write it in struct definitions. |
How about Disjoint Union ? |
A list of predicates found in other Go libraries and languages:
I didn't include idioms which makes it looks like the proposal is missing "empty", but for containers in Go it's |
One other set-related function I've used in some of my many hand-written set implementations over the years is Cartesian product, which I expect would have a signature something like this: func CartesianProduct[ElemA, ElemB comparable](a Set[ElemA], b Set[ElemB]) Set[struct { a ElemA, b ElemB }] However, I do already have mixed feelings about whether it's common enough to deserve to be in stdlib. Compared to the other set operations described I've needed it far less frequently, but I have wanted it more than once. If this does seem like something worth implementing then I suppose it's another potential use-case for tuple structs (#63221) and/or for variadic generics (#56462, #66651). I suppose that's probably a good argument against offering this initially so that those broader related proposals have some chance to settle first. |
Thanks for the proposal, I strongly believe that a language such as Go which takes pride in its standart library should have more well-designed generic data structures. To add to the ideas presented here, a constructor function such as |
Is there a reason why some common Set functions are missing in this proposal? I.e empty, subset, properSubset, superset, properSuperset, and disjoint. |
Miss some function // Diff returns s diff of s2, return added, removed, remained sets
// with the given s2 set.
// For example:
// s1 = {a1, a3, a5, a7}
// s2 = {a3, a4, a5, a6}
// added = {a4, a6}
// removed = {a1, a7}
// remained = {a3, a6}
func (s Set[T]) Diff(s2 Set[T]) (added, removed, remained Set[T]) {
removed = newSet[T](len(s))
added = newSet[T](len(s2))
remained = newSet[T](len(s))
for key := range s {
if s2.Contains(key) {
remained[key] = struct{}{}
} else {
removed[key] = struct{}{}
}
}
for key := range s2 {
if !s.Contains(key) {
added[key] = struct{}{}
}
}
return added, removed, remained
}
// DiffVary returns s diff of s2, return added, removed sets
// with the given s2 set.
// For example:
// s1 = {a1, a3, a5, a7}
// s2 = {a3, a4, a5, a6}
// added = {a4, a6}
// removed = {a1, a7}
func (s Set[T]) DiffVary(s2 Set[T]) (added, removed Set[T]) {
removed = newSet[T](len(s))
added = newSet[T](len(s2))
for key := range s {
if !s2.Contains(key) {
removed[key] = struct{}{}
}
}
for key := range s2 {
if !s.Contains(key) {
added[key] = struct{}{}
}
}
return added, removed
} |
@thinkgos It seems like those functions are just combinations of functions already included in the proposal. 🤔 func (s Set[T]) Diff(s2 Set[T]) (added, removed, remained Set[T]) {
return s2.Difference(s), s.Difference(s2), Intersection(s, s2)
}
func (s Set[T]) DiffVary(s2 Set[T]) (added, removed Set[T]) {
return s2.Difference(s), s.Difference(s2)
} Admittedly your It doesn't seem to me that these new additions are really adding enough to justify their presence, particularly given that it seems confusing to have both |
@apparentlymart Yes! I agree! |
I've implemented a Set library before using type definition Some nice side effects of this choice:
Some not-so-nice side effects:
Additionally, one of the methods I've implemented in my library is |
This proposal is entirely based on the initial proposal and following discussion at #47331 and https://go.dev/blog/range-functions. There was a proposal to reopen this discussion 2 weeks ago (#69033) but from what I understand proposals must include what's actually proposed. I apologize if this is not the proper way to do things.
Proposal details
This is a partial copy of the API proposed at #47331, with the doc comment modified following #47331 (comment), and a new
All
function that comes from https://go.dev/blog/range-functions.The text was updated successfully, but these errors were encountered: