New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: add Map.Len method? #20680

Open
F21 opened this Issue Jun 15, 2017 · 28 comments

Comments

Projects
None yet
@F21

F21 commented Jun 15, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

1.9-beta1

What operating system and processor architecture are you using (go env)?

Windows 10 64-bit

set GOARCH=amd64
set GOBIN=
set GOEXE=.exe
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=C:\Work
set GORACE=
set GOROOT=C:\Go
set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
set GCCGO=gccgo
set CC=gcc
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0
set CXX=g++
set CGO_ENABLED=1
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config

It would be really useful to have a Length() method on sync.Map that can return the number of items in a map. Currently, I need to do something like this, which is quite tedious and not as readable:

length := 0

myMap.Range(func(_, _ interface{}) bool {
	length++
	
	return true
})
@randall77

This comment has been minimized.

Show comment
Hide comment
@randall77
Contributor

randall77 commented Jun 15, 2017

@bcmills

This comment has been minimized.

Show comment
Hide comment
@bcmills

bcmills Jun 15, 2017

Member

What's the use-case?

Member

bcmills commented Jun 15, 2017

What's the use-case?

@F21

This comment has been minimized.

Show comment
Hide comment
@F21

F21 Jun 15, 2017

I am storing items filled by goroutines processing data into 2 separate maps. Once this is done, I need to compare the length of these 2 maps to perform a quick sanity check and decide on which branch to proceed for further processing.

Previously I was using github.com/orcaman/concurrent-map and there was a Count() method to do this.

F21 commented Jun 15, 2017

I am storing items filled by goroutines processing data into 2 separate maps. Once this is done, I need to compare the length of these 2 maps to perform a quick sanity check and decide on which branch to proceed for further processing.

Previously I was using github.com/orcaman/concurrent-map and there was a Count() method to do this.

@bradfitz bradfitz changed the title from Length() for sync.Map to sync: add Map.Length method? Jun 15, 2017

@bradfitz bradfitz added this to the Go1.9Maybe milestone Jun 15, 2017

@bradfitz

This comment has been minimized.

Show comment
Hide comment
@bradfitz

bradfitz Jun 15, 2017

Member

This would normally be a Go 1.10 thing, but since this type is new in Go 1.9, I'll let @bcmills decide.

The implementation might be hairy enough to warrant Go 1.10 anyway, especially if the representation needs to change and other code gets modified.

Member

bradfitz commented Jun 15, 2017

This would normally be a Go 1.10 thing, but since this type is new in Go 1.9, I'll let @bcmills decide.

The implementation might be hairy enough to warrant Go 1.10 anyway, especially if the representation needs to change and other code gets modified.

@cespare

This comment has been minimized.

Show comment
Hide comment
@cespare

cespare Jun 15, 2017

Contributor

Isn't the number you get out going to be a best-effort guess anyway? (Like len(channel)). Seems like an easy user workaround would be to maintain a parallel atomic int64 count.

(I don't know if that'd also be a reasonable internal implementation or not.)

Contributor

cespare commented Jun 15, 2017

Isn't the number you get out going to be a best-effort guess anyway? (Like len(channel)). Seems like an easy user workaround would be to maintain a parallel atomic int64 count.

(I don't know if that'd also be a reasonable internal implementation or not.)

@F21

This comment has been minimized.

Show comment
Hide comment
@F21

F21 Jun 15, 2017

In my case, I am retrieving the length of the map after all goroutines have finished.

F21 commented Jun 15, 2017

In my case, I am retrieving the length of the map after all goroutines have finished.

@bcmills

This comment has been minimized.

Show comment
Hide comment
@bcmills

bcmills Jun 15, 2017

Member

sync.Map is optimized for long-lived, mostly-write workloads, for which a Len method would either be misleading (under-counting keys) or inefficient (introducing cache contention on the length counter).

The Range workaround at least has the benefit of appearing to be as expensive as it actually is.

I'm not opposed to the idea of adding a Len method, but I think we would need to show that the use-case is common enough to merit an API with such subtleties. At the very least, I don't think we should add it in 1.9.

Member

bcmills commented Jun 15, 2017

sync.Map is optimized for long-lived, mostly-write workloads, for which a Len method would either be misleading (under-counting keys) or inefficient (introducing cache contention on the length counter).

The Range workaround at least has the benefit of appearing to be as expensive as it actually is.

I'm not opposed to the idea of adding a Len method, but I think we would need to show that the use-case is common enough to merit an API with such subtleties. At the very least, I don't think we should add it in 1.9.

@bradfitz bradfitz modified the milestones: Go1.10, Go1.9Maybe Jun 15, 2017

@rsc rsc changed the title from sync: add Map.Length method? to sync: add Map.Len method? Jun 26, 2017

@AlexStocks

This comment has been minimized.

Show comment
Hide comment
@AlexStocks

AlexStocks Jul 6, 2017

@bcmills Do not you think it's very rediculous that Go has a built in func len for map while sync.Map does not have a corresponding function func (Map)Len() int to get its size?

AlexStocks commented Jul 6, 2017

@bcmills Do not you think it's very rediculous that Go has a built in func len for map while sync.Map does not have a corresponding function func (Map)Len() int to get its size?

@bcmills

This comment has been minimized.

Show comment
Hide comment
@bcmills

bcmills Jul 6, 2017

Member

@AlexStocks No, I don't think it's "very [ridiculous]". Concurrent data structures are not the same as unsynchronized data structures. The built-in map type doesn't have (and doesn't need) a LoadOrStore, for example.

We should decide the API of each type based on its own tradeoffs. Consistency is a benefit, but there are costs to weigh it against.

Member

bcmills commented Jul 6, 2017

@AlexStocks No, I don't think it's "very [ridiculous]". Concurrent data structures are not the same as unsynchronized data structures. The built-in map type doesn't have (and doesn't need) a LoadOrStore, for example.

We should decide the API of each type based on its own tradeoffs. Consistency is a benefit, but there are costs to weigh it against.

@rfyiamcool

This comment has been minimized.

Show comment
Hide comment
@rfyiamcool

rfyiamcool Jul 13, 2017

I thank go1.9 sync.map length feature should be required. Of course, I'm just a suggestion.
don't range all entry, Extend a field as a atomic counter ..

rfyiamcool commented Jul 13, 2017

I thank go1.9 sync.map length feature should be required. Of course, I'm just a suggestion.
don't range all entry, Extend a field as a atomic counter ..

@bcmills

This comment has been minimized.

Show comment
Hide comment
@bcmills

bcmills Jul 16, 2017

Member

@rfyiamcool

Extend a field as a atomic counter

Delete calls (and Store calls with previously-deleted keys) on disjoint keys in the read-only part of the map do not contend in the current implementation. An atomic counter would reintroduce contention for those calls.

We didn't omit Len just to be stubborn. It really is a subtle problem.

Member

bcmills commented Jul 16, 2017

@rfyiamcool

Extend a field as a atomic counter

Delete calls (and Store calls with previously-deleted keys) on disjoint keys in the read-only part of the map do not contend in the current implementation. An atomic counter would reintroduce contention for those calls.

We didn't omit Len just to be stubborn. It really is a subtle problem.

@bcmills

This comment has been minimized.

Show comment
Hide comment
@bcmills

bcmills Jul 17, 2017

Member

#21035 has more detail on some of the optimizations that might complicate an efficient implementation of Len.

Member

bcmills commented Jul 17, 2017

#21035 has more detail on some of the optimizations that might complicate an efficient implementation of Len.

@GoLangsam

This comment has been minimized.

Show comment
Hide comment
@GoLangsam

GoLangsam Jul 26, 2017

@bcmills May I suggest to have some remark in the source comments saying something like
"As of now, Len() is intentionally not implemented/provided due to ..." with a brief and concrete rationale.

This would ease understanding and adjust expectations of future users (who much more likely read source comments than old issues).

( And this should also be applied to other methods suggested elsewhere (such as UpdateOrStore) and currently not reasonably implementable. )

GoLangsam commented Jul 26, 2017

@bcmills May I suggest to have some remark in the source comments saying something like
"As of now, Len() is intentionally not implemented/provided due to ..." with a brief and concrete rationale.

This would ease understanding and adjust expectations of future users (who much more likely read source comments than old issues).

( And this should also be applied to other methods suggested elsewhere (such as UpdateOrStore) and currently not reasonably implementable. )

@maj-o

This comment has been minimized.

Show comment
Hide comment
@maj-o

maj-o Sep 7, 2017

I looked up the sourcecode of Range() and I think (knowing that the result could be wrong, see remark on Range() and above) this would be the solution for Len()

return len(read.m) instead of the for loop in line 328

maj-o commented Sep 7, 2017

I looked up the sourcecode of Range() and I think (knowing that the result could be wrong, see remark on Range() and above) this would be the solution for Len()

return len(read.m) instead of the for loop in line 328

@bcmills

This comment has been minimized.

Show comment
Hide comment
@bcmills

bcmills Sep 7, 2017

Member

@maj-o Promoting the read map makes Len an O(N) operation if interleaved with Store. By convention, Len methods on types in the Go standard library are O(1).

Member

bcmills commented Sep 7, 2017

@maj-o Promoting the read map makes Len an O(N) operation if interleaved with Store. By convention, Len methods on types in the Go standard library are O(1).

@protosam

This comment has been minimized.

Show comment
Hide comment
@protosam

protosam Sep 15, 2017

I believe that having a .Count() or .Len() or even just adding a count integer would be a good addition for syncmap.Map. Personally I'm using sync.Map to track connections in an asynchronous p2p API that is heavily built around propagating network writes. The count is necessary for artificial connection limits and I had to switch from maps, because of raise conditions in goroutines.

Hope this shows some use-cases for you. I can think of various situations where this simple functionality would reduce code by many lines.

protosam commented Sep 15, 2017

I believe that having a .Count() or .Len() or even just adding a count integer would be a good addition for syncmap.Map. Personally I'm using sync.Map to track connections in an asynchronous p2p API that is heavily built around propagating network writes. The count is necessary for artificial connection limits and I had to switch from maps, because of raise conditions in goroutines.

Hope this shows some use-cases for you. I can think of various situations where this simple functionality would reduce code by many lines.

@cznic

This comment has been minimized.

Show comment
Hide comment
@cznic

cznic Sep 15, 2017

Contributor

The count is necessary for artificial connection limits and I had to switch from maps, because of raise conditions in goroutines.

I think that does not justify adding a method to sync.Map, because it's easy to make your own derived type which just additionaly keeps the count/len information if you think such information is useful.

Contributor

cznic commented Sep 15, 2017

The count is necessary for artificial connection limits and I had to switch from maps, because of raise conditions in goroutines.

I think that does not justify adding a method to sync.Map, because it's easy to make your own derived type which just additionaly keeps the count/len information if you think such information is useful.

@maj-o

This comment has been minimized.

Show comment
Hide comment
@maj-o

maj-o Sep 16, 2017

If dirty has length oft N.
But 1sr : nobody is interested in dirty - if this blocks You, leave it out - O(1)
2nd : we can asume, that dirty is somehow const and small against the total amount - O(1) + const = O(1)
I know that in worst case (map is empty and dort is full) we could habe O(N) - this can be legt out or ignorred, because in real life it dies not matter. In real live the map is bigger then dirty and the amount of dirty is constant.. So, if it hurts You, comment the dirty loop out or leave it, because it does not matter in real live. And there it is O(1)

maj-o commented Sep 16, 2017

If dirty has length oft N.
But 1sr : nobody is interested in dirty - if this blocks You, leave it out - O(1)
2nd : we can asume, that dirty is somehow const and small against the total amount - O(1) + const = O(1)
I know that in worst case (map is empty and dort is full) we could habe O(N) - this can be legt out or ignorred, because in real life it dies not matter. In real live the map is bigger then dirty and the amount of dirty is constant.. So, if it hurts You, comment the dirty loop out or leave it, because it does not matter in real live. And there it is O(1)

@bcmills

This comment has been minimized.

Show comment
Hide comment
@bcmills

bcmills Sep 19, 2017

Member

@maj-o

If dirty has length oft N.

That assumes that the internal representation of sync.Map always includes a single dirty map. Nothing in the API or documentation guarantees that to be the case, and in fact some optimizations (such as #21035) may require the opposite.

Member

bcmills commented Sep 19, 2017

@maj-o

If dirty has length oft N.

That assumes that the internal representation of sync.Map always includes a single dirty map. Nothing in the API or documentation guarantees that to be the case, and in fact some optimizations (such as #21035) may require the opposite.

@dhui

This comment has been minimized.

Show comment
Hide comment
@dhui

dhui Oct 4, 2017

My usecase for an O(1) Len() func is to pre-allocate a slice of data (an optimization) to hold a filtered subset of the the map values, populated by a call to Range(). FYI, my primary use for sync.Map is concurrent O(1) lookups.

So for my usecase, Len() doesn't need to be consistent since the worse case is an under-allocated slice (resulting in inconsistent performance) or an over-allocated slice (resulting in a bit more memory consumed).

dhui commented Oct 4, 2017

My usecase for an O(1) Len() func is to pre-allocate a slice of data (an optimization) to hold a filtered subset of the the map values, populated by a call to Range(). FYI, my primary use for sync.Map is concurrent O(1) lookups.

So for my usecase, Len() doesn't need to be consistent since the worse case is an under-allocated slice (resulting in inconsistent performance) or an over-allocated slice (resulting in a bit more memory consumed).

@protosam

This comment has been minimized.

Show comment
Hide comment
@protosam

protosam Oct 5, 2017

Hey guys,

I just want to clarify on something here. What would you say the scope of requirements adding such functionally be?

Like, to make a decision to add this utility function a part of the API, are you looking for a large quantity of developer need/want for the feature?

Or are we just aiming to decide how it should work?

I'm the former situation I believe that a count feature would just be expected by developers considering that the map type can be used in len()

In the latter situation if it's something that we all want added, but comes down to deciding how it should operate, I think that leads to two potential situations:

1 - a somewhat ballpark count of the map would be necessary for an application. This could be non blocking and the result just needs to be close to the current state.

2 - a blocking function that needs to be timed for an absolute accurate count at the given moment.

A utility function could be made for both scenarios.

I think it is best to make at least the boilerplate functionality for developers to help evade bad practices due to niavity when using the API. Best practices with concurrency is not obvious to new developers and devs trying to adopt go. I do foresee newbies at least reading a godoc and being able to choose based on need.

protosam commented Oct 5, 2017

Hey guys,

I just want to clarify on something here. What would you say the scope of requirements adding such functionally be?

Like, to make a decision to add this utility function a part of the API, are you looking for a large quantity of developer need/want for the feature?

Or are we just aiming to decide how it should work?

I'm the former situation I believe that a count feature would just be expected by developers considering that the map type can be used in len()

In the latter situation if it's something that we all want added, but comes down to deciding how it should operate, I think that leads to two potential situations:

1 - a somewhat ballpark count of the map would be necessary for an application. This could be non blocking and the result just needs to be close to the current state.

2 - a blocking function that needs to be timed for an absolute accurate count at the given moment.

A utility function could be made for both scenarios.

I think it is best to make at least the boilerplate functionality for developers to help evade bad practices due to niavity when using the API. Best practices with concurrency is not obvious to new developers and devs trying to adopt go. I do foresee newbies at least reading a godoc and being able to choose based on need.

@bcmills

This comment has been minimized.

Show comment
Hide comment
@bcmills

bcmills Oct 9, 2017

Member

are you looking for a large quantity of developer need/want for the feature?

Yes, and use-cases that it would (or would not) improve.

Or are we just aiming to decide how it should work?

Not at this time.

I think it is best […] to help evade bad practices due to [naivety] when using the API. Best practices with concurrency is not obvious to new developers and devs trying to adopt go.

I agree that it is important to structure the API to encourage best practices. One best practice for using a concurrent data structure is to avoid depending on global properties of it (such as size), because computing those global properties can incur a surprising cost. That is perhaps the strongest reason why sync.Map does not have a Len method today.

Member

bcmills commented Oct 9, 2017

are you looking for a large quantity of developer need/want for the feature?

Yes, and use-cases that it would (or would not) improve.

Or are we just aiming to decide how it should work?

Not at this time.

I think it is best […] to help evade bad practices due to [naivety] when using the API. Best practices with concurrency is not obvious to new developers and devs trying to adopt go.

I agree that it is important to structure the API to encourage best practices. One best practice for using a concurrent data structure is to avoid depending on global properties of it (such as size), because computing those global properties can incur a surprising cost. That is perhaps the strongest reason why sync.Map does not have a Len method today.

@henrylee2cn

This comment was marked as off-topic.

Show comment
Hide comment
@henrylee2cn

henrylee2cn Oct 13, 2017

From #22247

func (m *Map) Len() int

The following is a reference:

https://github.com/henrylee2cn/goutil/blob/master/map.go#L541

henrylee2cn commented Oct 13, 2017

From #22247

func (m *Map) Len() int

The following is a reference:

https://github.com/henrylee2cn/goutil/blob/master/map.go#L541

@ychen11

This comment was marked as off-topic.

Show comment
Hide comment
@ychen11

ychen11 Mar 7, 2018

So whats the status of this issue? After this argument? Personally I just saw a million of developers are trying to persuade an arrogant golang maintainer that their requirement needs to be noticed.
Feel bad for this issue, typically on top of Golang.

ychen11 commented Mar 7, 2018

So whats the status of this issue? After this argument? Personally I just saw a million of developers are trying to persuade an arrogant golang maintainer that their requirement needs to be noticed.
Feel bad for this issue, typically on top of Golang.

@josharian

This comment was marked as off-topic.

Show comment
Hide comment
@josharian

josharian Mar 7, 2018

Contributor

@ychen11 please be polite

Contributor

josharian commented Mar 7, 2018

@ychen11 please be polite

@golang golang deleted a comment from c-Monster Mar 7, 2018

@golang golang locked as too heated and limited conversation to collaborators Mar 7, 2018

@golang golang unlocked this conversation Jul 13, 2018

@bcmills

This comment has been minimized.

Show comment
Hide comment
@bcmills

bcmills Jul 13, 2018

Member

Unlocking to allow for further comments regarding concrete examples and usage statistics.
(Ideally, please reference experience reports addressing concrete problems and the available workarounds.)

“Me too” comments that do not address the technical considerations up-thread are off topic and will be hidden or deleted.

Member

bcmills commented Jul 13, 2018

Unlocking to allow for further comments regarding concrete examples and usage statistics.
(Ideally, please reference experience reports addressing concrete problems and the available workarounds.)

“Me too” comments that do not address the technical considerations up-thread are off topic and will be hidden or deleted.

@protosam

This comment has been minimized.

Show comment
Hide comment
@protosam

protosam Jul 19, 2018

To be fair, I don't even think Sync.map has a large amount of projects that even need it at this time. Though I have 4 separate projects in which I am clumsily iterating my maps to get counts. I would still like to see something more standardized, planned out, and discussed; eventually implemented as a feature. That way I don't have to second guess if my code is bad later.

@bcmills in regards to you comment from Oct 9, 2017, I can definitely see what you're getting at. In this particular case, considering what sync.Map, I think not globalizing a way to track length is a bit of an oversight.

Hypothetically the type of applications that will use (or can benefit from) sync.Map are likely to be any concurrent tasks that would want to have managed access to a map. In server-side implementations where the server is delegating information between it's clients, it is quite common to want to have a count of things. If someone wanted to use this to concurrently push data sources into memory, a resulting count of things may be application after.

Something I've been considering for a while is to copy sync.Map, adding an integer variable, a function that outputs the value of the new integer, and then do addition/subtraction on it everytime Store() or Delete() is completed successfully.

This is starting to look more appealing than iterating the entire map to confirm the count. I'm in a situation where the amount of time it takes to count the entire map is slowing things down noticeably and I think that just giving the small compute time it takes to just do this plan will alleviate my problem.

protosam commented Jul 19, 2018

To be fair, I don't even think Sync.map has a large amount of projects that even need it at this time. Though I have 4 separate projects in which I am clumsily iterating my maps to get counts. I would still like to see something more standardized, planned out, and discussed; eventually implemented as a feature. That way I don't have to second guess if my code is bad later.

@bcmills in regards to you comment from Oct 9, 2017, I can definitely see what you're getting at. In this particular case, considering what sync.Map, I think not globalizing a way to track length is a bit of an oversight.

Hypothetically the type of applications that will use (or can benefit from) sync.Map are likely to be any concurrent tasks that would want to have managed access to a map. In server-side implementations where the server is delegating information between it's clients, it is quite common to want to have a count of things. If someone wanted to use this to concurrently push data sources into memory, a resulting count of things may be application after.

Something I've been considering for a while is to copy sync.Map, adding an integer variable, a function that outputs the value of the new integer, and then do addition/subtraction on it everytime Store() or Delete() is completed successfully.

This is starting to look more appealing than iterating the entire map to confirm the count. I'm in a situation where the amount of time it takes to count the entire map is slowing things down noticeably and I think that just giving the small compute time it takes to just do this plan will alleviate my problem.

@Zalgo2462

This comment has been minimized.

Show comment
Hide comment
@Zalgo2462

Zalgo2462 Jul 19, 2018

Hello, I am currently developing an application which matches up pairs of individual records in a stream of input data. I have hash-partitioned the data stream such that each of my go routines will use disjoint sets of keys in my sync.Map. However, sometimes a record won't have a corresponding match in the data stream. Over time, these records accumulate in the map. I'd like to routinely get an estimate of the map's size to trigger an eviction policy carried out by Range. This keeps the RAM usage of the program fairly constant without degrading performance.

Currently I am looking at implementing a solution similar to @protosam and maintaining a count myself using a derived type from sync.Map

I'm not sure what I'm asking for should be called Len, but it would certainly be useful.

Zalgo2462 commented Jul 19, 2018

Hello, I am currently developing an application which matches up pairs of individual records in a stream of input data. I have hash-partitioned the data stream such that each of my go routines will use disjoint sets of keys in my sync.Map. However, sometimes a record won't have a corresponding match in the data stream. Over time, these records accumulate in the map. I'd like to routinely get an estimate of the map's size to trigger an eviction policy carried out by Range. This keeps the RAM usage of the program fairly constant without degrading performance.

Currently I am looking at implementing a solution similar to @protosam and maintaining a count myself using a derived type from sync.Map

I'm not sure what I'm asking for should be called Len, but it would certainly be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment