Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: new version of sync.Map #47643

Open
puzpuzpuz opened this issue Aug 11, 2021 · 7 comments
Open

sync: new version of sync.Map #47643

puzpuzpuz opened this issue Aug 11, 2021 · 7 comments
Labels
NeedsInvestigation Performance
Milestone

Comments

@puzpuzpuz
Copy link

@puzpuzpuz puzpuzpuz commented Aug 11, 2021

Preface

Many projects written in Golang have modules that use either sync.Map, or a plain map protected with a mutex. Those of them that use sync.Map not necessarily fit into the use cases suggested by the docs:

The Map type is optimized for two common use cases: (1) when the entry for a given key is only ever written once but read many times, as in caches that only grow, or (2) when multiple goroutines read, write, and overwrite entries for disjoint sets of keys. In these two cases, use of a Map may significantly reduce lock contention compared to a Go map paired with a separate Mutex or RWMutex.

Such projects and even those that conform to the above use cases would benefit from better performance of sync.Map. This issue is aimed to describe a way to improve existing sync.Map or add a new concurrent map implementation.

Proposal

Proposed data structure may be found here: https://github.com/puzpuzpuz/xsync#map

This map uses a modified version of Cache-Line Hash Table (CLHT) data structure. CLHT is built around idea to organize the hash table in cache-line-sized buckets, so that on all modern CPUs update operations complete with at most one cache-line transfer. Also, Get operations involve no write to memory, as well as no mutexes or any other sort of locks. xsync.Map has some modifications of the CLHT algorithm.

I won't go into all details of the original algorithm, but if you're familiar with Java's ConcurrentHashMap, it's somewhat similar, yet organizes the table into a CPU-friendly way. On 64-bit builds the bucket layout looks like the following:

| bucket mutex	| keys array		| values array		| pointer to next bucket  |
| 8 bytes	| 24 bytes (3 pointers)	| 24 bytes (3 pointers)	| 8 bytes		  |
|<-					one cache line (64 bytes)			->|

More details and considerations may be found here.

In general, I'm under impression that the algorithm fits nicely into Golang due to flexible control over memory layout of structs, as well as the presence of garbage collection.

Compatibility aspects

The most significant behavioral difference with sync.Map is that nil values are not supported. That could be preserved, or a interface{} values pointing to a special "nil" struct could be used internally in the Map to mark nil values. Or the restriction could be put of the user code which means a breaking change in case if it's done in sync.Map.

Resize is also different since xsync.Map is grow-only, but that could be changed if necessary.

Benchmarks

The following image shows results of a benchmark run on a cloud VM (GCP, e2-highcpu-32, Intel Xeon, Haswell gen, Golang 1.16.5, Ubuntu 20.04 64-bit). The scenario assumes pre-warmed 1M entries, 99% Gets, 0.5% Stores, 0.5% Deletes and can be considered as read-heavy which is beneficial for sync.Map.

map-99-reads-chart

It may be seen that sync.Map (BenchmarkMapStandard_WarmUp) doesn't scale in this scenario, while xsync.Map (BenchmarkMap_WarmUp) does. More measurements can be found here.

If you have an idea of a better benchmark, I'm happy to run it and share the measurements. So far, I didn't find a scenario where xsync.Map would lead to a regression when compared with sync.Map.

Summary

Not necessary xsync.Map or its algorithm should be used for the next version of sync.Map and any other viable option may be considered. But having an efficient concurrent map as a part of the standard Golang library would be certainly beneficial for all users.

@carlmjohnson
Copy link
Contributor

@carlmjohnson carlmjohnson commented Aug 11, 2021

If another sync.Map is added, I expect users will demand a generic version. Can this be made to work with the generics proposal? If so, maybe it could go into the proposed generics maps package as maps.Sync or some other name.

@bcmills
Copy link
Member

@bcmills bcmills commented Aug 11, 2021

I would prefer to improve (replace the implementation of) the existing sync.Map and/or add a generic Map type than to add a second (non-generic) sync.Map type parallel to the first.

If the proposed implementation satisfies all of the existing (documented) invariants of the sync.Map API, then I think it would be fine to swap out the implementation. I do think it's important to preserve the existing behavior for the nil key, though.

We also need to be careful that Store operations concurrent with Range calls continue to work; see #46399.

I assume that this would also address #21035 and #21032?

@bcmills bcmills added this to the Backlog milestone Aug 11, 2021
@bcmills bcmills added the NeedsInvestigation label Aug 11, 2021
@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Aug 11, 2021

Can you clarify what it means to say that nil values are not supported? I can think of a couple of possibilities. For example, perhaps you could show a small piece of code that works with the current sync.Map but would stop working under this proposal. Thanks.

@puzpuzpuz
Copy link
Author

@puzpuzpuz puzpuzpuz commented Aug 11, 2021

Trying to answer all questions raised so far. Please let me know if something is still unclear. Say, I could go into more details on how atomic snapshots in readers work.

If another sync.Map is added, I expect users will demand a generic version. Can this be made to work with the generics proposal? If so, maybe it could go into the proposed generics maps package as maps.Sync or some other name.

I believe it's possible to swap current sync.Map with the new implementation without breaking the API. I left some ideas on how to approach this in the "Compatibility aspects" section above. Moreover, it would be possible to relax some things described as hints to users, like the list of use cases or the O(N) comment for Range method.

So, yes, a replacement for sync.Map seems to be the best option to me.

As for generics, I'm not familiar with the design, so can't tell if xsync.Map is a good match for them. But even if it's not, it doesn't mean that another algorithm (say, similar to CHM in Java) won't be a good fit.

We also need to be careful that Store operations concurrent with Range calls continue to work; see #46399.

Yes, both nested and concurrent Stores in Range should work fine.

I assume that this would also address #21035 and #21032?

It would certainly address #21035 since writers won't block any reader (at least in a general sense - with a lock/mutex; atomic snapshot code in readers may need to do some spins until it succeeds) and also writers that want to modify another hash table bucket.

As for #21032, I'd say that it should be also resolved since readers don't need to acquire any mutex (and, in general, do any loads into shared memory). Due to this, readers should scale linearly with the number of cores.

Can you clarify what it means to say that nil values are not supported? I can think of a couple of possibilities. For example, perhaps you could show a small piece of code that works with the current sync.Map but would stop working under this proposal. Thanks.

I just mean that in my library I raise a panic when a nil value is provided. So m.Store("foo", nil) would panic if you use xsync.Map which is not the case with sync.Map.

But as I mentioned above it's possible to work around that restriction and support nil values. This would mean an allocation of intermediate interface{} struct, but I don't think it can't be considered as something impacting users. Other than that, the number of allocations in xsync.Map's main operations is the same as in sync.Map, but with a lower allocation rate.

@puzpuzpuz
Copy link
Author

@puzpuzpuz puzpuzpuz commented Aug 11, 2021

One more thing to mention. I didn't verify it yet, but there is a way to make scans more efficient than they are now. ATM they do an equality check for each key in scanned buckets, but that could be improved. By storing MSBs of key's hash code in tagged pointers to key/value, it would possible to skip entries known to have a different hash code. See puzpuzpuz/xsync#2

@puzpuzpuz
Copy link
Author

@puzpuzpuz puzpuzpuz commented Aug 12, 2021

If there is a consensus to update sync.Map internals with the proposed implementation, I'm happy to open a PR and start working on an integration with the runtime (the first step would be hash code and equality calculation for interface{} keys). I'll be having some questions, so it would be nice to have someone who could answer the questions.

In the meanwhile, I'm going to resolve both nil values and grow-only resize limitations in my library.

@puzpuzpuz
Copy link
Author

@puzpuzpuz puzpuzpuz commented Aug 14, 2021

In the meanwhile, I'm going to resolve both nil values and grow-only resize limitations in my library.

Update. Both limitations were resolved. See puzpuzpuz/xsync#6 and puzpuzpuz/xsync#11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Performance
Projects
None yet
Development

No branches or pull requests

4 participants