New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: recognize map-copy with range idiom #26951

Open
martisch opened this Issue Aug 13, 2018 · 2 comments

Comments

Projects
None yet
3 participants
@martisch
Member

martisch commented Aug 13, 2018

As @mvdan notes on the map clearing idiom issue #20138 there are more cases of the map copying pattern than there are cases of the map clearing pattern in the std library:

gogrep '~ $m2 = $_; $*_; for $k, $v := range $(m1 is(map)) { $m2[$k] = $v }' std cmd

map copy pattern:

m2 := make(map[K]V)
for k, v := range m1 {
    m2[k] = v
}

This pattern and variations of it could be detected and made more efficient by avoiding (some) iteration overhead and allocating the new map to a sufficient size.

Implementation Notes

Copying the backing array directly with memcpy (as analog to memclr in the map clearing pattern optimization) however would need to keep the internal hash seed of the two maps equal. Given that there are no guarantees what the iteration order over maps is (not even that it can not be the same) from a language spec perspective that change seems backwards compatible. There is the possibility of leaking state (the seed) to other parts of a go implemented system and by learning something about collisions in the copied map now being able to infer collisions in the original map.

Another implementation caveat is that any values in overflow buckets will need to be copied too. In the map clear case they are currently ignored for clearing as the pointers to them will be removed and the overflow buckets are garbage collected. For a map copy however all overflow buckets will need to be processed too. The new overflow buckets could be bulk allocated. Depending on the complexity and performance gain the backing array and overflow buckets could be allocated in one allocation together.

A further aspect is that if the map is growing that growth would either need to finish first before copying the new bucket array (could be a lot of work) or the state of map growth with old and new bucket array would have to be copied too.

Note that map make + map copy issue is similar to slice make + slice copy/append #26252 and some of the infrastructure to detect the patterns could be reused for both optimizations. There is the possibility to make even more general compiler pass that detects making and overwriting more complex structures (that can not already be easily handled in ssa phase) to avoid unneeded memclr in the future. Slices and Maps directly seem like common enough and requested cases to specialize first in absence of a more general and likely more complex solution.

Map Copying A Special Case Of Map Merging

A more general optimization to map patterns could be to detect any merge (where copy is a special case of m2 being empty):

for k, v := range m1 {
    m2[k] = v
}

While this is easier to detect as a pattern (no compile time reasoning that m2 is empty needed) it is unclear if there are large performance gains that could be made in the non-copy case.

@josharian @khr @mvdan

@martisch martisch added this to the Unplanned milestone Aug 13, 2018

@josharian

This comment has been minimized.

Contributor

josharian commented Aug 13, 2018

Thanks for working this through. Given the complexity of implementation (as you note), I'm inclined to say we shouldn't do this, at least not with our current map implementation.

@valyala

This comment has been minimized.

Contributor

valyala commented Aug 17, 2018

A further aspect is that if the map is growing that growth would either need to finish first before copying the new bucket array (could be a lot of work)

Map copying loop already performs comparable amount of work to map growth, so it should be OK to finish map growth before map copying. This should be simpler than trying to copy map growth state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment