Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: clear() is slow for maps with big capacity and small number of items #70617

Open
valyala opened this issue Nov 29, 2024 · 7 comments
Open
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@valyala
Copy link
Contributor

valyala commented Nov 29, 2024

The issue

The following pattern is frequently used in order to avoid excess memory allocations by re-using the map:

func f() {
  m := make(map[string]int)

  for {
    addSomeItemsToMap(m)
    useMap(m)

    // clear the map for subsequent re-use
    clear(m)
  }
}

It has been appeared that clear(m) performance is proportional to the number of buckets in m. The number of buckets can grow significantly at addSomeItemsToMap(). After that the performance of clear(m) can slow down significantly (and forever), even if only a few items are added into the map on subsequent iterations.

See https://philpearl.github.io/post/map_clearing_and_size/ for more details.

The solution

Go runtime must be able to switch between the algorithm, which unconditionally clears all the buckets in m, and the algorithm, which clears only the buckets, which contain at least a single item, depending on the ratio between the number of items in the map and the number of buckets in it. This should improve performance of clear(m) in the pattern above when every iteration can store widely different number of items in m.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Nov 29, 2024
@gabyhelp
Copy link

@randall77
Copy link
Contributor

This is another case where I think map shrinking (#20135) is probably the right solution.
Or at least, it would help a lot, and maybe enough. And it helps lots of other cases.

@mknyszek mknyszek added this to the Backlog milestone Dec 2, 2024
@mknyszek
Copy link
Contributor

mknyszek commented Dec 2, 2024

Wouldn't shrinking the map partially defeat the original optimization though?

Not saying we shouldn't shrink maps, it's just the particular example OP gave seems like a trade-off between clear performance and map insert performance (because it forces map growths). It may also be that just shrinking maps is better overall.

@mknyszek
Copy link
Contributor

mknyszek commented Dec 2, 2024

As an additional note, I wonder if this is any better or worse with the Swiss map implementation in Go 1.24.

@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 2, 2024
@thepudds
Copy link
Contributor

thepudds commented Dec 2, 2024

FWIW, Keith commented on a related issue regarding clear performance for large maps for the new Swiss map implementation in #52157 (comment).

@thepudds
Copy link
Contributor

thepudds commented Dec 2, 2024

Wouldn't shrinking the map partially defeat the original optimization though?

One approach could be that the backing storage would not shrink immediately. For example, there is this comment from Keith in #54454 (comment), including:

We probably don't want to shrink if people do the map clear idiom + reinsert a new set of stuff. Perhaps we only start shrinking after 2N operations, so that N delete + N add involves no shrinking. There's definitely a tradeoff here between responsiveness when you use fewer entries and extra work that needs to be done to regrow.

(That is an older comment that predates the clear builtin I think, but presumably something similar could apply).

@prattmic
Copy link
Member

prattmic commented Dec 3, 2024

In https://go.dev/cl/627156, I made swissmap iteration make use of the map metadata to skip runs of empty slots, which significantly speeds up iteration over large but sparse maps (-38% vs old maps).

The same idea could be applied to clear to address this issue. clear will still take longer on large maps, but the increase should be less extreme.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
Development

No branches or pull requests

8 participants