Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upproposal: runtime/pprof: add new WithLabels* function that requires fewer allocations #33701
Comments
This comment has been minimized.
This comment has been minimized.
Change https://golang.org/cl/188499 mentions this issue: |
This comment has been minimized.
This comment has been minimized.
Just to summarize here to avoid needing to load external links, right now the API is:
This requires making a slice to pass to Labels, and that slice escapes (and is variable sized) so it must be heap allocated. The proposed interface in the CL is to add WithLabelsFromMapper(context.Context, LabelMapper), where LabelMapper is:
So you have to make a LabelMapper and then the context package calls its Len and Map methods to retrieve the labels. But if you already have the key-value pairs in your own data structure, you avoid allocating the converted slice. They still get copied into the context in some form, though, so you've cut the allocations by at most 50%, not 100%. Is there a simpler or cleaner API? Is Len really necessary? |
This comment has been minimized.
This comment has been minimized.
I do not think the proposal anywhere claims to lower down the usage to no allocations. I understand that saying there is opportunity to potentially save two allocations when this does only one of them in the worst case can lead to infer that this was misleading. That was not my intention. Note that there are usually at least 3 allocations involved:
The usual case I have seen is therefore should saving at least 2 of 3 allocations since most uses do not use []string as their native representation of key value pairs that can be passed as is to pprof.Labels.
The Len allows to pre-size the internally created map to usually hold all items from parent and child context with the initial allocation as another optimisation. Profiling shows that some map growth is made in this function which seems to account for ~30% of the time in WithLabels.
|
This comment has been minimized.
This comment has been minimized.
ping to experts. Would be nice if this could be resolved before go1.14 enters the freeze period. |
This comment has been minimized.
This comment has been minimized.
I think @matloob knows the history of this API design and the plan to optimize LabelsSet and WithLabels better. My impression was - it was originally designed to support census and it was uncommon to see a large number of new tags (labels) added at once. (usually one or two additional tags added per Do call except when a server is creating the new context based on the tags from wire). Maybe the trend is changed now? The |
This comment has been minimized.
This comment has been minimized.
Ping @pjweinbgo @matloob for any thoughts. Needing to make a Do does make this more complex. |
This comment has been minimized.
This comment has been minimized.
I don't think that we need to have a variant of Do. WithLabels is meant to be a lower level interface than Do, and the WithLabelsFromMapper looks like just a more efficient replacement that an alternative to Do should use. I think it would be valid for OpenCensus to have its own variant of Do that calls WithLabelsFromMapper instead. As long as the OpenCensus variant of Do sets and unsets labels the same way Do does, code that uses pprof's Do and code that uses OpenCensus's Do should be compatible. The way I thought about Do originally was that we'd make Do available for users who weren't using census or something similar, and other libraries would provide their own Dos that set the labels on the context and called WithLabels at the same time. It looks like this change provides another more efficient alternative to WtihLabels, so it seems like it fits in fine. |
This comment has been minimized.
This comment has been minimized.
This sense of "map" - meaning apply a function to a data structure - is not one we've used much in Go to date (strings.Map is the exception). I'm also bothered by needing both Len and Map. Would it really be so bad if there was only Map? Then the argument could be a plain function instead of an interface. Right now the CL uses it as a hint to allocate a map, in which case it really doesn't matter, but if a different implementation used it to allocate a slice and then blindly filled in increasing indexes, that would be a problem. I'm also a little confused about the avoidance of LabelSet. Should we instead be looking at an alternate LabelSet constructor, like LabelSetFromFunc (or a better name)? Then the result could be passed to both Do and WithLabels. |
This comment has been minimized.
This comment has been minimized.
Any comments about trying to use an alternate LabelSet constructor instead of all new API? |
This comment has been minimized.
This comment has been minimized.
I have not gotten around to profiling or testing that yet. Putting better naming aside a possible approach is:
to keep |
This comment has been minimized.
This comment has been minimized.
ping @martisch - any profiling or testing of this alternate approach? |
This comment has been minimized.
This comment has been minimized.
The approach of adding a new helper In addition I removed the mapper closure in the new approach as that caused an allocation which would regress the List case in the number of allocations but has the downside of exposing the internal map[string]string structure. As that is not ideal I would need to first investigate if we can stack allocate and keep that implementation detail hidden with a closure again if that is important. code:
Updated cl/188499 with new code. |
This comment has been minimized.
This comment has been minimized.
@martisch, thanks for confirming that we can use an alternate LabelSet constructor instead of having to change other parts of the API. It would still be good to find a way to avoid exposing the map[string]string. In the long run I expect that internal detail might change too. Unless it is OK for the LabelSet constructor to be passed a dummy map and copy those values out. |
This comment has been minimized.
This comment has been minimized.
Since we are in freeze for go1.14 I have 6months to spend some time figuring out an interface (and potential generic compiler optimization if needed) that does not cause an additional allocation and just assumes a string type for key and value but no other implementation details. |
runtime/pprof.Labels
is used in conjunction withruntime/pprof.WithLabels
to set pprof labels in a context for performance profiling.go/src/runtime/pprof/label.go
Line 59 in c485506
Adding information for fine grained on demand profiling of already running binaries should idealy be very efficient so it can always stay enabled with minimal overhead. The current API could be made more efficient by requiring fewer heap allocations. Pprof information sourced from contexts added by census frameworks is used in large go deployments on every RPC request and thereby small performance gains add up to a larger resource saving across many servers.
The current
runtime/pprof
API requires census frameworks such as OpenCensus to first convert their internal representation of key and value tag pairs (in a slice or map) to a slice of strings for input toruntime/pprof.Labels
.https://github.com/census-instrumentation/opencensus-go/blob/df6e2001952312404b06f5f6f03fcb4aec1648e5/tag/profile_19.go#L24
This requires at least one heap allocation for a variable amount of labels. Then internaly the
Labels
functions constructs aLabelSet
data structure which requires another allocation (the case where this uses more than one allocation will be improved with cl/181517 ). All in all this makes two heap allocations per context creation with pprof labels which can potentially be avoided.I propose to extend
runtime/pprof
to have an API that takes e.g. a mapping/iteration interface such that census frameworks can implement that interface on their internal tag representations (e.g. maps and slices with custom types) andruntime/pprof
can then source the labels to be set in a newruntime/pprof.WithLabels*
function without first requiring conversion between multiple internal and external data structures.cl/188499 is a quick prototype as an example how this could look like. Different other ways of making an interface that can be used are possible to reduce allocations. Note that the LabelSet struct cant be changed to an interface itself (which seems the cleaner approach) due being not API backwards compatible.
/cc @aclements @randall77 @matloob