Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: runtime/pprof: add new WithLabels* function that requires fewer allocations #33701

Open
martisch opened this issue Aug 17, 2019 · 15 comments

Comments

@martisch
Copy link
Member

@martisch martisch commented Aug 17, 2019

runtime/pprof.Labels is used in conjunction with runtime/pprof.WithLabels to set pprof labels in a context for performance profiling.

func Labels(args ...string) LabelSet {

Adding information for fine grained on demand profiling of already running binaries should idealy be very efficient so it can always stay enabled with minimal overhead. The current API could be made more efficient by requiring fewer heap allocations. Pprof information sourced from contexts added by census frameworks is used in large go deployments on every RPC request and thereby small performance gains add up to a larger resource saving across many servers.

The current runtime/pprof API requires census frameworks such as OpenCensus to first convert their internal representation of key and value tag pairs (in a slice or map) to a slice of strings for input to runtime/pprof.Labels.

https://github.com/census-instrumentation/opencensus-go/blob/df6e2001952312404b06f5f6f03fcb4aec1648e5/tag/profile_19.go#L24

This requires at least one heap allocation for a variable amount of labels. Then internaly the Labels functions constructs a LabelSet data structure which requires another allocation (the case where this uses more than one allocation will be improved with cl/181517 ). All in all this makes two heap allocations per context creation with pprof labels which can potentially be avoided.

I propose to extend runtime/pprof to have an API that takes e.g. a mapping/iteration interface such that census frameworks can implement that interface on their internal tag representations (e.g. maps and slices with custom types) and runtime/pprof can then source the labels to be set in a new runtime/pprof.WithLabels* function without first requiring conversion between multiple internal and external data structures.

cl/188499 is a quick prototype as an example how this could look like. Different other ways of making an interface that can be used are possible to reduce allocations. Note that the LabelSet struct cant be changed to an interface itself (which seems the cleaner approach) due being not API backwards compatible.

/cc @aclements @randall77 @matloob

@martisch martisch added the Proposal label Aug 17, 2019
@martisch martisch added this to the Go1.14 milestone Aug 17, 2019
@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Aug 19, 2019

Change https://golang.org/cl/188499 mentions this issue: runtime/pprof: add interface to reduce number of allocation when setting labels

@bcmills bcmills changed the title proposal: add new WithLabels* function to runtime/pprof that requires fewer allocations proposal: runtime/pprof: add new WithLabels* function that requires fewer allocations Aug 19, 2019
@andybons andybons modified the milestones: Go1.14, Proposal Sep 9, 2019
@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Sep 12, 2019

Just to summarize here to avoid needing to load external links, right now the API is:

type LabelSet struct {
	// Has unexported fields.
}

func Labels(args ...string) LabelSet

func WithLabels(ctx context.Context, labels LabelSet) context.Context

This requires making a slice to pass to Labels, and that slice escapes (and is variable sized) so it must be heap allocated.

The proposed interface in the CL is to add WithLabelsFromMapper(context.Context, LabelMapper), where LabelMapper is:

// LabelMapper is a set of Len() number of labels that can apply a function to all of the contained labels.
type LabelMapper interface {
	Len() int
	Map(f func(key, value string))
}

So you have to make a LabelMapper and then the context package calls its Len and Map methods to retrieve the labels. But if you already have the key-value pairs in your own data structure, you avoid allocating the converted slice.

They still get copied into the context in some form, though, so you've cut the allocations by at most 50%, not 100%.

Is there a simpler or cleaner API? Is Len really necessary?

/cc @hyangah @pjweinbgo @matloob

@martisch

This comment has been minimized.

Copy link
Member Author

@martisch martisch commented Sep 13, 2019

They still get copied into the context in some form, though, so you've cut the allocations by at most 50%, not 100%.

I do not think the proposal anywhere claims to lower down the usage to no allocations. I understand that saying there is opportunity to potentially save two allocations when this does only one of them in the worst case can lead to infer that this was misleading. That was not my intention. Note that there are usually at least 3 allocations involved:

  1. from internal census representation (e.g. a map or slice of structs) to a []string to pass to pprof.Labels
  2. another in pprof.Labels to construct a new slice for the LabelsSet
  3. in WithLabels to create a map to hold the labels

The usual case I have seen is therefore should saving at least 2 of 3 allocations since most uses do not use []string as their native representation of key value pairs that can be passed as is to pprof.Labels.

Is there a simpler or cleaner API? Is Len really necessary?

The Len allows to pre-size the internally created map to usually hold all items from parent and child context with the initial allocation as another optimisation. Profiling shows that some map growth is made in this function which seems to account for ~30% of the time in WithLabels.

for _, label := range labels.list {
	childLabels[label.key] = label.value
}
@martisch

This comment has been minimized.

Copy link
Member Author

@martisch martisch commented Sep 19, 2019

ping to experts. Would be nice if this could be resolved before go1.14 enters the freeze period.
/cc @hyangah @pjweinbgo @matloob

@hyangah

This comment has been minimized.

Copy link
Contributor

@hyangah hyangah commented Sep 19, 2019

I think @matloob knows the history of this API design and the plan to optimize LabelsSet and WithLabels better.

My impression was - it was originally designed to support census and it was uncommon to see a large number of new tags (labels) added at once. (usually one or two additional tags added per Do call except when a server is creating the new context based on the tags from wire). Maybe the trend is changed now?

The runtime/pprof.Do is what the package recommends for users to use, so this needs a variance of Do that works with LabelMapper.

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Sep 25, 2019

Ping @pjweinbgo @matloob for any thoughts. Needing to make a Do does make this more complex.

@matloob

This comment has been minimized.

Copy link
Contributor

@matloob matloob commented Sep 27, 2019

I don't think that we need to have a variant of Do. WithLabels is meant to be a lower level interface than Do, and the WithLabelsFromMapper looks like just a more efficient replacement that an alternative to Do should use. I think it would be valid for OpenCensus to have its own variant of Do that calls WithLabelsFromMapper instead. As long as the OpenCensus variant of Do sets and unsets labels the same way Do does, code that uses pprof's Do and code that uses OpenCensus's Do should be compatible.

The way I thought about Do originally was that we'd make Do available for users who weren't using census or something similar, and other libraries would provide their own Dos that set the labels on the context and called WithLabels at the same time. It looks like this change provides another more efficient alternative to WtihLabels, so it seems like it fits in fine.

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Oct 2, 2019

This sense of "map" - meaning apply a function to a data structure - is not one we've used much in Go to date (strings.Map is the exception). I'm also bothered by needing both Len and Map. Would it really be so bad if there was only Map? Then the argument could be a plain function instead of an interface. Right now the CL uses it as a hint to allocate a map, in which case it really doesn't matter, but if a different implementation used it to allocate a slice and then blindly filled in increasing indexes, that would be a problem.

I'm also a little confused about the avoidance of LabelSet. Should we instead be looking at an alternate LabelSet constructor, like LabelSetFromFunc (or a better name)? Then the result could be passed to both Do and WithLabels.

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Oct 9, 2019

Any comments about trying to use an alternate LabelSet constructor instead of all new API?

@martisch

This comment has been minimized.

Copy link
Member Author

@martisch martisch commented Oct 9, 2019

I have not gotten around to profiling or testing that yet. Putting better naming aside a possible approach is:

type LabelSet struct {
  mapper LabelMapper
}

func LabelsFromMapper(lm LabelMapper) LabelSet {
  return LabelSet{mapper: lm}
}

func Labels(args ...string) LabelSet {
  return LabelSet{mapper: listMapper(args)}
}

type listMapper []string

func (l listMapper) Len() int { return len(l) }
func (l listMapper) Map(f func(key, value string)) {
  for i := 0; i+1 < len(l); i += 2 {
   f(l[i], l[i+1])
  }
}

to keep LabelSet struct backwards compatible. The inner workings of WithLabels can then be replaced with that of WithLabelsFromMapper from https://golang.org/cl/188499 using the mapper field of LabelSet.

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Oct 30, 2019

ping @martisch - any profiling or testing of this alternate approach?

@martisch

This comment has been minimized.

Copy link
Member Author

@martisch martisch commented Nov 6, 2019

The approach of adding a new helper LabelsFromAdder (name to be decided), using WithLabels with a new internal implementation and changing LabelSet struct seems equally performant.

In addition I removed the mapper closure in the new approach as that caused an allocation which would regress the List case in the number of allocations but has the downside of exposing the internal map[string]string structure. As that is not ideal I would need to first investigate if we can stack allocate and keep that implementation detail hidden with a closure again if that is important.

code:

type LabelSet struct {
	adder LabelAdder
}

type LabelAdder interface {
	AddLabels(map[string]string) // should have a better name
}

type listLabels []string

func (l listLabels) AddLabels(m map[string]string) {
	for i := 0; i+1 < len(l); i += 2 {
		m[l[i]] = l[i+1]
	}
}

func WithLabels(ctx context.Context, labelset LabelSet) context.Context {
	parentLabels := labelValue(ctx)
	childLabels := make(labelMap)
	for k, v := range parentLabels {
		childLabels[k] = v
	}
	labelset.adder.AddLabels(childLabels)
	return context.WithValue(ctx, labelContextKey{}, &childLabels)
}

func LabelsFromAdder(labels LabelAdder) LabelSet {
	return LabelSet{adder: labels}
}

func Labels(args ...string) LabelSet {
	return LabelSet{adder: listLabels(args)}
}

name                                           time/op
ContextLabels/WithLabels/Labels/List           297ns ± 2%
ContextLabels/WithLabels/Labels/Map            366ns ± 3%
ContextLabels/WithLabels/LabelsFromAdder/List  208ns ± 2%
ContextLabels/WithLabels/LabelsFromAdder/Map   264ns ± 2%

name                                           alloc/op
ContextLabels/WithLabels/Labels/List            520B ± 0%
ContextLabels/WithLabels/Labels/Map             520B ± 0%
ContextLabels/WithLabels/LabelsFromAdder/List   392B ± 0%
ContextLabels/WithLabels/LabelsFromAdder/Map    392B ± 0%

name                                           allocs/op
ContextLabels/WithLabels/Labels/List            6.00 ± 0%
ContextLabels/WithLabels/Labels/Map             6.00 ± 0%
ContextLabels/WithLabels/LabelsFromAdder/List   4.00 ± 0%
ContextLabels/WithLabels/LabelsFromAdder/Map    4.00 ± 0%

Updated cl/188499 with new code.

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Nov 6, 2019

@martisch, thanks for confirming that we can use an alternate LabelSet constructor instead of having to change other parts of the API.

It would still be good to find a way to avoid exposing the map[string]string. In the long run I expect that internal detail might change too. Unless it is OK for the LabelSet constructor to be passed a dummy map and copy those values out.

@martisch

This comment has been minimized.

Copy link
Member Author

@martisch martisch commented Nov 7, 2019

It would still be good to find a way to avoid exposing the map[string]string.
I agree.

Since we are in freeze for go1.14 I have 6months to spend some time figuring out an interface (and potential generic compiler optimization if needed) that does not cause an additional allocation and just assumes a string type for key and value but no other implementation details.

@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Nov 20, 2019

Marking this on hold until @martisch's update. @martisch, please remove the hold label once you have an update. No hurry. Thanks.

@rsc rsc added this to Active in Proposals Nov 27, 2019
@rsc rsc moved this from Active to Hold in Proposals Nov 27, 2019
@rsc rsc added the Proposal-Hold label Dec 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
6 participants
You can’t perform that action at this time.