Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wavefront/serializer: improve performance by ~30% #5842

Merged

Conversation

charlievieth
Copy link
Contributor

This PR improves the performance and memory consumption of the Wavefront serializer by roughly 30%.

benchmark                 old ns/op     new ns/op     delta
BenchmarkSerialize-16     1168          778           -33.39%

benchmark                 old allocs     new allocs     delta
BenchmarkSerialize-16     16             8              -50.00%

benchmark                 old bytes     new bytes     delta
BenchmarkSerialize-16     1116          800           -28.32%

Required for all PRs:

  • [ x ] Signed CLA.
  • [ x ] Associated README.md updated.
  • [ x ] Has appropriate unit tests.

This PR improves the performance and memory consumption of the Wavefront
serializer by roughly 30%.

```
benchmark                 old ns/op     new ns/op     delta
BenchmarkSerialize-16     1168          778           -33.39%

benchmark                 old allocs     new allocs     delta
BenchmarkSerialize-16     16             8              -50.00%

benchmark                 old bytes     new bytes     delta
BenchmarkSerialize-16     1116          800           -28.32%
```
@glinton glinton added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label May 13, 2019
@charlievieth
Copy link
Contributor Author

charlievieth commented May 13, 2019

Thanks for approving!

@@ -39,6 +41,12 @@ var tagValueReplacer = strings.NewReplacer("\"", "\\\"", "*", "-")

var pathReplacer = strings.NewReplacer("_", ".")

var bufPool = sync.Pool{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could add a reused buffer to the serializer, it isn't required for a Serializer to be thread safe.

This is a notable difference from the Parser interface which is currently required to be thread-safe, though I do think this should be changed in the future. I will update the documentation for these interfaces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think there is value to using a sync.Pool that is global to the package. The sync.Pool will pin the current goroutine to a pool, which should improve locality of reference. Additionally, it does not make the WavefrontSerializer unsafe to use concurrently and is dynamic in size.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't any reason for it to be dynamic though because it will never need more than a single buffer. Another issue with the pool is that you can't return the buffer to it unless you copy the data out of the buffer, I believe the current use in this pull request is unsafe even in a single threaded environment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another issue with the pool is that you can't return the buffer to it unless you copy the data out of the buffer

We do that. The contents of pooled buffers are copied to the out buffer (note: out will never alias buf, example). This code is thread-safe.

Personally, I would prefer to not add a mutable field to the WavefrontSerializer, but I understand the argument for simplicity and will make the change.

What are your thoughts on adding a mutex to protect the WavefrontSerializer's scratch buffer? From you comment and a quick scan of the code it appears there is no need for it to be thread-safe, but if that were to change it would be beneficial to have it as a guard against this code being overlooked during a refactor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this, but this pool was added by accident the one we actually use is pbFree. That said, I'm still removing the pools in favor of a shared buffer - as requested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't add anything to make it thread safe. I agree it there is a risk of code sneaking in that use these wrong, especially with the way the SerializerOutput interface works (sending in a single instance), but I'd like to move the other way with parsers/serializers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I made WavefrontSerializer thread-safe.

metricSeparator := "."
const metricSeparator = "."
var out []byte
buf := pbFree.Get().(*buffer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably makes sense to just create a single bytes.Buffer and then pass this into the various functions, the out slice would be removed and then there is no reason to append. I don't see a reason we need both of these.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expect something from me tomorrow or Friday as I'm pretty short on bandwidth right now.

@danielnelson
Copy link
Contributor

@puckpuck Can you review? I assume any changes would also need to go into the wavefront-sdk.

Note this removes use of `sync.Pool` and since this is no longer
thread-safe the parallel benchmark is also removed.
}

metricValue, buildError := buildValue(value, metric.Metric)
metricValue, buildError := buildValue(value, name)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always ignore this error - is there any point returning it from buildValue()? This was the case before this PR was made.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be used to determine if the value should be skipped or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically you log on error from the caller, but we log inside the buildValue function instead to avoid noisy log outputs when the value is of type string (which is expected to not work).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the insight, I just changed this to return a bool instead of an error since we used the error as a bool (nil / not-nil) and this saves an alloc.

*b = append(*b, c)
}

func (b *buffer) WriteUnit64(val uint64) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function name has a typo in it, which can be misleading (Unit vs Uint)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦‍♂ thanks - fixed

return *b
}

// Use a fast and simple buffer for constructing statsd messages
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this comment is a left over from something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kinda, and good catch - will remove.

Charlie Vieth added 4 commits May 24, 2019 16:54
Since the buffer we use is now a field of the WavefrontSerializer we
need to protect it against concurrent access with a mutex.
The error it previously returned was treated as a bool - so we should
just return a bool instead.
@danielnelson danielnelson added this to the 1.12.0 milestone Jun 12, 2019
@danielnelson danielnelson added the performance problems with decreased performance or enhancements that improve performance label Jun 12, 2019
@danielnelson danielnelson merged commit 1a647fb into influxdata:master Jun 12, 2019
hwaastad pushed a commit to hwaastad/telegraf that referenced this pull request Jun 13, 2019
bitcharmer pushed a commit to bitcharmer/telegraf that referenced this pull request Oct 18, 2019
athoune pushed a commit to bearstech/telegraf that referenced this pull request Apr 17, 2020
idohalevi pushed a commit to idohalevi/telegraf that referenced this pull request Sep 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin performance problems with decreased performance or enhancements that improve performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants