Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

attempt to correct invalid metric names #108

Merged
merged 3 commits into from Oct 21, 2017

Conversation

wyattwalter
Copy link
Contributor

Attempt to correct an invalid name for a metric. Today it tries to send along to upstream StatsD server as-is which will reject the metric.

@wyattwalter
Copy link
Contributor Author

Ping @jstorimer or @wvanbergen for review.

@wvanbergen
Copy link
Contributor

wvanbergen commented Sep 22, 2017

I am a bit worried about adding a gsub call to every metric we send. This code gets executed in many hot paths for us so we are vigilant about adding expensive operations here.

  • Maybe tr works well enough for this use case (similar to how we have done this for tag normalization)? tr is a lot faster than gsub
  • Can you post some before and after benchmarks using Benchmark.ips?

@wyattwalter
Copy link
Contributor Author

Thanks for pointing that out! It definitely is much faster using tr.

I did only a simple benchmark with a single name and the difference was pretty clear. There's definitely a cost to doing this in terms of performance.

with original PR change using gsub:

Warming up --------------------------------------
      normalize name    31.410k i/100ms
Calculating -------------------------------------
      normalize name    371.142k (± 4.8%) i/s -      1.853M in   5.004860s

using tr:

Warming up --------------------------------------
      normalize name   154.744k i/100ms
Calculating -------------------------------------
      normalize name      2.721M (± 6.2%) i/s -     13.617M in   5.024451s

simply returning name:

Warming up --------------------------------------
      normalize name   254.654k i/100ms
Calculating -------------------------------------
      normalize name      8.543M (± 6.9%) i/s -     42.527M in   5.003753s

# @return [String]
def self.normalize_name(name)
name.tr(':|', '_')
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on making this method a private instance method? I don't see how it could be useful to anyone to use it outside of the Metric class.

Maybe normalize_tags could be made a private instance method at some point, but that would break compatibility with previous versions.

@maximebedard
Copy link

Datadog's implementation seems to be stripping even more invalid characters https://github.com/DataDog/dogstatsd-ruby/blob/master/lib/datadog/statsd.rb#L396-L398. Not sure if that's specific to their implementation or to the spec though as I can't find a reliable one anywhere

Copy link
Contributor

@wvanbergen wvanbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with this now if you can address @maximebedard's questions.

@wyattwalter
Copy link
Contributor Author

Thanks for the feedback! I made the changes that I think you were recommending. Since this is no longer a static method and the need for benchmarking was temporary, I also removed the benchmark file (it didn't work after change).

@wyattwalter
Copy link
Contributor Author

I'm also unclear on the spec since DataDog has implemented their own protocol. My change also added the @ as an invalid character, but behaves slightly differently when a :: would be encountered. It would be a __ instead. The DataDog implementation relies on a gsub to consolidate the two chars to one.

@wvanbergen
Copy link
Contributor

I think two underscores are acceptable to not have to take the performance hit of gsub. If people feel that strongly about their metric names, they should be formatting them properly themselves. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants