Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statsd source not supporting Dogstatsd distribution type of metrics #2603

Closed
11 tasks done
szibis opened this issue May 14, 2020 · 9 comments
Closed
11 tasks done

statsd source not supporting Dogstatsd distribution type of metrics #2603

szibis opened this issue May 14, 2020 · 9 comments
Assignees
Labels
have: should We should have this feature, but is not required. It is medium priority. provider: datadog Anything `datadog` service provider related sink: datadog_metrics Anything `datadog_metrics` sink related source: statsd Anything `statsd` source related type: enhancement A value-adding code change that enhances its existing functionality.

Comments

@szibis
Copy link
Contributor

szibis commented May 14, 2020

https://docs.datadoghq.com/developers/metrics/dogstatsd_metrics_submission/?tab=python#distribution

https://docs.datadoghq.com/developers/metrics/types/?tab=distribution#definition

All tested on latest nightly.

Simple example script in python

from datadog import initialize, statsd
import time
import random

options = {
    'statsd_host':'127.0.0.1',
    'statsd_port':8125
}

initialize(**options)

tags_list=["environment:dev"]

@statsd.timed('ss.test_metric.timer', tags=tags_list)
def my_function():
  time.sleep(random.randint(0, 10))

i = 0
while(1):
  i += 1
  statsd.set('ss.test_metric.set', i, tags=tags_list)

  my_function()

  statsd.increment('ss.test_metric.increment', tags=tags_list)
  statsd.decrement('ss.test_metric.decrement', tags=tags_list)

  statsd.gauge('ss.test_metric.gauge', i, tags=tags_list)

  statsd.increment('ss.test_metric.count_rate', sample_rate=0.5, tags=tags_list)

  statsd.distribution('ss.test_metric.distribution', random.randint(0, 20), tags=tags_list)

  statsd.histogram('ss.test_metric.histogram', random.randint(0, 20), tags=tags_list)

  time.sleep(2)

Will produce for distribution emit data this in Vector logs for statsd source

May 14 14:05:26 ip-10-105-195-187 vector[30693]: May 14 14:05:26.721 ERROR source{name=statsd type=statsd}: vector::sources::statsd: Statsd parse error: UnknownMetricType("d")
May 14 14:05:36 ip-10-105-195-187 vector[30693]: May 14 14:05:36.728 ERROR source{name=statsd type=statsd}: vector::sources::statsd: Statsd parse error: UnknownMetricType("d")

Implementation

DataDog distribution type of metrics will be supported by expanding Vector distribution metric to, besides histogram statistic, support summary statistic. In sinks, this statistic differs only in two things from histogram:

While we accept it through:

Originally I thought we could reduce the amount of work by making summary to only differ from histogram in datadog metric sink, but that would make this feature inconsistent.

Alternatives

Support the metric only in datadog metric' sink and statsd source/sink, while in other places default to histogram behavior. In documentation we would only add that we support d tag in statsd source/sink which is passed to datadog metric sink.

Todo

Support of Dogstatsd distribution type of metrics

@binarylogic binarylogic added source: statsd Anything `statsd` source related type: enhancement A value-adding code change that enhances its existing functionality. have: should We should have this feature, but is not required. It is medium priority. labels May 14, 2020
@binarylogic
Copy link
Contributor

Thanks @szibis, we're prioritizing this. @ktff let us know if you need more detail around this.

@ktff

This comment has been minimized.

@ktff
Copy link
Contributor

ktff commented Aug 7, 2020

There is one more small improvements that can be made for aws_cloudwatch_metrics and datadog_metrics, at the moment they are sending aggregates as defined by histogram, but they can be changed to be as defined by https://docs.datadoghq.com/developers/metrics/types/?tab=distribution#definition. But this also depends on how much we would like to integrate this metric.

@binarylogic
Copy link
Contributor

@ktff thanks.

but they can be changed to be as defined by docs.datadoghq.com/developers/metrics/types/?tab=distribution#definition

Right. Isn't that the problem though? According to @lukesteensen in #2913 (comment), we can't send this data to DataDog. They are using an undocumented API for sending sketch data types. We'll need to reach out to them to get information on how to do this.

@binarylogic binarylogic added provider: datadog Anything `datadog` service provider related sink: datadog_metrics Anything `datadog_metrics` sink related labels Aug 7, 2020
@ktff
Copy link
Contributor

ktff commented Aug 9, 2020

@binarylogic by changing the aggregates that we send. That mostly means adding more percentile aggregations than the histogram does. But it seams better to wait for #3130 than add this.

@ktff
Copy link
Contributor

ktff commented Sep 6, 2020

@jamtur01 @binarylogic the opening comment has been changed to reflect the current state and direction.

@binarylogic
Copy link
Contributor

Thanks. The checklist looks good. I assume you’ll continue work on those? Are you blocked on any of that?

@ktff
Copy link
Contributor

ktff commented Sep 9, 2020

I assume you’ll continue work on those?

Yes, and

Are you blocked on any of that?

no.

bors bot pushed a commit that referenced this issue Oct 3, 2020
Ref #2603

Second part of documentation updates. First part https://github.com/timberio/vector-website/pull/128


<!--
**Your PR title must conform to the conventional commit spec!**

  <type>!?(<scope>): <description>

  * `type` = chore, docs, enhancement, newfeat, perf
  * `!` = signals a breaking change
  * `scope` = https://github.com/timberio/vector/blob/master/.github/semantic.yml#L4
  * `description` = short description of the change

Examples:

  * enhancement(file source): Added `sort` option to sort discovered files
  * feat(new source): Initial `statsd` source
  * fix(file source): Fixed a bug discovering new files
  * perf(observability): Improved logging performance
  * docs: Clarified `batch_size` option
-->
@ktff
Copy link
Contributor

ktff commented Oct 4, 2020

This will be done with merger of timberio/vector-website#128

@ktff ktff closed this as completed Oct 11, 2020
mengesb pushed a commit to jacobbraaten/vector that referenced this issue Dec 9, 2020
…ev#4301)

Ref vectordotdev#2603

Second part of documentation updates. First part https://github.com/timberio/vector-website/pull/128

<!--
**Your PR title must conform to the conventional commit spec!**

  <type>!?(<scope>): <description>

  * `type` = chore, docs, enhancement, newfeat, perf
  * `!` = signals a breaking change
  * `scope` = https://github.com/timberio/vector/blob/master/.github/semantic.yml#L4
  * `description` = short description of the change

Examples:

  * enhancement(file source): Added `sort` option to sort discovered files
  * feat(new source): Initial `statsd` source
  * fix(file source): Fixed a bug discovering new files
  * perf(observability): Improved logging performance
  * docs: Clarified `batch_size` option
-->

Signed-off-by: Brian Menges <brian.menges@anaplan.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
have: should We should have this feature, but is not required. It is medium priority. provider: datadog Anything `datadog` service provider related sink: datadog_metrics Anything `datadog_metrics` sink related source: statsd Anything `statsd` source related type: enhancement A value-adding code change that enhances its existing functionality.
Projects
None yet
Development

No branches or pull requests

4 participants