Add tags support in relayed metrics #14

szibis · 2016-12-28T22:12:03Z

This PR adds support for statsd metrics tags supported by Datadog - https://help.datadoghq.com/hc/en-us/articles/204312749-Getting-started-with-tags

Adds new option metrics-tags which is used to add tags at the end of each metric just like prefix on the beginning.
Sending metic with tags:

echo "test.service.counter:1|c|@1.000000|#foo:bar,test" | nc -w 1 -u 127.0.0.1 9125

and statsrelay running with example:

statsrelay -sendproto=TCP -b=127.0.0.1 -metrics-prefix=prefix -metrics-tags=relaytag,relay:tag 10.1.1.1:8125

This will produce:

prefix.test.service.counter:1|c|@1.000000|#foo:bar,test,relaytag,relay:tag
statsrelay.statsProcessed:1|c

If no tag in metric and defined in statsrelay then it will add statsrelay |#tags to every metric.

This feature should be very helpful with DataDog metrics management and any system supporting tags like InfluxDB backend for example.
We can manage short key in metrics with dimensions defined in tags.

szibis · 2017-01-12T19:37:10Z

@jjneely can you take a look ?

jjneely · 2017-02-15T22:00:03Z

statsrelay.go

@@ -142,7 +169,7 @@ func sendPacket(buff []byte, target string, sendproto string, TCPtimeout time.Du
 			break
 		}
 		conn.Write(buff)
-		conn.Close()
+		defer conn.Close()


This moves the Close() routine effectively to a go routine -- I imagine this has some speed benefits?

Yes, we need some overall better testing and right now i do some tests on higher traffic client most with flamegraphs and pprof to look what takes most of CPU and this is only little change to make it better before bigger changes.
Main problem in statsrelay right now is dial on TCP.
We are making new connection sending small packet (mtu defined size) and then closing connection. Moving closing to go-routine should help, but final solution is to build buffering and sending data on one connection in configured batches of packets and then reconnect.
We need buffers/or structure per hashring endpoint or rehashing if enpoint return time expire.
This makes possible to have fail buffer when connection i/o errors and store data until hash ring endpoint return with some buffer limits and remove last old when no space in buffer.

I would like to implement this but in near future because now i need to focus on different part of monitoring infrastructure.

I think we should leave the TCP connection open, re-open on errors. That's what I've looked at doing, but I don't yet use this code path myself. ;-)

I hesitate to do much buffering, it could definitely assist with better detection of healthy backends, but there is little we can do to prevent time skew.

Simple improve here, before better solution #18

jjneely · 2017-02-15T22:02:15Z

Sorry this has taken me so long. Life....work....

These changes look great, and it looks like there are several changes that will help keep statsrelay fast as well. Much appreciated.

szibis added 2 commits December 28, 2016 22:56

Add tags support in relayed metrics

99bcde2

Readme update

d9ce192

Better strings concatenate and defer tcp conn in dial

0a33c28

jjneely reviewed Feb 15, 2017

View reviewed changes

jjneely merged commit c8adb78 into jjneely:master Feb 15, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tags support in relayed metrics #14

Add tags support in relayed metrics #14

szibis commented Dec 28, 2016 •

edited

Loading

szibis commented Jan 12, 2017

jjneely Feb 15, 2017

szibis Mar 10, 2017

jjneely Mar 10, 2017

szibis Mar 12, 2017 •

edited

Loading

jjneely commented Feb 15, 2017

Add tags support in relayed metrics #14

Add tags support in relayed metrics #14

Conversation

szibis commented Dec 28, 2016 • edited Loading

szibis commented Jan 12, 2017

jjneely Feb 15, 2017

Choose a reason for hiding this comment

szibis Mar 10, 2017

Choose a reason for hiding this comment

jjneely Mar 10, 2017

Choose a reason for hiding this comment

szibis Mar 12, 2017 • edited Loading

Choose a reason for hiding this comment

jjneely commented Feb 15, 2017

szibis commented Dec 28, 2016 •

edited

Loading

szibis Mar 12, 2017 •

edited

Loading