-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tags support in relayed metrics #14
Conversation
@jjneely can you take a look ? |
@@ -142,7 +169,7 @@ func sendPacket(buff []byte, target string, sendproto string, TCPtimeout time.Du | |||
break | |||
} | |||
conn.Write(buff) | |||
conn.Close() | |||
defer conn.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This moves the Close() routine effectively to a go routine -- I imagine this has some speed benefits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need some overall better testing and right now i do some tests on higher traffic client most with flamegraphs and pprof to look what takes most of CPU and this is only little change to make it better before bigger changes.
Main problem in statsrelay right now is dial on TCP.
We are making new connection sending small packet (mtu defined size) and then closing connection. Moving closing to go-routine should help, but final solution is to build buffering and sending data on one connection in configured batches of packets and then reconnect.
We need buffers/or structure per hashring endpoint or rehashing if enpoint return time expire.
This makes possible to have fail buffer when connection i/o errors and store data until hash ring endpoint return with some buffer limits and remove last old when no space in buffer.
I would like to implement this but in near future because now i need to focus on different part of monitoring infrastructure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should leave the TCP connection open, re-open on errors. That's what I've looked at doing, but I don't yet use this code path myself. ;-)
I hesitate to do much buffering, it could definitely assist with better detection of healthy backends, but there is little we can do to prevent time skew.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simple improve here, before better solution #18
Sorry this has taken me so long. Life....work.... These changes look great, and it looks like there are several changes that will help keep statsrelay fast as well. Much appreciated. |
This PR adds support for statsd metrics tags supported by Datadog - https://help.datadoghq.com/hc/en-us/articles/204312749-Getting-started-with-tags
Adds new option metrics-tags which is used to add tags at the end of each metric just like prefix on the beginning.
Sending metic with tags:
and statsrelay running with example:
This will produce:
If no tag in metric and defined in statsrelay then it will add statsrelay |#tags to every metric.
This feature should be very helpful with DataDog metrics management and any system supporting tags like InfluxDB backend for example.
We can manage short key in metrics with dimensions defined in tags.