Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved statistics on client requests for dnsdist #6766

Closed
johnhtodd opened this issue Jul 3, 2018 · 2 comments
Closed

Improved statistics on client requests for dnsdist #6766

johnhtodd opened this issue Jul 3, 2018 · 2 comments

Comments

@johnhtodd
Copy link

Program: dnsdist

Issue type: feature request

Description:

dnsdist has many methods, ports, IP addresses, and encryption protocols now tied to the front-end of the process but relatively few counters that assist in understanding how this traffic is being presented. It seems there may be more data that can be exposed in the Carbon stats to be imported into TSDB or other monitoring systems.

  • If there are multiple IP addresses bound to dnsdist, it is useful to understand the statistics about each bound IP address. This prevents having to run separate dnsdist instances just for the sake of collecting separate result pools.

  • Protocol (TCP, UDP) needs to be another expansion per statistic type

  • TLS? DNSCrypt? DoH? While it may be obvious from the port and protocol used, perhaps that is also worth tagging.

  • Things like "number of transactions per TCP session" (avg/min/max?) need to be kept to evaluate the quantity of traffic flowing across the average TCP session (regardless of encryption status)

  • Counters for each type of encryption supported are probably required for TLS or DoH (how many sockets were opened with each of the available encryption methods?)

  • Timers for each type of TCP session would also be useful (min/avg/max) so that premature closures could be monitored, which may signal DoS attacks or faults.

  • Closure types for TCP sessions would be useful. What are the counters for "natural" closes requested by one end or the other, or are there timeouts? These indicate DoS attacks and faults. This may also be useful for determining when a system is reaching kernel or other limits.

This is not a complete list of statistics on the front-end of dnsdist, but it can serve as a starting point for further discussion.

Use case:

The goal is to be able to answer questions like "How many TLS sessions did we get on 1.2.3.4:853 yesterday?" or "How many transactions do we get through DoH sessions?" as these questions are going to be more important as encryption increases, and front-end complexity becomes harder to debug without clearer stats.

Related

Related ticket about back-end statistics reporting:
#6255

@rgacogne
Copy link
Member

#7559 added a few additional metrics for TCP and DoT, notably:

  • per-frontend (listening address, port and protocol): current connections count, average queries per connection, average duration of a connection, termination cause (died while reading the query from the client, while sending the response, because of a timeout while speaking to the client, because the downstream server was down) ;
  • per-backend: current connections count, average queries per connection, average duration of a connection, termination cause (died while sending the query to the backend, while reading the response, because of a read timeout or a write timeout).

It also added the type of a frontend (UDP, TCP, UDP DNSCrypt, TCP DNSCrypt, DoT) to the console and API outputs.

@rgacogne
Copy link
Member

rgacogne commented Aug 9, 2019

I'm closing this issue since I believe we have everything you asked for in 1.4.0.

@rgacogne rgacogne closed this as completed Aug 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants