Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout per Query? #12

Closed
Mario-Hofstaetter opened this issue Feb 21, 2020 · 3 comments
Closed

Timeout per Query? #12

Mario-Hofstaetter opened this issue Feb 21, 2020 · 3 comments

Comments

@Mario-Hofstaetter
Copy link

Hi, I have a question since documentation is sparse:

in the example there is a

"MillisecondTimeout": 4000

for the whole metrics.json file. Is this a TOTAL SUM for the scraping of all querys? Or for each of them?
Is it also possible to define a Timeout on per-query basis?

There may be querys that are fast (and shall fail fast) and ones that take longer than the default timeout.

Or should you start more than one exporter processes with different configurations?

Thanks in advance, BR Mario

@DanielOliver
Copy link
Owner

Good question that I should have clarified before. That's a timeout that starts when a prometheus instance calls the metrics endpoint, which is what triggers each query to run and to measure. Without the prometheus instance scraping the metrics endpoint, no queries will run at all.

var results = Task.WhenAll(_metrics.Select(x => x.MeasureWithConnection(_logger, _sqlConnectionString, _millisecondTimeout)).ToArray()).ConfigureAwait(false).GetAwaiter().GetResult();

As best as resources allow, every query will be called in parallel and then each query must finish before that timeout happens, or else each failed individual query will contribute to a built-in "timed out queries" gauge metric that merely shows how many queries are timing out.

There is currently a minor ability to set an individual query timeout. Each query might have a timeout also set, but the timeout that will be used is the minimum between the global timeout and the query timeout.

var timeout = Math.Min(defaultMillisecondTimeout, query.MillisecondTimeout ?? 100_000_000);

    "Queries": [
        {
            "Name": "mssql_deadlocks",
            "Query": "SELECT cntr_value FROM sys.dm_os_performance_counters where counter_name = 'Number of Deadlocks/sec' AND instance_name = '_Total'",
            "Description": "Number of lock requests per second that resulted in a deadlock since last restart",
            "Columns": [
                {
                    "Name": "cntr_value",
                    "Label": "mssql_deadlocks",
                    "Usage": "Gauge",
                    "DefaultValue": 0
                }
            ],
            "MillisecondTimeout": 2000
        }
    ],
    "MillisecondTimeout": 4000
}

The reason that I so insistently throw a hard limit on query times out there is that I don't want monitoring to be a big performance hit on my SQL Server instances, AND because I tend to make aggressively small timeouts on prometheus scrape targets to keep things as light as possible.
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config

@DanielOliver
Copy link
Owner

And to the last part of your question, you always could start multiple exporter processes, but that seems like a lot of overhead. I'd rather reach a solution that doesn't require a lot of these instances.

@Mario-Hofstaetter
Copy link
Author

Thank you for your answers.
What would you recommend as a possible way, if some querys are taking more time?

First thing that came to my mind is: offloading the query to e.g. SQL Server Agent and caching the results in temporary tables, which then can be scraped quickly.

Probably I should ensure my querys don't take that darn long though :-/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants