Skip to content

Conversation

@maxenglander
Copy link
Collaborator

Currently the exporter runs queries without any time bounds. This means a poorly behaved query will just run for ever. A connection limit on the exporter role should prevent queries from piling up, but query pile up is still something to be concerned about.

Validated by creating this file:

❯ cat timeout_test_queries.yaml
pg_long_query:
  query: "SELECT 5 as value FROM (SELECT pg_sleep(5)) as t"
  metrics:
    - value:
        usage: "GAUGE"
        description: "5 second query"

Running with a --scrape.timeout of 10s:

❯ ./postgres_exporter \
          --extend.query-path=timeout_test_queries.yaml \
          --scrape.timeout=10s \
          --log.level=info

cURL-ing:

❯ curl http://localhost:9187/metrics | grep long_query
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 28414    0 28414    0     0   5645      0 --:--:--  0:00:05 --:--:--  5646# HELP pg_long_query_value 5 second query
# TYPE pg_long_query_value gauge
pg_long_query_value{server="127.0.0.1:5432"} 5
100 97364    0 97364    0     0  19341      0 --:--:--  0:00:05 --:--:-- 25461

Repeating with a --scrape.timeout of 2s does not include the pg_long_query_value

❯ curl http://localhost:9187/metrics | grep long_query
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 93091    0 93091    0     0  46352      0 --:--:--  0:00:02 --:--:-- 46360

But does include partial results:

❯ curl http://localhost:9187/metrics | grep pg_settings | wc -l
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 93192    0 93192    0     0  46313      0 --:--:--  0:00:02 --:--:-- 46318
     810

Signed-off-by: Max Englander <max@planetscale.com>
excludeDatabases = kingpin.Flag("exclude-databases", "A list of databases to remove when autoDiscoverDatabases is enabled (DEPRECATED)").Default("").Envar("PG_EXPORTER_EXCLUDE_DATABASES").String()
includeDatabases = kingpin.Flag("include-databases", "A list of databases to include when autoDiscoverDatabases is enabled (DEPRECATED)").Default("").Envar("PG_EXPORTER_INCLUDE_DATABASES").String()
metricPrefix = kingpin.Flag("metric-prefix", "A metric prefix can be used to have non-default (not \"pg\") prefixes for each of the metrics").Default("pg").Envar("PG_EXPORTER_METRIC_PREFIX").String()
scrapeTimeout = kingpin.Flag("scrape.timeout", "Maximum time for a scrape to complete before timing out (0 = no timeout)").Default("0").Envar("PG_EXPORTER_SCRAPE_TIMEOUT").Duration()
Copy link
Member

@frouioui frouioui Sep 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see other flags use dashes or dots, I am unsure I understand the convention

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah Claude suggested this. i'll stick with convention

@frouioui
Copy link
Member

frouioui commented Sep 6, 2025

what is the plan to deploy this? would we directly upstream this or start building our own images from this fork?

Signed-off-by: Max Englander <max@planetscale.com>
@maxenglander maxenglander merged commit 29e305b into main Sep 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants