Skip to content

Conversation

mattbostock
Copy link
Contributor

Fixes #60.

Use the pg_settings view to retrieve runtime variables:
https://www.postgresql.org/docs/current/static/view-pg-settings.html

This replaces the use of SHOW to retrieve runtime variables.

In PostgreSQL 9.6, this adds 339 metrics, which use the short_desc
field as a description.

Only runtime variables with a vartype of real or integer are
currently supported.

Example metrics:

# HELP pg_settings_archive_timeout Forces a switch to the next xlog file if a new file has not been started within N seconds.
# TYPE pg_settings_archive_timeout gauge
pg_settings_archive_timeout 0
# HELP pg_settings_authentication_timeout Sets the maximum allowed time to complete client authentication.
# TYPE pg_settings_authentication_timeout gauge
pg_settings_authentication_timeout 60
# HELP pg_settings_autovacuum_analyze_scale_factor Number of tuple inserts, updates, or deletes prior to analyze as a fraction of reltuples.
# TYPE pg_settings_autovacuum_analyze_scale_factor gauge
pg_settings_autovacuum_analyze_scale_factor 0.1
# HELP pg_settings_autovacuum_analyze_threshold Minimum number of tuple inserts, updates, or deletes prior to analyze.
# TYPE pg_settings_autovacuum_analyze_threshold gauge
pg_settings_autovacuum_analyze_threshold 50

Tries to use the unit column from the pg_settings view to determine
which unit a metric is using, and normalise the metrics accordingly to
match Prometheus conventions:

https://prometheus.io/docs/practices/naming/

Settings with units of 8kB are not normalised because most of those
metrics are counting pages rather than bytes, e.g.:

# HELP pg_settings_checkpoint_flush_after Number of pages after which previously performed writes are flushed to disk.
# TYPE pg_settings_checkpoint_flush_after gauge
pg_settings_checkpoint_flush_after 0

# select * from pg_settings WHERE name = 'checkpoint_flush_after';
-[ RECORD 1 ]---+-----------------------------------------------------------------------------
name            | checkpoint_flush_after
setting         | 0
unit            | 8kB
category        | Write-Ahead Log / Checkpoints
short_desc      | Number of pages after which previously performed writes are flushed to disk.
extra_desc      |
context         | sighup
vartype         | integer
source          | default
min_val         | 0
max_val         | 256
enumvals        |
boot_val        | 0
reset_val       | 0
sourcefile      |
sourceline      |
pending_restart | f

@mattbostock
Copy link
Contributor Author

mattbostock commented Mar 27, 2017

I'm interested in feedback on this PR - this change adds a lot of metrics (at least 339 on my test setup), which I'm not sure is desired.

I'd also like to look at adding some tests for this PR before it's considered for merging.

@mattbostock mattbostock changed the title Add all integer/real runtime variables [WIP] Add all integer/real runtime variables Mar 27, 2017
@mattbostock
Copy link
Contributor Author

/cc @prymitive @bobrik @ferringb

@mattbostock
Copy link
Contributor Author

Rebased to fix the integration test.

@wrouesnel
Copy link
Contributor

Test wise we'd want to figure out how to assert the unit inference works version to version I think, which might be a bit of a change to how I run the tests now since the best way would be to do a like to like comparison across postgres versions where one unit matches.

Adding 300+ metrics which don't change much should be okay for Prometheus since unchanging metrics yield a pointer update in the backend and no additional storage use.

At the very least Prometheus should be arbitrating what it wants to accept, not us.

switch unit {
case "":
case "min":
suffix = "minimum"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not minute ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

@mattbostock
Copy link
Contributor Author

@wrouesnel: Prometheus can compress these kinds of metrics efficiently, however the overhead of indexing and retaining the extra timeseries in-memory is worth considering IMHO.

I think it's probably fine - these are useful metrics to have and we could always put them behind a feature flag in future if users complained.

@mattbostock
Copy link
Contributor Author

I miscounted the number of metrics - I've included bool types and the number of metrics is 189 using PostgreSQL 9.6.

@mattbostock
Copy link
Contributor Author

Sorry for the delay on this, picking it up again.

@mattbostock mattbostock force-pushed the pg_settings branch 2 times, most recently from 2cf60d6 to 6fb45dc Compare April 12, 2017 14:32
@mattbostock
Copy link
Contributor Author

Updated to add tests, please take a look.

@mattbostock mattbostock force-pushed the pg_settings branch 2 times, most recently from cc9a506 to 51aefbb Compare April 12, 2017 14:40
@coveralls
Copy link

Coverage Status

Coverage increased (+0.4%) to 52.159% when pulling 51aefbb on mattbostock:pg_settings into 0b003b2 on wrouesnel:master.

@mattbostock mattbostock changed the title [WIP] Add all integer/real runtime variables Add all integer/real runtime variables Apr 12, 2017
@wrouesnel
Copy link
Contributor

Hopefully I'll have time in the next few days. Thanks for the great work on this!

Use the `pg_settings` view to retrieve runtime variables:
https://www.postgresql.org/docs/current/static/view-pg-settings.html

This replaces the use of `SHOW` to retrieve runtime variables.

In PostgreSQL 9.6, this adds 189 metrics, which use the `short_desc`
field as a description.

Only runtime variables with a `vartype` of `bool`, `real`, or `integer`
are currently supported.

Example metrics:

    # HELP pg_settings_allow_system_table_mods Allows modifications of the structure of system tables.
    # TYPE pg_settings_allow_system_table_mods gauge
    pg_settings_allow_system_table_mods 0
    # HELP pg_settings_archive_timeout_seconds Forces a switch to the next xlog file if a new file has not been started within N seconds. [Units converted to seconds.]
    # TYPE pg_settings_archive_timeout_seconds gauge
    pg_settings_archive_timeout_seconds 0
    # HELP pg_settings_array_nulls Enable input of NULL elements in arrays.
    # TYPE pg_settings_array_nulls gauge
    pg_settings_array_nulls 1
    # HELP pg_settings_authentication_timeout_seconds Sets the maximum allowed time to complete client authentication. [Units converted to seconds.]
    # TYPE pg_settings_authentication_timeout_seconds gauge
    pg_settings_authentication_timeout_seconds 60
    # HELP pg_settings_autovacuum Starts the autovacuum subprocess.
    # TYPE pg_settings_autovacuum gauge
    pg_settings_autovacuum 1
    # HELP pg_settings_autovacuum_analyze_scale_factor Number of tuple inserts, updates, or deletes prior to analyze as a fraction of reltuples.
    # TYPE pg_settings_autovacuum_analyze_scale_factor gauge
    pg_settings_autovacuum_analyze_scale_factor 0.1
    # HELP pg_settings_autovacuum_analyze_threshold Minimum number of tuple inserts, updates, or deletes prior to analyze.
    # TYPE pg_settings_autovacuum_analyze_threshold gauge
    pg_settings_autovacuum_analyze_threshold 50
    # HELP pg_settings_autovacuum_freeze_max_age Age at which to autovacuum a table to prevent transaction ID wraparound.
    # TYPE pg_settings_autovacuum_freeze_max_age gauge
    pg_settings_autovacuum_freeze_max_age 2e+08
    # HELP pg_settings_autovacuum_max_workers Sets the maximum number of simultaneously running autovacuum worker processes.
    # TYPE pg_settings_autovacuum_max_workers gauge
    pg_settings_autovacuum_max_workers 3
    # HELP pg_settings_autovacuum_multixact_freeze_max_age Multixact age at which to autovacuum a table to prevent multixact wraparound.
    # TYPE pg_settings_autovacuum_multixact_freeze_max_age gauge
    pg_settings_autovacuum_multixact_freeze_max_age 4e+08
    # HELP pg_settings_autovacuum_naptime_seconds Time to sleep between autovacuum runs. [Units converted to seconds.]
    # TYPE pg_settings_autovacuum_naptime_seconds gauge
    pg_settings_autovacuum_naptime_seconds 60
    # HELP pg_settings_autovacuum_vacuum_cost_delay_seconds Vacuum cost delay in milliseconds, for autovacuum. [Units converted to seconds.]
    # TYPE pg_settings_autovacuum_vacuum_cost_delay_seconds gauge
    pg_settings_autovacuum_vacuum_cost_delay_seconds 0.02
    # HELP pg_settings_autovacuum_vacuum_cost_limit Vacuum cost amount available before napping, for autovacuum.
    # TYPE pg_settings_autovacuum_vacuum_cost_limit gauge
    pg_settings_autovacuum_vacuum_cost_limit -1
    # HELP pg_settings_autovacuum_vacuum_scale_factor Number of tuple updates or deletes prior to vacuum as a fraction of reltuples.
    # TYPE pg_settings_autovacuum_vacuum_scale_factor gauge
    pg_settings_autovacuum_vacuum_scale_factor 0.2
    # HELP pg_settings_autovacuum_vacuum_threshold Minimum number of tuple updates or deletes prior to vacuum.
    # TYPE pg_settings_autovacuum_vacuum_threshold gauge
    pg_settings_autovacuum_vacuum_threshold 50
    # HELP pg_settings_autovacuum_work_mem_bytes Sets the maximum memory to be used by each autovacuum worker process. [Units converted to bytes.]
    # TYPE pg_settings_autovacuum_work_mem_bytes gauge
    pg_settings_autovacuum_work_mem_bytes -1
@coveralls
Copy link

Coverage Status

Coverage increased (+0.4%) to 51.815% when pulling f9df40d on mattbostock:pg_settings into 53b9d9c on wrouesnel:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.4%) to 51.815% when pulling f9df40d on mattbostock:pg_settings into 53b9d9c on wrouesnel:master.

@wrouesnel
Copy link
Contributor

Spent some time augmenting the integration tests so I could inspect the result of this. In future I definitely need to have CI send comments automatically for me - across all versions:

Removed Metrics

pg_runtime_variable_max_connections
pg_runtime_variable_max_files_per_process
pg_runtime_variable_max_function_args
pg_runtime_variable_max_identifier_length
pg_runtime_variable_max_index_keys
pg_runtime_variable_max_locks_per_transaction
pg_runtime_variable_max_pred_locks_per_transaction
pg_runtime_variable_max_prepared_transactions
pg_runtime_variable_max_standby_archive_delay_milliseconds
pg_runtime_variable_max_standby_streaming_delay_milliseconds
pg_runtime_variable_max_wal_senders

Added Metrics

pg_settings_allow_system_table_mods
pg_settings_archive_mode
pg_settings_archive_timeout_seconds
pg_settings_array_nulls
pg_settings_authentication_timeout_seconds
pg_settings_autovacuum
pg_settings_autovacuum_analyze_scale_factor
pg_settings_autovacuum_analyze_threshold
pg_settings_autovacuum_freeze_max_age
pg_settings_autovacuum_max_workers
pg_settings_autovacuum_multixact_freeze_max_age
pg_settings_autovacuum_naptime_seconds
pg_settings_autovacuum_vacuum_cost_delay_seconds
pg_settings_autovacuum_vacuum_cost_limit
pg_settings_autovacuum_vacuum_scale_factor
pg_settings_autovacuum_vacuum_threshold
pg_settings_autovacuum_work_mem_bytes
pg_settings_backend_flush_after_bytes
pg_settings_bgwriter_delay_seconds
pg_settings_bgwriter_flush_after_bytes
pg_settings_bgwriter_lru_maxpages
pg_settings_bgwriter_lru_multiplier
pg_settings_block_size
pg_settings_bonjour
pg_settings_check_function_bodies
pg_settings_checkpoint_completion_target
pg_settings_checkpoint_flush_after_bytes
pg_settings_checkpoint_segments
pg_settings_checkpoint_timeout_seconds
pg_settings_checkpoint_warning_seconds
pg_settings_commit_delay
pg_settings_commit_siblings
pg_settings_cpu_index_tuple_cost
pg_settings_cpu_operator_cost
pg_settings_cpu_tuple_cost
pg_settings_cursor_tuple_fraction
pg_settings_data_checksums
pg_settings_db_user_namespace
pg_settings_deadlock_timeout_seconds
pg_settings_debug_assertions
pg_settings_debug_pretty_print
pg_settings_debug_print_parse
pg_settings_debug_print_plan
pg_settings_debug_print_rewritten
pg_settings_default_statistics_target
pg_settings_default_transaction_deferrable
pg_settings_default_transaction_read_only
pg_settings_default_with_oids
pg_settings_effective_cache_size_bytes
pg_settings_effective_io_concurrency
pg_settings_enable_bitmapscan
pg_settings_enable_hashagg
pg_settings_enable_hashjoin
pg_settings_enable_indexonlyscan
pg_settings_enable_indexscan
pg_settings_enable_material
pg_settings_enable_mergejoin
pg_settings_enable_nestloop
pg_settings_enable_seqscan
pg_settings_enable_sort
pg_settings_enable_tidscan
pg_settings_escape_string_warning
pg_settings_exit_on_error
pg_settings_extra_float_digits
pg_settings_from_collapse_limit
pg_settings_fsync
pg_settings_full_page_writes
pg_settings_geqo
pg_settings_geqo_effort
pg_settings_geqo_generations
pg_settings_geqo_pool_size
pg_settings_geqo_seed
pg_settings_geqo_selection_bias
pg_settings_geqo_threshold
pg_settings_gin_fuzzy_search_limit
pg_settings_gin_pending_list_limit_bytes
pg_settings_hot_standby
pg_settings_hot_standby_feedback
pg_settings_idle_in_transaction_session_timeout_seconds
pg_settings_ignore_checksum_failure
pg_settings_ignore_system_indexes
pg_settings_integer_datetimes
pg_settings_join_collapse_limit
pg_settings_krb_caseins_users
pg_settings_lock_timeout_seconds
pg_settings_lo_compat_privileges
pg_settings_log_autovacuum_min_duration_seconds
pg_settings_log_checkpoints
pg_settings_log_connections
pg_settings_log_disconnections
pg_settings_log_duration
pg_settings_log_executor_stats
pg_settings_log_file_mode
pg_settings_logging_collector
pg_settings_log_hostname
pg_settings_log_lock_waits
pg_settings_log_min_duration_statement_seconds
pg_settings_log_parser_stats
pg_settings_log_planner_stats
pg_settings_log_replication_commands
pg_settings_log_rotation_age_seconds
pg_settings_log_rotation_size_bytes
pg_settings_log_statement_stats
pg_settings_log_temp_files_bytes
pg_settings_log_truncate_on_rotation
pg_settings_maintenance_work_mem_bytes
pg_settings_max_connections
pg_settings_max_files_per_process
pg_settings_max_function_args
pg_settings_max_identifier_length
pg_settings_max_index_keys
pg_settings_max_locks_per_transaction
pg_settings_max_parallel_workers_per_gather
pg_settings_max_pred_locks_per_transaction
pg_settings_max_prepared_transactions
pg_settings_max_replication_slots
pg_settings_max_stack_depth_bytes
pg_settings_max_standby_archive_delay_seconds
pg_settings_max_standby_streaming_delay_seconds
pg_settings_max_wal_senders
pg_settings_max_wal_size_bytes
pg_settings_max_worker_processes
pg_settings_min_parallel_relation_size_bytes
pg_settings_min_wal_size_bytes
pg_settings_old_snapshot_threshold_seconds
pg_settings_operator_precedence_warning
pg_settings_parallel_setup_cost
pg_settings_parallel_tuple_cost
pg_settings_password_encryption
pg_settings_port
pg_settings_post_auth_delay_seconds
pg_settings_pre_auth_delay_seconds
pg_settings_quote_all_identifiers
pg_settings_random_page_cost
pg_settings_replacement_sort_tuples
pg_settings_replication_timeout_seconds
pg_settings_restart_after_crash
pg_settings_row_security
pg_settings_segment_size_bytes
pg_settings_seq_page_cost
pg_settings_server_version_num
pg_settings_shared_buffers_bytes
pg_settings_silent_mode
pg_settings_sql_inheritance
pg_settings_ssl
pg_settings_ssl_prefer_server_ciphers
pg_settings_ssl_renegotiation_limit_bytes
pg_settings_standard_conforming_strings
pg_settings_statement_timeout_seconds
pg_settings_superuser_reserved_connections
pg_settings_synchronize_seqscans
pg_settings_syslog_sequence_numbers
pg_settings_syslog_split_messages
pg_settings_tcp_keepalives_count
pg_settings_tcp_keepalives_idle_seconds
pg_settings_tcp_keepalives_interval_seconds
pg_settings_temp_buffers_bytes
pg_settings_temp_file_limit_bytes
pg_settings_trace_notify
pg_settings_trace_sort
pg_settings_track_activities
pg_settings_track_activity_query_size
pg_settings_track_commit_timestamp
pg_settings_track_counts
pg_settings_track_io_timing
pg_settings_transaction_deferrable
pg_settings_transaction_read_only
pg_settings_transform_null_equals
pg_settings_unix_socket_permissions
pg_settings_update_process_title
pg_settings_vacuum_cost_delay_seconds
pg_settings_vacuum_cost_limit
pg_settings_vacuum_cost_page_dirty
pg_settings_vacuum_cost_page_hit
pg_settings_vacuum_cost_page_miss
pg_settings_vacuum_defer_cleanup_age
pg_settings_vacuum_freeze_min_age
pg_settings_vacuum_freeze_table_age
pg_settings_vacuum_multixact_freeze_min_age
pg_settings_vacuum_multixact_freeze_table_age
pg_settings_wal_block_size
pg_settings_wal_buffers_bytes
pg_settings_wal_compression
pg_settings_wal_keep_segments
pg_settings_wal_log_hints
pg_settings_wal_receiver_status_interval_seconds
pg_settings_wal_receiver_timeout_seconds
pg_settings_wal_retrieve_retry_interval_seconds
pg_settings_wal_segment_size_bytes
pg_settings_wal_sender_delay_seconds
pg_settings_wal_sender_timeout_seconds
pg_settings_wal_writer_delay_seconds
pg_settings_wal_writer_flush_after_bytes
pg_settings_work_mem_bytes
pg_settings_zero_damaged_pages

Two metrics not retained (looking at the short names) between the versions are:

max_standby_archive_delay_milliseconds
max_standby_streaming_delay_milliseconds

Which looks to be because we pick up "seconds" instead.

So nothing's lost and I'll add this data to release notes.

@wrouesnel
Copy link
Contributor

Closed by 98ba566

I ended up doing some rebasing and touch-ups on the commit messages, but your authorship is preserved. Great work - I'll think on it for another few days and cut a new release then.

@wrouesnel wrouesnel closed this Apr 13, 2017
@coveralls
Copy link

Coverage Status

Coverage remained the same at 51.815% when pulling d0f1e37 on mattbostock:pg_settings into 98ba566 on wrouesnel:master.

@mattbostock
Copy link
Contributor Author

Thanks!

@mattbostock mattbostock deleted the pg_settings branch April 20, 2017 08:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants