pg_sampletolog is a PostgreSQL extension which allows to sample statements and/or transactions to logs. It add in PostgreSQL (from 9.4 to 11) same kind of statement sampling added in PostgreSQL 12 (currently not released).
pg_sampletolog allows to:
- Log a sample of statements
- Log a sample of transactions
- Log before or after execution (in order to be compatible with pgreplay)
- Log all
MODstatements, same as log_statement
- Log statement's queryid if pg_stat_statements is installed
As it is a recent release, do not use in production. But feedbacks are welcome ;)
The module can be built using the standard PGXS infrastructure. For this to work, the
pg_config program must be available in your $PATH. Instruction to install follows:
git clone https://github.com/anayrat/pg_sampletolog.git cd pg_sampletolog make make install
Extension must me loaded either:
- In local session with
session_preload_libraries. To be loaded at connection start.
New settings should be visible in
select name from pg_settings where name like 'pg_sampletolog%'; name ---------------------------------------- pg_sampletolog.disable_log_duration pg_sampletolog.log_before_execution pg_sampletolog.log_level pg_sampletolog.log_statement pg_sampletolog.statement_sample_rate pg_sampletolog.transaction_sample_rate (6 rows)
pg_sampletolog is controlled by specifing GUC settings. So you can apply them in your
postgresql.conf or change them inside the session with
By default, pg_sampletolog is disabled, you have to change theses settings (default values in parenthesis):
- pg_sampletolog.log_level (
LOG): Control log severity level. Refer to PostgreSQL documentation for their meaning : Message Severity Levels
- pg_sampletolog.statement_sample_rate (
0): Control fraction of statements to report in log.
- pg_sampletolog.transaction_sample_rate (
0): Control fraction of transactions to report in log.
none): Allow to log all DDL and/or MOD statements.
- pg_sampletolog.log_before_execution (
false): Control to log statements before or after execution.
- pg_sampletolog.disable_log_duration (
false): Disable duration reporting (mainly used for tests).
- Log only 10% of statements:
pg_sampletolog.statement_sample_rate = 0.1
pg_sampelog will log arround 10% of statements. It will do computation for each statement by calling
random() function whose cost is negligible. Message output will look like (depending of your
2019-01-27 12:50:39.361 CET  LOG: Sampled query duration: 0.014 ms - SELECT 1;
- Log only 10 of transactions:
pg_sampletolog.transaction_sample_rate = 0.1
Same as previous, with the difference that, when a transaction starts, pg_sampletolog will choose if the transaction will be logged. When a transaction is logged, all statements of the transaction are logged.
Message output will look like (depending of your
Transaction : BEGIN; SELECT 1; SELECT 1; COMMIT; 2019-01-27 12:51:40.562 CET  LOG: Sampled transaction duration: 0.008 ms - SELECT 1; 2019-01-27 12:51:40.562 CET  LOG: Sampled transaction duration: 0.005 ms - SELECT 1;
- Log all DDL statements:
pg_sampletolog.log_statement = 'ddl':
pg_sampletolog will log all DDL statements, it is similar to
log_statement = ddl parameter in Postgres. It could be useful if you want a sample of read queries but catch all ddl changes.
2019-01-27 12:53:47.564 CET  LOG: Sampled ddl CREATE TABLE t1(c1 int);
- Log all data-modifying statements:
pg_sampletolog.log_statement = 'mod':
pg_sampletolog will log all data-modifying statements (including ddl), it is similar to
log_statement = mod parameter in Postgres. It could be useful if you want a sample of read queries but catch all write changes.
2019-01-27 12:59:54.043 CET  LOG: Sampled query duration: 0.246 ms - INSERT INTO t1 VALUES(1); 2019-01-27 13:00:16.468 CET  LOG: Sampled ddl CREATE INDEX ON t1(c1);
- Log before execution:
pg_sampletolog.log_before_execution = on
By default, pg_sampletolog, will log after query is executed. It could be useful if you want to use results with pgreplay.
pg_sampletolog works on primary, but also on standby. If you call an SQL function containing several queries, pg_sampletolog will only log function call.
With a simple test with pgbench and
log_min_duration = 0:
pgbench -c8 -j2 -t 30000 -P5 bench starting vacuum...end. progress: 5.0 s, 10618.2 tps, lat 0.750 ms stddev 0.170 progress: 10.0 s, 10690.9 tps, lat 0.746 ms stddev 0.168 progress: 15.0 s, 10596.4 tps, lat 0.752 ms stddev 0.149 progress: 20.0 s, 9547.6 tps, lat 0.836 ms stddev 0.176 transaction type: <builtin: TPC-B (sort of)> scaling factor: 100 query mode: simple number of clients: 8 number of threads: 2 number of transactions per client: 30000 number of transactions actually processed: 240000/240000 latency average = 0.774 ms latency stddev = 0.175 ms tps = 9836.617690 (including connections establishing) tps = 9837.334350 (excluding connections establishing) pgbench -c8 -j2 -t 30000 -P5 bench 6.96s user 12.32s system 78% cpu 24.448 total wc -l logfile 1680009
Same test with
pg_sampletolog.statement_sample_rate = 0.1
wc -l logfile 119859
We got less than 10% lines of first test because pg_sampletolog don't log
Bonus if you have
pg_stat_statements installed, pg_sampletolog will report queryid:
LOG: Sampled query duration: 0.032 ms - queryid = 5892081150081336634 - UPDATE pgbench_tellers SET tbalance = tbalance + -3337 WHERE tid = 326;
It is useful to get query's parameters from a query you identified in pg_stat_statements' view.
- PREPARE statements are not supported.
Tests are in
sql/pg_sampletolog.sql and results in
make installcheck to run them, you should see:
/tmp/pgbuild/lib/postgresql/pgxs/src/makefiles/../../src/test/regress/pg_regress --inputdir=./ --bindir='/tmp/pgbuild/bin' --dbname=contrib_regression pg_sampletolog (using postmaster on Unix socket, port 5432) ============== dropping database "contrib_regression" ============== DROP DATABASE ============== creating database "contrib_regression" ============== CREATE DATABASE ALTER DATABASE ============== running regression test queries ============== test pg_sampletolog ... ok ===================== All 1 tests passed. =====================
Regex to remove prefix added by
pg_sampletolog and replace it by
statement: (same prefix as log_statement):
sed -r 's/Sampled (query|ddl|transaction) - (Duration: [0-9]*\.[0-9]* ms)?( - )?(queryid = -?[0-9]*)?( - )?/statement: /g' inputlog.csv > output.log
- Add extension to PGXN
- Add debian packages
- Add centos packges
- Log statements by their queryid
pg_sampletolog is free software distributed under the PostgreSQL license.
Copyright (c) 2019, Adrien NAYRAT