digoal
2022-11-21
PostgreSQL , prepared statement , 参数 , 执行计划 , plan_cache_mode , generic plan , custom plan , auto
PostgreSQL prepared statement在日志中通常体现为, 参数用位置变量替代:
LOG: duration: 0.012 ms execute stmt: SELECT oid FROM pg_class WHERE relname = $1
DETAIL: parameters: $1 = 'pg_proc'
如果要查看用位置变量替代的SQL的执行计划? 如何? 直接explain会报错:
EXPLAIN SELECT oid FROM pg_class WHERE relname = $1;
ERROR: there is no parameter $1
LINE 1: EXPLAIN SELECT oid FROM pg_class WHERE relname = $1;
^
PostgreSQL优化器12开始可以通过plan_cache_mode参数来控制prepared statement的执行计划选择模式
plan_cache_mode={force_generic_plan|force_custom_plan|auto};
- force_generic_plan : prepared statement 强制用generic_plan , 使用通用执行计划, 每次执行时bind,execute
- force_custom_plan : 强制用custom_plan, 每次执行时都传入变量, parser, plan, execute. 特别适合不在乎parser,plan代价的低并发, 复杂SQL场景, 例如ware house.
- auto : 前5次custom plan, 存储custom plan的avg cost, 然后使用generic_plan . 但是使用前会根据bind value使用generic plan计算一下cost, 如果比custom plan avg cost大, 就用custom plan, 并且累计统计custom plan avg cost.
auto模式:
postgres=# PREPARE stmt(unknown) AS SELECT oid FROM pg_class WHERE relname = $1;
PREPARE
postgres=# set plan_cache_mode=auto;
postgres=# EXPLAIN EXECUTE stmt(NULL);
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.00 rows=0 width=0)
One-Time Filter: false
(2 rows)
postgres=# EXPLAIN EXECUTE stmt(NULL);
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.00 rows=0 width=0)
One-Time Filter: false
(2 rows)
postgres=# EXPLAIN EXECUTE stmt(NULL);
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.00 rows=0 width=0)
One-Time Filter: false
(2 rows)
postgres=# EXPLAIN EXECUTE stmt(NULL);
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.00 rows=0 width=0)
One-Time Filter: false
(2 rows)
postgres=# EXPLAIN EXECUTE stmt(NULL);
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.00 rows=0 width=0)
One-Time Filter: false
(2 rows)
-- 五次custom plan之后, 变成了generic plan
postgres=# EXPLAIN EXECUTE stmt(NULL);
QUERY PLAN
-------------------------------------------------------------------------------------------
Index Scan using pg_class_relname_nsp_index on pg_class (cost=0.28..2.49 rows=1 width=4)
Index Cond: (relname = $1)
(2 rows)
generic plan可以理解为适合大多数输入值都可以获得好的执行效率的执行计划. 如果数据分布倾斜较大, generic plan可能就不适合某些输入value.
例子:
create unlogged table tbl (id int, info text);
insert into tbl select 1,'test' from generate_series(1,10000000); -- 1000万条id=1
insert into tbl values (2,'abc'); -- id=2只有一条
create index idx_tbl_1 on tbl(id);
analyze tbl;
SET plan_cache_mode = force_generic_plan;
这个例子适合大多数id value的generic plan就是全表扫描.
PREPARE p(unknown) AS SELECT * FROM tbl WHERE id = $1;
EXPLAIN EXECUTE p(2);
QUERY PLAN
---------------------------------------------------------------
Seq Scan on tbl (cost=0.00..179055.01 rows=10000001 width=9)
Filter: (id = $1)
(2 rows)
而custom plan会根据实际输入变量, 选择最佳计划:
postgres=# SET plan_cache_mode = force_custom_plan;
SET
postgres=# EXPLAIN EXECUTE p(2);
QUERY PLAN
---------------------------------------------------------------------
Index Scan using idx_tbl_1 on tbl (cost=0.43..1.55 rows=1 width=9)
Index Cond: (id = 2)
(2 rows)
postgres=# EXPLAIN EXECUTE p(1);
QUERY PLAN
---------------------------------------------------------------
Seq Scan on tbl (cost=0.00..179055.01 rows=10000001 width=9)
Filter: (id = 1)
(2 rows)
如果要查看用位置变量替代的SQL的generic plan执行计划?
- 使用unknown类型替代变量类型
- 输入null
- 使用
SET plan_cache_mode = force_generic_plan;
PREPARE stmt(unknown) AS SELECT oid FROM pg_class WHERE relname = $1;
SET plan_cache_mode = force_generic_plan;
EXPLAIN EXECUTE stmt(NULL);
QUERY PLAN
═══════════════════════════════════════════════════════════════════════════════════════════
Index Scan using pg_class_relname_nsp_index on pg_class (cost=0.28..8.29 rows=1 width=4)
Index Cond: (relname = $1)
(2 rows)
DEALLOCATE stmt;
OR
postgres=# SET plan_cache_mode = force_generic_plan;
SET
postgres=# PREPARE p2(unknown, unknown) AS SELECT * FROM tbl WHERE id = $1 and info=$2;
PREPARE
postgres=# explain execute p2 (null,null);
QUERY PLAN
---------------------------------------------------------------
Seq Scan on tbl (cost=0.00..204055.01 rows=10000001 width=9)
Filter: ((id = $1) AND (info = $2))
(2 rows)
注意, 本文提供的方法, 目的是得到prepared statement的generic plan. 而不是custom plan, 因为custom plan需要实际的输入条件才有意义.
得到custom plan就是另一个话题了, 例如过去某些SQL的抖动, 是不是执行计划不对引起的, 那可以用auto_explain插件.
https://github.com/cybertec-postgresql/generic-plan
https://www.cybertec-postgresql.com/en/explain-that-parameterized-statement/