Skip to content

Latest commit

 

History

History
323 lines (190 loc) · 11.4 KB

20190304_01.md

File metadata and controls

323 lines (190 loc) · 11.4 KB

PostgreSQL 谁堵塞了谁(锁等待检测)- pg_blocking_pids

作者

digoal

日期

2019-03-04

标签

PostgreSQL , 锁等待 , 队列


背景

1 "被害人"

1、找到"被害人",获取被锁堵塞的PID

select distinct pid from pg_locks where not granted;  

2、找到"嫌疑人",获取被锁堵塞的PID是被哪些PID堵塞的

postgres=# select * from pg_blocking_pids(53920);  
 pg_blocking_pids   
------------------  
 {53868}  
(1 row)  

3、找到"被害人" 受侵害的证据

被锁堵塞的PID,当前的会话内容

postgres=# select * from pg_stat_activity where pid=53920;  
-[ RECORD 1 ]----+------------------------------  
datid            | 13285  
datname          | postgres  
pid              | 53920  
usesysid         | 10  
usename          | postgres  
application_name | psql  
client_addr      |   
client_hostname  |   
client_port      | -1  
backend_start    | 2019-03-04 10:20:29.124634+08  
xact_start       | 2019-03-04 10:20:30.962902+08  
query_start      | 2019-03-04 10:20:30.962902+08  
state_change     | 2019-03-04 10:20:30.962905+08  
wait_event_type  | Lock  
wait_event       | relation  
state            | active  
backend_xid      | 1286297005  
backend_xmin     | 1286297004  
query            | drop table a;  
backend_type     | client backend  

被锁堵塞的PID,当前的锁等待内容

postgres=# select * from pg_locks where pid=53920 and not granted;  
 locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction |  pid  |        mode         | granted | fastpath   
----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+---------------------+---------+----------  
 relation |    13285 |  1907887 |      |       |            |               |         |       |          | 5/1358301          | 53920 | AccessExclusiveLock | f       | f  
(1 row)  

2 "嫌疑人"

1、找到"嫌疑人"当前的状态,(注意,有可能当前会话内容看不出侵害动作)

堵塞这个PID的PIDs,当前的会话内容

postgres=# select * from pg_stat_activity where pid= any (pg_blocking_pids(53920));  
-[ RECORD 1 ]----+------------------------------  
datid            | 13285  
datname          | postgres  
pid              | 53868  
usesysid         | 10  
usename          | postgres  
application_name | psql  
client_addr      |   
client_hostname  |   
client_port      | -1  
backend_start    | 2019-03-04 10:20:21.377909+08  
xact_start       | 2019-03-04 10:20:23.832489+08  
query_start      | 2019-03-04 10:20:25.529063+08  
state_change     | 2019-03-04 10:20:25.53116+08  
wait_event_type  | Client  
wait_event       | ClientRead  
state            | idle in transaction  
backend_xid      | 1286297004  
backend_xmin     |   
query            | truncate a;  
backend_type     | client backend  

如果当前状态没有找到具体是哪条SQL干的坏事,则需要从审计日志中查找。

2、找到"嫌疑人"的"犯罪"证据:

堵塞这个PID的PIDs,当前的锁内容

postgres=# select * from pg_locks where pid=any (pg_blocking_pids(53920)) order by pid;  
   locktype    | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction |  pid  |        mode         | granted | fastpath   
---------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+---------------------+---------+----------  
 virtualxid    |          |          |      |       | 4/1372747  |               |         |       |          | 4/1372747          | 53868 | ExclusiveLock       | t       | t  
 relation      |    13285 |  1907887 |      |       |            |               |         |       |          | 4/1372747          | 53868 | ShareLock           | t       | f  
 relation      |    13285 |  1907887 |      |       |            |               |         |       |          | 4/1372747          | 53868 | AccessExclusiveLock | t       | f  
 transactionid |          |          |      |       |            |    1286297004 |         |       |          | 4/1372747          | 53868 | ExclusiveLock       | t       | f  
(4 rows)  

3 当场对峙

1、"被害人" 对13285.1907887对象需要如下锁

 relation |    13285 |  1907887 |      |       |            |               |         |       |          | 5/1358301          | 53920 | AccessExclusiveLock | f       | f  

2、"嫌疑人" 对13285.1907887对象已持有如下锁

 relation      |    13285 |  1907887 |      |       |            |               |         |       |          | 4/1372747          | 53868 | ShareLock           | t       | f  
 relation      |    13285 |  1907887 |      |       |            |               |         |       |          | 4/1372747          | 53868 | AccessExclusiveLock | t       | f  

两者冲突。

也可以用递归查询加速排查

pg_blocking_pids(被堵塞PID)

create table t_lock (id int primary key, info text, ts timestamp);  
insert into t_lock values (1,'test',now());  

session 1

db1=> begin isolation level repeatable read ;  
BEGIN  
db1=*> select * from t_lock;  
 id | info |             ts               
----+------+----------------------------  
  1 | test | 2023-11-11 02:42:58.803888  
(1 row)  

session 2

db1=> begin;  
BEGIN  
db1=*> truncate table t_lock;  

session 3

db1=> select pg_backend_pid();  
 pg_backend_pid   
----------------  
            367  
(1 row)  
  
  
db1=> select * from t_lock;  

session 4

with recursive tmp as (  
  select 1 as id, 367 as blocked_pid, conflict_origin_own_lock_pid from unnest(pg_blocking_pids(367)) as conflict_origin_own_lock_pid  
  union all  
  select id+1, conflict_origin_own_lock_pid as blocked_pid, unnest(pg_blocking_pids(conflict_origin_own_lock_pid)) as conflict_origin_own_lock_pid from tmp   
     -- where pg_blocking_pids(conflict_origin_own_lock_pid) <> '{}'::int[]   
)   
select tmp.id, pg_locks.pid, pg_locks.locktype, pg_locks.mode, pg_locks.granted, pg_stat_activity.query,   
  tmp.conflict_origin_own_lock_pid   
  from tmp, pg_locks, pg_stat_activity   
  where tmp.blocked_pid=pg_locks.pid   
      -- or tmp.conflict_origin_own_lock_pid=pg_locks.pid   
      and pg_locks.pid = pg_stat_activity.pid  
  order by id;   
  
  
 id | pid |   locktype    |        mode         | granted |         query          | conflict_origin_own_lock_pid   
----+-----+---------------+---------------------+---------+------------------------+------------------------------  
  1 | 367 | virtualxid    | ExclusiveLock       | t       | select * from t_lock;  |                          358  
  1 | 367 | relation      | AccessShareLock     | f       | select * from t_lock;  |                          358  
  2 | 358 | relation      | AccessExclusiveLock | f       | truncate table t_lock; |                          344  
  2 | 358 | virtualxid    | ExclusiveLock       | t       | truncate table t_lock; |                          344  
  2 | 358 | transactionid | ExclusiveLock       | t       | truncate table t_lock; |                          344  
(5 rows)  

找到了源头是344, 最后查一下这条在干什么?

select pg_locks.pid, pg_locks.locktype, pg_locks.mode, pg_locks.granted, pg_stat_activity.query  
from pg_locks, pg_stat_activity   
where pg_locks.pid = pg_stat_activity.pid  
and pg_locks.pid=344;   
  
  
 pid |  locktype  |      mode       | granted |         query           
-----+------------+-----------------+---------+-----------------------  
 344 | relation   | AccessShareLock | t       | select * from t_lock;  
 344 | virtualxid | ExclusiveLock   | t       | select * from t_lock;  
 344 | relation   | AccessShareLock | t       | select * from t_lock;  
(3 rows)  

参考

《PostgreSQL 谁堵塞了谁(锁等待检测)- pg_blocking_pids, pg_safe_snapshot_blocking_pids》

《PostgreSQL 锁等待排查实践 - 珍藏级 - process xxx1 acquired RowExclusiveLock on relation xxx2 of database xxx3 after xxx4 ms at xxx》

《PostgreSQL 商用版本EPAS(阿里云ppas(Oracle 兼容版)) NUMA 架构spin锁等待优化》

《PostgreSQL 锁等待监控 珍藏级SQL - 谁堵塞了谁》

《PostgreSQL 锁等待跟踪》

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

digoal's wechat