Skip to content

Latest commit

 

History

History
247 lines (182 loc) · 10.4 KB

20200817_01.md

File metadata and controls

247 lines (182 loc) · 10.4 KB

PostgreSQL 20200819当天代码 - 14 对比 13 高并发性能优化 数据对比 - get snapshot improve

作者

digoal

日期

2020-08-17

标签

PostgreSQL , snapshot , 高并发性能优化


背景

https://www.citusdata.com/blog/2020/10/25/improving-postgres-connection-scalability-snapshots/

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=73487a60fc1063ba4b5178b69aee4ee210c182c4

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=5788e258bb26495fab65ff3aa486268d1c50b123

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=941697c3c1ae5d6ee153065adb96e1e63ee11224

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=1f51c17c68d05c28d5b9294d8013cb9e7e653160

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=dc7420c2c9274a283779ec19718d2d16323640c0

《PostgreSQL 14 GetSnapshotData 高并发优化, 以及 64位xid避免xid wrap优化》

对比数据来自 20200817 当天的PG 13和14代码版本.

https://www.postgresql.org/ftp/snapshot/

测试case

./configure --prefix=/home/digoal/pg? --enable-debug    
gmake world    
gmake install-world    
initdb -D $PGDATA -U postgres -E UTF8 --lc-collate=C  --lc-ctype=en_US.utf8    
sysctl -w vm.nr_hugepages=8700    
    
postgresql.conf     
    
listen_addresses = '0.0.0.0'    
port = 41921    
max_connections = 10000    
superuser_reserved_connections = 3    
unix_socket_directories = '/tmp,.'    
tcp_keepalives_idle = 60    
tcp_keepalives_interval = 10    
tcp_keepalives_count = 10    
shared_buffers = 16GB    
huge_pages = on    
maintenance_work_mem = 1GB    
dynamic_shared_memory_type = posix    
vacuum_cost_delay = 0    
bgwriter_delay = 10ms    
bgwriter_lru_maxpages = 1000    
bgwriter_lru_multiplier = 10.0    
effective_io_concurrency = 0    
max_worker_processes = 128    
max_parallel_workers_per_gather = 0    
wal_level = minimal    
synchronous_commit = off    
full_page_writes = on    
wal_compression = on    
wal_buffers = 16MB    
wal_writer_delay = 10ms    
max_wal_size = 64GB    
min_wal_size = 8GB    
max_wal_senders = 0    
random_page_cost = 1.1    
effective_cache_size = 256GB    
log_destination = 'csvlog'    
logging_collector = on    
log_truncate_on_rotation = on    
log_checkpoints = on    
log_timezone = 'Asia/Shanghai'    
log_autovacuum_min_duration = 0    
autovacuum_vacuum_cost_delay = 0ms    
datestyle = 'iso, mdy'    
timezone = 'Asia/Shanghai'    
lc_messages = 'en_US.utf8'    
lc_monetary = 'en_US.utf8'    
lc_numeric = 'en_US.utf8'    
lc_time = 'en_US.utf8'    
default_text_search_config = 'pg_catalog.english'    
pgbench -i -s 1000     
    
pgbench -M prepared -n -r -P 5 -c ? -j ? -T 120 -S    
    
pgbench -M prepared -n -r -P 5 -c ? -j ? -T 120    

对比

52C 104线程 384GB机器. PG 14 tpcb 1亿数据量 select only 200万qps. 峰值, 2000连接后, 性能提升20%以上.

Architecture:          x86_64    
CPU op-mode(s):        32-bit, 64-bit    
Byte Order:            Little Endian    
CPU(s):                104    
On-line CPU(s) list:   0-103    
Thread(s) per core:    2    
Core(s) per socket:    26    
Socket(s):             2    
NUMA node(s):          2    
Vendor ID:             GenuineIntel    
CPU family:            6    
Model:                 85    
Model name:            Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz    
Stepping:              7    
CPU MHz:               1665.649    
CPU max MHz:           2500.0000    
CPU min MHz:           1200.0000    
BogoMIPS:              5000.00    
Hypervisor vendor:     KVM    
Virtualization type:   full    
L1d cache:             32K    
L1i cache:             32K    
L2 cache:              1024K    
L3 cache:              36608K    
NUMA node0 CPU(s):     0-51    
NUMA node1 CPU(s):     52-103    
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 ida arat avx512_vnni    

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=623a9ba79bbdd11c5eccb30b8bd5c446130e521c

snapshot scalability: cache snapshots using a xact completion counter.
author	Andres Freund <andres@anarazel.de>	
Tue, 18 Aug 2020 12:07:10 +0800 (21:07 -0700)
committer	Andres Freund <andres@anarazel.de>	
Tue, 18 Aug 2020 12:08:30 +0800 (21:08 -0700)
commit	623a9ba79bbdd11c5eccb30b8bd5c446130e521c
tree	ed9f9fc1fd58e33a7c7d1fe5c037e08f64d6ddcd	tree | snapshot
parent	51300b45db95b6fd29f88534ab0739fdc9df1699	commit | diff
snapshot scalability: cache snapshots using a xact completion counter.

Previous commits made it faster/more scalable to compute snapshots. But not
building a snapshot is still faster. Now that GetSnapshotData() does not
maintain RecentGlobal* anymore, that is actually not too hard:

This commit introduces xactCompletionCount, which tracks the number of
top-level transactions with xids (i.e. which may have modified the database)
that completed in some form since the start of the server.

We can avoid rebuilding the snapshot's contents whenever the current
xactCompletionCount is the same as it was when the snapshot was
originally built.  Currently this check happens while holding
ProcArrayLock. While it's likely possible to perform the check without
acquiring ProcArrayLock, it seems better to do that separately /
later, some careful analysis is required. Even with the lock this is a
significant win on its own.

On a smaller two socket machine this gains another ~1.03x, on a larger
machine the effect is roughly double (earlier patch version tested
though).  If we were able to safely avoid the lock there'd be another
significant gain on top of that.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de

测试仅针对getsnapshot损耗在整体损耗中占比较高的场景.

模型 连接数 PG 13 tps PG 14 tps 14性能提升比率
只读 1亿 -c 52 -j 52 -T 120 52 1234936 1582846 28.2%
只读 1亿 -c 104 -j 104 -T 120 104 1611162 2062788 28%
只读 1亿 -c 208 -j 104 -T 120 208 1803801 2045995 13.4%
只读 1亿 -c 512 -j 64 -T 120 512 1713561 1839027 7.3%
只读 1亿 -c 2048 -j 64 -T 360 2048 1279550 1686426 31.8%
只读 1亿 -c 5000 -j 50 -T 360 5000 1164421 1595496 37%

PostgreSQL 14 GetSnapshotData 的优化代码还没有完全合并到master, 后期数据应该更好,

https://www.postgresql.org/message-id/flat/20200301083601.ews6hz5dduc3w2se%40alap3.anarazel.de

以下是作者自己分支的测试数据    
    
conns   tps master              tps pgxact-split        
1       26842.492845            26524.194821        
10      246923.158682           249224.782661        
50      695956.539704           709833.746374        
100     1054727.043139          1903616.306028        
200     964795.282957           1949200.338012        
300     906029.377539           1927881.231478        
400     845696.690912           1911065.369776        
500     812295.222497           1926237.255856        
600     888030.104213           1903047.236273        
700     866896.532490           1886537.202142        
800     863407.341506           1883768.592610        
900     871386.608563           1874638.012128        
1000    887668.277133           1876402.391502        
1500    860051.361395           1815103.564241        
2000    890900.098657           1775435.271018        
3000    874184.980039           1653953.817997        
4000    845023.080703           1582582.316043        
5000    817100.195728           1512260.802371        

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

digoal's wechat