digoal
2017-07-13
PostgreSQL , Oracle
从功能、性能、SQL兼容性几个方面,输出一些Oracle业务适合用PostgreSQL替代的一些指标。
如果业务中用到了这些特性,则非常适合使用PostgreSQL。
IP地址、GIS、数组、范围、全文检索、大对象、字节流、比特流、枚举、几何、自定义复合、UUID、XML、JSON、货币、字符串、数值、时间、加密数据类型
全文检索、模糊查询、相似查询
connect by、多维分析(grouping, grouping sets, rollup, cube)、多表JOIN、窗口查询(over partition by ())、聚合函数
parallel
hash hint
left join
right join
outer join
merge join
hash agg
group agg
merge sort
skip scan
单表过亿
PostgreSQL 内置多种索引接口(hash, btree, gin, gist, sp-gist, brin, bloom)
primary key, unique key, check, not null, default value
sequence
create index idx on table (reverse(column));
create index idx on table (exp...);
create index idx on table (func());
create index idx on table (col) where ...;
小事务和分析型事务并存。(OLTP和OLAP并存)
PostgreSQL通过多核并行、JIT、算子复用等技术,加速分析事务。
create index idx on table using gin (col);
例如全文检索、数组、JSON、大量重复值的列。
select /*+ parallel(t, 16) */ select xxxxx from xxx where xxx;
HLL 估值类型,快速统计唯一值,唯一值的合并,唯一值求差集等
单机(32 CORE, SSD, 512GB内存)
SF=100,100GB 裸数据,测试结果如下,完全满足这个量级的增量统计的需求。
2017-07-13 20:04:29 [1499947469] : running queries defined in TPC-H benchmark
2017-07-13 20:04:29 [1499947469] : running query 1
2017-07-13 20:04:29 [1499947469] : run explain
2017-07-13 20:04:29 [1499947469] : run the query on background
2017-07-13 20:04:47 [1499947487] : query 1 finished OK (17 seconds)
2017-07-13 20:04:47 [1499947487] : running query 2
2017-07-13 20:04:47 [1499947487] : run explain
2017-07-13 20:04:47 [1499947487] : run the query on background
2017-07-13 20:08:13 [1499947693] : query 2 finished OK (206 seconds)
2017-07-13 20:08:13 [1499947693] : running query 3
2017-07-13 20:08:13 [1499947693] : run explain
2017-07-13 20:08:14 [1499947694] : run the query on background
2017-07-13 20:08:55 [1499947735] : query 3 finished OK (41 seconds)
2017-07-13 20:08:55 [1499947735] : running query 4
2017-07-13 20:08:55 [1499947735] : run explain
2017-07-13 20:08:55 [1499947735] : run the query on background
2017-07-13 20:09:02 [1499947742] : query 4 finished OK (6 seconds)
2017-07-13 20:09:02 [1499947742] : running query 5
2017-07-13 20:09:02 [1499947742] : run explain
2017-07-13 20:09:02 [1499947742] : run the query on background
2017-07-13 20:09:16 [1499947756] : query 5 finished OK (14 seconds)
2017-07-13 20:09:16 [1499947756] : running query 6
2017-07-13 20:09:16 [1499947756] : run explain
2017-07-13 20:09:16 [1499947756] : run the query on background
2017-07-13 20:09:21 [1499947761] : query 6 finished OK (4 seconds)
2017-07-13 20:09:21 [1499947761] : running query 7
2017-07-13 20:09:21 [1499947761] : run explain
2017-07-13 20:09:21 [1499947761] : run the query on background
2017-07-13 20:10:06 [1499947806] : query 7 finished OK (35 seconds)
2017-07-13 20:10:06 [1499947806] : running query 8
2017-07-13 20:10:06 [1499947806] : run explain
2017-07-13 20:10:06 [1499947806] : run the query on background
2017-07-13 20:10:38 [1499947838] : query 8 finished OK (31 seconds)
2017-07-13 20:10:38 [1499947838] : running query 9
2017-07-13 20:10:38 [1499947838] : run explain
2017-07-13 20:10:38 [1499947838] : run the query on background
2017-07-13 20:11:32 [1499947892] : query 9 finished OK (54 seconds)
2017-07-13 20:11:32 [1499947892] : running query 10
2017-07-13 20:11:32 [1499947892] : run explain
2017-07-13 20:11:32 [1499947892] : run the query on background
2017-07-13 20:11:49 [1499947909] : query 10 finished OK (16 seconds)
2017-07-13 20:11:49 [1499947909] : running query 11
2017-07-13 20:11:49 [1499947909] : run explain
2017-07-13 20:11:49 [1499947909] : run the query on background
2017-07-13 20:11:56 [1499947916] : query 11 finished OK (7 seconds)
2017-07-13 20:11:56 [1499947916] : running query 12
2017-07-13 20:11:56 [1499947916] : run explain
2017-07-13 20:11:56 [1499947916] : run the query on background
2017-07-13 20:13:37 [1499948017] : query 12 finished OK (100 seconds)
2017-07-13 20:13:37 [1499948017] : running query 13
2017-07-13 20:13:37 [1499948017] : run explain
2017-07-13 20:13:37 [1499948017] : run the query on background
2017-07-13 20:17:11 [1499948231] : query 13 finished OK (213 seconds)
2017-07-13 20:17:11 [1499948231] : running query 14
2017-07-13 20:17:11 [1499948231] : run explain
2017-07-13 20:17:11 [1499948231] : run the query on background
2017-07-13 20:17:15 [1499948235] : query 14 finished OK (4 seconds)
2017-07-13 20:17:15 [1499948235] : running query 15
2017-07-13 20:17:15 [1499948235] : run explain
2017-07-13 20:17:15 [1499948235] : run the query on background
2017-07-13 20:17:40 [1499948260] : query 15 finished OK (25 seconds)
2017-07-13 20:17:40 [1499948260] : running query 16
2017-07-13 20:17:40 [1499948260] : run explain
2017-07-13 20:17:40 [1499948260] : run the query on background
2017-07-13 20:18:41 [1499948321] : query 16 finished OK (60 seconds)
2017-07-13 20:18:41 [1499948321] : running query 17
2017-07-13 20:18:41 [1499948321] : run explain
2017-07-13 20:18:41 [1499948321] : run the query on background
2017-07-13 20:27:55 [1499948875] : query 17 finished OK (552 seconds)
2017-07-13 20:27:55 [1499948875] : running query 18
2017-07-13 20:27:55 [1499948875] : run explain
2017-07-13 20:27:55 [1499948875] : run the query on background
2017-07-13 20:49:57 [1499950197] : query 18 finished OK (1317 seconds)
2017-07-13 20:49:57 [1499950197] : running query 19
2017-07-13 20:49:57 [1499950197] : run explain
2017-07-13 20:49:57 [1499950197] : run the query on background
2017-07-13 20:50:09 [1499950209] : query 19 finished OK (11 seconds)
2017-07-13 20:50:09 [1499950209] : running query 20
2017-07-13 20:50:09 [1499950209] : run explain
2017-07-13 20:50:09 [1499950209] : run the query on background
2017-07-13 20:56:43 [1499950603] : query 20 finished OK (393 seconds)
2017-07-13 20:56:43 [1499950603] : running query 21
2017-07-13 20:56:43 [1499950603] : run explain
2017-07-13 20:56:43 [1499950603] : run the query on background
2017-07-13 20:58:19 [1499950699] : query 21 finished OK (95 seconds)
2017-07-13 20:58:19 [1499950699] : running query 22
2017-07-13 20:58:19 [1499950699] : run explain
2017-07-13 20:58:19 [1499950699] : run the query on background
2017-07-13 21:00:43 [1499950843] : query 22 finished OK (143 seconds)
2017-07-13 21:00:43 [1499950843] : finished TPC-H benchmark
3000仓库、256客户端。84.5万 tpmC。
100亿位置信息,近邻查询。
tps: 7.4万/s
rt: 0.848毫秒
前后模糊(like '%????%')
1亿数据量,前后模糊,0.2毫秒。
10亿随机值,返回2万条匹配记录,26毫秒。
《PostgreSQL 全文检索加速 快到没有朋友 - RUM索引接口(潘多拉魔盒)》
2张1亿记录,10张1000万记录,1张1000记录的表进行JOIN,聚合查询。
23毫秒。
c 1000万
d 1000
e 1亿
postgres=# explain (analyze,verbose,timing,costs,buffers)
select count(t1.*) from
e t1 join e t2 on (t1.id=t2.id and t1.id<=1000)
join c t3 on (t1.id=t3.id)
join c t4 on (t1.id=t4.id)
join c t5 on (t1.id=t5.id)
join c t6 on (t1.id=t6.id)
join c t7 on (t1.id=t7.id)
join c t8 on (t1.id=t8.id)
join c t9 on (t1.id=t9.id)
join c t10 on (t1.id=t10.id)
join c t11 on (t1.id=t11.id)
join c t12 on (t1.id=t12.id)
join d t13 on (t1.id=t13.id) ;
Aggregate (cost=3234.08..3234.09 rows=1 width=8) (actual time=23.665..23.665 rows=1 loops=1)
Output: count(t1.*)
Buffers: shared hit=48059
-> Nested Loop (cost=5.76..3234.08 rows=1 width=28) (actual time=0.083..23.553 rows=1000 loops=1)
Output: t1.*
Join Filter: (t1.id = t13.id)
Buffers: shared hit=48059
............
Planning time: 7.943 ms
Execution time: 23.782 ms
(116 rows)
单表8亿记录,avg,count,sum,min,max维度聚合查询。
32个并行度
5.3秒
postgres=# select count(*),sum(id),avg(id),min(id),max(id) from e;
count | sum | avg | min | max
-----------+-------------------+-----------------------+-----+-----------
800000000 | 40000000400000000 | 50000000.500000000000 | 1 | 100000000
(1 row)
Time: 5316.490 ms (00:05.316)
并行写入,500万条记录/s 或 每秒1.8GB/s。
《PostgreSQL 如何潇洒的处理每天上百TB的数据增量》
简单来说,PostgreSQL是Oracle的最佳替代产品,而且还有额外惊喜,参考应用案例一文。
您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.