一款兼容PostgreSQL协议的时序数据库 - QuestDB

作者

digoal

日期

2020-04-12

背景

What is QuestDB

QuestDB is an open-source NewSQL relational database designed to process time-series data, faster. Our approach comes from low-latency trading; QuestDB’s stack is engineered from scratch, zero-GC Java and dependency-free.

QuestDB ingests data via HTTP, PostgreSQL wire protocol, Influx line protocol or directly from Java. Reading data is done using SQL via HTTP, PostgreSQL wire protocol or via Java API. The whole database and console fits in a 3.5Mb package.

https://github.com/questdb/questdb

https://www.questdb.io/blog/2020/04/02/using-simd-to-aggregate-billions-of-rows-per-second

Summary

We have implemented SIMD-based vector execution of queries, such as select sum(value) from table. This is ~100x faster than non-vector based execution. This is just the beginning as we will introduce vectors to more operations going forward. Try our first implementation in this release - stay tuned for more features in the upcoming releases!

questdb 支持simd向量指令, 以及瓦片结构, 所以快.

PostgreSQL 同样有类似的插件: 与之类似的PG插件有vops.

questdb 宣传的对比数据中, PG是原生的, 并没有使用vops, 所以看起来questdb比pg快很多, 实际上pg使用vops后, 性能会大幅度提升.

《PostgreSQL VOPS 向量计算 + DBLINK异步并行 - 单实例 10亿聚合计算跑进2秒》

《PostgreSQL 向量化执行插件(瓦片式实现-vops) 10x提速OLAP》

https://github.com/postgrespro/vops

Vertical storage
Traditional query executor (like Postgres executor) deals with single row of data at each moment of time. If it has to evaluate expression (x+y) then it fetches value of "x", then value of "y", performs operation "+" and returns the result value to the upper node. In contrast vectorized executor is able to process in one operation multiple values. In this case "x" and "y" represent not just a single scalar value, but vector of values and result is also a vector of values. In vector execution model interpretation and function call overhead is divided by size of vector. The price of performing function call is the same, but as far as function proceeds N values instead of just one, this overhead become less critical.

What is the optimal size for the vector? From the explanation above it is clear that the larger vector is, the less per-row overhead we have. So we can form vector from all values of the correspondent table attribute. It is so called vertical data model or columnar store. Unlike classical "horizontal" data model where the unit of storing data is row (tuple), here we have vertical columns. Columnar store has the following main advantages:

Reduce size of fetched data: only columns used in query need to be fetched
Better compression: storing all values of the same attribute together makes it possible to much better and faster compress them, for example using delta encoding.
Minimize interpretation overhead: each operation is perform not for single value, but for set of values
Use CPU vector instruction (SIMD) to process data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

20200412_03.md

20200412_03.md

一款兼容PostgreSQL协议的时序数据库 - QuestDB

作者

日期

标签

背景

PostgreSQL 许愿链接

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

PolarDB 学习图谱: 训练营、培训认证、在线互动实验、解决方案、生态合作、写心得拿奖品

购买PolarDB云服务折扣活动进行中, 55元起

About 德哥

Files

20200412_03.md

Latest commit

History

20200412_03.md

File metadata and controls

一款兼容PostgreSQL协议的时序数据库 - QuestDB

作者

日期

标签

背景

PostgreSQL 许愿链接

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

PolarDB 学习图谱: 训练营、培训认证、在线互动实验、解决方案、生态合作、写心得拿奖品

购买PolarDB云服务折扣活动进行中, 55元起

About 德哥