Skip to content

Latest commit

 

History

History
103 lines (26 loc) · 4 KB

20200324_37.md

File metadata and controls

103 lines (26 loc) · 4 KB

PostgreSQL 近似算法库 - DataSketches

作者

digoal

日期

2020-03-24

标签

PostgreSQL , 近似 , DataSketches


背景

PostgreSQL module with approximate algorithms based on DataSketches/sketches-core-cpp

https://github.com/apache/incubator-datasketches-postgresql

提供了一些近似算法支持

  • CPC (Compressed Probabilistic Counting) sketch - very compact (smaller than HLL when serialized) distinct-counting sketch
  • Theta sketch - distinct counting with set operations (intersection, a-not-b)
  • HLL sketch - very compact distinct-counting sketch based on HyperLogLog algorithm
  • KLL float quantiles sketch - for estimating distributions: quantile, rank, PMF (histogram), CDF
  • Frequent strings sketch - capture the heaviest items (strings) by count or by some other weight

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

digoal's wechat