digoal
2022-08-08
PostgreSQL , 分区表 , 缓存 , 写入优化 , list , range , 16 , PARTITION_CACHED_FIND_THRESHOLD
PostgreSQL 16, 分区查找优化:
- 在同一个会话里面, 插入某个分区表, 如果连续16条记录的数据都写入到了某一个分区, 那么会把这个分区number缓存入会话的rel cache中. 接下来写入这个分区表时, 根据记录的值直接去判断是否属于缓存的分区number, 如果是, 就直接写入这个分区, 如果不是再走原有的分区查找流程(二分法查找).
这个优化方法适合在同一个会话中, 有连续写入分区表的场景, 而且连续写入的属于同一分区的情况.
例如:
- 按list分区的SaaS类业务、IOT业务数据. 某个会话写入的就是某个传感器的批量数据(每个传感器必须是连续的一批进来, 不能错开)、某些客户(saas的客户ID区分)的批量数据(某个APPID的数据必须连续进来, 不能错开).
- 按range分区的时序类数据. 例如按天按月等分区, 写入的数据基本上都是连续进入某一个特定分区.
这个优化方法减少了二分法查找分区的开销, 提升分区表的写入性能.
不适用的场景:
- default分区不缓存
- list null分区值不缓存
- hash分区不加速(可能由于hash查找本身就比较快?)
+/*
+ * The number of times the same partition must be found in a row before we
+ * switch from a binary search for the given values to just checking if the
+ * values belong to the last found partition. This must be above 0.
+ */
+#define PARTITION_CACHED_FIND_THRESHOLD 16
Have ExecFindPartition cache the last found partition
author David Rowley <drowley@postgresql.org>
Mon, 1 Aug 2022 21:55:27 +0000 (09:55 +1200)
committer David Rowley <drowley@postgresql.org>
Mon, 1 Aug 2022 21:55:27 +0000 (09:55 +1200)
commit 3592e0ff98b130c428c1d8091fc61767f7608732
tree ac997f9570f773793a25062559cfee98ac0a6528 tree
parent 83f1793d6096deb419e16cc38201484a6f6e9f48 commit | diff
Have ExecFindPartition cache the last found partition
Here we add code which detects when ExecFindPartition() continually finds
the same partition and add a caching layer to improve partition lookup
performance for such cases.
Both RANGE and LIST partitioned tables traditionally require a binary
search for the set of Datums that a partition needs to be found for. This
binary search is commonly visible in profiles when bulk loading into a
partitioned table. Here we aim to reduce the overhead of bulk-loading
into partitioned tables for cases where many consecutive tuples belong to
the same partition and make the performance of this operation closer to
what it is with a traditional non-partitioned table.
When we find the same partition 16 times in a row, the next search will
result in us simply just checking if the current set of values belongs to
the last found partition. For LIST partitioning we record the index into
the PartitionBoundInfo's datum array. This allows us to check if the
current Datum is the same as the Datum that was last looked up. This
means if any given LIST partition supports storing multiple different
Datum values, then the caching only works when we find the same value as
we did the last time. For RANGE partitioning we simply check if the given
Datums are in the same range as the previously found partition.
We store the details of the cached partition in PartitionDesc (i.e.
relcache) so that the cached values are maintained over multiple
statements.
No caching is done for HASH partitions. The majority of the cost in HASH
partition lookups are in the hashing function(s), which would also have to
be executed if we were to try to do caching for HASH partitioned tables.
Since most of the cost is already incurred, we just don't bother. We also
don't do any caching for LIST partitions when we continually find the
values being looked up belong to the DEFAULT partition. We've no
corresponding index in the PartitionBoundInfo's datum array for this case.
We also don't cache when we find the given values match to a LIST
partitioned table's NULL partition. This is so cheap that there's no
point in doing any caching for this. We also don't cache for a RANGE
partitioned table's DEFAULT partition.
There have been a number of different patches submitted to improve
partition lookups. Hou, Zhijie submitted a patch to detect when the value
belonging to the partition key column(s) were constant and added code to
cache the partition in that case. Amit Langote then implemented an idea
suggested by me to remember the last found partition and start to check if
the current values work for that partition. The final patch here was
written by me and was done by taking many of the ideas I liked from the
patches in the thread and redesigning other aspects.
Discussion: https://postgr.es/m/OS0PR01MB571649B27E912EA6CC4EEF03942D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com
Author: Amit Langote, Hou Zhijie, David Rowley
Reviewed-by: Amit Langote, Hou Zhijie