Skip to content

Latest commit

 

History

History
176 lines (110 loc) · 5.78 KB

20211005_01.md

File metadata and controls

176 lines (110 loc) · 5.78 KB

PostgreSQL 数组或JSON内容的模糊匹配索引插件: parray_gin

作者

digoal

日期

2021-10-05

标签

PostgreSQL , parray_gin , 索引


背景

https://pgxn.org/dist/parray_gin/1.3.3/

结合了pg_trgm, 将数组或JSON中的value打散成token后进行索引构建. 实现数组或JSON元素级别的模糊匹配.

用法如下:

https://api.pgxn.org/src/parray_gin/parray_gin-1.3.3/doc/parray_gin.md

parray_gin extension

Installing

	CREATE EXTENSION parray_gin;  

Extension is compatible with PostgreSQL 9.1 through 14.
Description

Extension parray_gin provides GIN index and operator support for arrays with
partial match.

Usage

Extension contains operator class named parray_gin_ops for using GIN index
with the text arrays. Matching can be strict (array items must be equal)
or partial (array items of query may contain like expressions).
Surely operators can be used separately from the index.

Index can be created for the table with the following commands:

	-- test table, column `val` needs to be indexed  
	create table test_table(id bigserial, val text[]);  
	-- create the index  
	create index test_tags_idx on test_table using gin (val parray_gin_ops);  
	-- select using index  
	select * from test_table where val @> array['must','contain'];  
	-- select using index  
	select * from test_table where val @@> array['what%like%'];  

GIN index can be used with three operators: @>, <@, @@>, <@@.

Developers of an extension succesfully used GIN index on JSON arrays extracted
from JSON text fields using json_accessors extension.

GIN index is based on trigram decomposition. Trigrams implementation from
pg_trgm contrib module is used.
Indexed keys are splitted to trigrams which are stored as GIN keys.
Query is splitted to trigrams too and carefully checked against GIN keys.
Query can contain like expressions which could slow down an index a little.
Trigram index can fetch rows with false positive so provided array matching
operators recheck fetched rows for sure.

Interface

Operators

@> (text[], text[]) -> bool

Strict array contains. Returns true if LHS array contains all items from
the RHS array.

Sample index search:

	$ select * from test_table;  
	{star,wars}  
	{long,time,ago,in}  
	{a,galaxy,far}  
	{far,away}  
	  
	-- must contain any item from right side, strict matched  
	$ select * from test_table where val @> array['far'];  
	{a,galaxy,far}  
	{far,away}  

<@ (text[], text[]) -> bool

Strict array contained. Returns true if LHS array is contained by the
RHS array.

Sample index search:

	-- must contain all items from right side, partial matched  
	$ select * from test_table where val <@ array['galaxy','ago','vader'];  
	{long,time,ago,in}  
	{a,galaxy,far}  

@@> (text[], text[]) -> bool

Partial array contains. Returns true if LHS array contains all items from
the RHS array,
matched partially (i.e. 'foobar' ~~ 'foo%' or 'foobar' ~~ '%oo%)

Sample index search:

	-- must contain any item from right side, partially matched  
	$ select * from test_table where val @@> array['%ar%'];  
	{star,wars}  

<@@ (text[], text[]) -> bool

Partial array contained by. Returns true if LHS array is contained by all
items from the RHS array, matched partially (i.e. foobar contains oobar).
Inversion of the @@>.

Sample index search:

	-- must contain all items from right side, partially matched  
	$ select * from test_table where val <@@ array['%ar%','vader'];  
	{star,wars}  

Operator class

operator class parray_gin_ops

GIN-capable operator class. Support indexing strategies based on
these operators.

Author

Developed and maintaned by Eugene Seliverstov

Copyright and License

You can use any code from this project under the terms of
PostgreSQL License.

Please consult with the COPYING for license information.

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

digoal's wechat