Skip to content

Latest commit

 

History

History
135 lines (86 loc) · 6.73 KB

20240428_01.md

File metadata and controls

135 lines (86 loc) · 6.73 KB

DB吐槽大会,第89期 - 数组or多值列不支持外键约束

作者

digoal

日期

2024-04-28

标签

PostgreSQL , PolarDB , DuckDB , array , 数组 , tsvector , jsonb , json , vector , 外键约束


背景

视频回放

数组or多值列不支持外键约束

外键约束用于约束跨表数据的强关联关系, 例如有一张用户表, 每个用户一条记录, 另有一张用户喜好表(表示这个用户关注了哪些其他用户), 可以设计为每个点对点的关系一条记录, 也可以使用一对多的设计.

用户表

create table users (  
  uid int primary key,   
  info text,   
  ts timestamp    
);    

一对一的用户关系表, 可以使用foreign key约束, 通过on delete, on update 来配置 no action|RESTRICT|CASCADE|SET NULL|SET DEFAULT 行为.

create table user_like_list1 (  
  uid int primary key,   
  friends int references users(uid)  
);    
  
create index on user_like_list1 (friends);   
postgres=# \d+ user_like_list1  
                                     Table "public.user_like_list1"  
 Column  |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description   
---------+---------+-----------+----------+---------+---------+-------------+--------------+-------------  
 uid     | integer |           | not null |         | plain   |             |              |   
 friends | integer |           |          |         | plain   |             |              |   
Indexes:  
    "user_like_list1_pkey" PRIMARY KEY, btree (uid)  
    "user_like_list1_friends_idx" btree (friends)  
Foreign-key constraints:  
    "user_like_list1_friends_fkey" FOREIGN KEY (friends) REFERENCES users(uid)  
Access method: heap  

使用一对一的关系表, 在查询某个用户关注了哪些其他用户时, 需要返回很多条记录, 数据也可能散落在很多数据块上, 性能不太好, 可以看我之前的一些文章的分析和性能优化:

《PostgreSQL index include - 类聚簇表与应用(append only, IoT时空轨迹, 离散多行扫描与返回)》

《重新发现PostgreSQL之美 - 8 轨迹业务IO杀手克星index include(覆盖索引)》

《PostgreSQL IoT,车联网 - 实时轨迹、行程实践 2 - (含index only scan类聚簇表效果)》

《PostgreSQL IoT,车联网 - 实时轨迹、行程实践 1》

一对多的关系查询起来性能就会好很多, 因为被关注的多个用户数据聚集到了1个数组里面, 但是问题来了, 如果用户注销(delete)或者某些原因变更了uid, 那么这个是无法通过外键来进行强约束的.

create table user_like_list (  
  uid int primary key,   
  friends int[] references users (uid)   
);    
  
ERROR:  42804: foreign key constraint "user_like_list_friends_fkey" cannot be implemented  
DETAIL:  Key columns "friends" and "uid" are of incompatible types: integer[] and integer.  
LOCATION:  ATAddForeignKeyConstraint, tablecmds.c:9220  

为了解决这个问题, 需要内核的支持, 或者在pk字段上使用check约束检查值是否在关联表的多值字段内存在, 为了保障性能需要在多值字段上添加gin索引.

外键相关参考信息:

https://www.postgresql.org/docs/current/sql-createtable.html

  REFERENCES reftable [ ( refcolumn ) ] [ MATCH FULL | MATCH PARTIAL | MATCH SIMPLE ]  
    [ ON DELETE referential_action ] [ ON UPDATE referential_action ] }  

The ON DELETE clause specifies the action to perform when a referenced row in the referenced table is being deleted.

Likewise, the ON UPDATE clause specifies the action to perform when a referenced column in the referenced table is being updated to a new value. If the row is updated, but the referenced column is not actually changed, no action is done.

Referential actions other than the NO ACTION check cannot be deferred, even if the constraint is declared deferrable. There are the following possible actions for each clause:

NO ACTION
Produce an error indicating that the deletion or update would create a foreign key constraint violation. If the constraint is deferred, this error will be produced at constraint check time if there still exist any referencing rows. This is the default action.

RESTRICT
Produce an error indicating that the deletion or update would create a foreign key constraint violation. This is the same as NO ACTION except that the check is not deferrable.

CASCADE
Delete any rows referencing the deleted row, or update the values of the referencing column(s) to the new values of the referenced columns, respectively.

SET NULL [ ( column_name [, ... ] ) ]
Set all of the referencing columns, or a specified subset of the referencing columns, to null. A subset of columns can only be specified for ON DELETE actions.

SET DEFAULT [ ( column_name [, ... ] ) ]
Set all of the referencing columns, or a specified subset of the referencing columns, to their default values. A subset of columns can only be specified for ON DELETE actions. (There must be a row in the referenced table matching the default values, if they are not null, or the operation will fail.)

If the referenced column(s) are changed frequently, it might be wise to add an index to the referencing column(s) so that referential actions associated with the foreign key constraint can be performed more efficiently.

digoal's wechat