Search before asking
Description
当前colocate的限制:
- 多租户部署时副本冗余:例如租户A需要JOIN租户B的表,租户C需要JOIN租户B的表,但是租户A不需要JOIN租户C的表,理论上我们只需要在租户A/C的机器上增加租户B的表的副本,租户A/C就可以在自己的机器上做ColocateJoin,但是当前的Colocate实现要求租户A/B/C的表必须在A/B/C三个租户的机器上都放置副本,导致副本冗余。
- 分桶数限制:当前colocate的实现要求两个表的分桶数必须相同才能做ColocateJoin,当JOIN的表的数量级差别比较大时,难以设置统一的分桶数。
方案:
- 支持租户粒度的colcoate,避免副本冗余问题。同时,单个租户的colocate失效不影响其他租户。
例如租户A/B/C上的3个表如下,t1与t2、t3与t2可以支持colocate join,没有副本冗余问题。
CREATE TABLE `t1` (
`id` bigint NOT NULL,
`m1` bigint NULL
) ENGINE=OLAP
duplicate KEY(`id`)
COMMENT 'OLAP'
DISTRIBUTED BY hash(id) BUCKETS 2
PROPERTIES (
"replication_allocation" = "tag.location.a: 1",
"colocate_group"="tag.location.a:coloc1"
);
CREATE TABLE `t2` (
`id` bigint NOT NULL,
`m2` bigint NULL
) ENGINE=OLAP
duplicate KEY(`id`)
COMMENT 'OLAP'
DISTRIBUTED BY hash(id) BUCKETS 2
PROPERTIES (
"replication_allocation" = "tag.location.a: 1,tag.location.b: 1,tag.location.c: 1",
"colocate_group"="tag.location.a:coloc1,tag.location.c:coloc3"
);
CREATE TABLE `t3` (
`id` bigint NOT NULL,
`m3` bigint NULL
) ENGINE=OLAP
duplicate KEY(`id`)
COMMENT 'OLAP'
DISTRIBUTED BY hash(id) BUCKETS 2
PROPERTIES (
"replication_allocation" = "tag.location.c: 1",
"colocate_group"="tag.location.c:coloc3"
);
- 支持分桶数为倍数关系的colocate
例如下面的t4和t5的分桶数是t1的倍数,理论上t1/t4/t5也可以支持colocate join。(由于历史原因,我们内部的实现使用了两个表属性)
CREATE TABLE `t4` (
`id` bigint NOT NULL,
`m4` bigint NULL
) ENGINE=OLAP
duplicate KEY(`id`)
COMMENT 'OLAP'
DISTRIBUTED BY hash(id) BUCKETS 4
PROPERTIES (
"replication_allocation" = "tag.location.a: 1",
"colocate_slave"="tag.location.a:coloc1"
);
CREATE TABLE `t5` (
`id` bigint NOT NULL,
`m5` bigint NULL
) ENGINE=OLAP
duplicate KEY(`id`)
COMMENT 'OLAP'
DISTRIBUTED BY hash(id) BUCKETS 6
PROPERTIES (
"replication_allocation" = "tag.location.a: 1",
"colocate_slave"="tag.location.a:coloc1"
);
Use case
No response
Related issues
No response
Are you willing to submit PR?
Code of Conduct
Search before asking
Description
当前colocate的限制:
方案:
例如租户A/B/C上的3个表如下,t1与t2、t3与t2可以支持colocate join,没有副本冗余问题。
例如下面的t4和t5的分桶数是t1的倍数,理论上t1/t4/t5也可以支持colocate join。(由于历史原因,我们内部的实现使用了两个表属性)
Use case
No response
Related issues
No response
Are you willing to submit PR?
Code of Conduct