Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update DimTableSegmentAssignment to include both OFFLINE and REALTIME servers #6530

Merged

Conversation

cbalci
Copy link
Contributor

@cbalci cbalci commented Feb 2, 2021

Description

Fixes a bug with the DimTableSegmentAssignment logic where we skipped REALTIME servers when distributing segments.
We want dimension tables to be distributed to REALTIME servers as well as OFFLINE servers to enable "Realtime Fact Table to Dimension Table Joins".

Testing

Unit tests are updated accordingly.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
(Not related to this PR) For easier management of the cluster, I would recommend having a separate tenant for the dimension table so that dimension table can be shared by multiple tenants. You can configure the server with multiple tags, and the server can belong to multiple tenants.

@yupeng9
Copy link
Contributor

yupeng9 commented Feb 3, 2021

LGTM.
(Not related to this PR) For easier management of the cluster, I would recommend having a separate tenant for the dimension table so that dimension table can be shared by multiple tenants. You can configure the server with multiple tags, and the server can belong to multiple tenants.

Since this is the first time we have queries across tables, I think it's a good time to discuss the policy. There are two options: join tables within the tenant, and join tables across tenants. Personally I prefer a default constraint that the tables to join are within the same tenant for better isolation. But given the broadcast join nature, the dimension table is in fact copied to all tenants. Nevertheless, I feel it's good to have this high-level consideration.

Copy link
Contributor

@yupeng9 yupeng9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

}

@Override
public Map<String, Map<String, String>> rebalanceTable(Map<String, Map<String, String>> currentAssignment,
Map<InstancePartitionsType, InstancePartitions> instancePartitionsMap, @Nullable List<Tier> sortedTiers,
@Nullable Map<String, InstancePartitions> tierInstancePartitionsMap, Configuration config) {
String serverTag = TagNameUtils.extractOfflineServerTag(_tenantConfig);
List<String> instances = HelixHelper.getInstancesWithTag(_helixManager, serverTag);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@siddharthteotia
Copy link
Contributor

LGTM.
(Not related to this PR) For easier management of the cluster, I would recommend having a separate tenant for the dimension table so that dimension table can be shared by multiple tenants. You can configure the server with multiple tags, and the server can belong to multiple tenants.

Since this is the first time we have queries across tables, I think it's a good time to discuss the policy. There are two options: join tables within the tenant, and join tables across tenants. Personally I prefer a default constraint that the tables to join are within the same tenant for better isolation. But given the broadcast join nature, the dimension table is in fact copied to all tenants. Nevertheless, I feel it's good to have this high-level consideration.

Currently there is no broadcast happening right?

I thought the design was to have full dimension table colocated with fact table servers so that each server does a local join with part of fact table and full dimension table.

For the eventual implementation of fact-dimension using broadcast, I don't think we should have cross tenant joins since the data movement will take longer. We can probably allow it but the recommendation should probably be joins within the same tenant. Joins across tenant essentially become a cross cluster join which is likely to be expensive (but depends on the topology I guess)

@cbalci
Copy link
Contributor Author

cbalci commented Feb 3, 2021

Thanks for the reviews @Jackie-Jiang , @yupeng9 .

Since this is the first time we have queries across tables, I think it's a good time to discuss the policy. There are two options: join tables within the tenant, and join tables across tenants. Personally I prefer a default constraint that the tables to join are within the same tenant for better isolation. But given the broadcast join nature, the dimension table is in fact copied to all tenants. Nevertheless, I feel it's good to have this high-level consideration.

@yupeng9, small correction on the last part, currently dimension tables are not copied to all tenants, but all servers in a single tenant.

I also agree with your opinion here to default to restricting joins to in-tenant only. Otherwise, besides isolation, unnecessary memory usage will be another issue since dim tables are loaded into heap directly. I think @Jackie-Jiang's tag based cross-tenant solution could be useful for cases where folks have set up different tenants for REALTIME and OFFLINE servers of the same table and would like to share a dimension table.

@siddharthteotia

Currently there is no broadcast happening right?

You're right, there is no data movement at query time. Dimension tables are distributed only at segment placement time, according to the segment assignment policy as updated in this PR.

@yupeng9
Copy link
Contributor

yupeng9 commented Feb 3, 2021

Thanks for the reviews @Jackie-Jiang , @yupeng9 .

Since this is the first time we have queries across tables, I think it's a good time to discuss the policy. There are two options: join tables within the tenant, and join tables across tenants. Personally I prefer a default constraint that the tables to join are within the same tenant for better isolation. But given the broadcast join nature, the dimension table is in fact copied to all tenants. Nevertheless, I feel it's good to have this high-level consideration.

@yupeng9, small correction on the last part, currently dimension tables are not copied to all tenants, but all servers in a single tenant.

Yes, that's right: all servers of the same tenant.

I also agree with your opinion here to default to restricting joins to in-tenant only. Otherwise, besides isolation, unnecessary memory usage will be another issue since dim tables are loaded into heap directly. I think @Jackie-Jiang's tag based cross-tenant solution could be useful for cases where folks have set up different tenants for REALTIME and OFFLINE servers of the same table and would like to share a dimension table.

@siddharthteotia

Currently there is no broadcast happening right?

You're right, there is no data movement at query time. Dimension tables are distributed only at segment placement time, according to the segment assignment policy as updated in this PR.

@yupeng9
Copy link
Contributor

yupeng9 commented Feb 3, 2021

LGTM.
(Not related to this PR) For easier management of the cluster, I would recommend having a separate tenant for the dimension table so that dimension table can be shared by multiple tenants. You can configure the server with multiple tags, and the server can belong to multiple tenants.

Since this is the first time we have queries across tables, I think it's a good time to discuss the policy. There are two options: join tables within the tenant, and join tables across tenants. Personally I prefer a default constraint that the tables to join are within the same tenant for better isolation. But given the broadcast join nature, the dimension table is in fact copied to all tenants. Nevertheless, I feel it's good to have this high-level consideration.

Currently there is no broadcast happening right?

I think we are on the same page: the dimension table is distributed (i.e. broadcasted) to all servers, though not at the query time.

I thought the design was to have full dimension table colocated with fact table servers so that each server does a local join with part of fact table and full dimension table.

For the eventual implementation of fact-dimension using broadcast, I don't think we should have cross tenant joins since the data movement will take longer. We can probably allow it but the recommendation should probably be joins within the same tenant. Joins across tenant essentially become a cross cluster join which is likely to be expensive (but depends on the topology I guess)

@siddharthteotia
Copy link
Contributor

LGTM.
(Not related to this PR) For easier management of the cluster, I would recommend having a separate tenant for the dimension table so that dimension table can be shared by multiple tenants. You can configure the server with multiple tags, and the server can belong to multiple tenants.

@Jackie-Jiang , the current design is based on pre placement of dimension table with fact table servers so there is no broadcast at query time. With this setup, how can we have dimension table in a separate tenant?

On a related note, we can possibly have dimension table assigned to multiple tags/tenants. So for joining a hybrid fact table with offline dimension table, the dim table will be present in both offline tenant and realtime tenant servers of the fact table. So, there will be one tenant local join on the offline side and one tenant local join on the realtime side. I believe this is what you mean by saying dimension table can be shared by multiple tenants ?

@Jackie-Jiang
Copy link
Contributor

@siddharthteotia if we have 5 tables on different tenant, all of them need to join with the same dimension table. If we always colocate them using the same tenant, we need to have 5 copies of the dimension table stored in the deep store. If we use a separate tenant for the dimension table, we only need to store one copy. Each server can be configured with multiple tenants, so that both the fact table and dimension table are still colocated.
This is out of the scope of this PR, just how to manage the cluster.

@Jackie-Jiang Jackie-Jiang merged commit 2be1520 into apache:master Feb 3, 2021
@yupeng9
Copy link
Contributor

yupeng9 commented Feb 3, 2021

@siddharthteotia if we have 5 tables on different tenant, all of them need to join with the same dimension table. If we always colocate them using the same tenant, we need to have 5 copies of the dimension table stored in the deep store. If we use a separate tenant for the dimension table, we only need to store one copy. Each server can be configured with multiple tenants, so that both the fact table and dimension table are still colocated.
This is out of the scope of this PR, just how to manage the cluster.

Agreed that it's outside the scope of this PR. But for better cluster management, I suggest we make 5 copies (though can be soft copies) for better isolation. In particular, I suggest we consider using tenant.table to resolve a table in the future, so that we can support tables with the same name in different namespaces (i.e. tenants).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants