Multi-tenant remote tables

### Company or project name

ClickHouse

### Use case

Let's suppose we have many ClickHouse instances of different customers, and all of them have some small tables of the same structure - for example, system tables. We don't want to store these tables on every customer cluster. Instead we want to use a single remote table on a shared cluster. This will save cost on storage due to better compression, lower resource usage on customer clusters, and combining with asynchronous inserts, lower the number of requests to S3.

### Describe the solution you'd like

Create a new table engine `MultiTenantRemote`. It will accept the cluster name, remote database, and table name, like a `Distributed` table. It will use two parameters from the server configuration: `customer_id` and `customer_secret`. On INSERT, it will add two columns to the block: `customer_id LowCardinality(String), customer_secret LowCardinality(String)` and perform an INSERT into the remote table. On SELECT, it will query the remote server and send the `customer_id` and `customer_secret` as query parameters.

Restricting access to the data has to be implemented on the remote server side. It can be done in the following way:

The table has `customer_id` as the first column of the sorting key for fast queries.

It has a row policy that restricts selects to `customer_id = {customer_id:String} AND customer_secret = {customer_secret:String}`.

A more sophisticated row policy condition can support time-based key rotation. The `customer_secret` will contain the cutoff time and HMAC, using the key that is only known to the remote server. The row policy validates the HMAC, extracts the cutoff time, and filters records using this cutoff time. So that the access to the newer data will be restricted to newer keys, which will be periodically rotated.

The table can have a constraint that allows only legitimate inserts. This can be implemented by adding HMAC to the `customer_secret`, which is checked on the remote server side, using the key that is only known to the remote server.

Creating the tables on the remote server and configuring the remote server are out of scope of the implementation - it has to be done manually or with a script.

The tables cannot be altered.

### Describe alternatives you've considered

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-tenant remote tables #91632

Company or project name

Use case

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multi-tenant remote tables #91632

Description

Company or project name

Use case

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions