Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(clickhouse-driver): allow to enable compression #9341

Merged
merged 3 commits into from
Mar 25, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 15 additions & 11 deletions docs/pages/product/configuration/data-sources/clickhouse.mdx
Original file line number Diff line number Diff line change
@@ -31,16 +31,17 @@ CUBEJS_DB_PASS=**********

## Environment Variables

| Environment Variable | Description | Possible Values | Required |
| ------------------------------- | ----------------------------------------------------------------------------------- | ------------------------- | :------: |
| `CUBEJS_DB_HOST` | The host URL for a database | A valid database host URL | ✅ |
| `CUBEJS_DB_PORT` | The port for the database connection | A valid port number | ❌ |
| `CUBEJS_DB_NAME` | The name of the database to connect to | A valid database name | ✅ |
| `CUBEJS_DB_USER` | The username used to connect to the database | A valid database username | ✅ |
| `CUBEJS_DB_PASS` | The password used to connect to the database | A valid database password | ✅ |
| `CUBEJS_DB_CLICKHOUSE_READONLY` | Whether the ClickHouse user has read-only access or not | `true`, `false` | ❌ |
| `CUBEJS_DB_MAX_POOL` | The maximum number of concurrent database connections to pool. Default is `20` | A valid number | ❌ |
| `CUBEJS_CONCURRENCY` | The number of [concurrent queries][ref-data-source-concurrency] to the data source | A valid number | ❌ |
| Environment Variable | Description | Possible Values | Required |
| ---------------------------------- | ----------------------------------------------------------------------------------- | ------------------------- | :------: |
| `CUBEJS_DB_HOST` | The host URL for a database | A valid database host URL | ✅ |
| `CUBEJS_DB_PORT` | The port for the database connection | A valid port number | ❌ |
| `CUBEJS_DB_NAME` | The name of the database to connect to | A valid database name | ✅ |
| `CUBEJS_DB_USER` | The username used to connect to the database | A valid database username | ✅ |
| `CUBEJS_DB_PASS` | The password used to connect to the database | A valid database password | ✅ |
| `CUBEJS_DB_CLICKHOUSE_READONLY` | Whether the ClickHouse user has read-only access or not | `true`, `false` | ❌ |
| `CUBEJS_DB_CLICKHOUSE_COMPRESSION` | Whether the ClickHouse client has compression enabled or not | `true`, `false` | ❌ |
| `CUBEJS_DB_MAX_POOL` | The maximum number of concurrent database connections to pool. Default is `20` | A valid number | ❌ |
| `CUBEJS_CONCURRENCY` | The number of [concurrent queries][ref-data-source-concurrency] to the data source | A valid number | ❌ |

[ref-data-source-concurrency]: /product/configuration/concurrency#data-source-concurrency

@@ -130,6 +131,9 @@ You can connect to a ClickHouse database when your user's permissions are
[restricted][clickhouse-readonly] to read-only, by setting
`CUBEJS_DB_CLICKHOUSE_READONLY` to `true`.

You can connect to a ClickHouse database with compression enabled, by setting
`CUBEJS_DB_CLICKHOUSE_COMPRESSION` to `true`.

[clickhouse]: https://clickhouse.tech/
[clickhouse-docs-users]:
https://clickhouse.tech/docs/en/operations/settings/settings-users/
@@ -144,4 +148,4 @@ You can connect to a ClickHouse database when your user's permissions are
[self-preaggs-batching]: #batching
[ref-preaggs]: /product/caching/using-pre-aggregations
[ref-preaggs-indexes]: /reference/data-model/pre-aggregations#indexes
[ref-preaggs-rollup-join]: /reference/data-model/pre-aggregations#rollup_join
[ref-preaggs-rollup-join]: /reference/data-model/pre-aggregations#rollup_join
8 changes: 8 additions & 0 deletions docs/pages/reference/configuration/environment-variables.mdx
Original file line number Diff line number Diff line change
@@ -217,6 +217,14 @@ Whether the ClickHouse user has read-only access or not.
| --------------- | ---------------------- | --------------------- |
| `true`, `false` | N/A | N/A |

## `CUBEJS_DB_CLICKHOUSE_COMPRESSION`

Whether the ClickHouse client has compression enabled or not.

| Possible Values | Default in Development | Default in Production |
| --------------- | ---------------------- | --------------------- |
| `true`, `false` | `false` | `false` |

## `CUBEJS_DB_DATABRICKS_ACCEPT_POLICY`

To accept the license terms for the Databricks JDBC driver, this must be set to
19 changes: 16 additions & 3 deletions packages/cubejs-backend-shared/src/env.ts
Original file line number Diff line number Diff line change
@@ -1164,9 +1164,22 @@ const variables: Record<string, (...args: any) => any> = {
}: {
dataSource: string,
}) => (
process.env[
keyByDataSource('CUBEJS_DB_CLICKHOUSE_READONLY', dataSource)
]
get(keyByDataSource('CUBEJS_DB_CLICKHOUSE_READONLY', dataSource))
.default('false')
.asBool()
),

/**
* ClickHouse compression flag.
*/
clickhouseCompression: ({
dataSource
}: {
dataSource: string,
}) => (
get(keyByDataSource('CUBEJS_DB_CLICKHOUSE_COMPRESSION', dataSource))
.default('false')
.asBool()
),

/** ****************************************************************
79 changes: 67 additions & 12 deletions packages/cubejs-backend-shared/test/db_env_multi.test.ts
Original file line number Diff line number Diff line change
@@ -1511,34 +1511,89 @@ describe('Multiple datasources', () => {
});

test('getEnv("clickhouseReadOnly")', () => {
process.env.CUBEJS_DB_CLICKHOUSE_READONLY = 'default1';
process.env.CUBEJS_DS_POSTGRES_DB_CLICKHOUSE_READONLY = 'postgres1';
process.env.CUBEJS_DS_WRONG_DB_CLICKHOUSE_READONLY = 'wrong1';
expect(getEnv('clickhouseReadOnly', { dataSource: 'default' })).toEqual('default1');
expect(getEnv('clickhouseReadOnly', { dataSource: 'postgres' })).toEqual('postgres1');
process.env.CUBEJS_DB_CLICKHOUSE_READONLY = 'true';
process.env.CUBEJS_DS_POSTGRES_DB_CLICKHOUSE_READONLY = 'true';
process.env.CUBEJS_DS_WRONG_DB_CLICKHOUSE_READONLY = 'true';
expect(getEnv('clickhouseReadOnly', { dataSource: 'default' })).toEqual(true);
expect(getEnv('clickhouseReadOnly', { dataSource: 'postgres' })).toEqual(true);
expect(() => getEnv('clickhouseReadOnly', { dataSource: 'wrong' })).toThrow(
'The wrong data source is missing in the declared CUBEJS_DATASOURCES.'
);

process.env.CUBEJS_DB_CLICKHOUSE_READONLY = 'default2';
process.env.CUBEJS_DS_POSTGRES_DB_CLICKHOUSE_READONLY = 'postgres2';
process.env.CUBEJS_DS_WRONG_DB_CLICKHOUSE_READONLY = 'wrong2';
expect(getEnv('clickhouseReadOnly', { dataSource: 'default' })).toEqual('default2');
expect(getEnv('clickhouseReadOnly', { dataSource: 'postgres' })).toEqual('postgres2');
process.env.CUBEJS_DB_CLICKHOUSE_READONLY = 'false';
process.env.CUBEJS_DS_POSTGRES_DB_CLICKHOUSE_READONLY = 'false';
process.env.CUBEJS_DS_WRONG_DB_CLICKHOUSE_READONLY = 'false';
expect(getEnv('clickhouseReadOnly', { dataSource: 'default' })).toEqual(false);
expect(getEnv('clickhouseReadOnly', { dataSource: 'postgres' })).toEqual(false);
expect(() => getEnv('clickhouseReadOnly', { dataSource: 'wrong' })).toThrow(
'The wrong data source is missing in the declared CUBEJS_DATASOURCES.'
);

process.env.CUBEJS_DB_CLICKHOUSE_READONLY = 'wrong';
process.env.CUBEJS_DS_POSTGRES_DB_CLICKHOUSE_READONLY = 'wrong';
process.env.CUBEJS_DS_WRONG_DB_CLICKHOUSE_READONLY = 'wrong';
expect(() => getEnv('clickhouseReadOnly', { dataSource: 'default' })).toThrow(
'env-var: "CUBEJS_DB_CLICKHOUSE_READONLY" should be either "true", "false", "TRUE", "FALSE", 1, or 0'
);
expect(() => getEnv('clickhouseReadOnly', { dataSource: 'postgres' })).toThrow(
'env-var: "CUBEJS_DS_POSTGRES_DB_CLICKHOUSE_READONLY" should be either "true", "false", "TRUE", "FALSE", 1, or 0'
);
expect(() => getEnv('clickhouseReadOnly', { dataSource: 'wrong' })).toThrow(
'The wrong data source is missing in the declared CUBEJS_DATASOURCES.'
);

delete process.env.CUBEJS_DB_CLICKHOUSE_READONLY;
delete process.env.CUBEJS_DS_POSTGRES_DB_CLICKHOUSE_READONLY;
delete process.env.CUBEJS_DS_WRONG_DB_CLICKHOUSE_READONLY;
expect(getEnv('clickhouseReadOnly', { dataSource: 'default' })).toBeUndefined();
expect(getEnv('clickhouseReadOnly', { dataSource: 'postgres' })).toBeUndefined();
expect(getEnv('clickhouseReadOnly', { dataSource: 'default' })).toEqual(false);
expect(getEnv('clickhouseReadOnly', { dataSource: 'postgres' })).toEqual(false);
expect(() => getEnv('clickhouseReadOnly', { dataSource: 'wrong' })).toThrow(
'The wrong data source is missing in the declared CUBEJS_DATASOURCES.'
);
});

test('getEnv("clickhouseCompression")', () => {
process.env.CUBEJS_DB_CLICKHOUSE_COMPRESSION = 'true';
process.env.CUBEJS_DS_POSTGRES_DB_CLICKHOUSE_COMPRESSION = 'true';
process.env.CUBEJS_DS_WRONG_DB_CLICKHOUSE_COMPRESSION = 'true';
expect(getEnv('clickhouseCompression', { dataSource: 'default' })).toEqual(true);
expect(getEnv('clickhouseCompression', { dataSource: 'postgres' })).toEqual(true);
expect(() => getEnv('clickhouseCompression', { dataSource: 'wrong' })).toThrow(
'The wrong data source is missing in the declared CUBEJS_DATASOURCES.'
);

process.env.CUBEJS_DB_CLICKHOUSE_COMPRESSION = 'false';
process.env.CUBEJS_DS_POSTGRES_DB_CLICKHOUSE_COMPRESSION = 'false';
process.env.CUBEJS_DS_WRONG_DB_CLICKHOUSE_COMPRESSION = 'false';
expect(getEnv('clickhouseCompression', { dataSource: 'default' })).toEqual(false);
expect(getEnv('clickhouseCompression', { dataSource: 'postgres' })).toEqual(false);
expect(() => getEnv('clickhouseCompression', { dataSource: 'wrong' })).toThrow(
'The wrong data source is missing in the declared CUBEJS_DATASOURCES.'
);

process.env.CUBEJS_DB_CLICKHOUSE_COMPRESSION = 'wrong';
process.env.CUBEJS_DS_POSTGRES_DB_CLICKHOUSE_COMPRESSION = 'wrong';
process.env.CUBEJS_DS_WRONG_DB_CLICKHOUSE_COMPRESSION = 'wrong';
expect(() => getEnv('clickhouseCompression', { dataSource: 'default' })).toThrow(
'env-var: "CUBEJS_DB_CLICKHOUSE_COMPRESSION" should be either "true", "false", "TRUE", "FALSE", 1, or 0'
);
expect(() => getEnv('clickhouseCompression', { dataSource: 'postgres' })).toThrow(
'env-var: "CUBEJS_DS_POSTGRES_DB_CLICKHOUSE_COMPRESSION" should be either "true", "false", "TRUE", "FALSE", 1, or 0'
);
expect(() => getEnv('clickhouseCompression', { dataSource: 'wrong' })).toThrow(
'The wrong data source is missing in the declared CUBEJS_DATASOURCES.'
);

delete process.env.CUBEJS_DB_CLICKHOUSE_COMPRESSION;
delete process.env.CUBEJS_DS_POSTGRES_DB_CLICKHOUSE_COMPRESSION;
delete process.env.CUBEJS_DS_WRONG_DB_CLICKHOUSE_COMPRESSION;
expect(getEnv('clickhouseCompression', { dataSource: 'default' })).toEqual(false);
expect(getEnv('clickhouseCompression', { dataSource: 'postgres' })).toEqual(false);
expect(() => getEnv('clickhouseCompression', { dataSource: 'wrong' })).toThrow(
'The wrong data source is missing in the declared CUBEJS_DATASOURCES.'
);
});

test('getEnv("elasticApiId")', () => {
process.env.CUBEJS_DB_ELASTIC_APIKEY_ID = 'default1';
process.env.CUBEJS_DS_POSTGRES_DB_ELASTIC_APIKEY_ID = 'postgres1';
63 changes: 51 additions & 12 deletions packages/cubejs-backend-shared/test/db_env_single.test.ts
Original file line number Diff line number Diff line change
@@ -959,20 +959,59 @@ describe('Single datasources', () => {
});

test('getEnv("clickhouseReadOnly")', () => {
process.env.CUBEJS_DB_CLICKHOUSE_READONLY = 'default1';
expect(getEnv('clickhouseReadOnly', { dataSource: 'default' })).toEqual('default1');
expect(getEnv('clickhouseReadOnly', { dataSource: 'postgres' })).toEqual('default1');
expect(getEnv('clickhouseReadOnly', { dataSource: 'wrong' })).toEqual('default1');

process.env.CUBEJS_DB_CLICKHOUSE_READONLY = 'default2';
expect(getEnv('clickhouseReadOnly', { dataSource: 'default' })).toEqual('default2');
expect(getEnv('clickhouseReadOnly', { dataSource: 'postgres' })).toEqual('default2');
expect(getEnv('clickhouseReadOnly', { dataSource: 'wrong' })).toEqual('default2');
process.env.CUBEJS_DB_CLICKHOUSE_READONLY = 'true';
expect(getEnv('clickhouseReadOnly', { dataSource: 'default' })).toEqual(true);
expect(getEnv('clickhouseReadOnly', { dataSource: 'postgres' })).toEqual(true);
expect(getEnv('clickhouseReadOnly', { dataSource: 'wrong' })).toEqual(true);

process.env.CUBEJS_DB_CLICKHOUSE_READONLY = 'false';
expect(getEnv('clickhouseReadOnly', { dataSource: 'default' })).toEqual(false);
expect(getEnv('clickhouseReadOnly', { dataSource: 'postgres' })).toEqual(false);
expect(getEnv('clickhouseReadOnly', { dataSource: 'wrong' })).toEqual(false);

process.env.CUBEJS_DB_CLICKHOUSE_READONLY = 'wrong';
expect(() => getEnv('clickhouseReadOnly', { dataSource: 'default' })).toThrow(
'env-var: "CUBEJS_DB_CLICKHOUSE_READONLY" should be either "true", "false", "TRUE", "FALSE", 1, or 0'
);
expect(() => getEnv('clickhouseReadOnly', { dataSource: 'postgres' })).toThrow(
'env-var: "CUBEJS_DB_CLICKHOUSE_READONLY" should be either "true", "false", "TRUE", "FALSE", 1, or 0'
);
expect(() => getEnv('clickhouseReadOnly', { dataSource: 'wrong' })).toThrow(
'env-var: "CUBEJS_DB_CLICKHOUSE_READONLY" should be either "true", "false", "TRUE", "FALSE", 1, or 0'
);

delete process.env.CUBEJS_DB_CLICKHOUSE_READONLY;
expect(getEnv('clickhouseReadOnly', { dataSource: 'default' })).toBeUndefined();
expect(getEnv('clickhouseReadOnly', { dataSource: 'postgres' })).toBeUndefined();
expect(getEnv('clickhouseReadOnly', { dataSource: 'wrong' })).toBeUndefined();
expect(getEnv('clickhouseReadOnly', { dataSource: 'default' })).toEqual(false);
expect(getEnv('clickhouseReadOnly', { dataSource: 'postgres' })).toEqual(false);
expect(getEnv('clickhouseReadOnly', { dataSource: 'wrong' })).toEqual(false);
});

test('getEnv("clickhouseCompression")', () => {
process.env.CUBEJS_DB_CLICKHOUSE_COMPRESSION = 'true';
expect(getEnv('clickhouseCompression', { dataSource: 'default' })).toEqual(true);
expect(getEnv('clickhouseCompression', { dataSource: 'postgres' })).toEqual(true);
expect(getEnv('clickhouseCompression', { dataSource: 'wrong' })).toEqual(true);

process.env.CUBEJS_DB_CLICKHOUSE_COMPRESSION = 'false';
expect(getEnv('clickhouseCompression', { dataSource: 'default' })).toEqual(false);
expect(getEnv('clickhouseCompression', { dataSource: 'postgres' })).toEqual(false);
expect(getEnv('clickhouseCompression', { dataSource: 'wrong' })).toEqual(false);

process.env.CUBEJS_DB_CLICKHOUSE_COMPRESSION = 'wrong';
expect(() => getEnv('clickhouseCompression', { dataSource: 'default' })).toThrow(
'env-var: "CUBEJS_DB_CLICKHOUSE_COMPRESSION" should be either "true", "false", "TRUE", "FALSE", 1, or 0'
);
expect(() => getEnv('clickhouseCompression', { dataSource: 'postgres' })).toThrow(
'env-var: "CUBEJS_DB_CLICKHOUSE_COMPRESSION" should be either "true", "false", "TRUE", "FALSE", 1, or 0'
);
expect(() => getEnv('clickhouseCompression', { dataSource: 'wrong' })).toThrow(
'env-var: "CUBEJS_DB_CLICKHOUSE_COMPRESSION" should be either "true", "false", "TRUE", "FALSE", 1, or 0'
);

delete process.env.CUBEJS_DB_CLICKHOUSE_COMPRESSION;
expect(getEnv('clickhouseCompression', { dataSource: 'default' })).toEqual(false);
expect(getEnv('clickhouseCompression', { dataSource: 'postgres' })).toEqual(false);
expect(getEnv('clickhouseCompression', { dataSource: 'wrong' })).toEqual(false);
});

test('getEnv("elasticApiId")', () => {
10 changes: 8 additions & 2 deletions packages/cubejs-clickhouse-driver/src/ClickHouseDriver.ts
Original file line number Diff line number Diff line change
@@ -111,6 +111,7 @@ type ClickHouseDriverConfig = {
database: string,
requestTimeout: number,
exportBucket: ClickhouseDriverExportAWS | null,
compression: { response?: boolean; request?: boolean },
clickhouseSettings: ClickHouseSettings,
};

@@ -150,8 +151,7 @@ export class ClickHouseDriver extends BaseDriver implements DriverInterface {
const database = config.database ?? (getEnv('dbName', { dataSource }) as string) ?? 'default';

// TODO this is a bit inconsistent with readOnly
this.readOnlyMode =
getEnv('clickhouseReadOnly', { dataSource }) === 'true';
this.readOnlyMode = getEnv('clickhouseReadOnly', { dataSource });

// Expect that getEnv('dbQueryTimeout') will always return a value
const requestTimeoutEnv: number = getEnv('dbQueryTimeout', { dataSource }) * 1000;
@@ -165,6 +165,11 @@ export class ClickHouseDriver extends BaseDriver implements DriverInterface {
exportBucket: this.getExportBucket(dataSource),
readOnly: !!config.readOnly,
requestTimeout,
compression: {
// Response compression can't be enabled for a user with readonly=1, as ClickHouse will not allow settings modifications for such user.
response: this.readOnlyMode ? false : getEnv('clickhouseCompression', { dataSource }),
request: getEnv('clickhouseCompression', { dataSource }),
},
clickhouseSettings: {
// If ClickHouse user's permissions are restricted with "readonly = 1",
// change settings queries are not allowed. Thus, "join_use_nulls" setting
@@ -224,6 +229,7 @@ export class ClickHouseDriver extends BaseDriver implements DriverInterface {
username: this.config.username,
password: this.config.password,
database: this.config.database,
compression: this.config.compression,
clickhouse_settings: this.config.clickhouseSettings,
request_timeout: this.config.requestTimeout,
max_open_connections: maxPoolSize,
Loading
Oops, something went wrong.