New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add suppport for creating table with MergeTree engine. #7135
Conversation
ClickHouseTableProperties.getPartitionBy(tableProperties).ifPresent(value -> tableOptions.add("PARTITION BY " + value)); | ||
ClickHouseTableProperties.getSampleBy(tableProperties).ifPresent(value -> tableOptions.add("SAMPLE BY " + value)); | ||
} | ||
else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should validate the order by, primary key, partition by, sample by are not present if engine is not "mergretree".
Otherwise we silently ignore table properties explicitly requested by the user.
Other option would be to pass everything to ClickHouse and let CH do the validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I means I will add required property (order by
) and optional properties (if present) for MergeTree
engine.
But for Log
engine, all above properties are not required, so i just ignore them.
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java
Outdated
Show resolved
Hide resolved
4ea7e9c
to
55b9193
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments.
A question about testing.
If ClickHouse passes the testAddColumn
and testDropColumn
defined in superclass then instead of over-riding them add new methods with more descriptive names (e.g. testDropColumn
-> testDropMergeTreeColumn
, testAddColumn
-> testAddMergeTreeColumn
.
In case the superclass tests fail I'd suggest to add a SkipException
with the reason. This way if the base test class is ever improved in future all connectors will get the improved test coverage for free.
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseClient.java
Outdated
Show resolved
Hide resolved
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseEngineType.java
Show resolved
Hide resolved
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseTableProperties.java
Outdated
Show resolved
Hide resolved
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseTableProperties.java
Outdated
Show resolved
Hide resolved
...n/trino-clickhouse/src/test/java/io/trino/plugin/clickhouse/TestClickHouseConnectorTest.java
Show resolved
Hide resolved
...n/trino-clickhouse/src/test/java/io/trino/plugin/clickhouse/TestClickHouseConnectorTest.java
Outdated
Show resolved
Hide resolved
7e9cee7
to
60c7c38
Compare
false), | ||
new PropertyMetadata<>( | ||
ORDER_BY_PROPERTY, | ||
"columns to be the sorting key. Required", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Required for tables using MergeTree engine property."
But I think it's OK to drop "Required" and "Optional" here and just describe this in docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These description of table property comes from office documentation.
Ok, I change them.
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseTableProperties.java
Outdated
Show resolved
Hide resolved
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseTableProperties.java
Outdated
Show resolved
Hide resolved
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseTableProperties.java
Outdated
Show resolved
Hide resolved
plugin/trino-clickhouse/src/main/java/io/trino/plugin/clickhouse/ClickHouseTableProperties.java
Outdated
Show resolved
Hide resolved
ClickHouseTableProperties.getPrimaryKey(tableProperties).ifPresent(value -> tableOptions.add("PRIMARY KEY " + value)); | ||
ClickHouseTableProperties.getPartitionBy(tableProperties).ifPresent(value -> tableOptions.add("PARTITION BY " + value)); | ||
ClickHouseTableProperties.getSampleBy(tableProperties).ifPresent(value -> tableOptions.add("SAMPLE BY " + value)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
relating to #7135 (comment),
what happens if i issue
CREATE TABLE clickhouse.default.t(a int)
WITH (partition_by = ARRAY['x'])
?
I would expect some kind of failure (either from the connector, or from ClickHouse), because there is no column x
.
Then, what happens if i issue
CREATE TABLE clickhouse.default.t(a int, x int)
WITH (engine = 'LOG', partition_by = ARRAY['x'])
?
I would expect some kind of failure (either from the connector, or from ClickHouse), because i asked for the table to be partitioned, but it cannot be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CREATE TABLE clickhouse.default.t(a int)
WITH (partition_by = ARRAY['x'])
CREATE TABLE clickhouse.default.t(a int, x int)
WITH (engine = 'LOG', partition_by = ARRAY['x'])
Yes, the above sqls will both create successfully, because the default table engine Log
will ignore all other properties.
we can pass all table property regardless table engine, it will be kind of failure from ClickHouse side。
So we take this approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
execute the following sql in ClickHouse directly
create table default.t (a int) engine = Log order by a;
It print the following errors:
ClickHouse exception, code: 36, host: 10.60.242.112, port: 18123; Code: 36, e.displayText() = DB::Exception:
Engine Log doesn't support PARTITION_BY, PRIMARY_KEY, ORDER_BY or SAMPLE_BY clauses.
Currently only the following engines have support for the feature: [MergeTree, ReplicatedVersionedCollapsingMergeTree,
ReplacingMergeTree, ReplicatedSummingMergeTree, ReplicatedAggregatingMergeTree, ReplicatedCollapsingMergeTree, ReplicatedGraphiteMergeTree, ReplicatedMergeTree,
ReplicatedReplacingMergeTree, VersionedCollapsingMergeTree, SummingMergeTree,
GraphiteMergeTree, CollapsingMergeTree, AggregatingMergeTree] (version 20.3.5.21 (official build))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified codes:
@Override
protected String createTableSql(RemoteTableName remoteTableName, List<String> columns, ConnectorTableMetadata tableMetadata)
{
ImmutableList.Builder<String> tableOptions = ImmutableList.builder();
Map<String, Object> tableProperties = tableMetadata.getProperties();
ClickHouseEngineType engine = ClickHouseTableProperties.getEngine(tableProperties);
tableOptions.add("ENGINE = " + engine.getEngineType());
if (engine == ClickHouseEngineType.MERGETREE && formatProperty(ClickHouseTableProperties.getOrderBy(tableProperties)).isEmpty()) {
// order_by property is required
throw new TrinoException(INVALID_TABLE_PROPERTY, format("The property of %s is required for table engine %s", ClickHouseTableProperties.ORDER_BY_PROPERTY, engine.getEngineType()));
}
formatProperty(ClickHouseTableProperties.getOrderBy(tableProperties)).ifPresent(value -> tableOptions.add("ORDER BY " + value));
formatProperty(ClickHouseTableProperties.getPrimaryKey(tableProperties)).ifPresent(value -> tableOptions.add("PRIMARY KEY " + value));
formatProperty(ClickHouseTableProperties.getPartitionBy(tableProperties)).ifPresent(value -> tableOptions.add("PARTITION BY " + value));
ClickHouseTableProperties.getSampleBy(tableProperties).ifPresent(value -> tableOptions.add("SAMPLE BY " + value));
return format("CREATE TABLE %s (%s) %s", quoted(remoteTableName), join(", ", columns), join(" ", tableOptions.build()));
}
60c7c38
to
1496799
Compare
The build status seems related
|
1496799
to
bfc29cd
Compare
@electrum @kokosing @martint @losipiuk cc @trinodb/maintainers I have a question how should we model table properties that are arbitrary expessions.
Those expressions can be column names, in which case user could expect the connector to apply proper quoting. We could reduce risk for confusion by naming the table properties informatively, eg I am therefore inclined to continue with the approach presented here, using shorter table property names, eg Thoughts? |
@findepi |
2021-03-06 1. Changes engine property to enum class 2. Validates required property 2021-03-08 1. Add test cases for table properties 2021-03-09 1. move ClickHouseTableProperties#formatPropety to ClickHouse 2. Add all passed table properties regardless of table engine 3. more tests 2021-03-10 1. Fixed excepted message for test
bfc29cd
to
e6317e7
Compare
Merged, thanks! |
MergeTree
engine Support for create table with MergeTree engine in ClickHouse connector #7130