Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improve] [Clickhouse-V2] Clickhouse Support Int128,Int256 Type #3067

Merged

Conversation

FWLamb
Copy link
Contributor

@FWLamb FWLamb commented Oct 11, 2022

Purpose of this pull request

Clickhouse Support Int128,Int256 Type #3057

Check list

@TyrantLucifer
Copy link
Member

Your pull request changed the core module without proposal. Please refer to Coding Guide change your pr request. I think Int128,Int256 Type can using Long data type.

@Hisoka-X PLAL.

@FWLamb FWLamb marked this pull request as draft October 11, 2022 14:56
@FWLamb
Copy link
Contributor Author

FWLamb commented Oct 12, 2022

@TyrantLucifer #3071

@FWLamb FWLamb marked this pull request as ready for review October 12, 2022 04:39
@FWLamb FWLamb marked this pull request as draft October 14, 2022 01:27
@FWLamb FWLamb force-pushed the clickhouse_source_support_int128_int256 branch from 551920b to 8df4d26 Compare October 14, 2022 01:40
@FWLamb FWLamb marked this pull request as ready for review October 14, 2022 01:41
@FWLamb
Copy link
Contributor Author

FWLamb commented Oct 14, 2022

I found that it can be achieved with String data type, which has passed the E2E test. @Hisoka-X PLAL.

@Hisoka-X
Copy link
Member

We should support it both source side and sink side, so we should test read int128 from clickhouse source and write it into clickhouse table.

@FWLamb
Copy link
Contributor Author

FWLamb commented Oct 14, 2022

We should support it both source side and sink side, so we should test read int128 from clickhouse source and write it into clickhouse table.

Yes, I have already tested.
The following is a case of e2d testing.

source_table = """
create table if not exists default.source_table(
id Int64,
c_map Map(String, Int32),
c_array_string Array(String),
c_array_short Array(Int16),
c_array_int Array(Int32),
c_array_long Array(Int64),
c_array_float Array(Float32),
c_array_double Array(Float64),
c_string String,
c_boolean Boolean,
c_int8 Int8,
c_int16 Int16,
c_int32 Int32,
c_int64 Int64,
c_float32 Float32,
c_float64 Float64,
c_decimal Decimal(9,4),
c_date Date,
c_datetime DateTime64,
c_nullable Nullable(Int32),
c_lowcardinality LowCardinality(String),
c_nested Nested
(
int UInt32,
double Int64,
string String
),
c_int128 Int128,
c_int256 Int256,
c_uint128 UInt128,
c_uint256 UInt256
)engine=Memory;
"""

sink_table = """
create table if not exists default.sink_table(
id Int64,
c_map Map(String, Int32),
c_array_string Array(String),
c_array_short Array(Int16),
c_array_int Array(Int32),
c_array_long Array(Int64),
c_array_float Array(Float32),
c_array_double Array(Float64),
c_string String,
c_boolean Boolean,
c_int8 Int8,
c_int16 Int16,
c_int32 Int32,
c_int64 Int64,
c_float32 Float32,
c_float64 Float64,
c_decimal Decimal(9,4),
c_date Date,
c_datetime DateTime64,
c_nullable Nullable(Int32),
c_lowcardinality LowCardinality(String),
c_nested Nested
(
int UInt32,
double Int64,
string String
),
c_int128 Int128,
c_int256 Int256,
c_uint128 UInt128,
c_uint256 UInt256
)engine=Memory;
"""

env {
execution.parallelism = 1
job.mode = "BATCH"
}

source {
Clickhouse {
host = "clickhouse:8123"
database = "default"
sql = "select * from source_table"
username = "default"
password = ""
result_table_name = "source_table"
}
}

transform {
}

sink {
Clickhouse {
host = "clickhouse:8123"
database = "default"
table = "sink_table"
fields = [
"id",
"c_map",
"c_array_string",
"c_array_short",
"c_array_int",
"c_array_long",
"c_array_float",
"c_array_double",
"c_string",
"c_boolean",
"c_int8",
"c_int16",
"c_int32",
"c_int64",
"c_float32",
"c_float64",
"c_decimal",
"c_date",
"c_datetime",
"c_nullable",
"c_lowcardinality",
"c_nested.int",
"c_nested.double",
"c_nested.string",
"c_int128",
"c_int256",
"c_uint128",
"c_uint256"
]
username = "default"
password = ""
}
}

@Hisoka-X
Copy link
Member

I have a question, why not use decimal to implement int128 and int256?

@FWLamb
Copy link
Contributor Author

FWLamb commented Oct 14, 2022

Because decimal only supports up to 38 digits in spark, Int128 and Int256 have exceeded 38 digits, and Spark reported an exception.

@FWLamb
Copy link
Contributor Author

FWLamb commented Oct 14, 2022

I use String type in spark, Decimal type in other places, it's ok. So I thought, can I only use the String type, I tested it later and found that it is possible.

Copy link
Member

@EricJoy2048 EricJoy2048 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@EricJoy2048 EricJoy2048 merged commit e118cce into apache:dev Oct 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants