Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature][Connector-V2][Kafka]Kafka source supports data deserializat… #4364

Merged

Conversation

521daichen
Copy link
Contributor

@521daichen 521daichen commented Mar 17, 2023

…ion failure skipping #4361

[Feature][Connector-V2][Kafka]Kafka source supports data deserialization failure skipping #4361

Purpose of this pull request

Check list

@521daichen 521daichen force-pushed the 521daichen/kafka-source-support-error-handle branch 2 times, most recently from a4c25ba to e9e7601 Compare March 17, 2023 09:44
@TyrantLucifer
Copy link
Member

Add e2e test case to verify this pull reqeust. Thanks

@521daichen
Copy link
Contributor Author

Add e2e test case to verify this pull reqeust. Thanks

The function configuration has been added to the kafka-e2e project, and the local docker verification has been completed.

Thanks.

@TyrantLucifer
Copy link
Member

Add e2e test case to verify this pull reqeust. Thanks

The function configuration has been added to the kafka-e2e project, and the local docker verification has been completed.

Thanks.

BTW, SeaTunnel use the spotless plugin to guarantee the style of code, so before you submit pull reqeuest you should execute mvn spotless:apply to format you code

@521daichen
Copy link
Contributor Author

Add e2e test case to verify this pull reqeust. Thanks

Thanks a lot for the reminder.
I execute locally
mvn spotless: check
mvn spotless:apply returns successfully, but it is not clear why CI did not pass

[INFO] SeaTunnel : E2E : Connector V2 : DataHub ........... SUCCESS [ 0.015 s]
[INFO] SeaTunnel : E2E : Connector V2 : Mongodb ........... SUCCESS [ 0.015 s]
[INFO] SeaTunnel : E2E : Connector V2 : Hbase ............. SUCCESS [ 0.014 s]
[INFO] connector-maxcompute-e2e ........................... SUCCESS [ 0.014 s]
[INFO] SeaTunnel : E2E : Engine : ......................... SUCCESS [ 0.013 s]
[INFO] SeaTunnel : E2E : Engine : Base .................... SUCCESS [ 0.127 s]
[INFO] SeaTunnel : E2E : Engine : Console ................. SUCCESS [ 0.016 s]
[INFO] SeaTunnel : E2E : Transforms V2 .................... SUCCESS [ 0.015 s]
[INFO] SeaTunnel : E2E : Transforms V2 : Common ........... SUCCESS [ 0.015 s]
[INFO] SeaTunnel : E2E : Transforms V2 : Part 1 ........... SUCCESS [ 0.015 s]
[INFO] SeaTunnel : E2E : Transforms V2 : Part 2 ........... SUCCESS [ 0.016 s]
[INFO] SeaTunnel : Dist ................................... SUCCESS [ 0.013 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 8.417 s
[INFO] Finished at: 2023-03-17T22:41:47+08:00
[INFO] ------------------------------------------------------------------------
➜ incubator-seatunnel git:(521daichen/kafka-source-support-error-handle) mvn spotless:apply

@521daichen 521daichen force-pushed the 521daichen/kafka-source-support-error-handle branch from e9e7601 to 22f4cfa Compare March 17, 2023 15:26
521daichen and others added 6 commits March 18, 2023 00:15
…ion failure skipping apache#4361

[Feature][Connector-V2][Kafka]Kafka source supports data deserialization failure skipping apache#4361
1. change log level
2. add kafka source changeLog
add changelog
add e2e
@TyrantLucifer TyrantLucifer force-pushed the 521daichen/kafka-source-support-error-handle branch from 22f4cfa to 9183d7f Compare March 17, 2023 16:16
Copy link
Member

@TyrantLucifer TyrantLucifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an e2e case to do the following things:

  1. Send different format messages to kafka and setting way to fail
  2. Send different format messages to kafka and setting way to skip

You can refer to this case KafkaIT.

521daichen and others added 2 commits March 19, 2023 12:54
…he/seatunnel/connectors/seatunnel/kafka/source/KafkaSource.java

Co-authored-by: Tyrantlucifer <tyrantlucifer@apache.org>
@laglangyue
Copy link
Contributor

maybe we need to collect the dirty data about this by engine,it's very important featur about ETL sofeware, Actually, this pr can slove the serialization of kafka data,can we get the more detail reason about serialization failure,sometimes,we need such feature as null as emptynull,'default data' to solve the dirty illegal column.

1、format_error_handle_way = fail
The data is invalid, an exception will be thrown
2、skip
The data is invalid and will be skipped
@521daichen
Copy link
Contributor Author

ok. I have added e2e

1、format_error_handle_way = fail
The data is invalid, an exception will be thrown
2、skip
The data is invalid and will be skipped

 unify exception
Copy link
Member

@TyrantLucifer TyrantLucifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's waiting CI/CD

Copy link
Member

@TyrantLucifer TyrantLucifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need fix e2e test cases

@521daichen
Copy link
Contributor Author

fixed

TyrantLucifer
TyrantLucifer previously approved these changes Mar 23, 2023
Co-authored-by: Eric <gaojun2048@gmail.com>
EricJoy2048
EricJoy2048 previously approved these changes Mar 24, 2023
Copy link
Member

@EricJoy2048 EricJoy2048 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

fix code-style
@TyrantLucifer TyrantLucifer merged commit e1ed22b into apache:dev Mar 24, 2023
MonsterChenzhuo pushed a commit to MonsterChenzhuo/incubator-seatunnel that referenced this pull request Apr 19, 2023
…tion failure skipping (apache#4364)

* [Feature][Connector-V2][Kafka]Kafka source supports data deserialization failure skipping apache#4361

[Feature][Connector-V2][Kafka]Kafka source supports data deserialization failure skipping apache#4361

* change log level and add changelog

1. change log level
2. add kafka source changeLog

* add changelog

add changelog

* add e2e

add e2e

* add e2e case

* [Feature][Connector-V2][Kafka] Fix code style

* Update seatunnel-connectors-v2/connector-kafka/src/main/java/org/apache/seatunnel/connectors/seatunnel/kafka/source/KafkaSource.java

Co-authored-by: Tyrantlucifer <tyrantlucifer@apache.org>

* fix code-review

* add e2e case for format_error_handle_way

1、format_error_handle_way = fail
The data is invalid, an exception will be thrown
2、skip
The data is invalid and will be skipped

* unify exception

 unify exception

* change e2e config

* fix e2e test case

* Update docs/en/connector-v2/source/kafka.md

Co-authored-by: Eric <gaojun2048@gmail.com>

* Update kafka.md

fix code-style

---------

Co-authored-by: tyrantlucifer <tyrantlucifer@gmail.com>
Co-authored-by: Tyrantlucifer <tyrantlucifer@apache.org>
Co-authored-by: Eric <gaojun2048@gmail.com>
ic4y pushed a commit to ic4y/incubator-seatunnel that referenced this pull request May 22, 2023
…tion failure skipping (apache#4364)

* [Feature][Connector-V2][Kafka]Kafka source supports data deserialization failure skipping apache#4361

[Feature][Connector-V2][Kafka]Kafka source supports data deserialization failure skipping apache#4361

* change log level and add changelog

1. change log level
2. add kafka source changeLog

* add changelog

add changelog

* add e2e

add e2e

* add e2e case

* [Feature][Connector-V2][Kafka] Fix code style

* Update seatunnel-connectors-v2/connector-kafka/src/main/java/org/apache/seatunnel/connectors/seatunnel/kafka/source/KafkaSource.java

Co-authored-by: Tyrantlucifer <tyrantlucifer@apache.org>

* fix code-review

* add e2e case for format_error_handle_way

1、format_error_handle_way = fail
The data is invalid, an exception will be thrown
2、skip
The data is invalid and will be skipped

* unify exception

 unify exception

* change e2e config

* fix e2e test case

* Update docs/en/connector-v2/source/kafka.md

Co-authored-by: Eric <gaojun2048@gmail.com>

* Update kafka.md

fix code-style

---------

Co-authored-by: tyrantlucifer <tyrantlucifer@gmail.com>
Co-authored-by: Tyrantlucifer <tyrantlucifer@apache.org>
Co-authored-by: Eric <gaojun2048@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants