Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BitSail][Connector] Support Assert sink connector #282

Merged
merged 5 commits into from
Dec 30, 2022

Conversation

liuxiaocs7
Copy link
Contributor

@liuxiaocs7 liuxiaocs7 commented Dec 26, 2022

Signed-off-by:

Pre-Checklist

Note: Please complete ALL items in the following checklist.

  • I have read through the CONTRIBUTING.md documentation.
  • My code has the necessary comments and documentation (if needed).
  • I have added relevant tests.

Purpose

Some description about what this PR wants to do.

Approaches

Currently we support print sink and we use print in some tests, but print is not sufficient enough to verify the test result.
So we implement an AssertSink to verify the test result more accurately.

Related Issues

Close #153

New Behavior (screenshots if needed)

N/A

@garyli1019 garyli1019 self-assigned this Dec 28, 2022
Copy link
Collaborator

@garyli1019 garyli1019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your contribution @liuxiaocs7 , left some comments

@@ -0,0 +1,9 @@
{
"name": "bitsail-connector-unified-assert",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove unified here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that connector V1 is uniformly named bitsail-connector-unified-xxx, like this, the same naming convention is used here.

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I see

import java.util.List;
import java.util.Map;

public class AssertRuleExecutor {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add UT for this class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, i'll add it


@Override
public void close() throws IOException {
if (rowRules != null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't get this. Why do we do this at closing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rowRules is used to limit the number of processed data by engine, we need to ensure that the number of processed data is within the range of [min_row, max_row]. In write method, we add counter the after handler a row and compare it in close method finally.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the row rule must be a single parallelism right, looks like this counter only count the number of rows in this writer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for pointing this out, i think so, if we want to limit row number, the parallelism must be one, can we set it writer_parallelism_num by this conf, or any suggestions, thx

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the assert sink only support single parallelism, we should make sure the single parallelism was hard coded to ensure the correct result. The related class is ParallelismComputable.
If we wanna support multi parallelism, I believe we should some how aggregate the count in the coordinator. We could make the multi parallelism support as a follow up task for future work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much, got it, i'll look ParallelismComputable first.

import java.util.List;
import java.util.Map;

public class AssertRuleParser implements Serializable {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO UT is necessary for this class. can we add it please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, by the way, what's the meaning of IMO? :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it stands for in my opinion :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get it, thx

@liuxiaocs7
Copy link
Contributor Author

@garyli1019 The ci failed, and error message shows failing to download maven dependency, which may be caused by network problems.

Error: Failed to execute goal on project bitsail-connector-rocketmq: Could not resolve dependencies for project com.bytedance.bitsail:bitsail-connector-rocketmq:jar:0.1.0-SNAPSHOT: Failed to collect dependencies at org.apache.rocketmq:rocketmq-client:jar:4.9.2: Failed to read artifact descriptor for org.apache.rocketmq:rocketmq-client:jar:4.9.2: Could not transfer artifact org.apache.rocketmq:rocketmq-client:pom:4.9.2 from/to central (https://repo.maven.apache.org/maven2): transfer failed for https://repo.maven.apache.org/maven2/org/apache/rocketmq/rocketmq-client/4.9.2/rocketmq-client-4.9.2.pom: Connection timed out (Read failed) -> [Help 1]

Copy link
Collaborator

@garyli1019 garyli1019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM~ Thanks for your contribution @liuxiaocs7 , would you please make a follow up PR to add the documentation about this assert sink?

@garyli1019 garyli1019 merged commit 034d80d into bytedance:master Dec 30, 2022
@liuxiaocs7
Copy link
Contributor Author

LGTM~ Thanks for your contribution @liuxiaocs7 , would you please make a follow up PR to add the documentation about this assert sink?

sure, I'd love to do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[BitSail][Connector] Support Assert sink connector
2 participants