Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Functions] Support KEY_BASED batch builder for Java based functions and sources #11706

Conversation

lhotari
Copy link
Member

@lhotari lhotari commented Aug 18, 2021

Motivation

#8523 added support for configuring KEY_BASED batch builder for functions and sources. However, the implemention missed some parts for Java based functions and sources. Support for Go was added in #8561 and for Python in #8540 .

The required code went missing as part of reverting #8434 changes in #10843 . The KEY_BASED logic had been added to the PulsarCluster class after #8434 had been merged.

Modifications

  • Set producerBuilder.batcherBuilder(BatcherBuilder.KEY_BASED) when batch-builder is configured as KEY_BASED.
  • Include batchBuilder in ProducerSpec -> ProducerConfig.ProducerConfigBuilder conversion
  • Fix issue where --batch-builder KEY_BASED passed to pulsar-admin sources create command didn't do anything.

@lhotari lhotari added type/bug The PR fixed a bug or issue reported a bug doc-not-needed Your PR changes do not impact docs release/2.7.4 release/2.8.2 labels Aug 18, 2021
@lhotari lhotari added this to the 2.9.0 milestone Aug 18, 2021
@lhotari lhotari self-assigned this Aug 18, 2021
@lhotari lhotari requested a review from freeznet August 18, 2021 19:44
@lhotari lhotari requested a review from eolivelli August 19, 2021 04:49
@lhotari
Copy link
Member Author

lhotari commented Aug 19, 2021

There's a remaining gotcha with sources. For sources, the --batch-builder KEY_BASED argument isn't used. It is required to use --producer-config "{\"batchBuilder\": \"KEY_BASED\"}". for example:

./bin/pulsar-admin sources create --tenant public --namespace default --name datagen --destination-topic-name datagen -a builtin://data-generator --source-config "{\"sleepBetweenMessages\": 1000}" --producer-config "{\"batchBuilder\": \"KEY_BASED\"}"

I wonder if this is a bug or a feature.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change makes sense to me.
As usual we should add some test case that will save us from future regressions.

I am sorry but we need a end to end integration test for this

@lhotari
Copy link
Member Author

lhotari commented Aug 19, 2021

The change makes sense to me.
As usual we should add some test case that will save us from future regressions.

I am sorry but we need a end to end integration test for this

I completely agree. The changes in the PR now fix the various issues and it's time to think about the test coverage to prevent regressions.

@lhotari
Copy link
Member Author

lhotari commented Aug 19, 2021

There's a remaining gotcha with sources. For sources, the --batch-builder KEY_BASED argument isn't used. It is required to use --producer-config "{"batchBuilder": "KEY_BASED"}".

I also fixed this problem as part of this PR.

lhotari added a commit to lhotari/pulsar that referenced this pull request Aug 19, 2021
… (backports apache#11706)

- backports apache#11706 to branch-2.7

- include batchBuilder in ProducerSpec -> ProducerConfig.ProducerConfigBuilder conversion
  since it was missing

- support setting batch builder with "--batch-builder KEY_BASED" argument
@lhotari
Copy link
Member Author

lhotari commented Aug 19, 2021

I have created a separate PR for backporting the required changes to branch-2.7: #11710 . The code in branch-2.7 is slightly different since #8523 changes haven't been reverted in branch-2.7 .

@sijie
Copy link
Member

sijie commented Aug 19, 2021

@nlu90 Can you review this PR?

@lhotari
Copy link
Member Author

lhotari commented Aug 23, 2021

@codelipenghui @nlu90 @wolfstudy Please review the changes.

lhotari added a commit to datastax/pulsar that referenced this pull request Aug 24, 2021
@lhotari lhotari force-pushed the lh-fix-batch-builder-for-function-context-producers branch from 9ae27bc to 1a1f10a Compare August 24, 2021 06:51
@lhotari lhotari requested a review from hangc0276 August 24, 2021 11:39
@lhotari lhotari requested a review from zymap August 24, 2021 11:39
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@freeznet freeznet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

Copy link
Member

@nlu90 nlu90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lhotari lhotari merged commit b923af1 into apache:master Aug 25, 2021
lhotari added a commit that referenced this pull request Aug 26, 2021
… (backports #11706) (#11710)

- backports #11706 to branch-2.7

- include batchBuilder in ProducerSpec -> ProducerConfig.ProducerConfigBuilder conversion
  since it was missing

- support setting batch builder with "--batch-builder KEY_BASED" argument
codelipenghui pushed a commit that referenced this pull request Sep 9, 2021
…and sources (#11706)

* [Functions] Support KEY_BASED batch builder for Java based functions and sources

* Include batchBuilder in ProducerSpec -> ProducerConfig.ProducerConfigBuilder conversion

* Support setting batch builder for sources with "--batch-builder KEY_BASED" argument

(cherry picked from commit b923af1)
@codelipenghui codelipenghui added the cherry-picked/branch-2.8 Archived: 2.8 is end of life label Sep 9, 2021
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this pull request Mar 18, 2022
…and sources (apache#11706)

* [Functions] Support KEY_BASED batch builder for Java based functions and sources

* Include batchBuilder in ProducerSpec -> ProducerConfig.ProducerConfigBuilder conversion

* Support setting batch builder for sources with "--batch-builder KEY_BASED" argument
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-picked/branch-2.8 Archived: 2.8 is end of life doc-not-needed Your PR changes do not impact docs release/2.8.2 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants