[FLINK-28870][Connector/Pulsar] Improve the Pulsar source performance when meeting small data rates. #15
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What is the purpose of the change
When using Pulsar Source to consume data, if the data rate is small, e.g. 2 msg/s, there will be long periods of time when no messages are consumed.
This is caused by the default
PulsarSourceOptions.PULSAR_MAX_FETCH_TIME
andPulsarSourceOptions.PULSAR_MAX_FETCH_RECORDS
options. Pulsar Source will try to pull messages until any conditions exceed. Such as fetch until 100 messages or fetch 10 seconds.We have to add a new fetch time option for Pulsar Source. We would consider there is no message on the current topic if this fetch time exceeds. This would make sure the source would stop fetching messages when the 100ms exceed. Avoid hanging on small message income rates.
Brief change log
PulsarSourceOptions.PULSAR_SINGLE_FETCH_TIME
option.PulsarSourceOptions.PULSAR_SINGLE_FETCH_TIME
.Verifying this change
This change is already covered by existing tests, such as PulsarPartitionSplitReaderTest.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation