Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support partitioned topics in the Reader #7518

Merged
merged 10 commits into from Nov 17, 2020
Merged

Conversation

315157973
Copy link
Contributor

@315157973 315157973 commented Jul 12, 2020

Fixes #3643 #7265

Motivation

Support partitioned topics in the Reader

Modifications

class relationship:
PulsarClientImpl -> MultiTopicsReaderImpl -> MultiTopicsConsumerImpl -> ConsumerImpl

PulsarClientImpl support build MultiTopicsReader
MultiTopicsReader wraps one MultiTopicsConsumerImpl
MultiTopicsConsumerImpl contains multiple consumerImpl

Make MultiTopicsConsumerImpl support seek by messageId

seek by time:
All consumerImpl seek by time, Reader can get all the messages returned by each partition

seek by message:
1)When seek by latest/earliest,all partitions seek by latest/earliest,Reader can get all the messages returned by each partition
2)When the messageId contains explicit partition information, it will only receive messages from this partition, and will not receive messages from other partitions.To avoid other partitions returning messages in inclusive mode, I make other consumer seek to latest and add a ignoredConsumersSet to ignore messages from other partitions .

Verifying this change

unit tests:
TopicReaderTest
MultiTopicsReaderTest

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API: (yes / no)
  • The schema: (yes / no / don't know)
  • The default values of configurations: (yes / no)
  • The wire protocol: (yes / no)
  • The rest endpoints: (yes / no)
  • The admin cli options: (yes / no)
  • Anything that affects deployment: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
  • If a feature is not applicable for documentation, explain why?
  • If a feature is not documented yet in this PR, please create a followup issue for adding the documentation

@sollecitom
Copy link

Any progress on this? :)

@sijie
Copy link
Member

sijie commented Sep 21, 2020

@sollecitom : @jiazhai @codelipenghui will review this pull request today

@jiazhai
Copy link
Member

jiazhai commented Sep 21, 2020

/pulsarbot run-failure-checks

Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The topics of the multiple topics consumer may belong to different partitioned topics, looks the PR assume topics of the multiple topics consumer belong to a partitioned topic, please double-check.

@315157973
Copy link
Contributor Author

There are still some details of this requirement that need to be discussed:
Suppose there is a partitioned topic that contains 3 partitions.

  1. If a MessageId belongs to partition-1, when the reader sets startMessageId, how should the messages of the other two partitions be processed? earliest, latest...?
  2. If the user first seeks (earliest) and then seeks a certain MessageId, how should the messages of the other two partitions be processed? Forbid the other 2 partitions to return messages? Or continue to let the other 2 partitions return messages?

Thanks for @codelipenghui 's suggestion:
In this PR, for MultiTopicReader, we only support seek earliest and latest , and do not support seek with a specific MessageId

@315157973
Copy link
Contributor Author

/pulsarbot run-failure-checks

4 similar comments
@315157973
Copy link
Contributor Author

/pulsarbot run-failure-checks

@315157973
Copy link
Contributor Author

/pulsarbot run-failure-checks

@315157973
Copy link
Contributor Author

/pulsarbot run-failure-checks

@315157973
Copy link
Contributor Author

/pulsarbot run-failure-checks

@315157973
Copy link
Contributor Author

/pulsarbot run-failure-checks

@slinstaedt
Copy link

Seeking to a specific MessageId seems to be mandatory though. How should one resume reading otherwise?

Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me.

@codelipenghui
Copy link
Contributor

@slinstaedt We will support the reader to seek to the earliest position or the latest position first. Seek to a specific position is confusing in the partitioned reader, we can try to support later since there is not a good idea yet.

@codelipenghui
Copy link
Contributor

@sijie @jiazhai Please help review this PR, thanks.

@sijie sijie added doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. doc-required Your PR changes impact docs and you will update later. labels Nov 12, 2020
@sijie
Copy link
Member

sijie commented Nov 12, 2020

@315157973 Can you rebase it to the latest master?

@315157973
Copy link
Contributor Author

/pulsarbot run-failure-checks

@codelipenghui codelipenghui merged commit 12ef7c9 into apache:master Nov 17, 2020
flowchartsman pushed a commit to flowchartsman/pulsar that referenced this pull request Nov 17, 2020
Fixes apache#3643 apache#7265


### Motivation
Support partitioned topics in the Reader

### Modifications
class relationship:
PulsarClientImpl -> MultiTopicsReaderImpl -> MultiTopicsConsumerImpl -> ConsumerImpl

PulsarClientImpl support build MultiTopicsReader
MultiTopicsReader wraps one MultiTopicsConsumerImpl 
MultiTopicsConsumerImpl contains multiple consumerImpl

Make MultiTopicsConsumerImpl support seek by messageId

seek by time:
All consumerImpl seek by time, Reader can get all the messages returned by each partition

seek by message:
1)When seek by latest/earliest,all partitions seek by latest/earliest,Reader can get all the messages returned by each partition
2)When the messageId contains explicit partition information, it will only receive messages from this partition, and will not receive messages from other partitions.To avoid other partitions returning messages in inclusive mode, I make other consumer seek to latest and add a ignoredConsumersSet to ignore messages from other partitions .
@315157973 315157973 deleted the reader branch November 28, 2020 03:05
@Anonymitaet Anonymitaet added doc-complete Your PR changes impact docs and the related docs have been already added. and removed doc-required Your PR changes impact docs and you will update later. labels Feb 22, 2022
RobertIndie added a commit to RobertIndie/pulsar that referenced this pull request Nov 15, 2022
Motivation
We already added the seek support for multi-topics consumer in apache#7518. But the note for seek method haven't been updated.

Modification
* Update the doc for seek method in the consumer.

Signed-off-by: Zike Yang <zike@apache.org>
RobertIndie added a commit that referenced this pull request Nov 18, 2022
### Motivation

We already added the seek support for multi-topics consumer in #7518. But the note for seek method hasn't been updated.

### Modifications

* Update the doc for seek method in the consumer.
lifepuzzlefun pushed a commit to lifepuzzlefun/pulsar that referenced this pull request Dec 9, 2022
### Motivation

We already added the seek support for multi-topics consumer in apache#7518. But the note for seek method hasn't been updated.

### Modifications

* Update the doc for seek method in the consumer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/client doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. doc-complete Your PR changes impact docs and the related docs have been already added.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[client] Support seek at partitioned topic
7 participants