New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to use message index instead of record sequence for partitioning #470
Option to use message index instead of record sequence for partitioning #470
Conversation
@nicoloboschi:Thanks for your contribution. For this PR, do we need to update docs? |
Can you explain this scenario a bit more, why would only the last message be persisted? If all messages in a batch share the same |
Sure. Let's take this scenario. We set batchSize=2. When the third record is flushed, the resulting output filename is the same because recordSequence is the same of the first one. So the previous object is overridden in S3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @nicoloboschi
The use of a message index for the partitioner is okay, it gives the user another option to use. So overall the feature LGTM. But this PR is not solving the root cause here, which is the cloud storage connector do not handle file overwritten case. When a target file exists, the cloud storage connector should either load the file content, append the new content and overwrite it, or use another file name to save the file.
src/main/java/org/apache/pulsar/io/jcloud/partitioner/AbstractPartitioner.java
Show resolved
Hide resolved
src/main/java/org/apache/pulsar/io/jcloud/partitioner/AbstractPartitioner.java
Show resolved
Hide resolved
It's true. Moreover the index approach is not enabled by default. But the solution is not that easy.
This will violate the batchSize parameter
That would make sense. A time-based solution could be implemented as a new partitioner type. However both the time-based approach and the index one will not guarantee absence of duplicates in case of I think it's fine to merge this pull anyway. |
} else { | ||
LOGGER.debug("Found message {} with hasIndex=true but index is empty, using recordSequence", | ||
message.getMessageId()); | ||
} | ||
} else { | ||
LOGGER.debug("partitionerUseIndexAsOffset configured to true but no index found on the message {}, " | ||
+ "perhaps the broker didn't exposed the metadata, using recordSequence", | ||
message.getMessageId()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good idea to implicitly fall back to the recordSequence
if the user has configured the partitionerUseIndexAsOffset
property. At the very least this should be a WARN log, or we should throw an exception here imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your concern.
The problem is that who installs the sink may not know if the broker will put the index in the record metadata.
In my first implementation this was a WARN level: 15659aa#diff-ed19908cc0ff954ebf1f795a0e3fd708a3b1be501cda66c22ea1906d771da90cR91
This is the same behavior (except for the special batch messages handling) present in Kafka connect's connectors in Apache Pulsar: https://github.com/apache/pulsar/blob/4c22159f5a972e7a92382b9a90c6c70c43c5d166/pulsar-io/kafka-connect-adaptor/src/main/java/org/apache/pulsar/io/kafka/connect/KafkaConnectSink.java#L312-L359
To improve the usability we could:
- Explicitly state the behavior in the parameter doc
- Set those logs to WARN
Another (more complex) option would be to add another flag "noIndexAction" that regulates if throws error or accept records without index when partitionerUseIndexAsOffset
is true.
Although, I would avoid to add another boilerplate option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah let's not add another flag :) I think moving the log to WARN and extending the parameter description is a good tradeoff
Could you resolve the doc conflicts and add the new parameter to the azure provider config properties too? Then it's good to merge IMO :) |
2db6fa4
to
afcf914
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! LGTM
…titioning (streamnative#470) (cherry picked from commit a7672fb)
Motivation
Currently all the partitioners use the
recordSequence
field in order to partition messages. That field doesn't handle batch messages and, consequently, objects could be loss on the target storage. The record sequence has the same value for all the entries in a batch message. Only the last entry of the batch will be persisted on the storage.Modifications
partitionerUseIndexAsOffset
(default to false for compatibility) to use the message's index for partitioning.The index is exposed only if the brokers expose the metadata (see https://github.com/apache/pulsar/wiki/PIP-70%3A-Introduce-lightweight-broker-entry-metadata). In case the index is not available on a message, the record sequence will be used.
Documentation