Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move offset validation logic to consumer classes #13015

Merged
merged 9 commits into from
May 23, 2024

Conversation

KKcorps
Copy link
Contributor

@KKcorps KKcorps commented Apr 26, 2024

Depends on - #12806

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no guarantee we can always detect gap with start offset and batch first offset, especially after we introduce the exclusive start offset concept.
Seems like we still want to report gap within the general handling flow. In that case it should be easier to add an API hasGap() into MessageBatch, which can be filled by the stream consumer

@KKcorps
Copy link
Contributor Author

KKcorps commented Apr 29, 2024

There is no guarantee we can always detect gap with start offset and batch first offset, especially after we introduce the exclusive start offset concept. Seems like we still want to report gap within the general handling flow. In that case it should be easier to add an API hasGap() into MessageBatch, which can be filled by the stream consumer

That makes sense since we already have the information in the consumer from which offset did we actually wanted to consume the data vs from which offset we actually did OR any other criteria which consumer wants to implement

@KKcorps KKcorps marked this pull request as ready for review May 22, 2024 02:58
@KKcorps KKcorps changed the title [WIP] Move offset validation logic to consumer classes" Move offset validation logic to consumer classes" May 22, 2024
@Jackie-Jiang Jackie-Jiang changed the title Move offset validation logic to consumer classes" Move offset validation logic to consumer classes May 22, 2024
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise

*/
private void validateStartOffset(StreamPartitionMsgOffset startOffset, StreamPartitionMsgOffset batchFirstOffset) {
if (batchFirstOffset.compareTo(startOffset) > 0) {
private void reportDataLoss(MessageBatch messageBatch) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the java doc for this method

_serverMetrics.addMeteredTableValue(_tableStreamName, ServerMeter.STREAM_DATA_LOSS, 1L);
String message =
"startOffset(" + startOffset + ") is older than topic's beginning offset(" + batchFirstOffset + ")";
"Message loss detected in stream partition: " + _partitionGroupId + " for table: " + _tableNameWithType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use String.format() for readability

@@ -66,7 +66,7 @@ public synchronized KinesisMessageBatch fetchMessages(StreamPartitionMsgOffset s
KinesisPartitionGroupOffset startOffset = (KinesisPartitionGroupOffset) startMsgOffset;
String shardId = startOffset.getShardId();
String startSequenceNumber = startOffset.getSequenceNumber();

boolean isMissingMessages = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this

@@ -96,6 +96,10 @@ default boolean isEndOfPartitionGroup() {
return false;
}

default boolean hasDataLoss() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some javadoc for this method

@codecov-commenter
Copy link

codecov-commenter commented May 22, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 62.20%. Comparing base (59551e4) to head (41da47c).
Report is 485 commits behind head on master.

Files Patch % Lines
...a/manager/realtime/RealtimeSegmentDataManager.java 20.00% 3 Missing and 1 partial ⚠️
...pinot/plugin/stream/kafka20/KafkaMessageBatch.java 66.66% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #13015      +/-   ##
============================================
+ Coverage     61.75%   62.20%   +0.45%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2531      +95     
  Lines        133233   138570    +5337     
  Branches      20636    21448     +812     
============================================
+ Hits          82274    86197    +3923     
- Misses        44911    45926    +1015     
- Partials       6048     6447     +399     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration <0.01% <0.00%> (-0.01%) ⬇️
integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration2 0.00% <0.00%> (ø)
java-11 62.18% <50.00%> (+0.47%) ⬆️
java-21 35.11% <20.00%> (-26.52%) ⬇️
skip-bytebuffers-false 62.20% <50.00%> (+0.45%) ⬆️
skip-bytebuffers-true <0.01% <0.00%> (-27.73%) ⬇️
temurin 62.20% <50.00%> (+0.45%) ⬆️
unittests 62.20% <50.00%> (+0.45%) ⬆️
unittests1 46.67% <33.33%> (-0.22%) ⬇️
unittests2 27.90% <30.00%> (+0.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@KKcorps KKcorps merged commit 29c560f into apache:master May 23, 2024
19 checks passed
gortiz pushed a commit to gortiz/pinot that referenced this pull request Jun 14, 2024
* Enhance Kinesis consumer

* Simplify the handling

* Address comments

* Move offset validation logic to consumer classes

* Add missing message interface to message batch

* fix linting

* remove unused interface

* Cleanup and refactoring

* lint fixes

---------

Co-authored-by: Xiaotian (Jackie) Jiang <jackie.jxt@gmail.com>
Co-authored-by: Kartik Khare <kharekartik@kartiks-macbook-pro.tail8a064.ts.net>
Co-authored-by: Kartik Khare <kharekartik@Kartiks-MacBook-Pro.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants