Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-7703: position() may return a wrong offset after seekToEnd #6407

Merged
merged 7 commits into from
May 29, 2019

Conversation

viktorsomogyi
Copy link
Contributor

@viktorsomogyi viktorsomogyi commented Mar 8, 2019

When poll is called which resets the offsets to the beginning, followed by a seekToEnd and a position, it could happen that the "reset to earliest" call in poll overrides the "reset to latest" initiated by seekToEnd in a very delicate way:

  1. both request has been issued and returned to the client side (listOffsetResponse has happened)
  2. in Fetcher.resetOffsetIfNeeded(TopicPartition, Long, OffsetData) the thread scheduler could prefer the heartbeat thread with the "reset to earliest" call, overriding the offset to the earliest and setting the SubscriptionState with that position.
  3. The thread scheduler continues execution of the thread (application thread) with the "reset to latest" call and discards it as the "reset to earliest" already set the position - the wrong one.
  4. The blocking position call returns with the earliest offset instead of the latest, despite it wasn't expected.

The fix makes the TopicPartitionState in SubscriptionState synchronized and starts to track the requested reset timestamp. With this we can precisely decide if the incoming offset reset is really what we want (by comparing the timestamp set when assigning for reset and the one that is actually used on seek). Therefore the latest initiated offset reset will happen only. Synchronization furthermore ensures that this is done in an atomic manner to avoid further similar bugs.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@viktorsomogyi viktorsomogyi force-pushed the kafka-consumer-seek-bug branch 3 times, most recently from ed52b04 to 10acb07 Compare March 9, 2019 15:19
@ijuma
Copy link
Contributor

ijuma commented Mar 9, 2019

Cc @hachikuji @rajinisivaram

@hachikuji hachikuji self-assigned this Mar 15, 2019
Copy link
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good find. Left a couple comments.

@viktorsomogyi
Copy link
Contributor Author

retest this please

Copy link
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viktorsomogyi Thanks for the update. I left one suggestion to consider. Let me know what you think.

@viktorsomogyi viktorsomogyi force-pushed the kafka-consumer-seek-bug branch 2 times, most recently from 808ab74 to 31eecaa Compare April 12, 2019 14:58
boolean match = this.subscription.contains(topicPartition.topic());
if (!match) {
log.info("Assigned partition {} for non-subscribed topic; subscription is {}", topicPartition, this.subscription);
synchronized (SubscriptionState.this) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spotbugs throw an error for this but I don't think it's accurate since we synchronize the whole method and this predicate will be executed there. Should I add this to the excludes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'm not sure why we need this additional block if we are already holding the lock. The predicate wouldn't be used after this method returns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
Basically this is why I added it, I think spotbugs incorrectly detects this. I hope I can make an exception in spotbugs for this method.

Copy link
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. Left a few more comments.

return !hasValidPosition() && !awaitingReset();
}

private boolean isPaused() {
private synchronized boolean isPaused() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need synchronization here if SubscriptionState is synchronized?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all method in SubscriptionState is synchronized. Basically when I don't need to modify assignment I only synchronize until I get the TopicPartitionState and then lock that object to avoid locking the whole collection when it's not necessary.
If you look at isPaused there:

public boolean isPaused(TopicPartition tp) {
        TopicPartitionState assignedOrNull = assignedStateOrNull(tp);
        return assignedOrNull != null && assignedOrNull.isPaused();
    }

the first line would only get the state from the collection, release the lock of SubscriptionState and only lock the given TopicPartitionState.
When I write the collection then I lock for instance in assignFromUser then I lock the whole method as I think we should keep writes consistent.
If you think this is unnecessary or complicated I can just lock SubscriptionState every time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I have a slight preference to just lock SubscriptionState every time since it is the simplest option. I don't think contention is a major problem since there's only the heartbeat thread which is sleeping most of the time. Unless there's some reason to think the cost of lock acquisition itself is a concern.

Copy link
Contributor Author

@viktorsomogyi viktorsomogyi Apr 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I don't think lock acquisition would be expensive in this case.

boolean match = this.subscription.contains(topicPartition.topic());
if (!match) {
log.info("Assigned partition {} for non-subscribed topic; subscription is {}", topicPartition, this.subscription);
synchronized (SubscriptionState.this) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'm not sure why we need this additional block if we are already holding the lock. The predicate wouldn't be used after this method returns.

@viktorsomogyi viktorsomogyi force-pushed the kafka-consumer-seek-bug branch 3 times, most recently from 3aeaed9 to 9705b17 Compare April 25, 2019 14:04
Copy link
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. This is looking good. I had just a few additional comments.

@hachikuji
Copy link
Contributor

retest this please

@hachikuji
Copy link
Contributor

Compilation failing:

16:14:09 > Task :clients:compileJava
16:14:09 /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12@2/clients/src/main/java/org/apache/kafka/clients/consumer/internals/SubscriptionState.java:108: error: partitionStateValues() is not public in PartitionStates; cannot be accessed from outside package
16:14:09             ", assignment=" + assignment.partitionStateValues() + " (id=" + assignmentId + ")}";
16:14:09                                         ^
16:14:09   where S is a type-variable:
16:14:09     S extends Object declared in class PartitionStates
16:14:11 1 error
16:14:12 

@viktorsomogyi
Copy link
Contributor Author

Fixed the compilation error (it was due to a conflict with another commit) but the test failures seem suspicious. Please hold on for now.

Copy link
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, just a couple more comments.

public void resetGroupSubscription() {
this.groupSubscription.retainAll(subscription);
synchronized void resetGroupSubscription() {
groupSubscription = new HashSet<>(groupSubscription);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we also treat subscription as immutable once created, could we change this to groupSubscription = subscription?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, will do.

this.subscription = topicsToSubscribe;
this.groupSubscription.addAll(topicsToSubscribe);
subscription = topicsToSubscribe;
groupSubscription = topicsToSubscribe;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic looks a little different. The "group subscription" is a little hard to understand. It's intended to be the union of the subscriptions of all consumers in the group. When we change the local subscription, we should still keep the subscription from the rest of the group. Perhaps separately we can consider how to simplify this bookkeeping.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. I'll fix this.

Copy link
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @viktorsomogyi .

@hachikuji
Copy link
Contributor

I will go ahead and merge. The failing test is a known flake: https://issues.apache.org/jira/browse/KAFKA-8122.

@hachikuji hachikuji merged commit e82e2e7 into apache:trunk May 29, 2019
hachikuji pushed a commit that referenced this pull request May 29, 2019
When poll is called which resets the offsets to the beginning, followed by a seekToEnd and a position, it could happen that the "reset to earliest" call in poll overrides the "reset to latest" initiated by seekToEnd in a very delicate way:

1. both request has been issued and returned to the client side (listOffsetResponse has happened)
2. in Fetcher.resetOffsetIfNeeded(TopicPartition, Long, OffsetData) the thread scheduler could prefer the heartbeat thread with the "reset to earliest" call, overriding the offset to the earliest and setting the SubscriptionState with that position.
3. The thread scheduler continues execution of the thread (application thread) with the "reset to latest" call and discards it as the "reset to earliest" already set the position - the wrong one.
4. The blocking position call returns with the earliest offset instead of the latest, despite it wasn't expected.

The fix makes SubscriptionState synchronized so that we can verify that the reset is expected while holding the lock.

Reviewers: Jason Gustafson <jason@confluent.io>
@viktorsomogyi
Copy link
Contributor Author

viktorsomogyi commented May 30, 2019

@hachikuji thanks a lot for reviewing this!

@viktorsomogyi viktorsomogyi deleted the kafka-consumer-seek-bug branch May 30, 2019 11:32
pengxiaolong pushed a commit to pengxiaolong/kafka that referenced this pull request Jun 14, 2019
…che#6407)

When poll is called which resets the offsets to the beginning, followed by a seekToEnd and a position, it could happen that the "reset to earliest" call in poll overrides the "reset to latest" initiated by seekToEnd in a very delicate way: 

1. both request has been issued and returned to the client side (listOffsetResponse has happened)
2. in Fetcher.resetOffsetIfNeeded(TopicPartition, Long, OffsetData) the thread scheduler could prefer the heartbeat thread with the "reset to earliest" call, overriding the offset to the earliest and setting the SubscriptionState with that position.
3. The thread scheduler continues execution of the thread (application thread) with the "reset to latest" call and discards it as the "reset to earliest" already set the position - the wrong one.
4. The blocking position call returns with the earliest offset instead of the latest, despite it wasn't expected.

The fix makes SubscriptionState synchronized so that we can verify that the reset is expected while holding the lock. 

Reviewers: Jason Gustafson <jason@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants