Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add epoch for connection handler to handle create producer timeout. #5571

Merged
merged 6 commits into from Nov 19, 2019

Conversation

codelipenghui
Copy link
Contributor

Fixes #5535

Motivation

Currently, if user create producer timeout, the connection handler of producer will reconnect to the broker later, but if in broker already done the previous create producer request, the reconnection will failed with "producer with name xxx is already connected".

So this PR will introduce epoch for connection handler and add a field named isGeneratedName for producer to handle above problem.

This PR only handle the generated producer name scenario, so many users occur errors such like
#5535, so we need to fix the generated producer name scenario first.

For the scenario of user specified producer name, we can discuss later and find a simple approach to handle it, i left my idea here: using producer id and producer name as the identity of producer, producer name used for EO producer and producer id can used by the producer reconnect, but this approach depends on globally unique producer id generator.

Modifications

If the producer with generated producer name and epoch of the producer is bigger than the exists producer, the new producer will overwrite the old producer, so the reconnect producer will create succeed.

Verifying this change

Add unit tests to simulate producer timeout and reconnection

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API: (no)
  • The schema: (no)
  • The default values of configurations: (no)
  • The wire protocol: (no)
  • The rest endpoints: (no)
  • The admin cli options: (no)
  • Anything that affects deployment: (no)

Documentation

  • Does this pull request introduce a new feature? (no)

@codelipenghui
Copy link
Contributor Author

@wolfstudy if this PR can complete before cut 2.4.2, please considering include it, thanks.

@sijie sijie added this to the 2.4.2 milestone Nov 6, 2019
@sijie
Copy link
Member

sijie commented Nov 6, 2019

@wolfstudy @codelipenghui This issue has been reported by many users. so let's include it in 2.4.2

@codelipenghui
Copy link
Contributor Author

run java8 tests

Copy link
Member

@sijie sijie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codelipenghui the idea looks good to me. left a few comments. PTAL

/**
* @return the name of producer is generated or user specified
*/
boolean isGeneratedName();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to expose this to producer api? I think this is an implementation detail, which should be hidden in the implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, currently no need to expose to the interface

import org.apache.pulsar.client.api.PulsarClientException;
import org.apache.pulsar.client.impl.HandlerState.State;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

class ConnectionHandler {
public class ConnectionHandler {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add @VisiableForTesting ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let me try

@@ -57,6 +57,7 @@

private String topicName = null;
private String producerName = null;
private boolean isGeneratedName = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need adding another field in ProducerConfigurationData?


// Indicate the name of the producer is generated or not(user specified)
// Use default false here is in order to be forward compatible with the client
optional bool is_generated_name = 9 [default = false];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_generated_name => user_provided_producer_name ?

throw new NamingException(
"Producer with name '" + producer.getProducerName() + "' is already connected to topic");
boolean canOverwrite = false;
for (Producer existProducer : producers.values()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid iterating over the producers set? I.e. can you change the hash set to a hash map?

@codelipenghui
Copy link
Contributor Author

@sijie I have addressed your comments, please take a look again.

@@ -139,7 +140,9 @@ public String toString() {

@Override
public ConcurrentOpenHashSet<Producer> getProducers() {
return producers;
ConcurrentOpenHashSet<Producer> result = new ConcurrentOpenHashSet<>(16, 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. I didn't realize that there is a getProducers here. It is used in a lot of places. hence it might be a performance problem if we changed to a hash map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have checked the used places of method getProducers()
image
Looks ok, two places used it, handle backlog exceeded backlog and prometheus metrics aggregator, others are unit tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sijie i will try change the getProducers method to return a map.

@codelipenghui
Copy link
Contributor Author

run java8 tests

1 similar comment
@codelipenghui
Copy link
Contributor Author

run java8 tests

canOverwrite = true;
}
if (canOverwrite) {
producers.put(producer.getProducerName(), producer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use producers.replace(producer.getProducerName(), existingProducer, producer) to make sure one can successfully add the producer.

canOverwrite = true;
}
if (canOverwrite) {
producers.put(producer.getProducerName(), producer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use replace and check the return result

oldProducer.close();
canOverwrite = true;
}
if (canOverwrite) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this can be simplified with

if (canOverwrite && !producers.replace(newProducer.getProducerName(), oldProducer, newProducer)) {
      throw new BrokerServiceException.NamingException(
                    "Producer with name '" + newProducer.getProducerName() + "' is already connected 
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be if can simplified with

if (!canOverwrite || !producers.replace(newProducer.getProducerName(), oldProducer, newProducer)) {
    throw new BrokerServiceException.NamingException(
                    "Producer with name '" + newProducer.getProducerName() + "' is already connected")
}

Right?

@sijie sijie requested a review from rdhabalia November 8, 2019 07:16
@sijie
Copy link
Member

sijie commented Nov 8, 2019

@merlimat @rdhabalia PTAL

@codelipenghui
Copy link
Contributor Author

run java8 tests

@codelipenghui
Copy link
Contributor Author

run cpp tests
run java8 tests

@codelipenghui
Copy link
Contributor Author

run java8 tests

2 similar comments
@codelipenghui
Copy link
Contributor Author

run java8 tests

@codelipenghui
Copy link
Contributor Author

run java8 tests

@codelipenghui
Copy link
Contributor Author

run java8 tests

@sijie
Copy link
Member

sijie commented Nov 11, 2019

@merlimat @rdhabalia can you review this pull request?

@sijie
Copy link
Member

sijie commented Nov 11, 2019

@codelipenghui can you also create github issues for tracking adding similar fixes for other language clients?

@codelipenghui
Copy link
Contributor Author

@sijie I have added a task tracker #5606 for other language catch up.

@sijie
Copy link
Member

sijie commented Nov 12, 2019

@merlimat @rdhabalia can you please review this?

@wolfstudy wolfstudy modified the milestones: 2.4.2, 2.4.3 Nov 13, 2019
@wolfstudy
Copy link
Member

@codelipenghui l will change the Milestone to 2.4.3. So we can cut 2.4.2 and if needed
2.4.3 in a few weeks.

@jiazhai jiazhai merged commit 75c7229 into apache:master Nov 19, 2019
@jiazhai jiazhai modified the milestones: 2.4.3, 2.4.2 Nov 19, 2019
@wolfstudy wolfstudy modified the milestones: 2.4.2, 2.5.0 Nov 19, 2019
@wolfstudy
Copy link
Member

@codelipenghui @jiazhai The pr changes the proto file, will move the milestone to 2.5.0

jiazhai pushed a commit that referenced this pull request Nov 21, 2019
…5701)

Fixes #5698

Motivation
Since #5571 handle the generated producer name, the replicator producer was created by broker, only one producer for a replicated topic.

So, we can handle it simple by considered replicator producer as generated name producer.
@codelipenghui codelipenghui deleted the issue-5535 branch March 4, 2021 03:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

failed to add producer to topic after connection closed
4 participants