Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix create partition of an exist topic doesn't throw RestException #9342

Merged

Conversation

BewareMyPower
Copy link
Contributor

@BewareMyPower BewareMyPower commented Jan 27, 2021

Motivation

Currently creating a partition of an existed partitioned topic was created doesn't throw any exception. However it should be an invalid behavior. The reason is that when a non partitioned topic was created, it only checks whether the number of partitions is positive. However, no matter the topic doesn't exist or the topic is an existed partition, the number of partitions is 0. This PR is to distinguish these two cases and throw a RestException when the non-partitioned topic is an existed partition.

Modifications

  • When creating a non-partition topic that is an existed partition, throw a RestException and add a test to verify it.
  • Add a test provider to ensure backward compatibility that users can still create missed partitions by creating non-partitioned topics of the missed partition.
  • Fix the functions worker startup. Before this PR it always tries to create the metadata topics like public/functions/assignment no matter if it exists. Here we ignore the ConflictException.

Verifying this change

  • Make sure that the change passes the CI checks.

This change added tests and can be verified as follows:

  • Added PersistentTopics#testCreateExistedPartition
  • Added another test provider to PartitionCreationTest#testCreateMissedPartitions
  • Add tests for creating existed topics to AdminApiTeststestPersistentTopicCreation.

@BewareMyPower
Copy link
Contributor Author

@hangc0276 PTAL, this fix will expose the wrong repeated metadata initialization code of KoP.

@BewareMyPower
Copy link
Contributor Author

I've changed another way to fix the issue and the backward compatibility could be retained, PTAL again, @zymap

@zymap
Copy link
Member

zymap commented Jan 28, 2021

By the way, when a AdminResource was created with a full topic name, the internal topicName would be initialized with a wrong name. eg for persistent://public/default/my-topic, the internal topicName is persistent://public /default/persistent://public/default/my-topic. This wrong behavior should be fixed.

I am curious about this, when will this behavior happen? As I know, the rest API is defined as tenant/namespace/topicname. So that means this case will happen when you name your topic name as persistent://public /default/topic explicitly.

Currently creating a partition of an existed partitioned topic was created doesn't throw any exception. However it should be an invalid behavior. The reason is that when a non partitioned topic was created, it only checks whether the number of partitions is positive. However, no matter the topic doesn't exist or the topic is an existed partition, the number of partitions is 0. This PR is to distinguish these two cases and throw a RestException when the non-partitioned topic is an existed partition.

About this issue, I am using master code to test and it shows I can't create a non-partitioned topic with a name that contains the -partition-

 bin/pulsar-admin topics create persistent://public/default/tttest-partition-0
08:46:49.633 [AsyncHttpClient-7-1] WARN  org.apache.pulsar.client.admin.internal.BaseResource - [http://localhost:8080/admin/v2/persistent/public/default/tttest-partition-0] Failed to perform http put request: javax.ws.rs.ClientErrorException: HTTP 412 Precondition Failed
Can't create topic tttest-partition-0 with "-partition-" followed by numeric value if there isn't a partitioned topic tttest created.

Reason: Can't create topic tttest-partition-0 with "-partition-" followed by numeric value if there isn't a partitioned topic tttest created.

So could you please give me more information about this case?

@BewareMyPower
Copy link
Contributor Author

BewareMyPower commented Jan 28, 2021

@zymap You're right for the 1st issue. It only affects my tests code:

        final String partitionName = TopicName.get(topicName).getPartition(0).toString();
        try {
            persistentTopics.createNonPartitionedTopic(testTenant, testNamespace, partitionName, false);

I should use the local name in tests instead of in the validateTopicName.

For the 2nd issue, the intention for creating non-partitioned topic with -partition- suffix is to create missed partitions. So you should first create a partitioned topic.

Here's an example:

# 1. A partition was somehow deleted or not created successfully
$ ./bin/pulsar-admin topics create-partitioned-topic xyz -p 3
$ ./bin/pulsar-admin topics delete xyz-partition-0
$ ./bin/pulsar-admin topics list public/default              
"persistent://public/default/xyz-partition-2"
"persistent://public/default/xyz-partition-1"
# 2. Then we can create a non-partitioned topic to create missed partitions
$ ./bin/pulsar-admin topics create xyz-partition-0           
$ ./bin/pulsar-admin topics list public/default   
"persistent://public/default/xyz-partition-2"
"persistent://public/default/xyz-partition-0"
"persistent://public/default/xyz-partition-1"
# 3. However, if you create a existed partition, there's no error, this is what I want to fix
$ ./bin/pulsar-admin topics create xyz-partition-2

The rule is from #5148.

@BewareMyPower BewareMyPower changed the title Fix create partition of a exist topic doesn't throw RestException [WIP] Fix create partition of a exist topic doesn't throw RestException Jan 28, 2021
@BewareMyPower
Copy link
Contributor Author

The test is now broken, I'll fix it first.

@BewareMyPower BewareMyPower changed the title [WIP] Fix create partition of a exist topic doesn't throw RestException Fix create partition of a exist topic doesn't throw RestException Jan 28, 2021
@BewareMyPower
Copy link
Contributor Author

/pulsarbot run-failure-checks

1 similar comment
@BewareMyPower
Copy link
Contributor Author

/pulsarbot run-failure-checks

Copy link
Member

@zymap zymap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for your explanation!

Left a minor comment.

@BewareMyPower BewareMyPower force-pushed the bewaremypower/fix-create-existed-topic branch from d61cc34 to 0603fe8 Compare January 30, 2021 03:31
@BewareMyPower
Copy link
Contributor Author

/pulsarbot run-failure-checks

2 similar comments
@BewareMyPower
Copy link
Contributor Author

/pulsarbot run-failure-checks

@BewareMyPower
Copy link
Contributor Author

/pulsarbot run-failure-checks

@BewareMyPower BewareMyPower changed the title Fix create partition of a exist topic doesn't throw RestException Fix create partition of an exist topic doesn't throw RestException Feb 1, 2021
@BewareMyPower
Copy link
Contributor Author

/pulsarbot run-failure-checks

@BewareMyPower BewareMyPower force-pushed the bewaremypower/fix-create-existed-topic branch from 0603fe8 to 14727c1 Compare February 1, 2021 08:14
@jiazhai
Copy link
Member

jiazhai commented Feb 1, 2021

/pulsarbot run-failure-checks

@BewareMyPower
Copy link
Contributor Author

/pulsarbot run-failure-checks

@BewareMyPower
Copy link
Contributor Author

It looks like the BrokerServiceAutoSubscriptionCreationTest is affected by this PR. I'll fix it soon.

@BewareMyPower BewareMyPower force-pushed the bewaremypower/fix-create-existed-topic branch from 14727c1 to 4f0f350 Compare February 2, 2021 02:26
@BewareMyPower
Copy link
Contributor Author

/pulsarbot run-failure-checks

2 similar comments
@BewareMyPower
Copy link
Contributor Author

/pulsarbot run-failure-checks

@BewareMyPower
Copy link
Contributor Author

/pulsarbot run-failure-checks

@BewareMyPower
Copy link
Contributor Author

The failed tests may be caused by this PR, I'm trying to fix them.

@BewareMyPower
Copy link
Contributor Author

I ran the functions integration tests in my local environment and got the similar error:

19:57:06.676 [docker-java-stream-1645855842:org.apache.pulsar.tests.integration.utils.DockerUtils$2@221] INFO org.apache.pulsar.tests.integration.utils.DockerUtils - DOCKER.exec(PulsarFunctionsProcessTest-apuqq-pulsar-broker-1:tail -f /var/log/pulsar/broker.log): STDOUT: 11:57:06.237 [pulsar-web-43-7] ERROR org.apache.pulsar.broker.admin.impl.PersistentTopicsBase - [null] Topic persistent://public/functions/assignments already exists
11:57:06.246 [pulsar-web-43-7] INFO org.eclipse.jetty.server.RequestLog - 172.28.0.10 - - [03/Feb/2021:11:57:06 +0000] "PUT /admin/v2/persistent/public/functions/assignments?authoritative=true HTTP/1.1" 409 38 "-" "Pulsar-Java-v2.8.0-SNAPSHOT" 18

It looks like the functions worker container tries to create existed public/functions/assignments, which leads to the failure of test setup.

@BewareMyPower BewareMyPower force-pushed the bewaremypower/fix-create-existed-topic branch from 4f0f350 to 7e91eb2 Compare February 3, 2021 12:28
@BewareMyPower
Copy link
Contributor Author

BewareMyPower commented Feb 3, 2021

There're still 3 tests that failed.

[ERROR] Run 17: PulsarFunctionsProcessTest.testSerdeFunction » ServerSideError HTTP 502 Bad Ga...
[ERROR] Run 19: PulsarFunctionsProcessTest.testSlidingCountWindowTest » ServerSideError HTTP 5...
[ERROR] Run 21: PulsarFunctionsProcessTest.testTumblingCountWindowTest » ServerSideError HTTP ...

These tests may be flaky. I've rerun the tests and testSerdeFunction passed. They both failed with HTTP 502 like:

23:34:50.701 [AsyncHttpClient-211-1:org.apache.pulsar.client.admin.internal.BaseResource$1@128] WARN org.apache.pulsar.client.admin.internal.BaseResource - [http://localhost:32942/admin/v2/persistent/public/default/test-sliding-count-window-PROCESS-input-brtzenuf] Failed to perform http put request: javax.ws.rs.ServerErrorException: HTTP 502 Bad Gateway
23:34:51.579 [AsyncHttpClient-217-1:org.apache.pulsar.client.admin.internal.BaseResource$1@128] WARN org.apache.pulsar.client.admin.internal.BaseResource - [http://localhost:32942/admin/v2/persistent/public/default/test-sliding-count-window-PROCESS-input-fgdsfqgy] Failed to perform http put request: javax.ws.rs.ServerErrorException: HTTP 502 Bad Gateway
23:34:51.660 [AsyncHttpClient-223-1:org.apache.pulsar.client.admin.internal.BaseResource$1@128] WARN org.apache.pulsar.client.admin.internal.BaseResource - [http://localhost:32942/admin/v2/persistent/public/default/test-tumbling-count-window-PROCESS-input-cxgbpagj] Failed to perform http put request: javax.ws.rs.ServerErrorException: HTTP 502 Bad Gateway
23:34:51.728 [AsyncHttpClient-229-1:org.apache.pulsar.client.admin.internal.BaseResource$1@128] WARN org.apache.pulsar.client.admin.internal.BaseResource - [http://localhost:32942/admin/v2/persistent/public/default/test-tumbling-count-window-PROCESS-input-saucgobb] Failed to perform http put request: javax.ws.rs.ServerErrorException: HTTP 502 Bad Gateway

Take testWindowFunction for example, it failed at

        try (PulsarAdmin admin = PulsarAdmin.builder().serviceHttpUrl(pulsarCluster.getHttpServiceUrl()).build()) {
            admin.topics().createNonPartitionedTopic(inputTopicName);
            admin.topics().createNonPartitionedTopic(outputTopicName);
        }

@BewareMyPower BewareMyPower force-pushed the bewaremypower/fix-create-existed-topic branch from b2c3751 to 00e65bb Compare February 5, 2021 01:19
@BewareMyPower
Copy link
Contributor Author

/pulsarbot run-failure-checks

1 similar comment
@BewareMyPower
Copy link
Contributor Author

/pulsarbot run-failure-checks

@BewareMyPower
Copy link
Contributor Author

Now all tests passed, PTAL again, @jiazhai @sijie @zymap @codelipenghui

@sijie sijie merged commit 16a2240 into apache:master Feb 8, 2021
@BewareMyPower BewareMyPower deleted the bewaremypower/fix-create-existed-topic branch February 18, 2021 06:59
jiazhai pushed a commit to streamnative/kop that referenced this pull request Mar 3, 2021
Currently KoP created `__consumer_offsets` repeatedly, the `ConflictException` will be thrown after apache/pulsar#9342. Though it doesn't affect the usage because the `ConflictException` is treated as the concurrent issue and will be swallowed, it could cause some unnecessary logs. So this PR remove the redundant partition creation code.

Another issue is that when `__consumer_offsets` already exists, the existed partition will be created because the partition returned by broker contains `persistent://` prefix but the topics in `offsetPartitionSet` don't. This PR use `KopTopic` to avoid the topic name dismatch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants