Skip to content

Increase the rate of topic creation#11148

Closed
suiyuzeng wants to merge 6 commits intoapache:masterfrom
suiyuzeng:master
Closed

Increase the rate of topic creation#11148
suiyuzeng wants to merge 6 commits intoapache:masterfrom
suiyuzeng:master

Conversation

@suiyuzeng
Copy link
Contributor

@suiyuzeng suiyuzeng commented Jun 29, 2021

Motivation

The count of topic created in 1 second is less than 200, when the topic count of the namespace is about 200,000..

Modifications

The reason is that the time cost of topic exist checking is too long when the topic count is large.
In org.apache.pulsar.broker.admin.AdminResource#checkTopicExistsAsync, all the topics would be checked one by one. As the count of topic grows, the time cost will become longer.
Can we use org.apache.pulsar.broker.namespace.NamespaceService#checkTopicExists to check if the topic is exits?

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API: (no)
  • The schema: (no)
  • The default values of configurations: (no)
  • The wire protocol: ( no)
  • The rest endpoints: (no)
  • The admin cli options: (no)
  • Anything that affects deployment: ( no )

Documentation

Check the box below and label this PR (if you have committer privilege).

Need to update docs?

  • doc-required

    (If you need help on updating docs, create a doc issue)

  • no-need-doc

    code optimize

  • doc

    (If this PR contains doc changes)

Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! But I think this will break the current behavior. Currently, we do not allow the duplicate topic name with the persistent or non-persistent domain, with this change it will only check the persistent topic duplication.

I think we should add the non-persistent topic duplication check back.

@codelipenghui codelipenghui added this to the 2.9.0 milestone Jun 29, 2021
@codelipenghui codelipenghui added type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages release/2.8.1 labels Jun 29, 2021
@suiyuzeng
Copy link
Contributor Author

Nice catch! But I think this will break the current behavior. Currently, we do not allow the duplicate topic name with the persistent or non-persistent domain, with this change it will only check the persistent topic duplication.

I think we should add the non-persistent topic duplication check back.

Got it. I will update later.

@suiyuzeng
Copy link
Contributor Author

@codelipenghui hi, can you please take a look?


} else {
return pulsar().getNamespaceService().getListOfTopics(topicName.getNamespaceObject(),
CommandGetTopicsOfNamespace.Mode.NON_PERSISTENT)
Copy link
Contributor

@315157973 315157973 Jul 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have multiLayerTopicsMap in BrokerService, we don't need to traverse all topics under the namespace, and then traverse again to determine whether topic exists.
We only need to traverse the bundles, and then directly use map.contains , usually the number of bundles is not too much

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice catch !
I am happy to see an improvement here

do we have specific test cases around this function ?

@suiyuzeng
Copy link
Contributor Author

@315157973 @eolivelli thanks,i will Optimize the code and add some test about this function.

@315157973
Copy link
Contributor

@suiyuzeng Hello, please go on

@suiyuzeng
Copy link
Contributor Author

@suiyuzeng Hello, please go on

Sorry, a little busy recently~
Find another problem, such as the test case "testCreatePartitionedTopicHavingNonPartitionTopicWithPartitionSuffix". Some non partition topic has the partition suffix. In this case, we can not use the cache to check anymore. There is some checking about this issue in the current code, to ensure that we can not create no no-parititon topic whit -partition-xxx suffix. How about add a config? If a cluster has the no-partition whit the suffix, use the current way to check duplication. If no no-paritition topic whit the suffix, use the cache to check.

@hangc0276
Copy link
Contributor

@suiyuzeng Hello, please go on

Sorry, a little busy recently~
Find another problem, such as the test case "testCreatePartitionedTopicHavingNonPartitionTopicWithPartitionSuffix". Some non partition topic has the partition suffix. In this case, we can not use the cache to check anymore. There is some checking about this issue in the current code, to ensure that we can not create no no-parititon topic whit -partition-xxx suffix. How about add a config? If a cluster has the no-partition whit the suffix, use the current way to check duplication. If no no-paritition topic whit the suffix, use the cache to check.

@suiyuzeng Nice catch!

  1. I am cutting 2.8.1, would you have time to address @315157973 's comments?
  2. for the restriction for reserved key-words for topic name, such as -partition-xx suffix for non-partitioned topic name, we have started a discussion for this issue: https://lists.apache.org/list.html?dev@pulsar.apache.org

@hangc0276
Copy link
Contributor

move to 2.8.2

@315157973
Copy link
Contributor

move to 2.8.3

@github-actions
Copy link

github-actions bot commented Nov 1, 2021

@suiyuzeng:Thanks for your contribution. For this PR, do we need to update docs?
(The PR template contains info about doc, which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks)

@Anonymitaet Anonymitaet added doc-not-needed Your PR changes do not impact docs and removed doc-label-missing labels Nov 25, 2021
@codelipenghui codelipenghui modified the milestones: 2.10.0, 2.11.0 Jan 18, 2022
@michaeljmarshall
Copy link
Member

Removing the release/2.8.3 label as this change will target master and probably won't be cherry picked back to stable branches because it is an optimization.

@codelipenghui codelipenghui modified the milestones: 2.11.0, 2.12.0 Jul 26, 2022
@tisonkun
Copy link
Member

Closed as stale and conflict. Please rebase and resubmit the patch if it's still relevant.

I believe this patch is worth continuing, but holding a stale patch doesn't work. If @suiyuzeng can rebase and resubmit the patch, I'm glad to follow the PR lifecycle. Otherwise, I may pick it up later. Close.

@tisonkun tisonkun closed this Dec 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants