Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High CPU load when PulsarClient (Java) is idle #6414

Closed
racorn opened this issue Feb 24, 2020 · 4 comments
Closed

High CPU load when PulsarClient (Java) is idle #6414

racorn opened this issue Feb 24, 2020 · 4 comments
Labels
area/client help wanted type/bug The PR fixed a bug or issue reported a bug

Comments

@racorn
Copy link
Contributor

racorn commented Feb 24, 2020

Describe the bug
Running a simple consumer on Windows using the PulsarClient, I notice that there is considerable CPU usage when the consumer is just idling. Profiling showed that the 'pulsar-timer-x-y' thread is heavy on the CPU. Digging into the source code, we can see that the HashedWheelTimer is set up with 1 ms tick time:

timer = new HashedWheelTimer(getThreadFactory("pulsar-timer"), 1, TimeUnit.MILLISECONDS);

A simple test with similar values show that a running HashedWheelTimer can consume CPU equal to 100% of one hyperthread:

    @Test
    public void testTimerWheel() throws Exception {
        HashedWheelTimer timer = new HashedWheelTimer(1, TimeUnit.MILLISECONDS);

        timer.newTimeout(new TimerTask() {

            @Override
            public void run(final Timeout timeout) throws Exception {
                timer.newTimeout(this, 500, TimeUnit.MILLISECONDS);

            }
        }, 500, TimeUnit.MILLISECONDS);

        Thread.sleep(30_000);
        timer.stop();
    }

My understanding of the HashedWheelTimer is that it is indented to be used for a large number of approximated timeouts and not for millisecond precision scheduling, but I may be mistaken.

To Reproduce
Run a simple producer subscribing to a topic with no messages, or run the above simple JUnit test. Observe CPU usage.

Expected behavior
An idle consumer should not incur high CPU load. For example, when running 8 consumers on a i7-8700 (4 cores 2 threads per core), TaskManager reported 100% CPU usage - when there was no traffic, no messages on the topics.

Discussion
High CPU load is a high price to pay for immediate batch dispatching (which may be the primary reason for setting the 1 ms tick time). It should be possible to lax on immediate batch dispatching in favor of more compute resources available to other tasks.

An alternative is to use a plain old ScheduledExecutorService for batch dispatching/reception, tests suggest it will also be lighter on the CPU when millisecond precision is required and the number of task is not too high. For example, I could schedule 2000 simple tasks every millisecond (scheduleAtFixedRate) with more compute available on a single thread. When scheduling 10k tasks every millisecond, the thread was working 100%, but scheduling 10k tasks every 10 millisecond was OK. So if a client is used to create a huge number of producers/consumers, it might be better to lax the batch timeout so the timeout thread has a chance to actually perform its work. Another observation is that the HashedWheelTimer used with 1 millisecond tick time is not able to run a single simple task every millisecond, party due to the HashedWheelTimer approximate nature, and maybe party because spinning the wheel competes with task expiration.

Note that when using ScheduledExecutorService each task might not get exactly the same number of executions, for example when scheduling 1000 tasks run execute every millisecond for 30 seconds, task were executed between 29 986 and 30 012 times.

Some suggestions:

  1. The tick time could be configurable, with default value = 1 millisecond to preserve current behavior.
  2. Use default tick time 100 milliseconds, then create a ScheduledExecutorService in PulsarClientImpl that is exposed the same way as the Timer object for other classes to use. The ProducerImpl and ConsumerBase could use the ScheduledExecutorService to schedule batch dispatching/reception.

Additional context
Observed under Windows 10 64-bit, running Java 11.

@jiazhai
Copy link
Member

jiazhai commented Feb 28, 2020

Thanks a lot for @racorn 's detailed analysis. your suggestions sounds great. Would you like to provide a fix for this?

@racorn
Copy link
Contributor Author

racorn commented Feb 28, 2020

Yes, I can try to implement what I suggested, by mid next week. Will that be OK with you?

@racorn
Copy link
Contributor Author

racorn commented Feb 29, 2020

I wish I had seen this before: netty/netty#9710, and the fix netty/netty#9714

As the Netty version is now bumped to 4.1.45 on master, the CPU load is no longer an issue on Windows.

Anyway, @jiazhai, I made the suggested implementation in my forked repository, you can have a look to see if it is of any value. I could be an improvement in the ProducerImpl, where client.scheduledExecutorService().scheduleAtFixedRate better reflects what you want to achieve than using a Timeout. Also, the keyGeneratorTask could use the ScheduledExecutorService instead of using the Netty eventLoopGroup. On the other hand, if you are producing messages for a huge number of topics (10k +), the HashedWheelTimer may perform better. So maybe it is best to leave it as it is.

@jiazhai
Copy link
Member

jiazhai commented Mar 3, 2020

Thanks @racorn for your great help. Agree with you, we could leave it as it is. And would like to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/client help wanted type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

No branches or pull requests

2 participants