Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpu% high #1424

Closed
ssakshi0302 opened this issue Nov 18, 2016 · 24 comments
Closed

cpu% high #1424

ssakshi0302 opened this issue Nov 18, 2016 · 24 comments

Comments

@ssakshi0302
Copy link

ssakshi0302 commented Nov 18, 2016

Hi, too many RXComputationalScheduler threads are getting created and my application's cpu% is going high after using hystrix.

I am using this configuration

"groupName": "recsagservice", "threadPoolCoreSize": 5,
"cirBreakErrorThresPerc" : 50, "cirBreakReqVolThres":80, "cirBreakSleepTimeInMilliSec":72000 .

Please guide me what could be wrong here. I am attaching thread dump.

thread_hystrix.txt

@kennedyoliveira
Copy link
Contributor

The RXComputationalScheduler has nothing to do with Hystrix, it's a RxJava thread, and you will has 1 for each CPU you has in your machine, so you probably has an 8-core machine that's why you get 8 threads.

About your cpu% going high you will need to provide more information about what you are doing because on the thread dump there are only threads waiting, no threads doing particularly nothing and shouldn't consume your cpu as you say.

@ssakshi0302
Copy link
Author

ssakshi0302 commented Nov 18, 2016

Hi,

Everytime my client method is called, I am just creating a new HystrixCommand object and calling its execute method.
I have not called any reset method . This client method frequency is 45k times in an hour.

This is how I am using hystrix:

            Setter group =    Setter.withGroupKey(HystrixCommandGroupKey.Factory.asKey("mygroup")).andCommandKey(HystrixCommandKey.Factory.asKey("myCommand")).andThreadPoolKey(                HystrixThreadPoolKey.Factory.asKey("myGroup")).andThreadPoolPropertiesDefaults(HystrixThreadPoolProperties.Setter().withCoreSize(5)).andCommandPropertiesDefaults(                HystrixCommandProperties.Setter().withCircuitBreakerErrorThresholdPercentage(50).withCircuitBreakerRequestVolumeThreshold(20).withCircuitBreakerSleepWindowInMilliseconds( 72000));

            HystrixExternalCommand myHystrixCommand = new HystrixCommand( group  );             
             myresponse = myHystrixCommand.execute();

From recording its looking like Stack Trace "rx.observers.Subscribers$5.onNext(Object)" is called 550 times in a min which is accounting for 38% of cpu usage.
recording

@kennedyoliveira
Copy link
Contributor

@ssakshi0302 i think you are misinterpreting the percentage value, this % is not the total amount of CPU used overall, it's the total usage from the overall, i think i'm not explaining it clear, i'll try to exemplify:
Let's say your code used 5% of your CPU, and looking at the table there it's saying 38% was spent on this method, so it's 38% from the 5%.

First, check in the General tab and see how the CPU was used, and after if it's really using too much CPU then you could check on the Code section.

@ssakshi0302
Copy link
Author

ssakshi0302 commented Nov 18, 2016

Hi ,the average cpu usage is 89.8%. As shown in the earlier sceenshot ,
rx.internal.operators are accounting for 38% of this 89.8% and also if
wecalculate the total % by rx packages i.e (rx.internal.operators
,rx.subjects, rx.internal.util.unsafe, rx.observers) it will come around
(38.25 + 14.395 + 7.928 + 2.92 ) i.e 63.4% of 89.8%.

@kennedyoliveira
Copy link
Contributor

@ssakshi0302 maybe i'm missing something, but i can't see your cpu average in the screenshot, you should be looking at:

oracle_java_mission_control

@ssakshi0302
Copy link
Author

general_tab
Here it is :)

@kennedyoliveira
Copy link
Contributor

@ssakshi0302 now i see haha, thanks, this looks trick, what are you hystrix command doing? Executing some i/o task or something else?

@ssakshi0302
Copy link
Author

ssakshi0302 commented Nov 18, 2016

Its just a simple microservice call to a server . I am wondering if something is wrong with my metrics which is causing so . I am attaching memory screenshot also. You may find it useful.
memory_usage

allocationbythread
allocation_profile

@ssakshi0302 ssakshi0302 changed the title Too many RXComputationalScheduler threads getting created in hystrix cpu% high Nov 18, 2016
@ssakshi0302
Copy link
Author

@kennedyoliveira : Any clue on this?

@kennedyoliveira
Copy link
Contributor

@ssakshi0302 apologies for the delay, it's look trick to me, i'm not the best person to help with metrics, since i don't know the internals, @mattrjacobs can you help him with his doubt on it?

which hystrix version are you using?

@ssakshi0302
Copy link
Author

Hey I am using version 1.5.3

On Sun, 20 Nov 2016 at 2:01 AM, Kennedy Oliveira notifications@github.com
wrote:

@ssakshi0302 https://github.com/ssakshi0302 apologies for the delay,
it's look trick to me, i'm not the best person to help with metrics, since
i don't know the internals, @mattrjacobs https://github.com/mattrjacobs
can you help him with his doubt on it?

which hystrix version are you using?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1424 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AImAlm3wBeD9BbbVEuhJSxhOLtmKzrdgks5q_1yWgaJpZM4K2s_L
.

@kennedyoliveira
Copy link
Contributor

@ssakshi0302 could you try to update to 1.5.8? I remember some weird problems in an old version, but i'm not sure if was about cpu performance.

@ssakshi0302
Copy link
Author

@kennedyoliveira : Thanks Oliveira , I will change the version but I wanted to understand why this could have occured and what does this stack indicate. Was it due to any flaw in metrics ?

@ssakshi0302
Copy link
Author

ssakshi0302 commented Nov 21, 2016

@mattrjacobs : Hi , What does the thread RxComputaionalScheduler do?I ould not find much in documentation. Can you please help us out here as we are facing production issues

@kennedyoliveira
Copy link
Contributor

@ssakshi0302 i don't know if there are flaw in the metrics, because i don't know how the metrics works inside, i mean the implementation details, matt will help you better with that.

About the RxComputationalScheduler, is a fixed size scheduler with total threads equal as total core cpus on your machine, intended to be used for doing computational tasks, it's from RxJava.

In your screenshot from the allocation tab, could you print the stacktrace all the way up? Because it shows the internal operators calls but not who called it, may give a hint.

@ssakshi0302
Copy link
Author

@kennedyoliveira : Can you mention some email id , so that I can send flight recording itself. It will help you analyse better.

@mattrjacobs
Copy link
Contributor

Rx Computation threads are set up on any JVM which uses RxJava. Since Hystrix uses RxJava, the threads get set up. Within Hystrix they are used for metrics collection/aggregation, which happens in the background. In our internal systems, we've never seen any performance issues from their usage.

In the attached thread dump, none of the RxComputation threads were busy, so that does not indicate over-usage of those threads.

Please attach a FlightRecorder trace to this issue and I can take a look.

@ssakshi0302
Copy link
Author

stack.txt

@mattrjacobs
Copy link
Contributor

That is a stack trace for what is allocation memory to TLABs. From the rest of the discussion, I thought your concern was around CPU usage?

Can you please attach the entire jfr file?

@ssakshi0302
Copy link
Author

Hey, I am not able to attach any jfr file in github, I would be able to
share that through email only. And i sent this memory allocation stack in
reponse to kennedy's comment. And yes, the concern here is cpu usage :)

On Mon, Nov 21, 2016 at 11:46 PM, Matt Jacobs notifications@github.com
wrote:

That is a stack trace for what is allocation memory to TLABs. From the
rest of the discussion, I thought your concern was around CPU usage?

Can you please attach the entire jfr file?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1424 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AImAll8cgnCqXFmpsM8WKFru9E-0iaxQks5rAd_9gaJpZM4K2s_L
.

@ssakshi0302
Copy link
Author

flight_recording_3002371128084_6 (copy).txt

@mattrjacobs : Hi ,
I am attaching jfr file as txt file . Please change the extension to jfr and try

@ssakshi0302
Copy link
Author

@mattrjacobs : Any update ?

@mattrjacobs
Copy link
Contributor

I took a look through the JFR you attached, and it looks as you described. A couple questions:

  1. How many commands are in your system?
  2. How many thread pools are in your system?
  3. Have you changed any configuration from defaults (specifically around metrics)?

From my initial look, all of these code paths look valid, but it's not clear to me why they're taking as much CPU as they are. That hasn't been the case for the production system I work on, nor in the jmh tests that ship with Hystrix.

Could you run the jmh tests and take a flight-recorder trace during these?

@mattrjacobs
Copy link
Contributor

Closing due to inactivity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants