-
Notifications
You must be signed in to change notification settings - Fork 913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARTEMIS-1999 Broker uses 100% core's CPU time if msg grouping is used #2203
Conversation
@clebertsuconic Please take a look if the change seems to break any other logic and @michaelandrepearce if exclusive consumers should be affected by a similar issue (IMO it shouldn't be the case) |
Would be possible a test? without a test I don't know how to validate the change TBH. At least were you able to run the whole testsuite. |
I have already run the entire test suite that is already filled of AMQP and CORE JMS message group tests AFAIK. |
@franz1981 make a test that will exercise the loop. Try to get a synchronize on Queue... if the test hangs.. it's a bug... use a timeout tag on the test. |
@franz1981 i exclusive yes i think would have similar issue, after all it followed the same logic of message groups in part. It would be easy to fix, as in the same place / if statement just check for the exclusive flag. (best way to confirm this would be to extend the test case @clebertsuconic is suggesting to do the same with exclusive) Btw your fix assumes a message group has been assigned already, what if all consumers are busy and a message group isnt assigned? it would still spin i assume. I would probably change the check you are doing to see if groupID is set or not. |
@@ -2370,10 +2370,10 @@ private void deliver() { | |||
} | |||
} | |||
|
|||
if (pos == endPos) { | |||
// Round robin'd all | |||
if (pos == endPos || groupConsumer != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
groupConsumer is only set if a msg group is already assigned, or has succesfully handled. i would change this to check if groupID is not null.
Nice, I suppose that would be better to raise a different issue/PR for that even if I'm tempted to do it fro this one: it is indeed fixing a similar but different issue.
if the message group isn't assigned it will round robin between the consumers until noDeliver == size: in that case it will stop spinning without burning any CPU, because deliverAsync won't be called anymore. |
@franz1981 once you do fix for this with test and this is merged, ill fix exclusive quickly, as then i can just rip your hard work :P :P :P btw how you get a clean PR build ? without the damn MultiThreadAsynchronousFileTest failing...like it has been for most PR's recently? |
@michaelandrepearce ahah fine!
That's a nice question indeed...it isn't an intermittent failure? Probably I've been lucky...or unlucky depends on the point of views :P |
@michaelandrepearce would be too much of a hack to add a -Ptravis profile and add a property to ignore those tests? Those tests run at least daily on my CI and they never fail. It's probably a question of limited resources or the use of hypervisors on a public cloud. We don't actually need those to validate PR buidls. |
@franz1981 so are you adding a test? |
@clebertsuconic before i went on holiday it didnt seem to error as much as it is now, i wonder if some recent merge has destabilized the build and in particular this test case? I would worry about ignoring it, as its a concurrency test so it maybe actually highlighting an issue thats been introduced by some recent change / merge. Has there been any changes around the journals? |
@michaelandrepearce there are no changes around the journal... We already have a profile that will ignore a lot of tests.. and anyone running the full testsuite would still be able to capture regressions. I will take a look on the failures, but if this is indeed environmental we should move them to the ignore-list on the -Pfast-tests |
@franz1981 please watch the examples. On my private CI an example failed on your branch. |
@clebertsuconic I have tried both message-group and message-group2 examples and they are working: which examples are not working for you? |
149dd7f
to
226d12f
Compare
@franz1981 can you rebase here? what to do with this? |
just one nit-pick.. this test should be on QueueImplTest, away from timing... (integration-tests) can you move it? |
The deliver loop won't give up trying to deliver messages when back-pressure kicks in (credits and/or TCP) if msg grouping is used and there are many consumers registered: this change will allow the loop to exit by instructing the logic that the group consumer is the only consumer to check.
@franz1981 I have moved this into 2.6.x but I had to manually make the changes.. can you check 2.6.x please? |
The deliver loop won't give up trying to deliver messages when
back-pressure kicks in (credits and/or TCP) if msg grouping is used and
there are many consumers registered: this change will allow the loop
to exit by instructing the logic that the group consumer is the only
consumer to check.