Skip to content

Comments

CAMEL-19815: use a ConcurrentHashMap instead of a blanket synchronize#11279

Closed
orpiske wants to merge 2 commits intoapache:mainfrom
orpiske:use-concurrent-hm
Closed

CAMEL-19815: use a ConcurrentHashMap instead of a blanket synchronize#11279
orpiske wants to merge 2 commits intoapache:mainfrom
orpiske:use-concurrent-hm

Conversation

@orpiske
Copy link
Contributor

@orpiske orpiske commented Sep 1, 2023

This should help offset the performance impacts of removing the cached QueueReference on d66469d

4.1 (this patch)

Benchmark                                         Mode  Cnt    Score    Error   Units
SedaRoundTripTest.sendBlocking                   thrpt   10  925.275 ±  1.146  ops/ms
SedaRoundTripTest.sendBlockingWithMultipleTypes  thrpt   10  229.778 ±  0.107  ops/ms
SedaRoundTripTest.sendBlockingWithMultipleTypes   avgt   10    0.004 ±  0.001   ms/op
SedaRoundTripTest.sendBlocking                      ss   10    0.163 ±  0.035   ms/op
SedaRoundTripTest.sendBlockingWithMultipleTypes     ss   10    0.570 ±  0.047   ms/op

4.1 (without the patch)

Benchmark                                         Mode  Cnt    Score    Error   Units
SedaRoundTripTest.sendBlocking                   thrpt   10  915.329 ±  0.320  ops/ms
SedaRoundTripTest.sendBlockingWithMultipleTypes  thrpt   10  225.853 ±  0.459  ops/ms
SedaRoundTripTest.sendBlockingWithMultipleTypes   avgt   10    0.005 ±  0.001   ms/op
SedaRoundTripTest.sendBlocking                      ss   10    0.167 ±  0.025   ms/op
SedaRoundTripTest.sendBlockingWithMultipleTypes     ss   10    0.626 ±  0.222   ms/op

4.0 (baseline)

Benchmark                                         Mode  Cnt    Score    Error   Units
SedaRoundTripTest.sendBlocking                   thrpt   10  922.456 ±  0.462  ops/ms
SedaRoundTripTest.sendBlockingWithMultipleTypes  thrpt   10  242.381 ±  0.107  ops/ms
SedaRoundTripTest.sendBlockingWithMultipleTypes   avgt   10    0.004 ±  0.001   ms/op
SedaRoundTripTest.sendBlocking                      ss   10    0.162 ±  0.022   ms/op
SedaRoundTripTest.sendBlockingWithMultipleTypes     ss   10    0.573 ±  0.044   ms/op

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

🌟 Thank you for your contribution to the Apache Camel project! 🌟

🐫 Maintainers, please note that first-time contributors require manual approval for the GitHub Actions to run.

⚠️ Please note that the changes on this PR may be tested automatically if they change components.

🤖 Use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot.

If necessary Apache Camel Committers may access logs and test results in the job summaries!

ref = new QueueReference(queue, size, multipleConsumers);
QueueReference ref = new QueueReference(queue, size, multipleConsumers);
ref.addReference(endpoint);
getQueues().put(key, ref);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should consider creating the queue and the queue reference in a computeIfAbsent to prevent concurrent creation for the same key

if (ref.getSize() != null) {
setSize(ref.getSize());
synchronized (this) {
if (queue == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning, queue is not volatile so this is not thread-safe

@essobedo
Copy link
Contributor

essobedo commented Sep 1, 2023

Maybe I misunderstood your results, but I don't see a real significant improvement, could you please clarify a bit your results?

@orpiske orpiske marked this pull request as draft September 1, 2023 14:54
@orpiske
Copy link
Contributor Author

orpiske commented Sep 1, 2023

Marking it as a draft as I want to go through @essobedo's comments next week.

@essobedo
Copy link
Contributor

essobedo commented Sep 1, 2023

Another question, Is there a real-life use case where we heavily create queues? otherwise, I don't really get the real need of improving it.

Do you have profiler reports indicating that this part of the code as an hot spot?

@orpiske
Copy link
Contributor Author

orpiske commented Sep 1, 2023

Maybe I misunderstood your results, but I don't see a real significant improvement, could you please clarify a bit your results?

The key ones are the top two on each table. They are measured in terms of number of operations per millisecond. Even though the number seems small, when spread over the course of a few minutes/hours, this results in quite a lot more exchanges going through the core.

@essobedo
Copy link
Contributor

essobedo commented Sep 1, 2023

You mean that we need to compare 925 ops/ms for 4.1 with 922 ops/ms for 4.0 in case of SedaRoundTripTest.sendBlocking and 229 ops/ms for 4.1 with 242 ops/ms for 4.0 in case of SedaRoundTripTest.sendBlockingWithMultipleTypes, right?

}
return ref;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the 'synchronized' keyword in the 'registerQueue' method also be eliminated? I believe you can utilize 'computeIfAbsent,' as suggested by @essobedo in the comment above.

@orpiske
Copy link
Contributor Author

orpiske commented Sep 1, 2023

Another question, Is there a real-life use case where we heavily create queues? otherwise, I don't really get the real need of improving it.

The getQueueReference is in the hot path for the SedaProducer addToQueue`.

Do you have profiler reports indicating that this part of the code as an hot spot?

Absolutely.

Screenshot 2023-09-01 at 17 06 51 Screenshot 2023-09-01 at 17 07 03

@orpiske
Copy link
Contributor Author

orpiske commented Sep 1, 2023

You mean that we need to compare 925 ops/ms for 4.1 with 922 ops/ms for 4.0 in case of SedaRoundTripTest.sendBlocking and 229 ops/ms for 4.1 with 242 ops/ms for 4.0 in case of SedaRoundTripTest.sendBlockingWithMultipleTypes, right?

Exactly. This tries to bring the code closer to what we had in 4.0, as I had to revert one of the performance improvements we had there.

BTW, answering your other question: "Is there a real-life use case where we heavily create queues".

The answer is yes, we do. Basically, that patch for 4.0 was breaking a scenario for an user that was heavily creating queues.

@essobedo
Copy link
Contributor

essobedo commented Sep 1, 2023

Interesting, by chance, do you have the html version of the reports to allow me to zoom in?

@orpiske
Copy link
Contributor Author

orpiske commented Sep 1, 2023

Obs.: please, do feel free to review. I'll go through your review comments next week. For now just answering the non-review questions as my brain is basically a jelly after this week.

@orpiske
Copy link
Contributor Author

orpiske commented Sep 1, 2023

Interesting, by chance, do you have the html version of the reports to allow me to zoom in?

Here's one: http://angusyoung.org/arquivos/tmp/profile-camel-19815-with-perf-v6-4.1.0-SNAPSHOT-2023-09-01-043702.html

Obs.: just keep in mind that this is just one of the many scenarios I run (and NOT generated from the same JMH results I pasted - it's just the only one I have readily available). If you want a different one (i.e.: uncontended SEDA or etc), drop me a note next week and I can send it to you (or point to how you can generate your own).

@essobedo
Copy link
Contributor

essobedo commented Sep 1, 2023

You mean that we need to compare 925 ops/ms for 4.1 with 922 ops/ms for 4.0 in case of SedaRoundTripTest.sendBlocking and 229 ops/ms for 4.1 with 242 ops/ms for 4.0 in case of SedaRoundTripTest.sendBlockingWithMultipleTypes, right?

Exactly. This tries to bring the code closer to what we had in 4.0, as I had to revert one of the performance improvements we had there.

BTW, answering your other question: "Is there a real-life use case where we heavily create queues".

The answer is yes, we do. Basically, that patch for 4.0 was breaking a scenario for an user that was heavily creating queues.

If so if you check well the second case is better with Camel 4.0 and the first one is slightly better with Camel 4.1 but you have a higher Error 1.146 vs 0.462 which means that the results are more stable in Camel 4.0.

I'm wondering if an improvement of less than 1 % is insignificant enough to be considered a real improvement as the results may vary a lot from one test to another.

BTW, I've just realized that you set forks to 1, maybe you should retry with 3 to get more reliable results?

@orpiske
Copy link
Contributor Author

orpiske commented Sep 1, 2023

You mean that we need to compare 925 ops/ms for 4.1 with 922 ops/ms for 4.0 in case of SedaRoundTripTest.sendBlocking and 229 ops/ms for 4.1 with 242 ops/ms for 4.0 in case of SedaRoundTripTest.sendBlockingWithMultipleTypes, right?

Exactly. This tries to bring the code closer to what we had in 4.0, as I had to revert one of the performance improvements we had there.
BTW, answering your other question: "Is there a real-life use case where we heavily create queues".
The answer is yes, we do. Basically, that patch for 4.0 was breaking a scenario for an user that was heavily creating queues.

If so if you check well the second case is better with Camel 4.0 and the first one is slightly better with Camel 4.1 but you have a higher Error 1.146 vs 0.462 which means that the results are more stable in Camel 4.0.

I'm wondering if an improvement of less than 1 % is insignificant enough to be considered a real improvement as the results may vary a lot from one test to another.

BTW, I've just realized that you set forks to 1, maybe you should retry with 3 to get more reliable results?

Normally that would be the case (i.e.: if we were comparing the results from the same test class). However, these tests are already running on forked executions: the iterations for different versions run separately (as in, the automation runs the tests for one version and then for another - not on the same JVM, I mean).

@essobedo
Copy link
Contributor

essobedo commented Sep 1, 2023

Interesting, by chance, do you have the html version of the reports to allow me to zoom in?

Here's one: http://angusyoung.org/arquivos/tmp/profile-camel-19815-with-perf-v6-4.1.0-SNAPSHOT-2023-09-01-043702.html

Obs.: just keep in mind that this is just one of the many scenarios I run (and NOT generated from the same JMH results I pasted - it's just the only one I have readily available). If you want a different one (i.e.: uncontended SEDA or etc), drop me a note next week and I can send it to you (or point to how you can generate your own).

Thx for sharing do you have the same kind of report for 4.0 (the baseline)?

@orpiske
Copy link
Contributor Author

orpiske commented Sep 1, 2023

Interesting, by chance, do you have the html version of the reports to allow me to zoom in?

Here's one: http://angusyoung.org/arquivos/tmp/profile-camel-19815-with-perf-v6-4.1.0-SNAPSHOT-2023-09-01-043702.html
Obs.: just keep in mind that this is just one of the many scenarios I run (and NOT generated from the same JMH results I pasted - it's just the only one I have readily available). If you want a different one (i.e.: uncontended SEDA or etc), drop me a note next week and I can send it to you (or point to how you can generate your own).

Thx for sharing do you have the same kind of report for 4.0 (the baseline)?

Not right now, as I am off from work, but on Monday I'll take a look at the review comments you raised (along w/ the ones from @rhuan080), adjust the code and regenerate all the data. I'll post the update report so you guys can look at it.

This should help offset the performance impacts of removing the cached QueueReference on d66469d
…/unboxing

WIP: removing auto-boxing/unboxing
@orpiske
Copy link
Contributor Author

orpiske commented Sep 5, 2023

Let's put this one on pause for now. I tried with the suggestions, but the performance wasn't better:

4.1.0-SNAPSHOT Updated

SedaRoundTripTest.sendBlocking                   thrpt   10  914.608 ±  0.545  ops/ms
SedaRoundTripTest.sendBlockingWithMultipleTypes  thrpt   10  232.482 ±  0.419  ops/ms

I'll get back to this one with a clearer head in a couple of weeks, after I dispatch other higher priority items.

Edit: tracking this on CAMEL-19830.

@orpiske orpiske closed this Sep 5, 2023
@orpiske orpiske deleted the use-concurrent-hm branch April 10, 2024 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants