Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAMEL-19014 - Fixing issue about concurrence on the SimpleLanguage #9332

Closed
wants to merge 10 commits into from

Conversation

rhuan080
Copy link
Contributor

@rhuan080 rhuan080 commented Feb 10, 2023

Signed-off-by: Rhuan Rocha rhuan080@gmail.com

Description

Fixing the issue CAMEL-19014

Target

  • I checked that the commit is targeting the correct branch (note that Camel 3 uses camel-3.x, whereas Camel 4 uses the main branch)

Tracking

  • If this is a large change, bug fix, or code improvement, I checked there is a JIRA issue filed for the change (usually before you start working on it).

Apache Camel coding standards and style

  • I checked that each commit in the pull request has a meaningful subject line and body.
  • I formatted the code using mvn -Pformat,fastinstall install && mvn -Psourcecheck

Signed-off-by: Rhuan Rocha <rhuan080@gmail.com>
@github-actions
Copy link
Contributor

🌟 Thank you for your contribution to the Apache Camel project! 🌟

⚠️ Please note that the changes on this PR may be tested automatically.

If necessary Apache Camel Committers may access logs and test results in the job summaries!

@github-actions github-actions bot added the core label Feb 10, 2023
Copy link
Contributor

@essobedo essobedo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the PR, but first we need to agree that the use case is worth being fixed, I personally tend to say it doesn't as it will add some overhead to fix a very uncommon use case.

If we decide to fix it, this fix has to be reviewed because it will have a dramatic impact on the performance, especially in the evaluate method which is frequently called, and itself calls evaluate on children's expressions. This fix assumes that evaluating an expression is very fast but in practice, we don't know, indeed if we call a method that is very slow for some reason (IO or CPU), it will block parallel evaluations just to fix once again a non-standard use case.

Signed-off-by: Rhuan Rocha <rhuan080@gmail.com>
@rhuan080
Copy link
Contributor Author

Hi @essobedo, thank you for reviewing the PR.

I have changed the synchronized block to lock. Now the evaluate method tries to get the ReadLock that does not block the thread for multiple ReadLock. The init method is using the WriteLock, thus the evaluate method is blocked only in case of a WriteLock on init method. It avoids the evaluate method to be blocked unnecessarily.

I think we should develop this method as thread-safe, as it is cached and can be used in a multithreaded task (like parallel or some internal multithread task). Thus, to avoid an issue now or in future, I think it should be immutable or thread-safe

@essobedo
Copy link
Contributor

I have changed the synchronized block to lock. Now the evaluate method tries to get the ReadLock that does not block the thread for multiple ReadLock. The init method is using the WriteLock, thus the evaluate method is blocked only in case of a WriteLock on init method. It avoids the evaluate method to be blocked unnecessarily.

It if needs to be fixed, it is not the right way to fix it, you don't even need locks here. If you really want to fix it, you will have to rewrite the code of the concatExpression, to have 2 classes instead of one (for constants only and not only constants). The problem is with the case of the not only constants, for it, you simply need to set the content of col once for all while initializing the instance instead of doing it in the init method. Moreover, you would need to add a unit test to prove that the bug is really fixed, see TDD.

I think we should develop this method as thread-safe, as it is cached and can be used in a multithreaded task (like parallel or some internal multithread task). Thus, to avoid an issue now or in future, I think it should be immutable or thread-safe

This decision doesn't really belong to us, especially if we disagree, I guess that someone like @davsclaus or @oscerd could eventually decide. But, here, according to your stack trace, if we get this problem it is because the expressions are misused such that the threads spend their time to re-initialize the same expression over and over again instead of only once at startup as the framework does, I don't believe that for a production-ready code, we would really want this behavior, so it tends to prove that it is more a misuse than a bug but it is only my point of view, let's see what others think.

@rhuan080
Copy link
Contributor Author

I agree about the TDD, and I can add it here, but the project does not use TDD. The proof is we have a few tests in this module, but I agree we should add unit tests when submitting a PR and I can do that. However, I'm not sure the right way to solve that is by creating 2 classes. My point is the Expression object can be shared between threads and we have a cache of this object. Thus, in my opinion, an object that keeps states and works (or can work) inside a parallel context should be predicted concurrence problems. The best approach for me is to use an immutable object, however, if it is not possible I think we should consider using a lock or synchronized block.

It if needs to be fixed, it is not the right way to fix it, you don't even need locks here. If you really want to fix it, you will have to rewrite the code of the concatExpression, to have 2 classes instead of one (for constants only and not only constants). The problem is with the case of the not only constants, for it, you simply need to set the content of col once for all while initializing the instance instead of doing it in the init method. Moreover, you would need to add a unit test to prove that the bug is really fixed, see TDD.

Yes, I agree! I'm putting my point and analyzing your point. This PR is good to drive this discussion.

This decision doesn't really belong to us, especially if we disagree, I guess that someone like @davsclaus or @oscerd could eventually decide. But, here, according to your stack trace, if we get this problem it is because the expressions are misused such that the threads spend their time to re-initialize the same expression over and over again instead of only once at startup as the framework does, I don't believe that for a production-ready code, we would really want this behavior, so it tends to prove that it is more a misuse than a bug but it is only my point of view, let's see what others think.

@essobedo
Copy link
Contributor

The best approach for me is to use an immutable object, however, if it is not possible I think we should consider using a lock or synchronized block.

Actually, immutability is the idea behind what I proposed

@rhuan080
Copy link
Contributor Author

Understood and I agree. However, I think we don't need 2 classes, but just rewrite the concatExpression to have 1 class immutable. What do you think about it?

In the case of 2 classes, we will need to define when using immutable e when using mutable. For me, it is an unneeded complex.

The best approach for me is to use an immutable object, however, if it is not possible I think we should consider using a lock or synchronized block.

Actually, immutability is the idea behind what I proposed

@rhuan080
Copy link
Contributor Author

Hi @essobedo reviewing the class, I agree with you. This class already knows if the expression is constant or not. Thus, for me you are right! Thanks a lot.

I'll write the unit test and use the approach you said, and we can discuss it again.

@essobedo
Copy link
Contributor

Well, we don't necessarily need 2 classes, but the code will be cleaner and thus easier to maintain, it will then follow the Single-responsibility principle.

Regarding, the unit test, you can simply adapt what the OP provided as it helps to reproduce the bug.

@oscerd
Copy link
Contributor

oscerd commented Feb 14, 2023

From what I read this is really a corner case. I'm not sure we should fix that.

@rhuan080
Copy link
Contributor Author

Hi @oscerd,

Yes, I agree the way this was figured out is a corner case. However the Expression is used inside a parallel and multithreaded context, but it is not immutable nor thread-safe. Thus, I think we can use an immutable implementation to avoid problems. As it is a good practice to parallel processes, I think it is an opportunity to improve it. I agreed with @essobedo about the lock and the tradeoff of this solution with lock, however, using an immutable implementation we don't have this tradeoff and we avoid a problem about concurrency or rance condition.

What do you think about?

@essobedo
Copy link
Contributor

@rhuan080 +1 I agree, if we can end up with a solution based on immutable classes, it is a good tradeoff

@oscerd
Copy link
Contributor

oscerd commented Feb 14, 2023

Using immutable classes sounds good and I'm +1 on that

Signed-off-by: Rhuan Rocha <rhuan080@gmail.com>
@rhuan080
Copy link
Contributor Author

rhuan080 commented Feb 14, 2023

Hi,

I added the unit test. Now, I'll rewrite the concatExpression to be immutable. @essobedo and @oscerd If you have questions or improvements about the unit test, please, tell me :-)

@essobedo
Copy link
Contributor

Hi,

I added the unit test. Now, I'll rewrite the concatExpression to be immutable. @essobedo and @oscerd If you have questions or improvements about the unit test, please, tell me :-)

I don't understand why you don't use the test case provided by the OP?

@rhuan080
Copy link
Contributor Author

rhuan080 commented Feb 14, 2023

The test provided uses components that are not available in this module. Thus, I created the test using what is available inside the camel-support module. I have reproduced the issue with my test.

@essobedo
Copy link
Contributor

The test provided uses components that are not available in this module. Thus, I created the test using what is available inside the camel-support module. I have reproduced the issue with my test.

You can put it in this package https://github.com/apache/camel/tree/main/core/camel-core/src/test/java/org/apache/camel/language

Signed-off-by: Rhuan Rocha <rhuan080@gmail.com>
Signed-off-by: Rhuan Rocha <rhuan080@gmail.com>
@rhuan080
Copy link
Contributor Author

Hi @essobedo!

Added the test according to your suggestion.

Copy link
Contributor

@essobedo essobedo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thx for the uni test 🙏 Some small remarks

…ct to Concate Expression optimized

Signed-off-by: Rhuan Rocha <rhuan080@gmail.com>
@rhuan080
Copy link
Contributor Author

Hi @essobedo, I'll fix the unit tests. Thanks a lot!

I have committed the refractory of the concatExpression and created one method to build the optimized expression (immutable) and the unoptimized expression. Let me know if you have any questions about the solution.

Signed-off-by: Rhuan Rocha <rhuan080@gmail.com>
Copy link
Contributor

@essobedo essobedo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's more or less what I have in mind, but it makes me realize that it is more complex than I thought, see my comments for details


@Override
public void init(CamelContext context) {
boolean constant = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed

Collection<Object> preprocessedExpression = new ArrayList<>(expressions.size());
for (Expression expression : expressions) {
if (expression instanceof ConstantExpressionAdapter) {
expression.init(context);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not realize that for some ConstantExpressionAdapter we had to initialize it first to get its value, I don't know if it is acceptable to call init so early? @oscerd WDYT?

If it is not acceptable, the other way could be to use a volatile boolean field to know if the expression has been initialized or not and use the Double-checked locking approach to ensure that it is initialized once. Even if it adds a locking mechanism it is only during the initialization phase which happens only once per expression when properly used so the overhead is limited.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @essobedo, Looking at the ConstantExpression, I think that calling the init so early is not a problem. Waiting for the @oscerd to confirm it for us.

About the lock, the problem is the evaluate method being called concurrently with the initcalling. Thus, I think the lock should be put at the evaluate method as well. However, we came back to the penalty we said before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the lock, the problem is the evaluate method being called concurrently with the initcalling. Thus, I think the lock should be put at the evaluate method as well. However, we came back to the penalty we said before.

To be thread-safe, we have to consider the state changes and the visibility of those changes. Here only the init method affects the state and the init method is always called before evaluate so applying a double-check locking approach in the init method is good enough as it ensures that the state is changed only once at the very first call and that all threads thanks to a volatile read or synchronized block will always have the right state before calling evaluate. I hope it is clear enough.

@davsclaus
Copy link
Contributor

be careful to not overcomplicate this - the init of services in Camel are used during bootstrap and are called by Camel in sequence (no locking needed).

We should not do special weird code for one corner case with simple language that been in Camel since 1.x.

If you can come up with a better parser (then yeah) but do not sacrifice the init service lifecycle of Camel.

Signed-off-by: Rhuan Rocha <rhuan080@gmail.com>
@rhuan080
Copy link
Contributor Author

Hi @essobedo and @davsclaus. I fixed the init. I think just checking if the collection was initiated solves the question. Furthermore, the solution turns the code cleaner as now we have one implementation to optimized Expression and another one to unoptimized Expression.

for (Expression expression : expressions) {
if (expression instanceof ConstantExpressionAdapter) {
expression.init(context);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's already initialized so it is not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fault. I'm sorry. Fixed.

Signed-off-by: Rhuan Rocha <rhuan080@gmail.com>
Copy link
Contributor

@essobedo essobedo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 Even if the ExpressionAdapter of concatExpressionOptimized is not thread-safe, I believe that it is a good tradeoff compared to not fixing at all

@rhuan080 rhuan080 closed this Feb 19, 2023
@rhuan080 rhuan080 reopened this Feb 19, 2023
@github-actions
Copy link
Contributor

🌟 Thank you for your contribution to the Apache Camel project! 🌟

⚠️ Please note that the changes on this PR may be tested automatically.

If necessary Apache Camel Committers may access logs and test results in the job summaries!

@rhuan080
Copy link
Contributor Author

rhuan080 commented Feb 19, 2023

I'm sorry for closing this issue. It was just a click error, and the button to close does not have a double check 🤦‍♂️

Thank you @essobedo and @oscerd !

As the @davsclaus confirmed that the init is just called at the Camel bootstrap and the init method is not called inside a parallel process, I think just preventing the collection from being re-initialized in case of calling the init again is sufficient. I will update the issue in Jira with this information soon.

@davsclaus
Copy link
Contributor

what is the status of this? And I think its safer to do this in 3.21 release only, and not 3.20.x to keep it as-is.
This is a corner case and simple language is used a lot in Camel routes.

@essobedo
Copy link
Contributor

essobedo commented Mar 9, 2023

@rhuan080 Would you mind creating a new PR for the branch camel-3.x for Camel 3.21?

@davsclaus
Copy link
Contributor

This PR causes NPE errors on main, so hold until we fixed this or we need to revert and drop this PRs

Copy link
Contributor

@davsclaus davsclaus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see prev comments

@davsclaus
Copy link
Contributor

okay the optimized can be even better, if there are only ConstantExpressionAdapter then you can build the result string in init and return its constant value in evaluate

@davsclaus
Copy link
Contributor

Okay this commit fixes the NPEs
d6ca5fc

@davsclaus
Copy link
Contributor

And this commit optimizes when there are only constant values
f2a3cc2

@davsclaus
Copy link
Contributor

So we need a PR for camel-3.x branch and leave 3.20.x as-is.
And in 3.x we need this PR and the 2 commits I linked.

@rhuan080
Copy link
Contributor Author

Hi @davsclaus, ok! I'll send a PR to camel-3.x. I'll update the 3.x to add the commits.

@rhuan080
Copy link
Contributor Author

PR 3.x: #9531

Signed-off-by: Rhuan Rocha <rhuan080@gmail.com>
@davsclaus davsclaus closed this Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants