Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another set of code related warnings fixes #14222

Merged
merged 5 commits into from
May 23, 2024
Merged

Another set of code related warnings fixes #14222

merged 5 commits into from
May 23, 2024

Conversation

gnodet
Copy link
Contributor

@gnodet gnodet commented May 22, 2024

  • Fix visibility issues
  • Fix busy-wait loop issues

Copy link
Contributor

🌟 Thank you for your contribution to the Apache Camel project! 🌟

🤖 CI automation will test this PR automatically.

🐫 Apache Camel Committers, please review the following items:

  • First-time contributors require MANUAL approval for the GitHub Actions to run

  • You can use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot.

  • You can label PRs using build-all, build-dependents, skip-tests and test-dependents to fine-tune the checks executed by this PR.

  • Build and test logs are available in the Summary page. Only Apache Camel committers have access to the summary.

  • ⚠️ Be careful when sharing logs. Review their contents before sharing them publicly.

@gnodet
Copy link
Contributor Author

gnodet commented May 22, 2024

There's another busy-wait loop issue in org.apache.camel.support.Task.
This is usually not a problem when performing non trivial tasks and re-attempting again in case of a failure, such as trying to connect to a remote server every 5 seconds. However, this package is also used to wait for trivial conditions, such as in the ServicePool where a foreground task is used to wait for a service to be started every 5 milliseconds. I think this is an abuse of the framework. For such cases, I think an alternative strategy should be provided, using an implementation of java.util.concurrent.Condition that can be waited for.

@gnodet
Copy link
Contributor Author

gnodet commented May 22, 2024

There's another busy-wait loop issue in org.apache.camel.support.Task. This is usually not a problem when performing non trivial tasks and re-attempting again in case of a failure, such as trying to connect to a remote server every 5 seconds. However, this package is also used to wait for trivial conditions, such as in the ServicePool where a foreground task is used to wait for a service to be started every 5 milliseconds. I think this is an abuse of the framework. For such cases, I think an alternative strategy should be provided, using an implementation of java.util.concurrent.Condition that can be waited for.

Upon further investigation, the potential problem comes from this commit:
07b9c68
I'm not sure how this is supposed to work: services usually inherit from ServiceSupport class where starting the service is a blocking operation. When the service is created, we call addService with forceStart set to true. My understanding is that returning from this call, the service will either be started, or the start will be deferred, but I don't really see how it can be in the STARTING state. @orpiske ?

@orpiske
Copy link
Contributor

orpiske commented May 22, 2024

There's another busy-wait loop issue in org.apache.camel.support.Task. This is usually not a problem when performing non trivial tasks and re-attempting again in case of a failure, such as trying to connect to a remote server every 5 seconds. However, this package is also used to wait for trivial conditions, such as in the ServicePool where a foreground task is used to wait for a service to be started every 5 milliseconds. I think this is an abuse of the framework. For such cases, I think an alternative strategy should be provided, using an implementation of java.util.concurrent.Condition that can be waited for.

Upon further investigation, the potential problem comes from this commit: 07b9c68 I'm not sure how this is supposed to work: services usually inherit from ServiceSupport class where starting the service is a blocking operation. When the service is created, we call addService with forceStart set to true. My understanding is that returning from this call, the service will either be started, or the start will be deferred, but I don't really see how it can be in the STARTING state. @orpiske ?

The key problem I was avoiding on 07b9c68ea2a9a70cf3a3f42c1045fb9fd0acbe6a, was the if (producer instanceof StatefulService) { which caused a type check miss under high load.

The STARTING state check had been there before. Looking at the history of this change, it seems that the STARTING state check is related to CAMEL-14048 and a problem with Netty.

@orpiske
Copy link
Contributor

orpiske commented May 22, 2024

@gnodet it would probably be good to run a full test on the code before merging. I've triggered a build on my own CI to see how it goes.

@gnodet
Copy link
Contributor Author

gnodet commented May 22, 2024

There's another busy-wait loop issue in org.apache.camel.support.Task. This is usually not a problem when performing non trivial tasks and re-attempting again in case of a failure, such as trying to connect to a remote server every 5 seconds. However, this package is also used to wait for trivial conditions, such as in the ServicePool where a foreground task is used to wait for a service to be started every 5 milliseconds. I think this is an abuse of the framework. For such cases, I think an alternative strategy should be provided, using an implementation of java.util.concurrent.Condition that can be waited for.

Upon further investigation, the potential problem comes from this commit: 07b9c68 I'm not sure how this is supposed to work: services usually inherit from ServiceSupport class where starting the service is a blocking operation. When the service is created, we call addService with forceStart set to true. My understanding is that returning from this call, the service will either be started, or the start will be deferred, but I don't really see how it can be in the STARTING state. @orpiske ?

The key problem I was avoiding on 07b9c68ea2a9a70cf3a3f42c1045fb9fd0acbe6a, was the if (producer instanceof StatefulService) { which caused a type check miss under high load.

The STARTING state check had been there before. Looking at the history of this change, it seems that the STARTING state check is related to CAMEL-14048 and a problem with Netty.

Thx for the pointers. I've had another look, I think the fix 0f9f879 is wrong and the problem was actually fixed correctly by 4c8c071. This commit makes sure that the double-check locking only see a fully started service.

I'll add a commit to remove that part of the code.

@gnodet
Copy link
Contributor Author

gnodet commented May 22, 2024

@gnodet it would probably be good to run a full test on the code before merging. I've triggered a build on my own CI to see how it goes.

Agreed. The PR template mentions build-all but that's not an existing label, I've seen incremental-build-all however. Is that the correct one ? We need to fix the template btw...

@gnodet
Copy link
Contributor Author

gnodet commented May 22, 2024

There's another busy-wait loop issue in org.apache.camel.support.Task. This is usually not a problem when performing non trivial tasks and re-attempting again in case of a failure, such as trying to connect to a remote server every 5 seconds. However, this package is also used to wait for trivial conditions, such as in the ServicePool where a foreground task is used to wait for a service to be started every 5 milliseconds. I think this is an abuse of the framework. For such cases, I think an alternative strategy should be provided, using an implementation of java.util.concurrent.Condition that can be waited for.

Upon further investigation, the potential problem comes from this commit: 07b9c68 I'm not sure how this is supposed to work: services usually inherit from ServiceSupport class where starting the service is a blocking operation. When the service is created, we call addService with forceStart set to true. My understanding is that returning from this call, the service will either be started, or the start will be deferred, but I don't really see how it can be in the STARTING state. @orpiske ?

The key problem I was avoiding on 07b9c68ea2a9a70cf3a3f42c1045fb9fd0acbe6a, was the if (producer instanceof StatefulService) { which caused a type check miss under high load.
The STARTING state check had been there before. Looking at the history of this change, it seems that the STARTING state check is related to CAMEL-14048 and a problem with Netty.

Thx for the pointers. I've had another look, I think the fix 0f9f879 is wrong and the problem was actually fixed correctly by 4c8c071. This commit makes sure that the double-check locking only see a fully started service.

I'll add a commit to remove that part of the code.

Pushed a commit. However I think the ServicePool still has some synchronization issues. We use a BlockingQueue, but we're holding a lock to access it, so that's quite unexpected... That's for another PR though, maybe with the ResequencerEngine concurrency to avoid locking on the object itself.

@orpiske
Copy link
Contributor

orpiske commented May 22, 2024

@gnodet it would probably be good to run a full test on the code before merging. I've triggered a build on my own CI to see how it goes.

Agreed. The PR template mentions build-all but that's not an existing label, I've seen incremental-build-all however. Is that the correct one ? We need to fix the template btw...

I think these labels are outdated. IIRC, we never managed to run full builds on GH because it takes too long and causes the action runner to abort after some time.

For now, the only way to run a full test is by running the tests manually (or, semi automatically, as I do on my environment).

@orpiske
Copy link
Contributor

orpiske commented May 22, 2024

First run (before the last commit pushed) had only 1 failure on org.apache.camel.component.infinispan.remote.InfinispanRemoteProducerIT.replaceAValueByKeyAsyncWithOldValue.

I have a new build running with the latest commit.

Copy link
Contributor

@orpiske orpiske left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Tested on my own CI and there were only 2 failures which are known flakies:

  • org.apache.camel.component.file.FileSortByExpressionTest.testSortFiles
  • org.apache.camel.component.rocketmq.RocketMQRouteIT.testSimpleRoute

The service is always in a started state.  The problem was caused by the field being set (while holding the lock) to the newly created service before it was actually started, so that another thread could see the non started service.
@gnodet gnodet merged commit 5f72cef into apache:main May 23, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants