Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WFLY-6402 EJBs accessible too early (spec violation) #8824

Merged
merged 1 commit into from Jun 3, 2016
Merged

WFLY-6402 EJBs accessible too early (spec violation) #8824

merged 1 commit into from Jun 3, 2016

Conversation

kurobako
Copy link
Contributor

@kurobako kurobako commented Apr 5, 2016

https://issues.jboss.org/browse/WFLY-6402
EJBs accessible too early (spec violation)

now all startup beans in deployment should finish their postconstruct methods before external calls would be allowed to proceed

@wildfly-ci
Copy link

Linux Build 9708 outcome was FAILURE using a merge of df61de1
Summary: Execution timeout; tests passed: 1182, ignored: 125 Build time: 01:40:23

@wildfly-ci
Copy link

Windows Build 4731 outcome was FAILURE using a merge of df61de1
Summary: Execution timeout (new); tests passed: 1182, ignored: 125 Build time: 01:40:27

@bmaxwell
Copy link
Contributor

retest this please!!

@wildfly-ci
Copy link

Linux Build 9779 outcome was FAILURE using a merge of df61de1
Summary: Execution timeout; tests passed: 1182, ignored: 125 Build time: 01:41:30

@wildfly-ci
Copy link

Windows Build 4801 outcome was FAILURE using a merge of df61de1
Summary: Execution timeout; tests passed: 1182, ignored: 125 Build time: 01:41:30

@dmlloyd
Copy link
Member

dmlloyd commented Apr 15, 2016

retest this please

@wildfly-ci
Copy link

Linux Build 9792 outcome was FAILURE using a merge of df61de1
Summary: Execution timeout; tests passed: 1182, ignored: 125 Build time: 01:40:25

@wildfly-ci
Copy link

Windows Build 4814 outcome was FAILURE using a merge of df61de1
Summary: Execution timeout; tests passed: 1182, ignored: 125 Build time: 01:40:32

@dmlloyd
Copy link
Member

dmlloyd commented Apr 15, 2016

Seems like this PR is failing consistently.

@kurobako
Copy link
Contributor Author

kurobako commented Apr 15, 2016

@dmlloyd both failures occur while running SendMessagesTestCase times out:

[15:06:16][org.wildfly:wildfly-ts-integ-basic] Running org.jboss.as.test.integration.ejb.mdb.containerstart.SendMessagesTestCase
[16:36:36][org.wildfly:wildfly-ts-integ-basic] The build Pull Request::Linux #9792 {buildId=96690} has been running for more than 100 minutes. Terminating...

I wonder why test with security manager succeeds, though. Afaik the spec forces us to hold MDB 'calls' as well until startup beans' PostConstruct is done, right? Could it be I miss some aspect of how MDB work in such case?

@dmlloyd
Copy link
Member

dmlloyd commented Apr 15, 2016

I'm not sure, it's pretty strange. Perhaps there's some race condition, and using a security manager upsets the timing to just the perfect degree that the race condition does not occur. This also might explain why the PR passed on the EAP branch.

@@ -51,6 +51,7 @@ protected void handleAnnotations(final DeploymentUnit deploymentUnit, final EEAp
if (data != null) {
if (!data.getClassLevelAnnotations().isEmpty()) {
description.initOnStartup();
description.getModuleDescription().registerStartupBean();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this only handle annotations, and not the deployment descriptor as well?

@stuartwdouglas
Copy link
Contributor

There are a number of fairly fundamental problems here:

  1. It does not actually fulfill the spec, as external requests can still be delivered though other means, e.g. a web request into a Servlet that injects a local interface
  2. If 1) was fixed then any use of another EJB by a Servlet in @startup would immediately result in a deadlock, this would break many existing applications

I think this could be worked around by introducing a ThreadLocal that tracks if an invocation is a startup post construct invocation.

Basically StartupCountDownInterceptor would set this to true, then clear it after the request. If this ThreadLocal was set then the other interceptor would not wait on the latch but just let the request through.

Either way I still don't really like this, it would be better to stop all requests at the relevant endpoints until deployment is complete, much like we already do with graceful shutdown.

@wildfly-ci
Copy link

Linux Build 9505 outcome was FAILURE using a merge of df61de1
Summary: Execution timeout (new); tests passed: 1182, ignored: 125 Build time: 01:40:10

@wildfly-ci
Copy link

Windows Build 4530 outcome was FAILURE using a merge of df61de1
Summary: Execution timeout (new); tests passed: 1182, ignored: 125 Build time: 01:40:20

@kurobako kurobako closed this May 17, 2016
@kurobako kurobako reopened this May 26, 2016
@kurobako
Copy link
Contributor Author

retest this please

@wildfly-ci
Copy link

Windows Build 4689 outcome was FAILURE using a merge of f252600
Summary: Execution timeout; tests passed: 1185, ignored: 125 Build time: 01:40:14

@wildfly-ci
Copy link

Linux Build 9652 outcome was FAILURE using a merge of f252600
Summary: Execution timeout; tests passed: 1185, ignored: 125 Build time: 01:40:19

@kurobako
Copy link
Contributor Author

Build log for failed build contains thread dump showing this:

java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0xcb512288> (a java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
at org.jboss.as.ejb3.deployment.processors.StartupAwaitInterceptor.processInvocation(StartupAwaitInterceptor.java:22)

(a direct call to CountDownLatch from StartupAwaitInterceptor) which is just not present in the current version of PR. Currently StartupCountdown class is used as an intermediate between these two. Probably CI is having trouble getting latest changes from amended commits or something.

@kurobako
Copy link
Contributor Author

retest this please

@n1hility n1hility added hold PR should not be merged for some reason. ee-review and removed hold PR should not be merged for some reason. labels Jun 2, 2016
@n1hility
Copy link
Member

n1hility commented Jun 2, 2016

I agree that fundamental problem is that we should suspend at the remote endpoint level (e.g. no servlet, no remoting, etc)

@kurobako
Copy link
Contributor Author

kurobako commented Jun 3, 2016

@n1hility doing that would be a proper solution for the problem, and that is what people are going to do for wildfly eventually. Current PR is just a way to fix spec violation until the code handling graceful startup arrives

@stuartwdouglas stuartwdouglas merged commit 95ed36f into wildfly:master Jun 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants