Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WFARQ-14 NPE in ServerSetupObserver.handleAfterUndeploy #85

Merged
merged 1 commit into from Dec 16, 2016

Conversation

rhusar
Copy link
Member

@rhusar rhusar commented Nov 23, 2016

Jira
https://issues.jboss.org/browse/WFARQ-14

  • handleOnUndeploy not counting with the possibility of undeploy being fired by arq-core

@rhusar
Copy link
Member Author

rhusar commented Nov 23, 2016

@rachmatowicz Could you please take this for a spin? This fixes the problem that QE reported with 2 test cases.

@rachmatowicz
Copy link

rachmatowicz commented Nov 24, 2016

Hi Rado
Ran your fix against the testsuite. Same problem:

Tests in error:
ClusteredJPA2LCTestCase.org.jboss.as.test.clustering.cluster.jpa2lc.ClusteredJPA2LCTestCase » IndexOutOfBounds

@rhusar
Copy link
Member Author

rhusar commented Nov 24, 2016

Ok, these are different issues. Can you try the other PR? If that one doesnt work either, lets open another Jira.

@rachmatowicz
Copy link

Which is the other PR?

@rhusar
Copy link
Member Author

rhusar commented Nov 24, 2016

@rachmatowicz I sent you an email in the morning... the other pr is #86

@rachmatowicz
Copy link

Isn't the problem here that DistributableTestCase is not calling deploy/undeploy in pairs?

@jamezp
Copy link
Member

jamezp commented Dec 14, 2016

I think this is correct, but before merging I'm attempting to see if I can figure out why it's being called twice on the same container.

@rhusar
Copy link
Member Author

rhusar commented Dec 14, 2016

@jamezp I am thinking, can we discuss this maybe first? Since we have 4 different PR for the same class https://github.com/wildfly/wildfly-arquillian/pulls

One of the problems that I have is that there is no JavaDoc so the expected contract of the class is not clear to me.

@jamezp
Copy link
Member

jamezp commented Dec 14, 2016

@rhusar Yes. A lot of what is happening is not quite clear. This one does seem okay because it's kind of just a "let's be safe". I do think we should try to figure out why it's being triggered twice for the same container though. To me that seems a bit odd.

As I understand this specific class is to handle the @ServerSetup. Before the deployment the setup is ran, after the undeploy the setup is torn down and the AfterClass is used to clean up any remaining setups.

I do kind of wonder if the observed events shouldn't be dealing with deployments in this case. It would be a bit of a change for manual mode tests though. Which now that I think about it, maybe that's why events are fired twice. I think these clustering tests are manual mode so maybe the undeploy is being fired twice.

@jamezp
Copy link
Member

jamezp commented Dec 14, 2016

Okay I think I see this issue and this fix seems fine to me. What it looks like happens is the DistributableTestCase.testGracefulServeOnUndeploy undeploys the application, then after the test class the ClusterAbstractTestCase.afterTestMethod attempts to undeploy it again. I do think an undeploy should be safe to execute multiple times so I'd say this approach is fine.

You can work around this by checking the existence of the deployment before undeploying. I do think this is something we should add to the ArchiveDeployer, e.g. ArchiveDeployer.isDeployed(), as well as safely handle multiple undeploy's for the same deployment.

@rhusar
Copy link
Member Author

rhusar commented Dec 16, 2016

@jamezp I think you are right.

TLDR; since this is affecting our day to day operation I would appreciate merging this, releasing a micro or Beta and update in WF/EAP. But this is still mostly a workaround.

I do think we should try to figure out why it's being triggered twice for the same container though. To me that seems a bit odd.

This is what we are doing every time actually for every test method, see:
https://github.com/wildfly/wildfly/blob/10.1.0.Final/testsuite/integration/clustering/src/test/java/org/jboss/as/test/clustering/cluster/ClusterAbstractTestCase.java#L99
This class doesn't care what was the state of the container (started,stopped,killed) it will always start it and make sure all deployments are undeployed.

Similarly before every test it deploys all the deployments and starts the containers so that ARQ injections work, see https://github.com/wildfly/wildfly/blob/10.1.0.Final/testsuite/integration/clustering/src/test/java/org/jboss/as/test/clustering/cluster/ClusterAbstractTestCase.java#L89

Note that this is not costly, as usually the servers are started so node start will return immediately.

As I understand this specific class is to handle the @serversetup.

Right, but the problem is that in https://github.com/wildfly/wildfly/blob/10.1.0.Final/testsuite/integration/clustering/src/test/java/org/jboss/as/test/clustering/cluster/web/DistributableTestCase.java there is no server setup! This is because the internal maps are leaking references, see fix somewhere along the lines https://github.com/wildfly/wildfly-arquillian/pull/86/files

I don't remember now what exactly are the steps to trigger the leasks, but the "deployed" map is never cleared (nulled) when things are undeployed and the after class doen't clear because setupTasksInForce is empty and afterTestClass() won't clear the maps. When I re-think about that not maybe a fix would be to fix that condition in afterTestClass.

The above PR cause tests like https://github.com/wildfly/wildfly/blob/10.1.0.Final/testsuite/integration/smoke/src/test/java/org/jboss/as/test/smoke/deployment/rar/tests/redeployment/ReDeploymentTestCase.java#L131 to fail, because these expect that the observer logic will be triggered only once on the first deployment -- is that the right contract of this ServerSetup API?

I do think an undeploy should be safe to execute multiple times so I'd say this approach is fine. You can work around this by checking the existence of the deployment before undeploying. I do think this is something we should add to the ArchiveDeployer, e.g. ArchiveDeployer.isDeployed(), as well as safely handle multiple undeploy's for the same deployment.

Right, that's part of the reason, we currently don't know what is deployed. I am not sure how this is implemented, but we would need to make sure that the deployer would be aware of deployments that were deployed before the container is stopped. Since this is in our test cases.

I have opened https://issues.jboss.org/browse/WFARQ-17 for that, see if this gets any traction.

if (count == null) {
// The deployment was already undeployed or never deployed
// AfterUnDeploy and BeforeUnDeploy events are fired by arquillian-core regardless of deployment status
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding some debug logging here just to indicate the container appears to have been removed and is being skipped?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am hoping this won't be necessary if we can nail down the real cause (somewhere along the lines of #86 is attempting).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants