Skip to content
This repository has been archived by the owner on Nov 29, 2018. It is now read-only.

Investigate CI issue #26

Closed
cescoffier opened this issue Jul 18, 2016 · 8 comments
Closed

Investigate CI issue #26

cescoffier opened this issue Jul 18, 2016 · 8 comments
Labels

Comments

@cescoffier
Copy link
Member

The JGroups cluster manager didn't get a successful run since quite some time now:
https://vertx.ci.cloudbees.com/view/vert.x-3/job/vert.x3-jgroups/

@fmarinelli
Copy link
Contributor

If you execute a single test it completes with success, when you execute all test at once it completes with fail. We should probably check for something not cleared fine on cluster shutdown.

@cescoffier
Copy link
Member Author

It is because of the last jgroups update ?

@fmarinelli
Copy link
Contributor

No, I don't think so. But I'll have to investigate in any direction.

@Sanne
Copy link
Contributor

Sanne commented Aug 4, 2016

I've sent pr #27 to improve the tests, although it's unlikely to be enough to fix them all as I'm still investigating some deeper things.

@fmarinelli
Copy link
Contributor

I don't want to say that I found it because I don't understand it.
If I remove the testSubsRemovedForKilledNode2 inside the JGroupsClusteredEventbusTest everything works.
With that test on I get an unhandled exception for a mypojoencoder2 not found (probably just a matter of execution order)

@Sanne
Copy link
Contributor

Sanne commented Aug 5, 2016

Indeed, there are some leaks.

With my pr #27 I've introduced a JUnit rule to identify tests which don't close the JGroups channel, and there were many. I fixed some, but for others which I couldn't fix I had to introduce a "lenient" option for the rule to be forgiving (i.e. not fail the test) but it will still nuke the state of the SHARED_LOOPBACK protocol.

This isn't enough though: while I'm now ensuring that the network channel is clean, a JChannel encapsulates other resources which aren't being closed correctly.

Some of the vert.x core tests like io.vertx.test.core.ComplexHATest - which are being extended by the vertx-jgroups module - invoke an helper method to "kill" the node.
This kill method is however implemented as a simple "graceful" shutdown on the ClusterManager, not least the JGroupsClusterManager doesn't close the JChannel if this was passed to the constructor.

I think the implementation of the JGroupsClusterManager to not close the JChannel is correct as its not his responsibility, so it looks like this "simulateKill" in vertx-core could use a bit of redesigning.

Not least in JGroups you'd have some tools to kill the connection in a less graceful way, e.g. use the DISCARD protocol to block any message flow as first thing, then tear down the rest.

See:

  • io.vertx.core.impl.VertxInternal.simulateKill()
  • io.vertx.core.impl.HAManager.simulateKill()

In short.. yes there are more leaks which need to be cleaned up.

@luengnat
Copy link

Is this still not resolved? #27 seems to be committed already

@cescoffier
Copy link
Member Author

@luengnat nope, still failing on the CI.

@vietj vietj closed this as completed Feb 15, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Development

No branches or pull requests

5 participants