Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.5.5 recover gets stuck in waiting for cancel and eventually exists with recover error #214

Open
NiclasLindgren opened this issue Dec 2, 2020 · 5 comments

Comments

@NiclasLindgren
Copy link

If you simulate network issues by adding a IOException in send you will notice the the JmDns won't finalize its recover state but instead call the delegate letting it now it couldn't recover, because, at least from what it seems, the canceller tasks won't finish (it is cancelled) so you are stuck in Cancel_1 or Cancel_2 (if you had a packet to send on first cancel)

These leave JmDns in a state where you can't restart it because the Timer/Task threads (2) are left running so you leak thread if you create a new instance. Obviously the current instance has stopped announcing and receiving as the multicast socket is closed in the recover code.

Need some help to figure out how to correct this as the state machine isn't obvious, but it seems to me the HostInfo state is deassociated incorrectly during cancel (or maybe not move to cancelled state when it happens).

@NiclasLindgren
Copy link
Author

I think the problem is here

        if (!out.isEmpty()) {
            logger.debug("{}.run() JmDNS {} #{}", this.getName(), this.getTaskDescription(), this.getTaskState());
            this.getDns().send(out);

            // Advance the state of objects.
            this.advanceObjectsState(stateObjects);

When send returns an IOException, advanceObjectsState isn't called, instead recoverTask is called on the Canceler, which stops, so HostInfoState never reaches cancelled.

So if the canceller instead does (changed code)
protected void recoverTask(Throwable e) {
if (this.getTaskState().isCanceling()) {
this.getDns().advanceState(this);
} else {
this.getDns().recover();
}
}

It works, but it doesn't seem right, perhaps JmDnsImpl should advance the state in

    // We have an IO error so lets try to recover if anything happens lets close it.
    // This should cover the case of the IP address changing under our feet
    if (this.isClosing() || this.isClosed() || this.isCanceling() || this.isCanceled()) {
        return;
    }

Instead of just returning?

@NiclasLindgren
Copy link
Author

To repro just put a throw new IOException("fake") in JmDNSImpl.send instead of ms.send(packet)

@NiclasLindgren
Copy link
Author

Another issue is that when going in and out of hibernate Linux can remove network interfaces and you get the exception "No such device", the only way out of that is to call

            _interfaze = NetworkInterface.getByInetAddress(_address);

again before opening the socket in recover else you will get exception in recovering and the state machine will be stuck.

@NiclasLindgren
Copy link
Author

It seems if you call closeMulticastSocket() on any exception in

public void send(DNSOutgoing out) throws IOException {

before throwing the initial exception, recover won't get stuck

@NiclasLindgren
Copy link
Author

It also thinks it has incorrectly recovered if this happens

[local6.warni] 23:32:46,868 jmdns.impl.JmDNSImpl Creating multicast socket on interface name:eth0 (eth0)
[local6.warni] 23:32:46,872 jmdns.impl.HostInfo Find new interface for address /192.168.3.30
[local6.warni] 23:32:46,873 jmdns.impl.JmDNSImpl Creating multicast socket on new interface null
[local6.warni] 23:32:46,874 jmdns.impl.JmDNSImpl cts-va-20041634.recover() Start services exception
[local6.warni] java.net.SocketException: bad argument for IP_MULTICAST_IF2
[local6.warni] at java.net.AbstractPlainDatagramSocketImpl.setOption(Unknown Source)
[local6.warni] at java.net.MulticastSocket.setNetworkInterface(Unknown Source)
[local6.warni] at javax.jmdns.impl.JmDNSImpl.openMulticastSocket(JmDNSImpl.java:472)
[local6.warni] at javax.jmdns.impl.JmDNSImpl.__recover(JmDNSImpl.java:1883)
[local6.warni] at javax.jmdns.impl.JmDNSImpl$6.run(JmDNSImpl.java:1836)
[local6.warni] 23:32:46,876 jmdns.impl.JmDNSImpl cts-va-20041634.recover() We are back!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant