Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low network connections in 1.4.5, did not have problem in 1.4.3 #137

Closed
arcoso opened this issue Feb 5, 2015 · 5 comments · Fixed by #138
Closed

Low network connections in 1.4.5, did not have problem in 1.4.3 #137

arcoso opened this issue Feb 5, 2015 · 5 comments · Fixed by #138

Comments

@arcoso
Copy link

arcoso commented Feb 5, 2015

Only getting 4 network connections.

I did not have this problem in the previous "stable" version 1.4.3

I hope you please look into it, and thank you for all your work!!

@creativecuriosity
Copy link
Collaborator

Awesome arcoso, thank you for taking the time to get involved :)
Will look into it!

@dooglus
Copy link
Collaborator

dooglus commented Feb 6, 2015

I have two instances of v1.4.5 running:

$ clamd getpeerinfo | grep inbound | sort | uniq -c
     12         "inbound" : false,

$ clamd getpeerinfo | grep inbound | sort | uniq -c
      4         "inbound" : false,

It looks like neither has its p2p port forwarded to allow inbound connections, and so all the connections are outbound. This proves that v1.4.5 is capable of making more than 4 connections.

I'll try forwarding the port and seeing how it goes then.

...

Oh, this is interesting. The one with 12 outbound connections already had its port forwarded. I shut down the one with 4 connections and restarted it with -connect=w.x.y.z and it was able to connect.

The other one then showed:

$ clamd getpeerinfo | grep inbound | sort | uniq -c
     11         "inbound" : false,
      1         "inbound" : true,

I stopped the "-connect=" client, restarted it without arguments, and it didn't reconnect to the one with the 12 peers again.

So perhaps v1.4.5 clients are somehow failing to advertise themselves properly?

12 hours later, I still see no inbound connections even though if a peer tried to connect it would be accepted.

@dooglus
Copy link
Collaborator

dooglus commented Feb 7, 2015

I tried using -connect=... to connect to two different nodes.

When I connect to a node running an old version of the CLAM client, my local log shows:

receive version message: version 60014, blocks=329196, us=<myip>:<randomport>,
  them=<theirip>:31174, peer=<theirip>:31174

but when I connect to a peer running v1.4.5 I see:

receive version message: version 60014, blocks=329196, us=<myip>:<randomport>,
  them=0.0.0.0:31174, peer=<theirip>:31174

I always see them=0.0.0.0 on the peer receiving the connection, but on old versions of the client 'them' was non-zero on the peer initiating the connection.

I wonder if that's significant.

@dooglus
Copy link
Collaborator

dooglus commented Feb 7, 2015

I've hit a bit of a dead end.

It seems like the reason we're not getting incoming connections is because we can't figure out our own external IP address.

If you supply -externalip= then it all works fine.

4d8f0a5 chanced how we discover our external IP address. It's based on bitcoin/bitcoin#5161

According to the guys in #bitcoin-dev, a node behind a NAT router with a port forwarded will eventually start getting incoming connections without us having to manually provide the public IP address.

I'm currently syncing the Bitcoin blockchain to see that actually happen. Then I'll be able to find out how it happens. And then I'll see why it doesn't happen for CLAM...

@dooglus
Copy link
Collaborator

dooglus commented Feb 7, 2015

I think I found the problem.

4d8f0a5 writes:

    pfrom->addrLocal = addrMe;
    if (pfrom->fInbound && addrMe.IsRoutable())
    {
        SeenLocal(addrMe);
    }

then c8f8b50 removes that code completely. There's no mention of SeenLocal() after that commit. This is probably the cause of the problem. That commit is removing 'addrSeenByPeer' and accidentally removes code around 'SeenLocal()', which is unrelated.

Then 6649357 puts the code back in, but uses the old version from before 4d8f0a5, with the assignment to addrLocal inside the condition:

    if (pfrom->fInbound && addrMe.IsRoutable())
    {
        pfrom->addrLocal = addrMe;
        SeenLocal(addrMe);
    }

That means that for outbound connections we don't have a record of what the peer thinks our IP address is, and so IsPeerAddrLocalGood(pnode) in AdvertizeLocal() fails, and so we never get to advertize our IP address, it never gets onto the network, and so nobody attempts to make a connection to us.

The simple fix is to move the:

        pfrom->addrLocal = addrMe;

up 3 lines.

But a more serious problem is understanding how these kinds of errors happen. How are big chucks of code changing, disappearing, then changing back to how they started?

dooglus referenced this issue Feb 7, 2015
… new node if waited longer than MAX_TIME_SINCE_BEST_BLOCK seconds.

Changing target for initial blockchain download to 275000.

This way client will get quickly there and will have fewer number of blocks to obtain using slower methods.
@dooglus dooglus mentioned this issue Feb 7, 2015
dooglus added a commit that referenced this issue Feb 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants