Peer Profiling data does not factor in speed #1199

shakamd · 2018-06-23T05:42:43Z

The i2p java implementation appears to also track bandwidth used for all past peers when deciding whether a peer is 'fast'. This logic seems to be absent from i2pd, it looks for a random peer that has the eHighBandwidth flag, and if it's Bad (isBad true depending on tunnel reply metrics) then just get any random peer.

Are there any plans to add some more logic for peer selection to ensure 'fast' peers are selected based on observed bandwidth usage between peers? I'd be willing to take a stab at it, but I'm not sure where to hook to collect the bandwidth usage metrics to add to the peer profiles.

orignal · 2018-06-23T10:36:28Z

Look at TunnelPool.cpp where we send peer test, so you always get response time for tunnel tests.
https://github.com/PurpleI2P/i2pd/blob/openssl/libi2pd/TunnelPool.cpp#L361
You need to apply this round-trip time to all routers from that tunnel pair.
I was going to implement it occasionally, but didn't do it. The only thing we do in profiling, we detect and ignore completely bad routers, we also need to build profile for floodfills.
If you can take care about profiling it would be great

shakamd · 2018-06-23T15:23:24Z

Hmm, the latency is good to have but I was concerned specifically with the bandwidth of each tunnel. In the original implementation there is a calculation for the fastest of the past 1m data transfers: https://github.com/i2p/i2p.i2p/blob/2fd0ed1e74e418d322a9a1614a0d16a941ae3ee4/router/java/src/net/i2p/router/peermanager/PeerProfile.java (getPeakTunnel1mThroughputKBps)

What would be a good place to collect this data? i.e. every time any message is sent through a tunnel we perform a bandwidth calculation and save the throughput stats for that peer (so overtime we build a local profile of the fastest peers we can pick from), and then future tunnels would use that profiling information.

orignal · 2018-06-23T18:40:09Z

You can look into NTCPSession.cpp and measure time between send request and callbaclk for each session, that knows remote peer. For SSU it's similar but you must measure Ack time instead.

shakamd · 2018-06-24T00:05:36Z

Thanks, I'll take a look at that.

One thing I'm trying to understand is the loss of bandwidth I'm seeing between two nodes using just a 0-hop tunnel. Directly, these nodes can sustain 100mbps to each other but if I create a client-server tunnel pair using a 0-hop tunnel for inbound/outbound on both sides the application bandwidth drops to less than 10mbps, and lots of packet loss... but the wire speed still stays high. It seems quite a bit of retransmission is going on at the lower layers (I see the log printing this constantly: Streaming: Duplicate message [n] on sSID=[id]). Any ideas on what can be done to increase this performance? It doesn't seem to be CPU bound, just the bandwidth is being lost in retransmission.

orignal · 2018-06-24T10:25:05Z

Probably you are talking about 1 hops tunnel. )-hops means local, not networks involved.
SSU implementation is still not good

shakamd · 2018-06-24T13:41:38Z

By 0-hops I mean there is no other router in between that's in my tunnel (i.e. in netstat I can see the IP of the other machine on either side).

I'm seeing this with NTCP (I disabled SSU).

orignal · 2018-06-24T14:28:32Z

If you want to see where the latency comes from, look at Transports.cpp how message are queues up and espically io_service::post. That's known issue, but I don't see a reason to rewrite that code becuase the rest of the network is slow anyway

shakamd · 2018-06-24T15:44:47Z

Hmm, well like I was saying the bandwidth being used at the TCP layer stays very high while it is moving slow at the application layer, which indicates retransmission to me, unless there is a 90% overhead due the I2P in the packets? It doesn't appear to be CPU bound.

I think if the bandwidth is dropping 90% even with inbound.length = 0 and outbound.length = 0 on both sides then there isn't much hope to have fast links with tunnel lengths >= 1. I'm wondering how much the network is slow due to an actual lack of bandwidth and how much it could be improved just by optimizing some of the code. I think the slowness is one of the biggest problems today honestly.

orignal · 2018-06-24T16:36:15Z

as I said, the slownees comes from queues and mutexes invoked by io_service::post

shakamd changed the title ~~Profiling data does not factor in speed~~ Peer Profiling data does not factor in speed Jun 23, 2018

r4sas added the question label Jul 11, 2018

l-n-s added enhancement feature request labels Oct 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Peer Profiling data does not factor in speed #1199

Peer Profiling data does not factor in speed #1199

shakamd commented Jun 23, 2018

orignal commented Jun 23, 2018

shakamd commented Jun 23, 2018

orignal commented Jun 23, 2018

shakamd commented Jun 24, 2018

orignal commented Jun 24, 2018 •

edited

shakamd commented Jun 24, 2018

orignal commented Jun 24, 2018

shakamd commented Jun 24, 2018

orignal commented Jun 24, 2018

Peer Profiling data does not factor in speed #1199

Peer Profiling data does not factor in speed #1199

Comments

shakamd commented Jun 23, 2018

orignal commented Jun 23, 2018

shakamd commented Jun 23, 2018

orignal commented Jun 23, 2018

shakamd commented Jun 24, 2018

orignal commented Jun 24, 2018 • edited

shakamd commented Jun 24, 2018

orignal commented Jun 24, 2018

shakamd commented Jun 24, 2018

orignal commented Jun 24, 2018

orignal commented Jun 24, 2018 •

edited