Skip to content

Pex during handshake#2992

Merged
pompon0 merged 10 commits intomainfrom
gprusak-pex
Mar 2, 2026
Merged

Pex during handshake#2992
pompon0 merged 10 commits intomainfrom
gprusak-pex

Conversation

@pompon0
Copy link
Contributor

@pompon0 pompon0 commented Feb 27, 2026

This PR allows nodes to learn more node addresses, even if the peer they dial is out of capacity for new connections. This works by making listener node send pex batch as part of the handshake (it might discard the connection just after handshake, in case it decides it does not have capacity for this connection).

This new "pex in handshake" is enabled only if reactor pex is enabled in the node - disabling pex reactor prevents a node from learning new addresses and is currently mainly used to prevent node from connecting to random peers (which is misleading indirect use of pex flag), and this pr maintains this semantics to avoid distruptions. I think we should change this semantics and require people to just set MaxConnected to 0 instead (which is a direct way to say: connect only to persistent peers).

Additionally a SelfAddress is added to the handshake message: nodes advertise addresses of nodes they are connected to. Until now they could only advertise the addresses of outbound connections (i.e. verified addresses), but with this PR also SelfAddress of inbound connections is included (each node declares just their own up to date address, so it is fine to gossip it).

Note that SelfAddress could have been also be extracted from the pex response (it is always included), but I wanted to make it more explicit that it is special.

@github-actions
Copy link

github-actions bot commented Feb 27, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedMar 2, 2026, 2:55 PM

@codecov
Copy link

codecov bot commented Feb 27, 2026

Codecov Report

❌ Patch coverage is 83.95062% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.16%. Comparing base (f021f7b) to head (731df5e).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
sei-tendermint/internal/p2p/conv.go 84.61% 2 Missing and 2 partials ⚠️
sei-tendermint/internal/p2p/handshake.go 50.00% 2 Missing and 2 partials ⚠️
sei-tendermint/internal/p2p/peermanager.go 85.71% 1 Missing and 1 partial ⚠️
sei-tendermint/internal/p2p/router.go 89.47% 1 Missing and 1 partial ⚠️
sei-tendermint/internal/p2p/giga_router.go 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2992      +/-   ##
==========================================
- Coverage   58.22%   58.16%   -0.06%     
==========================================
  Files        2113     2110       -3     
  Lines      173671   173413     -258     
==========================================
- Hits       101115   100866     -249     
- Misses      63540    63608      +68     
+ Partials     9016     8939      -77     
Flag Coverage Δ
sei-chain-pr 69.60% <83.75%> (?)
sei-db 70.41% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-tendermint/internal/p2p/address.go 98.71% <ø> (+7.97%) ⬆️
sei-tendermint/internal/p2p/peermanager_pool.go 100.00% <ø> (ø)
sei-tendermint/internal/p2p/pex/reactor.go 89.13% <100.00%> (-0.12%) ⬇️
sei-tendermint/internal/p2p/routeroptions.go 86.36% <ø> (ø)
sei-tendermint/internal/p2p/testonly.go 79.91% <100.00%> (+0.27%) ⬆️
sei-tendermint/internal/p2p/transport.go 84.09% <100.00%> (+0.36%) ⬆️
sei-tendermint/node/setup.go 67.50% <100.00%> (+0.11%) ⬆️
sei-tendermint/internal/p2p/giga_router.go 0.00% <0.00%> (ø)
sei-tendermint/internal/p2p/peermanager.go 88.74% <85.71%> (+4.81%) ⬆️
sei-tendermint/internal/p2p/router.go 85.35% <89.47%> (+1.59%) ⬆️
... and 2 more

... and 75 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

// NOTE: amplification factor!
// small request results in up to maxMsgSize response
maxMsgSize = maxAddressSize * maxGetSelection
maxMsgSize = 1000 + maxAddressSize*p2p.MaxPexAddrs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How big is the average msg now? Where does the constant 1000 come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's say 50 addresses * (42B for NodeID and ~20B for ip/dns address + 6B of protobuf overhead) = 3.5kB. This 1kB extra is arbitrary for whatever constant overhead. This estimation is not very precise.

Encode: func(m *handshakeMsg) *pb.Handshake {
var selfAddr *string
if addr, ok := m.SelfAddr.Get(); ok {
selfAddr = utils.Alloc(addr.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would it happen that you can't get selfAddr here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

node doesn't have to configure an external (public) address in case it doesn't have one.

if p.SelfAddr != nil {
addr, err := ParseNodeAddress(*p.SelfAddr)
if err != nil {
return nil, fmt.Errorf("SelfAddr: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this ever happen during normal operations? DNS failures?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversary node can send broken data. This is just a proto converter though, it doesn't assume the proto to be valid in any sense, other than specified by the proto message definiton

for i, addrString := range p.PexAddrs {
addr, err := ParseNodeAddress(addrString)
if err != nil {
return nil, fmt.Errorf("PexAddrs[%v]: %w", i, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this may happen on a valid address, should we just ignore that one address and keep the others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wdym? Valid address is parseable. This is not a dynamic property.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, if this is a static parser this should be safe.

if err != nil {
return nil, fmt.Errorf("NodeAuthKey: %w", err)
}
nodeAuthSig, err := ed25519.SignatureFromBytes(p.NodeAuthSig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can anyone with zero stake send us address updates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, node keys do not have stake assigned. They are not validator keys

func (r *Router) Advertise(maxAddrs int) []NodeAddress {
return r.peerManager.Advertise(maxAddrs)
addrs := r.peerManager.Advertise()
return addrs[:min(len(addrs), maxAddrs)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would we ever want randomly pick instead of always the front?

Copy link
Contributor Author

@pompon0 pompon0 Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to. Currently the earlier addresses are of outbound connections, and they are preferred rn (by a rather weak argument that these are valid because we dialed them, while addresses of inbound connections are self-declared, and therefore potentially misconfigured). We can revisit later.

@pompon0 pompon0 enabled auto-merge (squash) March 2, 2026 14:55
@pompon0 pompon0 merged commit 94a7bad into main Mar 2, 2026
35 checks passed
@pompon0 pompon0 deleted the gprusak-pex branch March 2, 2026 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants