-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Originally raised by @zmanian:
Okay so it looked like deleting a DNS record can cause a null pointer deference in the pex.
https://github.com/tendermint/security/blob/master/p2p/pex/addrbook.go#L299-L311
I think we need to check the input to MarkGood for nilness.
Why this could be bad?
It's possible that deleting the DNS record for any of the seed nodes would halt the Hub or at least severely degrade the network.
Rough description of incident we just had at iqlusion:
Preconditions
Validator V1+V2 were peered with sentry S by IP address (V1,V2 -> S)
Sentry S advertised an external_laddr with DNS
Sentry S DNS address resolved by Validator V1,V2
Cause
Sentry S DNS address deleted
Incident
V1,V2 both simultaneously crashed with below stack trace
Response
Restore DNS record, wait for NXRECORD result to expire from DNS cache
Result
V1,V2 came back after DNS resolved again
Mar 30 02:39:03 cosmos01.sjc1.iqint.net cosmoshub[9395]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xd47235]
Mar 30 02:39:03 cosmos01.sjc1.iqint.net cosmoshub[9395]: goroutine 106 [running]:
Mar 30 02:39:03 cosmos01.sjc1.iqint.net cosmoshub[9395]: github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/p2p/pex.(*addrBook).MarkGood(0xc002fc02a0, 0x0)
Mar 30 02:39:03 cosmos01.sjc1.iqint.net cosmoshub[9395]: /builder/home/go/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/p2p/pex/addrbook.go:303 +0x75
Mar 30 02:39:03 cosmos01.sjc1.iqint.net cosmoshub[9395]: github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/p2p.(*Switch).MarkPeerAsGood(0xc0027beff0, 0x128c100, 0xc002fc3b00)
Mar 30 02:39:03 cosmos01.sjc1.iqint.net cosmoshub[9395]: /builder/home/go/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/p2p/switch.go:381 +0x7b
Mar 30 02:39:03 cosmos01.sjc1.iqint.net cosmoshub[9395]: github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusReactor).peerStatsRoutine(0xc0008c2e00)
Mar 30 02:39:03 cosmos01.sjc1.iqint.net cosmoshub[9395]: /builder/home/go/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/consensus/reactor.go:830 +0x36b
Mar 30 02:39:03 cosmos01.sjc1.iqint.net cosmoshub[9395]: created by github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusReactor).OnStart
Mar 30 02:39:03 cosmos01.sjc1.iqint.net cosmoshub[9395]: /builder/home/go/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/consensus/reactor.go:75 +0xff