Skip to content

Commit

Permalink
Merge #21310: zmq test: fix sync-up by matching notification to gener…
Browse files Browse the repository at this point in the history
…ated block

8a8c638 zmq test: fix sync-up by matching notification to generated block (Sebastian Falbesoner)

Pull request description:

  This is a follow-up PR for #21008, fixes #21216.

  In the course of investigating the problem with jnewbery (analyzing the Cirrus log https://cirrus-ci.com/task/4660108304056320), it turned out that the "sync up" procedure of repeatedly generating a block and waiting for a notification with timeout is too brittle in its current form, as the following scenario could happen:

  - generate block A
  - receive notification, timeout happens => repeat procedure
  - generate block B
  - node publishes block A notification
  - receive notification, we receive the one caused by block A (!!!) => sync-up procedure is completed
  - node publishes block B notification
  - the actual test starts
  - on the first notification reception, the one caused by block B is received, rather than the one actually caused by test code => assertion failure

  This change in the PR ensures that after each test block generation, we wait for the notification that is actually caused by that block and ignore others from possibly earlier blocks. The matching is kind of ugly, it assumes that one out of four components in the block is contained in the notification: the block hash, the tx id, the raw block data or the raw transaction data. (Unfortunately we have to support all publisher topics.)

  I'm aware that this is quite a lot of code now only for establishing a robust test setup. OTOH I wouldn't know of a better method right now, suggestions are very welcome.

  Note for potential reviewers: for both reproducing the issue on master branch and verifying on PR branch, one can simply generate two blocks in the sync-up procedure rather than one.

ACKs for top commit:
  MarcoFalke:
    Concept ACK 8a8c638

Tree-SHA512: a2eb78ba06dfd0fda7b1c111b6bbfb5dab4ab08500cc19c7ea02c3239495d5c74cc7d45250a8b3ecc78ca42d97ee6719bf73db8a137839e5e09a3cfcf08ed29e
  • Loading branch information
MarcoFalke committed Mar 2, 2021
2 parents 72e6979 + 8a8c638 commit cfce346
Showing 1 changed file with 34 additions and 8 deletions.
42 changes: 34 additions & 8 deletions test/functional/interface_zmq.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,31 @@ def receive_sequence(self):
return (hash, label, mempool_sequence)


class ZMQTestSetupBlock:
"""Helper class for setting up a ZMQ test via the "sync up" procedure.
Generates a block on the specified node on instantiation and provides a
method to check whether a ZMQ notification matches, i.e. the event was
caused by this generated block. Assumes that a notification either contains
the generated block's hash, it's (coinbase) transaction id, the raw block or
raw transaction data.
"""

def __init__(self, node):
self.block_hash = node.generate(1)[0]
coinbase = node.getblock(self.block_hash, 2)['tx'][0]
self.tx_hash = coinbase['txid']
self.raw_tx = coinbase['hex']
self.raw_block = node.getblock(self.block_hash, 0)

def caused_notification(self, notification):
return (
self.block_hash in notification
or self.tx_hash in notification
or self.raw_block in notification
or self.raw_tx in notification
)


class ZMQTest (BitcoinTestFramework):
def set_test_params(self):
self.num_nodes = 2
Expand Down Expand Up @@ -105,17 +130,18 @@ def setup_zmq_test(self, services, *, recv_timeout=60, sync_blocks=True):
# Ensure that all zmq publisher notification interfaces are ready by
# running the following "sync up" procedure:
# 1. Generate a block on the node
# 2. Try to receive a notification on all subscribers
# 3. If all subscribers get a message within the timeout (1 second),
# 2. Try to receive the corresponding notification on all subscribers
# 3. If all subscribers get the message within the timeout (1 second),
# we are done, otherwise repeat starting from step 1
for sub in subscribers:
sub.socket.set(zmq.RCVTIMEO, 1000)
while True:
self.nodes[0].generate(1)
test_block = ZMQTestSetupBlock(self.nodes[0])
recv_failed = False
for sub in subscribers:
try:
sub.receive()
while not test_block.caused_notification(sub.receive().hex()):
self.log.debug("Ignoring sync-up notification for previously generated block.")
except zmq.error.Again:
self.log.debug("Didn't receive sync-up notification, trying again.")
recv_failed = True
Expand Down Expand Up @@ -340,7 +366,7 @@ def test_sequence(self):
block_count = self.nodes[0].getblockcount()
best_hash = self.nodes[0].getbestblockhash()
self.nodes[0].invalidateblock(best_hash)
sleep(2) # Bit of room to make sure transaction things happened
sleep(2) # Bit of room to make sure transaction things happened

# Make sure getrawmempool mempool_sequence results aren't "queued" but immediately reflective
# of the time they were gathered.
Expand Down Expand Up @@ -389,8 +415,8 @@ def test_sequence(self):
assert_equal(label, "A")
# More transactions to be simply mined
for i in range(len(more_tx)):
assert_equal((more_tx[i], "A", mempool_seq), seq.receive_sequence())
mempool_seq += 1
assert_equal((more_tx[i], "A", mempool_seq), seq.receive_sequence())
mempool_seq += 1
# Bumped by rbf
assert_equal((orig_txid, "R", mempool_seq), seq.receive_sequence())
mempool_seq += 1
Expand All @@ -405,7 +431,7 @@ def test_sequence(self):
assert_equal((orig_txid_2, "A", mempool_seq), seq.receive_sequence())
mempool_seq += 1
self.nodes[0].generatetoaddress(1, ADDRESS_BCRT1_UNSPENDABLE)
self.sync_all() # want to make sure we didn't break "consensus" for other tests
self.sync_all() # want to make sure we didn't break "consensus" for other tests

def test_mempool_sync(self):
"""
Expand Down

0 comments on commit cfce346

Please sign in to comment.