Skip to content

Conversation

hodlinator
Copy link
Contributor

@hodlinator hodlinator commented Sep 18, 2025

  • Remove incorrect docstring in busy_wait_for_debug_log().
  • Replace nerfed regex searches with X in Y expressions.
  • Only compute the log string to be printed on failure when we actually fail instead of every 0.05s.
  • As we find each needle (expected message) in the haystack (log output), stop searching for it. If we fail and time out, we will only complain about the needles (expected messages) we didn't find.

Found while developing a new test case in #33336 (comment)

@DrahtBot DrahtBot added the Tests label Sep 18, 2025
@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 18, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33423.

Reviews

See the guideline for information on the review process.
A summary of reviews will appear here.

Copy link
Contributor

@l0rinc l0rinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reducing useless work - did you measure any speedup?
I think we could make the result simpler and more pythonic - it's a test, I find visibly simpler code better here than theoretically faster one that we can't even measure, so I suggested a few changes.
Here's the end-result containing most of my suggestions:

diff --git a/test/functional/test_framework/test_node.py b/test/functional/test_framework/test_node.py
index e28140c17b..ccbde4702c 100755
--- a/test/functional/test_framework/test_node.py
+++ b/test/functional/test_framework/test_node.py
@@ -550,32 +550,31 @@ class TestNode():
         time_end = time.time() + timeout * self.timeout_factor
         prev_size = self.debug_log_size(encoding="utf-8")  # Must use same encoding that is used to read() below
 
-        def print_log(log):
-            return " - " + "\n - ".join(log.splitlines())
-
         yield
 
+        def join_lines(logs):
+            return " - " + "\n - ".join(logs.splitlines())
+
+        remaining = list(expected_msgs)
         while True:
-            found = []
             with open(self.debug_log_path, encoding="utf-8", errors="replace") as dl:
                 dl.seek(prev_size)
                 log = dl.read()
+
             for unexpected_msg in unexpected_msgs:
-                if re.search(re.escape(unexpected_msg), log, flags=re.MULTILINE):
-                    self._raise_assertion_error(f'Unexpected message "{unexpected_msg}" found in log:\n\n{print_log(log)}\n\n')
-            for i, expected_msg in enumerate(expected_msgs):
-                if re.search(re.escape(expected_msg), log, flags=re.MULTILINE):
-                    found.append(i)
+                if unexpected_msg in log:
+                    self._raise_assertion_error(f'Unexpected message "{unexpected_msg}" found in log:\n\n{join_lines(log)}\n\n')
+
+            found = {i for i, m in enumerate(remaining) if m in log}
             # We found remaining expected messages before encountering unexpected messages.
-            if len(found) == len(expected_msgs):
+            if len(found) == len(remaining):
                 return
             # Remove the ones we found so far, so we only complain about remaining ones when timing out.
-            for f in reversed(found):
-                expected_msgs.pop(f)
+            remaining = [m for m in remaining if m not in found]
             if time.time() >= time_end:
                 break
             time.sleep(0.05)
-        self._raise_assertion_error(f'Expected message(s) {expected_msgs!s} not found in log:\n\n{print_log(log)}\n\n')
+        self._raise_assertion_error(f'Expected message(s) {remaining!r} not found in log:\n\n{join_lines(log)}\n\n')
 
     @contextlib.contextmanager
     def busy_wait_for_debug_log(self, expected_msgs, timeout=60):
@@ -589,21 +588,16 @@ class TestNode():
 
         yield
 
+        remaining = list(expected_msgs)
         while True:
-            found = []
             with open(self.debug_log_path, "rb") as dl:
                 dl.seek(prev_size)
                 log = dl.read()
 
-            for i, expected_msg in enumerate(expected_msgs):
-                if expected_msg in log:
-                    found.append(i)
-
-            if len(found) == len(expected_msgs):
+            found = {i for i, m in enumerate(remaining) if m in log}
+            if len(found) == len(remaining):
                 return
-
-            for f in reversed(found):
-                expected_msgs.pop(f)
+            remaining = [m for m in remaining if m not in found]
 
             if time.time() >= time_end:
                 print_log = " - " + "\n - ".join(log.decode("utf8", errors="replace").splitlines())
@@ -612,8 +606,7 @@ class TestNode():
             # No sleep here because we want to detect the message fragment as fast as
             # possible.
 
-        self._raise_assertion_error(
-            f'Expected message(s) {expected_msgs!s} not found in log:\n\n{print_log}\n\n')
+        self._raise_assertion_error(f'Expected message(s) {remaining!r} not found in log:\n\n{print_log}\n\n')
 
     @contextlib.contextmanager
     def wait_for_new_peer(self, timeout=5):

found = False
if found:
for i, expected_msg in enumerate(expected_msgs):
if re.search(re.escape(expected_msg), log, flags=re.MULTILINE):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the reason for regex iteration above and here if we're just escaping it anyway? We're not doing it in busy_wait_for_debug_log
I have added

            found = []
            for i, expected_msg in enumerate(expected_msgs):
                if re.search(re.escape(expected_msg), log, flags=re.MULTILINE):
                    found.append(i)
            found2 = [i for i, m in enumerate(expected_msgs) if m in log]
            assert found == found2, f"Mismatch: {found} != {found2}"

and ran the whole suite and it seems to be the same, maybe we can simplify it.
Note that found2 should likely be a set instead of a list, otherwise comparing the sizes may not make sense. It also helps with the in condition for the remaining values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I thought there was some special property with re.search(..., re.MULTILINE), but digging deeper I think not. Looking at the PR which added it, I found #14024 (comment) which indicates that the original user of the function was passing in a search string with symbols that could be interpreted as a regex but shouldn't be. #13006 which spawned the issue mentioned regexps, so that's probably why it's in this unfortunate nerfed state.

I agree on switch to needle in haystack form instead. Confirmed it to work by running --extended locally.

The inefficiency of print_log was also pointed out but dismissed: #14024 (comment) - hopefully this PR is still acceptable.

@hodlinator hodlinator force-pushed the 2025/09/assert_debug_log branch from 93511bd to f18a3ca Compare September 20, 2025 21:33
Copy link
Contributor Author

@hodlinator hodlinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback!

Thanks for reducing useless work - did you measure any speedup?

Rebased new push on top of #33336 which has a lot of assert_debug_log() (010cf81), the below represents somewhat average runs:

₿ git co 2025/09/assert_debug_log_rebased~3
₿ hyperfine -r 30 -N ./build/test/functional/feature_assumevalid.py
Benchmark 1: ./build/test/functional/feature_assumevalid.py
  Time (mean ± σ):      6.058 s ±  0.049 s    [User: 4.263 s, System: 0.520 s]
  Range (min … max):    5.985 s …  6.157 s    30 runs

₿ git co 2025/09/assert_debug_log_rebased
₿ hyperfine -r 30 -N ./build/test/functional/feature_assumevalid.py
Benchmark 1: ./build/test/functional/feature_assumevalid.py
  Time (mean ± σ):      5.998 s ±  0.045 s    [User: 4.237 s, System: 0.496 s]
  Range (min … max):    5.929 s …  6.109 s    30 runs

So the tendency is a difference of 1% for this test case - a bit underwhelming. Would expect to see more of a difference for longer log files and more expected_msgs per invocation.

found = False
if found:
for i, expected_msg in enumerate(expected_msgs):
if re.search(re.escape(expected_msg), log, flags=re.MULTILINE):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I thought there was some special property with re.search(..., re.MULTILINE), but digging deeper I think not. Looking at the PR which added it, I found #14024 (comment) which indicates that the original user of the function was passing in a search string with symbols that could be interpreted as a regex but shouldn't be. #13006 which spawned the issue mentioned regexps, so that's probably why it's in this unfortunate nerfed state.

I agree on switch to needle in haystack form instead. Confirmed it to work by running --extended locally.

The inefficiency of print_log was also pointed out but dismissed: #14024 (comment) - hopefully this PR is still acceptable.

@hodlinator
Copy link
Contributor Author

CI failure (https://github.com/bitcoin/bitcoin/actions/runs/17885097855/job/50857723211?pr=33423) seems unrelated:

ERROR: failed to build: failed to solve: short read: expected 659078263 bytes but got 90259897: unexpected EOF
Command '['./ci/test/02_run_container.sh']' returned non-zero exit status 1.

@hodlinator hodlinator force-pushed the 2025/09/assert_debug_log branch from f18a3ca to d116c8a Compare September 22, 2025 18:55
@hodlinator
Copy link
Contributor Author

Latest push has somewhat simpler code (no found-list). Tested performance again, similarly to #33423 (review):

₿ git co 2025/09/assert_debug_log_rebased~3
₿ hyperfine -r 30 -N ./build/test/functional/feature_assumevalid.py
Benchmark 1: ./build/test/functional/feature_assumevalid.py
  Time (mean ± σ):      6.034 s ±  0.047 s    [User: 4.260 s, System: 0.488 s]
  Range (min … max):    5.964 s …  6.146 s    30 runs
 
₿ git co 2025/09/assert_debug_log_rebased
₿ hyperfine -r 30 -N ./build/test/functional/feature_assumevalid.py
Benchmark 1: ./build/test/functional/feature_assumevalid.py
  Time (mean ± σ):      5.990 s ±  0.046 s    [User: 4.233 s, System: 0.482 s]
  Range (min … max):    5.881 s …  6.126 s    30 runs

So less than 1% performance improvement in this case. Again, should perform better for other cases.

@hodlinator hodlinator force-pushed the 2025/09/assert_debug_log branch from d116c8a to 6561ac3 Compare September 25, 2025 11:57
Copy link
Contributor

@l0rinc l0rinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message seems off to me now, we're potentially logging lines as missing that we haven't checked.

Comment on lines +565 to +566
self._raise_assertion_error(f'Unexpected message "{unexpected_msg}" '
f'found in log:\n\n{join_log(log)}\n\n')
Copy link
Contributor

@l0rinc l0rinc Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this was always eagerly evaluated since fa3e9f7#diff-7f22a082e3a80ca2f212e36193f87dd80237035afedb7f15b116ef7fa265f3eeR245-R248 - the current approach of only calculating it once when the loop also terminates makes sense

nit: I know it's a long line but I would still prefer this merged to reduce the noise. I'm fine either way:

Suggested change
self._raise_assertion_error(f'Unexpected message "{unexpected_msg}" '
f'found in log:\n\n{join_log(log)}\n\n')
self._raise_assertion_error(f'Unexpected message "{unexpected_msg}" found in log:\n\n{join_log(log)}\n\n')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding nit: I sympathize with the way busy_wait_for_debug_log() on master puts the parameter to _raise_assertion_error() on it's own line, so I would be tempted to do that for all of them - which makes me realize: if only I broke the message in two lines, I could raise up the first half and still keep close to 80 char width - and we're back with the current approach. Left unchanged in latest push.

Comment on lines +577 to +576
self._raise_assertion_error(f'Expected message(s) {remaining_expected!s} '
f'not found in log:\n\n{join_log(log)}\n\n')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's not necessarily true, we have just stopped searching after the first failure, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing this issue!

Latest push inserts

remaining_expected = [e for e in remaining_expected if e not in log]

before the error reporting for it to be correct, meaning we only need to do the full search on failure.

hodlinator and others added 4 commits September 26, 2025 22:27
Left over since fa7b075.

Co-authored-by: Lőrinc <pap.lorinc@gmail.com>
Always escaping the search string makes the use of regular expressions unnecessary.

Co-authored-by: Lőrinc <pap.lorinc@gmail.com>
print_log was recalculated every 0.05s in assert_debug_log(), even during successful circumstances - changed to only be computed upon failure.

Simplified terminology from "(does not) partially match(es)" to "(not) found in" since it seems to reference the first function having used regular expression matching, while it always escaped the search strings (see parent commit). (Simplified grammar also avoids issues with singular/plural "was/were not found").
Only leaves messages we never found in the final assert message of the functions, which is more helpful (remaining_expected).

Avoids repeatedly searching for messages we have already found (pop()).

Stops searching for other expected messages if we already failed finding one. Still need to clean remaining_expected at the end, but *only if we fail*.

Co-authored-by: Lőrinc <pap.lorinc@gmail.com>
@hodlinator hodlinator force-pushed the 2025/09/assert_debug_log branch from 6561ac3 to a1f7623 Compare September 26, 2025 21:16
Copy link
Contributor Author

@hodlinator hodlinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest push fixes bug from previous push (#33423 (comment)), simplifies code (#33423 (comment)) and adds commit to fix docstring (#33423 (comment)).

Compared run-times for all functional tests and wasn't able to find one which clearly benefited from this optimization. So I'm open to just replacing while remaining_expected and remaining_expected[-1] in log: remaining_expected.pop() with remaining_expected = [e for e in remaining_expected if e not in log] if reviewers prefer.

Comment on lines +577 to +576
self._raise_assertion_error(f'Expected message(s) {remaining_expected!s} '
f'not found in log:\n\n{join_log(log)}\n\n')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing this issue!

Latest push inserts

remaining_expected = [e for e in remaining_expected if e not in log]

before the error reporting for it to be correct, meaning we only need to do the full search on failure.

Comment on lines +565 to +566
self._raise_assertion_error(f'Unexpected message "{unexpected_msg}" '
f'found in log:\n\n{join_log(log)}\n\n')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding nit: I sympathize with the way busy_wait_for_debug_log() on master puts the parameter to _raise_assertion_error() on it's own line, so I would be tempted to do that for all of them - which makes me realize: if only I broke the message in two lines, I could raise up the first half and still keep close to 80 char width - and we're back with the current approach. Left unchanged in latest push.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants