Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

precondition with lambda function slows down RuleBasedStateMachine considerably #3963

Closed
pspacek opened this issue May 2, 2024 · 5 comments
Labels
performance go faster! use less memory!

Comments

@pspacek
Copy link

pspacek commented May 2, 2024

Hello once again,

and big thank you for Hypothesis! It's a great tool and helped me to uncover nasty and hard-to-find bugs.

Summary

@precondition decorator with a trivial lambda function (one integer comparison + object dereference) causes significant slowdown in RuleBasedStateMachine execution.

Versions

  • hypothesis 6.100.2
  • pytest 8.1.2
  • Python 3.12.3 x86_64 on Arch Linux (package version 3.12.3-1)

Measurement on a realistic test case

I'm going to show measurement numbers for my complex (but realistic) state machine, which uses complex custom strategy to generate inputs. With all the complexity I would not expect precondition to take negligible toll on execution time, but that's not the case.

pytest reported runtime:

A] including five instances of preconditions len(self.model) > 0 and self.iter_generation == self.testcase.generation:

72.39s (0:01:12)

B] Without preconditions (just commented out, they were meant as optimization to steer the state machine into more interesting states):

6.18 s

syscall profiling

perf trace --summary syscall profiling:

$ sudo perf trace --summary --pid `pgrep pytest`

 Summary of events:

 pytest (64494), 3217578 events, 100.0%

   syscall            calls  errors  total       min       avg       max       stddev
                                     (msec)    (msec)    (msec)    (msec)        (%)
   --------------- --------  ------ -------- --------- --------- ---------     ------
   openat           2842952 2828160 7342084.708     0.000     2.583    73.245      0.03%
   newfstatat         14791      0  2173.974     0.033     0.147    59.981      3.38%
   write             105496      0   170.597     0.001     0.002     0.034      0.27%
   read               14767      0    36.604     0.001     0.002     0.713      1.96%
   close              14791      0    28.622     0.001     0.002     0.021      0.23%
   fstat              14798      0    25.282     0.001     0.002     0.022      0.33%
   ioctl              14794  14791    17.895     0.001     0.001     0.018      0.34%
   lseek              14711      0    15.242     0.001     0.001     0.023      0.49%

In other words, roughly 10 % of runtime was spent in doing openat syscall. Huh?

Probable cause

Here's what strace shows during test execution:

openat(AT_FDCWD, "<unknown>", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "<unknown>", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "<unknown>", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "<unknown>", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "<unknown>", O_RDONLY) = -1 ENOENT (No such file or directory)
... and it goes on and on ...

A bit of light debugging in GDB+python-gdb plugin indicates that these calls originate here:

Traceback (most recent call first):
  File "/usr/lib/python3.12/ast.py", line 52, in parse
    return compile(source, filename, mode, flags,
  File "/usr/lib/python3.12/site-packages/hypothesis/internal/reflection.py", line ?, in extract_lambda_source
    (failed to get frame line number)
  File "/usr/lib/python3.12/site-packages/hypothesis/internal/reflection.py", line 449, in get_pretty_function_description
    return extract_lambda_source(f)
  File "/usr/lib/python3.12/site-packages/hypothesis/stateful.py", line 1009, in is_valid
    where = f"{desc} precondition {get_pretty_function_description(pred)}"
  File "/usr/lib/python3.12/site-packages/hypothesis/stateful.py", line 977, in <genexpr>
    if not any(self.is_valid(rule) for rule in self.rules):

Confirmation

Just to confirm we are on the right track this cheapskate patch gets runtime for variants with and without precondition to the same values:

--- a/stateful.py	2024-05-02 13:15:06.751722997 +0200
+++ b/stateful.py	2024-05-02 13:15:51.041613884 +0200
@@ -1006,8 +1006,8 @@
         desc = f"{self.machine.__class__.__qualname__}, rule {rule.function.__name__},"
         for pred in rule.preconditions:
             meets_precond = pred(self.machine)
-            where = f"{desc} precondition {get_pretty_function_description(pred)}"
-            predicates[where]["satisfied" if meets_precond else "unsatisfied"] += 1
+            #where = f"{desc} precondition {get_pretty_function_description(pred)}"
+            #predicates[where]["satisfied" if meets_precond else "unsatisfied"] += 1
             if not meets_precond:
                 return False

Reproducer

Here's smaller test case which does not actually test anything:
test_reproducer.py.gz

  • With precondition commented out its runtime is about ~ 1.6 seconds on my machine.
  • With precondition active runtime jumps to ridiculous ~ 35 seconds.
@pspacek
Copy link
Author

pspacek commented May 2, 2024

Another related issues seems to be that ast.parse() for some reason is consuming a lot of memory and keeps the memory around while the test is running. Here's Scalene profile produced on the minimal test case above:
profile.json

40 MB does not look like much, but on my realistic test case the memory usage climbs to 1 GB in the first minute and it keeps going up until terminated by max_examples limit.

@tybug tybug added bug something is clearly wrong here performance go faster! use less memory! and removed bug something is clearly wrong here labels May 3, 2024
@tybug
Copy link
Member

tybug commented May 3, 2024

Wow, thanks for the detailed report! #3966 will address the performance and likely subsequently the memory...though we may still have an underlying leak somewhere.

@pspacek
Copy link
Author

pspacek commented May 3, 2024

Tanks for a quick turnaround!

@tybug
Copy link
Member

tybug commented May 5, 2024

This should be fixed in hypothesis >= 6.100.3, though let us know if performance or memory is still bad!

@tybug tybug closed this as completed May 5, 2024
@pspacek
Copy link
Author

pspacek commented May 6, 2024

I confirm that CPU-time performance is okay on hypothesis-6.100.4. Memory consumption is still suspicious but I will open a separate issue about it, assuming I'm able to write down sensible bug report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance go faster! use less memory!
Projects
None yet
Development

No branches or pull requests

2 participants