-
Notifications
You must be signed in to change notification settings - Fork 584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More small IR-related improvements #3854
Conversation
self._draw_float(forced=clamped) | ||
result = clamped | ||
else: | ||
result = nasty_floats[i - 1] | ||
|
||
self._write_float(result) | ||
self._draw_float(forced=result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, this change makes float tests take much longer. -k float
goes from ~1 minute (master) to ~6 minutes (this branch) for me. The only thing _draw_float
does that _write_float
doesn't is start a new example, so I guess that's the cause? Is the extra example causing longer shrinks?
master:
===================================================================== slowest 20 durations ======================================================================
41.57s call hypothesis-python/tests/quality/test_float_shrinking.py::test_shrinks_downwards_to_integers
3.23s call hypothesis-python/tests/nocover/test_collective_minimization.py::test_can_collectively_minimize[floats(min_value=3.14, max_value=3.14)]
2.19s call hypothesis-python/tests/cover/test_testdecorators.py::test_float_addition_is_associative
1.16s call hypothesis-python/tests/quality/test_float_shrinking.py::test_shrinks_downwards_to_integers_when_fractional
this branch:
60.61s call hypothesis-python/tests/quality/test_float_shrinking.py::test_shrinks_downwards_to_integers
53.77s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-5]
53.72s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-10]
53.29s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-7]
53.24s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-9]
53.02s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-2]
53.01s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-4]
52.42s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-6]
39.02s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-8]
3.32s call hypothesis-python/tests/nocover/test_collective_minimization.py::test_can_collectively_minimize[floats(min_value=3.14, max_value=3.14)]
1.88s call hypothesis-python/tests/cover/test_testdecorators.py::test_float_addition_is_associative
1.31s call hypothesis-python/tests/quality/test_float_shrinking.py::test_shrinks_downwards_to_integers_when_fractional
1.06s call hypothesis-python/tests/numpy/test_from_dtype.py::test_float_subnormal_generation[32-False]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's certainly plausible that adding an example tag would take longer (via there being more options for what to do, and a slower path being chosen as a result). I don't have much intuition for that though. Comparing --hypothesis-verbosity=debug
might let you work out exactly what's happening?
I'd also be fine with adding examples only if forced is None:
, so long as that doesn't make things flaky... iirc it should be fine so long as the IR sequence is identical.
That's a pretty painful performance regression though, and for our users as well as us, so I'd prefer not to ship until we get it sorted out one way or another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding if forced is None:
works great 👍 and is something we may want to do regardless.
I'm still narrowing this performance regression down (for my own curiosity, partially), but it seems likely that it's localized to the dfa / LStar code, and has little to no impact elsewhere. I think the only thing we use at runtime there is LEARNED_DFAS
, so I wouldn't expect this to impact anything but our dfa tests. It doesn't seem like a general regression based on the simple addition of a new example.
I'll either narrow down to a root cause or go with a guard on adding examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I'm rabbit hole-ing here, so I'm going to leave this dfa investigation for another day.
dfa profiling
Some pyinstrument dumps, if they're informative to anyone:_draw_float(forced=...)
Flamegraph (speedscope format):
_write_float(...)
I'm a little concerned that skipping examples for forced values is going to cause bad shrinks for e.g. third party primitive providers, which realize a buffer entirely through forced
and wouldn't have the full example structure of the same naturally occurring buffer.
I read things a little closer and I'm fairly certain the DRAW_FLOAT_LABEL
example in _draw_float
is an exact duplicate of the one in draw_float
. Removing this (29ae20d) seems to pass tests with no slowdown. Does this sound like an amicable alternative?
Based on this, I wonder if the performance of our dfa learner blows up on duplicate example for whatever reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, yeah, that seems plausible. Certainly the result seems good, so let's ship it!
I found the docstring for hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/shrinker.py Lines 847 to 878 in 775ed91
Then we conditionally invoke this from Because we successfully replaced all these blocks, it's worth checking in case we can (hope that helps!) |
Clarifying the semantics of I'm taking another look at |
pretty happy that I got |
(close-and-reopen to get CI unstuck) |
self._draw_float(forced=clamped) | ||
result = clamped | ||
else: | ||
result = nasty_floats[i - 1] | ||
|
||
self._write_float(result) | ||
self._draw_float(forced=result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, yeah, that seems plausible. Certainly the result seems good, so let's ship it!
The last simple bits I could squeeze out of #3818.
The last remaining test I'd like to migrate off
draw_bits
istest_avoids_zig_zag_trap
. But I spent a fair amount of time trying to make that work, and wasn't able to get it to cooperate, likely because I don't fully understand whenlower_common_block_offset
fires. Any changes todraw_bits
I made in the zigzag test causedlower_common_block_offset
to be passed over as a result of this conditional being false:hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/shrinker.py
Lines 1011 to 1013 in 775ed91
but I don't really understand why this conditional is there or the circumstances that cause it to be true.