Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More small IR-related improvements #3854

Merged
merged 10 commits into from
Jan 23, 2024

Conversation

tybug
Copy link
Member

@tybug tybug commented Jan 21, 2024

The last simple bits I could squeeze out of #3818.

The last remaining test I'd like to migrate off draw_bits is test_avoids_zig_zag_trap. But I spent a fair amount of time trying to make that work, and wasn't able to get it to cooperate, likely because I don't fully understand when lower_common_block_offset fires. Any changes to draw_bits I made in the zigzag test caused lower_common_block_offset to be passed over as a result of this conditional being false:

if initial_data is self.shrink_target:
self.lower_common_block_offset()
return True

but I don't really understand why this conditional is there or the circumstances that cause it to be true.

@tybug tybug requested a review from Zac-HD as a code owner January 21, 2024 21:00
Comment on lines +1096 to +1101
self._draw_float(forced=clamped)
result = clamped
else:
result = nasty_floats[i - 1]

self._write_float(result)
self._draw_float(forced=result)
Copy link
Member Author

@tybug tybug Jan 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, this change makes float tests take much longer. -k float goes from ~1 minute (master) to ~6 minutes (this branch) for me. The only thing _draw_float does that _write_float doesn't is start a new example, so I guess that's the cause? Is the extra example causing longer shrinks?

master:

===================================================================== slowest 20 durations ======================================================================
41.57s call     hypothesis-python/tests/quality/test_float_shrinking.py::test_shrinks_downwards_to_integers
3.23s call     hypothesis-python/tests/nocover/test_collective_minimization.py::test_can_collectively_minimize[floats(min_value=3.14, max_value=3.14)]
2.19s call     hypothesis-python/tests/cover/test_testdecorators.py::test_float_addition_is_associative
1.16s call     hypothesis-python/tests/quality/test_float_shrinking.py::test_shrinks_downwards_to_integers_when_fractional

this branch:

60.61s call     hypothesis-python/tests/quality/test_float_shrinking.py::test_shrinks_downwards_to_integers
53.77s call     hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-5]
53.72s call     hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-10]
53.29s call     hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-7]
53.24s call     hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-9]
53.02s call     hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-2]
53.01s call     hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-4]
52.42s call     hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-6]
39.02s call     hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-8]
3.32s call     hypothesis-python/tests/nocover/test_collective_minimization.py::test_can_collectively_minimize[floats(min_value=3.14, max_value=3.14)]
1.88s call     hypothesis-python/tests/cover/test_testdecorators.py::test_float_addition_is_associative
1.31s call     hypothesis-python/tests/quality/test_float_shrinking.py::test_shrinks_downwards_to_integers_when_fractional
1.06s call     hypothesis-python/tests/numpy/test_from_dtype.py::test_float_subnormal_generation[32-False]

Copy link
Member

@Zac-HD Zac-HD Jan 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's certainly plausible that adding an example tag would take longer (via there being more options for what to do, and a slower path being chosen as a result). I don't have much intuition for that though. Comparing --hypothesis-verbosity=debug might let you work out exactly what's happening?

I'd also be fine with adding examples only if forced is None:, so long as that doesn't make things flaky... iirc it should be fine so long as the IR sequence is identical.

That's a pretty painful performance regression though, and for our users as well as us, so I'd prefer not to ship until we get it sorted out one way or another.

Copy link
Member Author

@tybug tybug Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding if forced is None: works great 👍 and is something we may want to do regardless.

I'm still narrowing this performance regression down (for my own curiosity, partially), but it seems likely that it's localized to the dfa / LStar code, and has little to no impact elsewhere. I think the only thing we use at runtime there is LEARNED_DFAS, so I wouldn't expect this to impact anything but our dfa tests. It doesn't seem like a general regression based on the simple addition of a new example.

I'll either narrow down to a root cause or go with a guard on adding examples.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'm rabbit hole-ing here, so I'm going to leave this dfa investigation for another day.

dfa profiling Some pyinstrument dumps, if they're informative to anyone:

_draw_float(forced=...)

image

_draw_float.html.zip

Flamegraph (speedscope format):

image

run.speedscope.json


_write_float(...)

image

_write_float.html.zip

I'm a little concerned that skipping examples for forced values is going to cause bad shrinks for e.g. third party primitive providers, which realize a buffer entirely through forced and wouldn't have the full example structure of the same naturally occurring buffer.

I read things a little closer and I'm fairly certain the DRAW_FLOAT_LABEL example in _draw_float is an exact duplicate of the one in draw_float. Removing this (29ae20d) seems to pass tests with no slowdown. Does this sound like an amicable alternative?

Based on this, I wonder if the performance of our dfa learner blows up on duplicate example for whatever reason.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, yeah, that seems plausible. Certainly the result seems good, so let's ship it!

@tybug tybug mentioned this pull request Jan 21, 2024
5 tasks
@Zac-HD
Copy link
Member

Zac-HD commented Jan 22, 2024

I found the docstring for lower_common_block_offset reasonably helpful:

"""Sometimes we find ourselves in a situation where changes to one part
of the byte stream unlock changes to other parts. Sometimes this is
good, but sometimes this can cause us to exhibit exponential slow
downs!
e.g. suppose we had the following:
m = draw(integers(min_value=0))
n = draw(integers(min_value=0))
assert abs(m - n) > 1
If this fails then we'll end up with a loop where on each iteration we
reduce each of m and n by 2 - m can't go lower because of n, then n
can't go lower because of m.
This will take us O(m) iterations to complete, which is exponential in
the data size, as we gradually zig zag our way towards zero.
This can only happen if we're failing to reduce the size of the byte
stream: The number of iterations that reduce the length of the byte
stream is bounded by that length.
So what we do is this: We keep track of which blocks are changing, and
then if there's some non-zero common offset to them we try and minimize
them all at once by lowering that offset.
This may not work, and it definitely won't get us out of all possible
exponential slow downs (an example of where it doesn't is where the
shape of the blocks changes as a result of this bouncing behaviour),
but it fails fast when it doesn't work and gets us out of a really
nastily slow case when it does.
"""

Then we conditionally invoke this from try_shrinking_blocks, after trying out our replaced-all-blocks buffer. Note here that that if initial_data is self.shrink_target: means that we succeeded - we can tell that the buffer was a successful shrink because the result is also now our self.shrink_target. (if it was merely equal, maybe our shrink was a noop, but we use is here)

Because we successfully replaced all these blocks, it's worth checking in case we can lower_common_block_offset and skip an exponential-in-size zig-zag where we can lower alternating blocks by a small amount on each attempt.

(hope that helps!)

@tybug
Copy link
Member Author

tybug commented Jan 22, 2024

Clarifying the semantics of if initial_data is self.shrink_target: was particularly helpful for me, thanks!

I'm taking another look at test_avoids_zig_zag_trap, but I'm also happy to get this pull in as-is (if you happen to get to this pull before I reach a conclusion). Any follow-up here should be a test-only change and so doesn't have a release overhead.

@tybug
Copy link
Member Author

tybug commented Jan 22, 2024

pretty happy that I got test_avoids_zig_zag_trap to work here. I think I was running into trouble earlier by using unbounded draws (no max value) in the test definition, which doesn't seem to be in the right representation for this shrink common offset trick to work on it. I've kept the previous bounded draws and things work nicely.

@Zac-HD Zac-HD closed this Jan 23, 2024
@Zac-HD Zac-HD reopened this Jan 23, 2024
@Zac-HD
Copy link
Member

Zac-HD commented Jan 23, 2024

(close-and-reopen to get CI unstuck)

Comment on lines +1096 to +1101
self._draw_float(forced=clamped)
result = clamped
else:
result = nasty_floats[i - 1]

self._write_float(result)
self._draw_float(forced=result)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, yeah, that seems plausible. Certainly the result seems good, so let's ship it!

@Zac-HD Zac-HD merged commit 38eafc7 into HypothesisWorks:master Jan 23, 2024
47 checks passed
@tybug tybug deleted the more-test-changes branch January 23, 2024 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants