More small IR-related improvements #3854

tybug · 2024-01-21T21:00:14Z

The last simple bits I could squeeze out of #3818.

The last remaining test I'd like to migrate off draw_bits is test_avoids_zig_zag_trap. But I spent a fair amount of time trying to make that work, and wasn't able to get it to cooperate, likely because I don't fully understand when lower_common_block_offset fires. Any changes to draw_bits I made in the zigzag test caused lower_common_block_offset to be passed over as a result of this conditional being false:

hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/shrinker.py

Lines 1011 to 1013 in 775ed91

    
           if initial_data is self.shrink_target: 
        
               self.lower_common_block_offset() 
        
               return True

but I don't really understand why this conditional is there or the circumstances that cause it to be true.

tybug · 2024-01-21T21:34:20Z

hypothesis-python/src/hypothesis/internal/conjecture/data.py

+                    self._draw_float(forced=clamped)
                    result = clamped
            else:
                result = nasty_floats[i - 1]

-                self._write_float(result)
+                self._draw_float(forced=result)


wow, this change makes float tests take much longer. -k float goes from ~1 minute (master) to ~6 minutes (this branch) for me. The only thing _draw_float does that _write_float doesn't is start a new example, so I guess that's the cause? Is the extra example causing longer shrinks?

master:

===================================================================== slowest 20 durations ====================================================================== 41.57s call hypothesis-python/tests/quality/test_float_shrinking.py::test_shrinks_downwards_to_integers 3.23s call hypothesis-python/tests/nocover/test_collective_minimization.py::test_can_collectively_minimize[floats(min_value=3.14, max_value=3.14)] 2.19s call hypothesis-python/tests/cover/test_testdecorators.py::test_float_addition_is_associative 1.16s call hypothesis-python/tests/quality/test_float_shrinking.py::test_shrinks_downwards_to_integers_when_fractional

this branch:

60.61s call hypothesis-python/tests/quality/test_float_shrinking.py::test_shrinks_downwards_to_integers 53.77s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-5] 53.72s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-10] 53.29s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-7] 53.24s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-9] 53.02s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-2] 53.01s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-4] 52.42s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-6] 39.02s call hypothesis-python/tests/quality/test_normalization.py::test_common_strategies_normalize_small_values[floats()-8] 3.32s call hypothesis-python/tests/nocover/test_collective_minimization.py::test_can_collectively_minimize[floats(min_value=3.14, max_value=3.14)] 1.88s call hypothesis-python/tests/cover/test_testdecorators.py::test_float_addition_is_associative 1.31s call hypothesis-python/tests/quality/test_float_shrinking.py::test_shrinks_downwards_to_integers_when_fractional 1.06s call hypothesis-python/tests/numpy/test_from_dtype.py::test_float_subnormal_generation[32-False]

It's certainly plausible that adding an example tag would take longer (via there being more options for what to do, and a slower path being chosen as a result). I don't have much intuition for that though. Comparing --hypothesis-verbosity=debug might let you work out exactly what's happening?

I'd also be fine with adding examples only if forced is None:, so long as that doesn't make things flaky... iirc it should be fine so long as the IR sequence is identical.

That's a pretty painful performance regression though, and for our users as well as us, so I'd prefer not to ship until we get it sorted out one way or another.

adding if forced is None: works great 👍 and is something we may want to do regardless.

I'm still narrowing this performance regression down (for my own curiosity, partially), but it seems likely that it's localized to the dfa / LStar code, and has little to no impact elsewhere. I think the only thing we use at runtime there is LEARNED_DFAS, so I wouldn't expect this to impact anything but our dfa tests. It doesn't seem like a general regression based on the simple addition of a new example.

I'll either narrow down to a root cause or go with a guard on adding examples.

ok, I'm rabbit hole-ing here, so I'm going to leave this dfa investigation for another day.

dfa profiling
Some pyinstrument dumps, if they're informative to anyone:

_draw_float(forced=...)

_draw_float.html.zip

Flamegraph (speedscope format):

run.speedscope.json

_write_float(...)

_write_float.html.zip

I'm a little concerned that skipping examples for forced values is going to cause bad shrinks for e.g. third party primitive providers, which realize a buffer entirely through forced and wouldn't have the full example structure of the same naturally occurring buffer.

I read things a little closer and I'm fairly certain the DRAW_FLOAT_LABEL example in _draw_float is an exact duplicate of the one in draw_float. Removing this (29ae20d) seems to pass tests with no slowdown. Does this sound like an amicable alternative?

Based on this, I wonder if the performance of our dfa learner blows up on duplicate example for whatever reason.

Huh, yeah, that seems plausible. Certainly the result seems good, so let's ship it!

Zac-HD · 2024-01-22T00:06:53Z

I found the docstring for lower_common_block_offset reasonably helpful:

hypothesis/hypothesis-python/src/hypothesis/internal/conjecture/shrinker.py

Lines 847 to 878 in 775ed91

    
                   """Sometimes we find ourselves in a situation where changes to one part 
        
                   of the byte stream unlock changes to other parts. Sometimes this is 
        
                   good, but sometimes this can cause us to exhibit exponential slow 
        
                   downs! 
        
                   e.g. suppose we had the following: 
        
                   m = draw(integers(min_value=0)) 
        
                   n = draw(integers(min_value=0)) 
        
                   assert abs(m - n) > 1 
        
                   If this fails then we'll end up with a loop where on each iteration we 
        
                   reduce each of m and n by 2 - m can't go lower because of n, then n 
        
                   can't go lower because of m. 
        
                   This will take us O(m) iterations to complete, which is exponential in 
        
                   the data size, as we gradually zig zag our way towards zero. 
        
                   This can only happen if we're failing to reduce the size of the byte 
        
                   stream: The number of iterations that reduce the length of the byte 
        
                   stream is bounded by that length. 
        
                   So what we do is this: We keep track of which blocks are changing, and 
        
                   then if there's some non-zero common offset to them we try and minimize 
        
                   them all at once by lowering that offset. 
        
                   This may not work, and it definitely won't get us out of all possible 
        
                   exponential slow downs (an example of where it doesn't is where the 
        
                   shape of the blocks changes as a result of this bouncing behaviour), 
        
                   but it fails fast when it doesn't work and gets us out of a really 
        
                   nastily slow case when it does. 
        
                   """

Then we conditionally invoke this from try_shrinking_blocks, after trying out our replaced-all-blocks buffer. Note here that that if initial_data is self.shrink_target: means that we succeeded - we can tell that the buffer was a successful shrink because the result is also now our self.shrink_target. (if it was merely equal, maybe our shrink was a noop, but we use is here)

Because we successfully replaced all these blocks, it's worth checking in case we can lower_common_block_offset and skip an exponential-in-size zig-zag where we can lower alternating blocks by a small amount on each attempt.

(hope that helps!)

tybug · 2024-01-22T18:50:15Z

Clarifying the semantics of if initial_data is self.shrink_target: was particularly helpful for me, thanks!

I'm taking another look at test_avoids_zig_zag_trap, but I'm also happy to get this pull in as-is (if you happen to get to this pull before I reach a conclusion). Any follow-up here should be a test-only change and so doesn't have a release overhead.

tybug · 2024-01-22T22:05:28Z

pretty happy that I got test_avoids_zig_zag_trap to work here. I think I was running into trouble earlier by using unbounded draws (no max value) in the test definition, which doesn't seem to be in the right representation for this shrink common offset trick to work on it. I've kept the previous bounded draws and things work nicely.

Zac-HD · 2024-01-23T05:54:14Z

(close-and-reopen to get CI unstuck)

Zac-HD · 2024-01-23T10:24:31Z

hypothesis-python/src/hypothesis/internal/conjecture/data.py

+                    self._draw_float(forced=clamped)
                    result = clamped
            else:
                result = nasty_floats[i - 1]

-                self._write_float(result)
+                self._draw_float(forced=result)


Huh, yeah, that seems plausible. Certainly the result seems good, so let's ship it!

tybug added 6 commits January 21, 2024 15:49

add __eq__ to IntervalSet

6fa8879

more draw_bits migrations

51a8b52

make float_runner more robust to buffer changes

6c91c72

formatting

d5008da

add release notes

403e1c1

add an extra zero byte for 24 bit int draw

4ff4d6d

tybug requested a review from Zac-HD as a code owner January 21, 2024 21:00

use forced= instead of write_float

2d20af0

tybug commented Jan 21, 2024

View reviewed changes

tybug mentioned this pull request Jan 21, 2024

Migrate DataTree to the new IR #3818

Merged

5 tasks

remove redundant example

29ae20d

tybug added 2 commits January 22, 2024 14:19

add cover tests for intervalset

02820f1

migrate buffer semantics in test_avoids_zig_zag_trap

16c0de8

Zac-HD closed this Jan 23, 2024

Zac-HD reopened this Jan 23, 2024

Zac-HD approved these changes Jan 23, 2024

View reviewed changes

Zac-HD merged commit 38eafc7 into HypothesisWorks:master Jan 23, 2024
47 checks passed

tybug deleted the more-test-changes branch January 23, 2024 15:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More small IR-related improvements #3854

More small IR-related improvements #3854

tybug commented Jan 21, 2024

tybug Jan 21, 2024 •

edited

Loading

Zac-HD Jan 21, 2024 •

edited

Loading

tybug Jan 22, 2024 •

edited

Loading

tybug Jan 22, 2024

Zac-HD Jan 23, 2024

Zac-HD commented Jan 22, 2024

tybug commented Jan 22, 2024

tybug commented Jan 22, 2024

Zac-HD commented Jan 23, 2024

Zac-HD Jan 23, 2024

	if initial_data is self.shrink_target:
	self.lower_common_block_offset()
	return True

More small IR-related improvements #3854

More small IR-related improvements #3854

Conversation

tybug commented Jan 21, 2024

tybug Jan 21, 2024 • edited Loading

Choose a reason for hiding this comment

Zac-HD Jan 21, 2024 • edited Loading

Choose a reason for hiding this comment

tybug Jan 22, 2024 • edited Loading

Choose a reason for hiding this comment

tybug Jan 22, 2024

Choose a reason for hiding this comment

Zac-HD Jan 23, 2024

Choose a reason for hiding this comment

Zac-HD commented Jan 22, 2024

tybug commented Jan 22, 2024

tybug commented Jan 22, 2024

Zac-HD commented Jan 23, 2024

Zac-HD Jan 23, 2024

Choose a reason for hiding this comment

tybug Jan 21, 2024 •

edited

Loading

Zac-HD Jan 21, 2024 •

edited

Loading

tybug Jan 22, 2024 •

edited

Loading