migrate `Float` shrinker to the ir #3899

tybug · 2024-02-26T20:26:49Z

I had originally tried writing a shrinker_v2.py from scratch...and quickly abandoned it as infeasible. I think my current plan is to convert shrinking passes one by one (where possible, which may or may not be all of them). minimize_floats is one of the more directly applicable ones.

Definitely open to feedback on the details of the migration approach here.

There is overlap in the IRTree definition with #3806, though it's not exactly the same. Either is fine to merge before the other; I'll handle conflict resolution whichever way.

hypothesis-python/src/hypothesis/internal/conjecture/data.py

hypothesis-python/src/hypothesis/internal/conjecture/shrinker.py

hypothesis-python/src/hypothesis/internal/conjecture/shrinking/floats.py

tybug · 2024-02-27T00:01:24Z

Seeing some slow shrinks locally (-k test_shrinks_downwards_to_integers_when_fractional). I think I'll need to add ir_tree support to cached_test_function, alongside buffers. We'll maintain both and then drop buffers once everything is on the ir.

tybug · 2024-02-27T05:14:12Z

hypothesis-python/src/hypothesis/internal/conjecture/data.py

+        leaf = self.ir_tree_leaves[self.ir_tree_leaves_index]
+        if leaf.ir_type != ir_type or kwargs != leaf.kwargs:
+            # Unlike buffers, not every ir tree is a valid choice sequence. If
+            # we want to maintain determinism we don't have many options here
+            # beyond giving up when a modified tree becomes misaligned.
+            #
+            # For what it's worth, misaligned buffers — albeit valid — are rather
+            # unlikely to be *useful* buffers, so this isn't an enormous
+            # downgrade.
+            self.mark_overrun()


shrinking may make modifications that misalign the ir tree. When working with buffers, we still run these all the way to a result, but my guess is they are very unlikely to be useful shrinks.

I don't know if there's anything better we can do for ir trees than giving up, beyond realigning tricks that are research papers in their own right. One idea is passing shrinker.engine.random here and continuing randomly, which might work, but I don't know if the same ir tree prefix running to different buffers will blow up.

By bailing out early we do less work in the test function, which means we could actually be spending more time proportionally on useful shrinks by overrunning. So it might be a wash at the end of the day. But I don't have a good intuition.

(also, just realizing: this should maybe be mark_invalid rather than overrun)

@DRMacIver would be the world expert on this one 😅

Let's start with the simplest implementation that gets all our tests passing, and then we can profile it and see if there's anything obviously-silly going on. "Continue randomly until not-a-shrink" would make sense to me, although I'm not confident it'd be a win; "continue but try realigning each time we leave an Example" is also salient to me for the explain phase (for shrinking, we'd also abandon if the misaligned-Example was > the base).

tybug · 2024-02-28T03:01:00Z

Good and bad news, mostly good:

cached_test_function actually already caches our ConjectureData.for_ir_tree datas, because running an ir_tree= data is no different from a prefix= data from the outside; both run to a buffer. No need to add a separate cache for ir trees. yay!
- I think this means this shrinker migration will come at no cost?
...as a consequence, this forces — no pun intended — our hand in implementing fake_forced. The ConjectureResult returned by an ir_tree= data needs to be completely correct, because it will be cached as the result for a buffer. fake_forced is messy, hard to verify correct, and will probably need to stick around even after the ir migration is complete. oh well, can't have everything.
pending ci, float shrinking basically works! There's a single failure in test_float_shrink_can_run_when_canonicalisation_does_not_work, which isn't a "float shrinking is broken" failure, but "you are calling the float shrinker on 1000.0 where you didn't used to". related review comment below.

I've pushed all my work here so we can see how everything fits together in a single place, but I'm going to bundle a few smaller changes in a split pr shortly. As is typical, this pr got very large very quickly 😄

tybug · 2024-02-28T03:08:51Z

hypothesis-python/src/hypothesis/internal/conjecture/shrinker.py

-                ex.label == DRAW_FLOAT_LABEL
-                and len(ex.children) == 2
-                and ex.children[1].length == 8
+        leaf = chooser.choose(self.leaves, lambda leaf: leaf.ir_type == "float")


Right now, we call Float.shrink on any float node (+ some short circuiting behavior in Float). The previous behavior skipped Float.shrink on floats that happened to also be integers:

buf = self.shrink_target.buffer b = buf[u:v] f = lex_to_float(int_from_bytes(b)) b2 = int_to_bytes(float_to_lex(f), 8) if b == b2 or self.consider_new_buffer(buf[:u] + b2 + buf[v:]):

I think the if b == b2 condition there is equivalent to checking if is_simple(f). i.e., don't try to float-shrink float nodes if they are already integers. The comments in lexical.py in this commit give some further insight:

# This floating point number can be represented in our simple format. # So we try converting it to that (which will give the same float, but # a different encoding of it). If that doesn't work then the float # value of this doesn't unambiguously give the desired predicate, so # this approach isn't useful. If it *does* work, then we're now in a # situation where we don't need it, so either way we return here. if is_simple(f): self.incorporate_float(f) return

To be honest, I'm not sold that we don't want to shrink float nodes as floats even if they happen to currently be integers. We mostly defer to the Integer shrinker anyway, and it also seems to make logical sense to keep all float-shrink-logic in the same place instead of jumping around. But I'm probably missing a good historical reason for it to be done this way. It's easy to put this condition back either way.

I think the intuition here is that we consider integer-valued floats simpler than all non-integer-valued floats, and so once we get to an integer-value we might as well use the machinery we have for shrinking integers.

You could probably convince me that a different ordering is better, but that's how I understand the status quo.

ok, I think that makes sense to me. I'll add this safeguard back. I was thinking about the case where shrinking an integer-valued float is faster if you allow it to try intermediate float values. But Integer probably gets you close enough to the minimum and then you can finish it off with a final Float pass.

I'd like to take a detailed look at the shrink ordering at some point. I saw a few weird things like values in (0, 1) being ordered after almost everything else; assert float_to_lex(sys.float_info.max) < float_to_lex(0.5). But for another day!

I think Hypothesis' shrinker is so well tuned that I shouldn't make changes like this without profiling or being very confident in them first. Of course, we may have no choice but to be adventurous when we get to the more complicated passes 🙂

Values between zero and one: not integers, and we order kinda by number of decimal places, so. And the largest finite float is an integer, which means "simpler" than 0.5! (I did say I'm not that attached to the ordering)

tybug · 2024-02-28T06:18:10Z

Unfortunately ran out of time tonight for the aforementioned split pr, but will get to it tomorrow.

tybug · 2024-03-10T07:06:12Z

Now that the infra is in place, I'm hopeful we can fly through these shrinker migrations. Only had time for this tonight, but more pulls to come!

tybug · 2024-03-10T07:11:33Z

ahh, coverage failure - will have to wait for tomorrow.

Zac-HD

Looking good! Small comments below, but this doesn't look like it'll suddenly grow on us...

hypothesis-python/src/hypothesis/internal/conjecture/data.py

hypothesis-python/src/hypothesis/internal/conjecture/shrinking/floats.py

tybug · 2024-03-14T02:50:53Z

3ca282e is a probably-bug-fix: we weren't considering smallest_nonzero_magnitude when counting the number of possible float children. This was probably a ticking time bomb for someone trying to use a small-range float strategy without subnormal support, though I haven't tried writing a strategy that would manifest this.

Zac-HD

Looks great with some very small comments below, please merge when they're addressed (or not) to your satisfaction!

hypothesis-python/src/hypothesis/internal/conjecture/data.py

hypothesis-python/src/hypothesis/internal/conjecture/shrinker.py

hypothesis-python/src/hypothesis/internal/conjecture/shrinking/floats.py

tybug · 2024-03-14T15:38:46Z

thanks for the comments, they were all helpful!

Zac-HD · 2024-03-14T18:15:44Z

Since we've closed #3086, do you want to open an issue to track all the remaining parts of the migrate-to-IR project? #3086 (comment) is not particularly detailed and I think we could probably break it down further now that we've come this far.

tybug · 2024-03-14T19:02:18Z

have also been thinking that we need a better issue for the IR project. I'll write something up.

tybug requested a review from Zac-HD as a code owner February 26, 2024 20:26

tybug commented Feb 26, 2024

View reviewed changes

tybug commented Feb 27, 2024

View reviewed changes

tybug commented Feb 28, 2024

View reviewed changes

This was referenced Feb 28, 2024

Some changes in preparation for the shrinker ir migration #3903

Merged

Add support for crosshair backend #3806

Merged

tybug added 3 commits March 10, 2024 03:02

reorder params

0ada7bc

add additional debugging

6662180

migrate float shrinker to the ir

b6bb8b4

tybug force-pushed the shrinker-ir branch from 1eedeb6 to b6bb8b4 Compare March 10, 2024 07:05

tybug and others added 3 commits March 11, 2024 00:59

add release note

ad525cf

add tests

c26b306

Exclude nan from test

aa8a3a3

Zac-HD reviewed Mar 11, 2024

View reviewed changes

hypothesis-python/src/hypothesis/internal/conjecture/data.py Outdated Show resolved Hide resolved

hypothesis-python/src/hypothesis/internal/conjecture/data.py Show resolved Hide resolved

hypothesis-python/src/hypothesis/internal/conjecture/shrinking/floats.py Outdated Show resolved Hide resolved

tybug added 2 commits March 13, 2024 22:30

account for smallest_nonzero_magnitude in max_children and all_children

3ca282e

add and use ir_value_permitted

22b08a7

Zac-HD approved these changes Mar 14, 2024

View reviewed changes

mark invalid instead of overrun, simplify code and comments

7595339

tybug enabled auto-merge March 14, 2024 15:38

tybug merged commit 749c8dd into HypothesisWorks:master Mar 14, 2024
51 checks passed

tybug deleted the shrinker-ir branch March 14, 2024 17:15

tybug mentioned this pull request Mar 15, 2024

Migrate our core representation to the typed choice sequence #3921

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

migrate `Float` shrinker to the ir #3899

migrate `Float` shrinker to the ir #3899

tybug commented Feb 26, 2024

tybug commented Feb 27, 2024 •

edited

Loading

tybug Feb 27, 2024 •

edited

Loading

Zac-HD Feb 27, 2024 •

edited

Loading

tybug commented Feb 28, 2024 •

edited

Loading

tybug Feb 28, 2024

Zac-HD Feb 28, 2024

tybug Feb 28, 2024 •

edited

Loading

Zac-HD Feb 28, 2024

tybug commented Feb 28, 2024

tybug commented Mar 10, 2024

tybug commented Mar 10, 2024 •

edited

Loading

Zac-HD left a comment

tybug commented Mar 14, 2024 •

edited

Loading

Zac-HD left a comment

tybug commented Mar 14, 2024

Zac-HD commented Mar 14, 2024

tybug commented Mar 14, 2024

migrate Float shrinker to the ir #3899

migrate Float shrinker to the ir #3899

Conversation

tybug commented Feb 26, 2024

tybug commented Feb 27, 2024 • edited Loading

tybug Feb 27, 2024 • edited Loading

Choose a reason for hiding this comment

Zac-HD Feb 27, 2024 • edited Loading

Choose a reason for hiding this comment

tybug commented Feb 28, 2024 • edited Loading

tybug Feb 28, 2024

Choose a reason for hiding this comment

Zac-HD Feb 28, 2024

Choose a reason for hiding this comment

tybug Feb 28, 2024 • edited Loading

Choose a reason for hiding this comment

Zac-HD Feb 28, 2024

Choose a reason for hiding this comment

tybug commented Feb 28, 2024

tybug commented Mar 10, 2024

tybug commented Mar 10, 2024 • edited Loading

Zac-HD left a comment

Choose a reason for hiding this comment

tybug commented Mar 14, 2024 • edited Loading

Zac-HD left a comment

Choose a reason for hiding this comment

tybug commented Mar 14, 2024

Zac-HD commented Mar 14, 2024

tybug commented Mar 14, 2024

migrate `Float` shrinker to the ir #3899

migrate `Float` shrinker to the ir #3899

tybug commented Feb 27, 2024 •

edited

Loading

tybug Feb 27, 2024 •

edited

Loading

Zac-HD Feb 27, 2024 •

edited

Loading

tybug commented Feb 28, 2024 •

edited

Loading

tybug Feb 28, 2024 •

edited

Loading

tybug commented Mar 10, 2024 •

edited

Loading

tybug commented Mar 14, 2024 •

edited

Loading