Support forcing all primitives #3801

tybug · 2023-11-26T06:38:42Z

Another step towards #3086!

hypothesis-python/src/hypothesis/internal/conjecture/data.py

hypothesis-python/src/hypothesis/internal/intervalsets.py

hypothesis-python/src/hypothesis/internal/conjecture/utils.py

tybug · 2023-11-26T06:49:58Z

hypothesis-python/src/hypothesis/internal/conjecture/utils.py

+        forced_choice = (
+            None
+            if forced is None
+            else next((b, a, a_c) for (b, a, a_c) in self.table if forced in (b, a))


no comment...ugly, but gets the job done. If I've missed an insight that renders this (and the forced in choice) unnecessary, let me know.

hypothesis-python/src/hypothesis/internal/conjecture/data.py

Zac-HD · 2023-11-26T09:00:31Z

I've only given this a quick skim, but so far it looks good - and I'm really excited about what it's going to enable!

Let's keep going with the pattern of shipping each individual chunk asap: I think we can get this one in pretty soon, and then have a followup which adds something like ConjectureData.force_from_primitives() to replay from a list - I expect that to be harder than it sounds.

tybug · 2023-11-26T18:39:17Z

Best I can tell, the failing nocover test about large integers is a result of this driveby change: 3ce6373#diff-46397813b66506dd6aaeb5162b247ecbe2cae12df531a635d1326b70f8eb543cR1215-R1217. Reverting this passes the test.

Something strange is going on here, because that draw_boolean call is definitely not returning True with p = 7 / 8. See following patch:

diff --git a/hypothesis-python/src/hypothesis/internal/conjecture/data.py b/hypothesis-python/src/hypothesis/internal/conjecture/data.py
index dfdc14303..2f0f3a4fa 100644
--- a/hypothesis-python/src/hypothesis/internal/conjecture/data.py
+++ b/hypothesis-python/src/hypothesis/internal/conjecture/data.py
@@ -858,7 +858,7 @@ class ConjectureResult:
 BYTE_MASKS = [(1 << n) - 1 for n in range(8)]
 BYTE_MASKS[0] = 255
 
-
+bool_draws = defaultdict(lambda: [0, 0])
 class PrimitiveProvider:
     # This is the low-level interface which would also be implemented
     # by e.g. CrossHair, by an Atheris-hypothesis integration, etc.
@@ -982,6 +982,10 @@ class PrimitiveProvider:
                     self._cd.draw_bits(bits, forced=int(result))
             break
         self._cd.stop_example()
+
+        if forced is None:
+            bool_draws[p][0] += int(result) # successes
+            bool_draws[p][1] += 1 # attempts
         return result
 
     def draw_integer(

and test code:

from hypothesis import *
from hypothesis.strategies import *
from hypothesis.internal.conjecture.data import bool_draws

values = []
@settings(database=None, max_examples=1000)
@given(integers(0, 1e100))
def test(x):
    if 2 <= x <= int(1e100) - 2:  # skip forced-endpoints
        values.append(x)

test()

print(bool_draws)

defaultdict(<function <lambda> at 0x105f32c00>, {0.875: [586, 995], 0.2758620689655171: [17, 75], 0.0: [0, 97], 0.3448275862068966: [169, 306], 0.1034482758620685: [0, 14], 0.1724137931034483: [12, 51], 0.3793103448275863: [9, 29], 0.5172413793103452: [4, 4], 0.06896551724138078: [0, 7], 0.413793103448274: [0, 3]})

which gives probability 586 / 995 = 0.58 where we expect 0.875.

I'm still looking into this. We can revert if we need for this pull, but I'd like to figure out why this is occurring.

tybug · 2023-12-02T22:40:39Z

sounds good! I keep forgetting that a bunch of this cruft goes away / is improved after more usages of the IR is in place.

:(

tybug · 2023-12-03T06:25:45Z

To circle back to the failure here mentioned here #3801 (comment) - this regressed in 9283da3. The desired counterexample is no longer found within 1000 tries, but is found within 5000. Clearly removing the discards and final forced draw has had an impact on data distribution. (If I had to hazard a guess as to why, it would be generate_novel_prefix exploring a different part of the search space first now, but I'm not confident in that assessment).

I've increased the budget for the test, but I am a bit concerned about it, since I think dtype.kind == "U" just delegates to st.characters(), which means this could have a relatively wide impact if it is an issue.

hypothesis/hypothesis-python/src/hypothesis/extra/numpy.py

Lines 199 to 200 in ff22890

    
           if NP_FIXED_UNICODE and "alphabet" not in kwargs: 
        
               kwargs["alphabet"] = st.characters()

I think this is the final remaining issue, though. Contingent on the above change being amenable, this PR is ready for a final review from my end!

tybug · 2023-12-03T06:34:34Z

hmm, ./build.sh check-conjecture-coverage passes locally for me...I wonder why it failed here? https://github.com/HypothesisWorks/hypothesis/actions/runs/7075174386/job/19257014760?pr=3801

Zac-HD

Looking good!

hypothesis-python/src/hypothesis/internal/conjecture/data.py

Zac-HD · 2023-12-09T07:40:54Z

Looks good - the only missing thing was that we'd spotted some edge cases to handle, without explicitly testing them. I added those myself to speed through review, but it looks like we need to add some additional logic to handle the sign bit on a nan.

(floating point numbers are so much more complicated than people want to think about 😅)

tybug · 2023-12-09T22:40:46Z

thanks for the additional tests 👍. Was a relatively straightforward fix.

Zac-HD · 2023-12-09T23:33:24Z

Woohoo! Really exciting to have this merged 😁

I'm going to try to get a simple test-only PRNG-based backend working as part of #3806, including replay and shrinking support... unclear whether it's feasible in a weekend, but we're that close to Crosshair support!

tybug added 23 commits November 24, 2023 18:03

type IntervalSet more

c09018e

implement forced for many

07115c7

implement forced for draw_string

ed42225

add test for forced many

ae52218

move forcing tests to separate file

e91ad95

linting

3ce6373

implement forcing for Sampler

79a39d3

linting

7e4b1d7

implement forcing for draw_integer, except weights

2c8a41a

add test for string forcing

699d56a

fix index_from_char_in_shrink_order bug

1090d94

fix off by one error

95d76ca

parameterize test by min_size, max_size

0c9d072

use st.data instead of constructing ConjectureData

2eb8fc8

simplify shrink_towards logic by clamping to start

44d824c

fix bug in Sampler forcing

3c385d1

add support for forcing integer weights

4866f5e

standardize forced test names

8e28786

add support for forcing bytes

16179e3

add support for forcing floats

ac232ea

add return typing for intervalsets

d36d285

require equal size in forced draw_bytes

64edc2e

add tests for written forced buffer

0f80e0e

tybug requested review from DRMacIver and Zac-HD as code owners November 26, 2023 06:38

tybug commented Nov 26, 2023

View reviewed changes

allow forced nans

50bc3f7

revert draw_boolean change

ef5e449

disallow out of bounds forced combinations

724ea85

Zac-HD mentioned this pull request Dec 3, 2023

Add support for crosshair backend #3806

Merged

tybug added 5 commits December 3, 2023 00:24

more correct condition check

0aa30ee

increase max_examples for dtype bug

2014ccf

suppress flaky too_slow health check

7ec7d8b

add release notes

654dc02

increase budget again

5c9c4a6

:(

move assume to correct scope

f74b93f

Zac-HD reviewed Dec 8, 2023

View reviewed changes

tybug added 2 commits December 7, 2023 23:33

formatting

cc77fb1

simpler forced logic for floats

a52bf7e

tybug force-pushed the forced-primitives branch from cbddf7c to a52bf7e Compare December 8, 2023 04:54

tybug and others added 4 commits December 8, 2023 00:04

fix generating nans when not allowed

9a232f0

fix lte sign comparison

d9cdfe2

simpler forced float draw

76c37eb

Test forced-float edge cases

738b6c2

tybug added 2 commits December 9, 2023 17:39

formatting

f52c198

correct forced_sign_bit handling for nans

0f0ebcc

Zac-HD approved these changes Dec 9, 2023

View reviewed changes

Zac-HD enabled auto-merge December 9, 2023 22:45

Zac-HD merged commit 512cfed into HypothesisWorks:master Dec 9, 2023
47 checks passed

tybug deleted the forced-primitives branch December 9, 2023 23:37

tybug mentioned this pull request Jan 6, 2024

6.91.2: flaky tests/quality/test_discovery_ability.py::test_can_produce_multi_line_strings #3829

Closed

tybug mentioned this pull request Mar 15, 2024

Migrate our core representation to the typed choice sequence #3921

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support forcing all primitives #3801

Support forcing all primitives #3801

tybug commented Nov 26, 2023 •

edited

Loading

tybug Nov 26, 2023

Zac-HD commented Nov 26, 2023

tybug commented Nov 26, 2023 •

edited

Loading

tybug commented Dec 2, 2023

tybug commented Dec 3, 2023 •

edited

Loading

tybug commented Dec 3, 2023 •

edited

Loading

Zac-HD left a comment

Zac-HD commented Dec 9, 2023

tybug commented Dec 9, 2023

Zac-HD commented Dec 9, 2023

Support forcing all primitives #3801

Support forcing all primitives #3801

Conversation

tybug commented Nov 26, 2023 • edited Loading

tybug Nov 26, 2023

Choose a reason for hiding this comment

Zac-HD commented Nov 26, 2023

tybug commented Nov 26, 2023 • edited Loading

tybug commented Dec 2, 2023

tybug commented Dec 3, 2023 • edited Loading

tybug commented Dec 3, 2023 • edited Loading

Zac-HD left a comment

Choose a reason for hiding this comment

Zac-HD commented Dec 9, 2023

tybug commented Dec 9, 2023

Zac-HD commented Dec 9, 2023

tybug commented Nov 26, 2023 •

edited

Loading

tybug commented Nov 26, 2023 •

edited

Loading

tybug commented Dec 3, 2023 •

edited

Loading

tybug commented Dec 3, 2023 •

edited

Loading