Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support forcing all primitives #3801

Merged
merged 47 commits into from
Dec 9, 2023
Merged

Conversation

tybug
Copy link
Member

@tybug tybug commented Nov 26, 2023

Another step towards #3086!

Comment on lines 180 to 183
forced_choice = (
None
if forced is None
else next((b, a, a_c) for (b, a, a_c) in self.table if forced in (b, a))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no comment...ugly, but gets the job done. If I've missed an insight that renders this (and the forced in choice) unnecessary, let me know.

@Zac-HD
Copy link
Member

Zac-HD commented Nov 26, 2023

I've only given this a quick skim, but so far it looks good - and I'm really excited about what it's going to enable!

Let's keep going with the pattern of shipping each individual chunk asap: I think we can get this one in pretty soon, and then have a followup which adds something like ConjectureData.force_from_primitives() to replay from a list - I expect that to be harder than it sounds.

@tybug
Copy link
Member Author

tybug commented Nov 26, 2023

Best I can tell, the failing nocover test about large integers is a result of this driveby change: 3ce6373#diff-46397813b66506dd6aaeb5162b247ecbe2cae12df531a635d1326b70f8eb543cR1215-R1217. Reverting this passes the test.

Something strange is going on here, because that draw_boolean call is definitely not returning True with p = 7 / 8. See following patch:

diff --git a/hypothesis-python/src/hypothesis/internal/conjecture/data.py b/hypothesis-python/src/hypothesis/internal/conjecture/data.py
index dfdc14303..2f0f3a4fa 100644
--- a/hypothesis-python/src/hypothesis/internal/conjecture/data.py
+++ b/hypothesis-python/src/hypothesis/internal/conjecture/data.py
@@ -858,7 +858,7 @@ class ConjectureResult:
 BYTE_MASKS = [(1 << n) - 1 for n in range(8)]
 BYTE_MASKS[0] = 255
 
-
+bool_draws = defaultdict(lambda: [0, 0])
 class PrimitiveProvider:
     # This is the low-level interface which would also be implemented
     # by e.g. CrossHair, by an Atheris-hypothesis integration, etc.
@@ -982,6 +982,10 @@ class PrimitiveProvider:
                     self._cd.draw_bits(bits, forced=int(result))
             break
         self._cd.stop_example()
+
+        if forced is None:
+            bool_draws[p][0] += int(result) # successes
+            bool_draws[p][1] += 1 # attempts
         return result
 
     def draw_integer(

and test code:

from hypothesis import *
from hypothesis.strategies import *
from hypothesis.internal.conjecture.data import bool_draws

values = []
@settings(database=None, max_examples=1000)
@given(integers(0, 1e100))
def test(x):
    if 2 <= x <= int(1e100) - 2:  # skip forced-endpoints
        values.append(x)

test()

print(bool_draws)
defaultdict(<function <lambda> at 0x105f32c00>, {0.875: [586, 995], 0.2758620689655171: [17, 75], 0.0: [0, 97], 0.3448275862068966: [169, 306], 0.1034482758620685: [0, 14], 0.1724137931034483: [12, 51], 0.3793103448275863: [9, 29], 0.5172413793103452: [4, 4], 0.06896551724138078: [0, 7], 0.413793103448274: [0, 3]})

which gives probability 586 / 995 = 0.58 where we expect 0.875.

I'm still looking into this. We can revert if we need for this pull, but I'd like to figure out why this is occurring.

@tybug
Copy link
Member Author

tybug commented Dec 2, 2023

sounds good! I keep forgetting that a bunch of this cruft goes away / is improved after more usages of the IR is in place.

@tybug
Copy link
Member Author

tybug commented Dec 3, 2023

To circle back to the failure here mentioned here #3801 (comment) - this regressed in 9283da3. The desired counterexample is no longer found within 1000 tries, but is found within 5000. Clearly removing the discards and final forced draw has had an impact on data distribution. (If I had to hazard a guess as to why, it would be generate_novel_prefix exploring a different part of the search space first now, but I'm not confident in that assessment).

I've increased the budget for the test, but I am a bit concerned about it, since I think dtype.kind == "U" just delegates to st.characters(), which means this could have a relatively wide impact if it is an issue.

if NP_FIXED_UNICODE and "alphabet" not in kwargs:
kwargs["alphabet"] = st.characters()

I think this is the final remaining issue, though. Contingent on the above change being amenable, this PR is ready for a final review from my end!

@tybug
Copy link
Member Author

tybug commented Dec 3, 2023

hmm, ./build.sh check-conjecture-coverage passes locally for me...I wonder why it failed here? https://github.com/HypothesisWorks/hypothesis/actions/runs/7075174386/job/19257014760?pr=3801

Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

@Zac-HD
Copy link
Member

Zac-HD commented Dec 9, 2023

Looks good - the only missing thing was that we'd spotted some edge cases to handle, without explicitly testing them. I added those myself to speed through review, but it looks like we need to add some additional logic to handle the sign bit on a nan.

(floating point numbers are so much more complicated than people want to think about 😅)

@tybug
Copy link
Member Author

tybug commented Dec 9, 2023

thanks for the additional tests 👍. Was a relatively straightforward fix.

@Zac-HD Zac-HD enabled auto-merge December 9, 2023 22:45
@Zac-HD Zac-HD merged commit 512cfed into HypothesisWorks:master Dec 9, 2023
47 checks passed
@Zac-HD
Copy link
Member

Zac-HD commented Dec 9, 2023

Woohoo! Really exciting to have this merged 😁

I'm going to try to get a simple test-only PRNG-based backend working as part of #3806, including replay and shrinking support... unclear whether it's feasible in a weekend, but we're that close to Crosshair support!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants