Add note to TypeError when each element to `sampled_from` is a strategy #3820

vreuter · 2023-12-22T09:25:05Z

Zac-HD · 2023-12-22T23:50:31Z

Hi Vince - thanks so much for contributing!

Unfortunately I don't think that a warning is suitable here, because the user might well want to sample from a list of strategies (e.g. as part of a .flatmap() strategy).

However, we can still do something useful, by providing a note or suggestion if and only if the test failed - the implementation gets a bit more complicated, but in return we can reduce the false-alarm rate and avoid annoying users' whose tests are actually OK. The basic idea:

Instead of emitting a warning, set a boolean flag on the data object (somewhat like this)
If the test fails with a TypeError and the message contains the substring SearchStrategy, then add a note to the exception object. (we have a backport helper in core.py)

…ypothesisWorks#3819 Revised per suggestion here: HypothesisWorks#3820 (comment)

vreuter · 2023-12-27T01:16:21Z

However, we can still do something useful, by providing a note or suggestion if and only if the test failed - the implementation gets a bit more complicated, but in return we can reduce the false-alarm rate and avoid annoying users' whose tests are actually OK.

💯 agree

@Zac-HD I've moved the implementation more in line with this description (though setting a flag on the strategy instance rather than on the data object), narrowing the context in which this API use suggestion would be given, and lessening the severity (note on exception rather than a warning). WDYT?

Zac-HD · 2023-12-27T01:41:20Z

Nice work!

Setting a flag on the data instance is going to be more annoying, but I think it's also necessary - with the current state, we'll only add the note if our sampled_from strategy is passed directly to @given(), but will miss e.g. st.lists(st.sampled_from(...)) or data.draw(st.sampled_from(...)). Implementation notes:

Instead of iterating over the list in SampledFromStrategy.__init__, let's add the check to SampledFromStrategy.do_draw - if isinstance(result, SearchStrategy) and <flag not yet set> and <all options are strategies>:
We'll then need to add the note somewhere where we have access to the data object - I'd suggest adding the note to e here
Let's test with indirect strategies and with st.data() as noted above

vreuter · 2023-12-29T13:09:27Z

@Zac-HD I've updated the implementation to apply to the indirect and direct sampling cases, rather than just given and added some corresponding tests. I wound up needing to do a check in StateForActualGivenExecution.run_engine for loss of __notes__ when an error is reproduced, as I didn't see a better way/place to do it. WDYT?

I thought an even safer way would be to concatenate or take an order-preserving union of the potentially two collections of errors (from the original and from the reprod), but I wasn't sure if there's already a code path that would result in duplication, or if you envision one.

Zac-HD

Looking good! I particularly like the new tests 😁

See comments below on the message details; I think this will be the last round of review before we merge 🤞

Zac-HD · 2023-12-30T03:53:30Z

hypothesis-python/tests/cover/test_sampled_from.py

+            try:
+                func_to_call()
+            except BaseException as e:
+                pytest.fail(f"Expected call to succeed but got error: {e}")
+            else:
+                assert True


Suggested change

try:

func_to_call()

except BaseException as e:

pytest.fail(f"Expected call to succeed but got error: {e}")

else:

assert True

func_to_call()

OK to fail the test with whatever unexpected exception we got, I think.

Makes sense, I simplified here: ffe03c4

Zac-HD · 2023-12-30T03:58:30Z

hypothesis-python/src/hypothesis/strategies/_internal/strategies.py

+        if (
+            isinstance(result, SearchStrategy)
+            and not hasattr(data, "sampling_is_from_a_collection_of_strategies")
+            and all(isinstance(x, SearchStrategy) for x in self.elements)
+        ):
+            data.sampling_is_from_a_collection_of_strategies = True


Suggested change

if (

isinstance(result, SearchStrategy)

and not hasattr(data, "sampling_is_from_a_collection_of_strategies")

and all(isinstance(x, SearchStrategy) for x in self.elements)

):

data.sampling_is_from_a_collection_of_strategies = True

if (

isinstance(result, SearchStrategy)

and all(isinstance(x, SearchStrategy) for x in self.elements)

):

data._sampled_from_strategy_reprs.append(repr(self))

If we collect this list (and initialize it in the __init__ method), we can show the exact repr of the problematic strategy(s), and I think that would make the note even more helpful.

I insert a repr of the strategies here: 4948d08
I just did this as a single str value rather than a list. Is there a reason to append here (and create a list if doesn't already exist)?

Zac-HD · 2023-12-30T05:14:48Z

hypothesis-python/src/hypothesis/core.py

+                try:
+                    notes = info._expected_exception.__notes__
+                except AttributeError:
+                    pass
+                else:
+                    e.__notes__ = notes


I think it's probably better to leave this out. While seeing the notes might be helpful, I'm concerned that moving the notes from the old exception might be confusing (especially if the new exception already has notes!).

Did you have a particular case in mind where it would be useful?

Happy to update this another way, but with a removal of this and no other change, the "positive" tests (where the note should be available) don't pass. It seems to be because the code path where the note's currently added is not walked by the reproduction (which directly calls execute_one rather than doing so through _execute_once_for_engine). Is this acceptable? Am I missing something that should be evident?

That suggests to me that we might be adding the note in the wrong place... but I'm not sure exactly what the solution should be. I'll take a look soon.

I had this sense but wasn't sure how you'd prefer to alter the attachment strategy/placement, sorry about that. Looks good from your recent commit!

vreuter · 2024-01-03T14:06:44Z

@Zac-HD any idea what that memory error that looks specific to Windows 11 may be? I thought it may be ephemeral, but it seems to be persistent.

jobh · 2024-01-03T14:40:15Z

@Zac-HD any idea what that memory error that looks specific to Windows 11 may be? I thought it may be ephemeral, but it seems to be persistent.

It looks like the repr has just become extremely long, for some reason, and exhausts available memory.

vreuter · 2024-01-03T15:22:21Z

It looks like the repr has just become extremely long, for some reason, and exhausts available memory.

👌 I'm trying limiting the number of displayed elements, TY @jobh

vreuter · 2024-01-03T19:35:17Z

The test currently failing is doing so because it's bailing out of the last 2 attempts of the 100 test runs to generate random text, as 47 + (100 - 98) = 49 < 50 required successes per the default spec of 0.5.

Why is this ( "> 1/2 of draws from text() should be multiline") being enforced as part of the contract of the strategy? Is this intended and/or reasonable?

Regardless, this test failure reproduces a very low fraction of the time for me locally. Could this contract either be relaxed, or the test rewritten in such a way that it's more stably reproduced? @Zac-HD @jobh

vreuter · 2024-01-03T22:57:23Z

This failing test (check-quality) does not reproduce locally (macOS), running build.sh check-quality even after clearing the cache.

For the record, here's the output of the local run:

============================================ test session starts =============================================
platform darwin -- Python 3.10.13, pytest-7.4.3, pluggy-1.3.0
cachedir: .tox/quality/.pytest_cache
rootdir: ~/code/hypothesis
configfile: pytest.ini
plugins: hypothesis-6.92.1, xdist-3.5.0
12 workers [254 items]
...................................................................................................... [ 40%]
...................................................................................................... [ 80%]
..................................................                                                     [100%]
============================================ slowest 20 durations ============================================
88.20s call     hypothesis-python/tests/quality/test_float_shrinking.py::test_shrinks_downwards_to_integers
7.38s call     hypothesis-python/tests/quality/test_shrink_quality.py::test_minimize_multiple_elements_in_silly_large_int_range
5.82s call     hypothesis-python/tests/quality/test_discovery_ability.py::test_can_produce_multi_line_strings
4.28s call     hypothesis-python/tests/quality/test_float_shrinking.py::test_shrinks_downwards_to_integers_when_fractional
3.43s call     hypothesis-python/tests/quality/test_shrink_quality.py::test_minimize_multiple_elements_in_silly_large_int_range_min_is_not_dupe
2.95s call     hypothesis-python/tests/quality/test_shrink_quality.py::test_find_large_union_list
2.93s call     hypothesis-python/tests/quality/test_discovery_ability.py::test_can_produce_large_binary_strings
2.62s call     hypothesis-python/tests/quality/test_deferred_strategies.py::test_non_trivial_json
2.57s call     hypothesis-python/tests/quality/test_shrink_quality.py::test_can_ignore_left_hand_side_of_flatmap
2.26s call     hypothesis-python/tests/quality/test_shrink_quality.py::test_dictionary[OrderedDict]
2.22s call     hypothesis-python/tests/quality/test_shrink_quality.py::test_dictionary[dict]
1.93s call     hypothesis-python/tests/quality/test_shrink_quality.py::test_minimize_long_list
1.84s call     hypothesis-python/tests/quality/test_discovery_ability.py::test_ints_can_occasionally_be_really_large
1.83s call     hypothesis-python/tests/quality/test_shrink_quality.py::test_minimize_sets_of_sets
1.76s call     hypothesis-python/tests/quality/test_shrink_quality.py::test_minimize_mixed_list
1.67s call     hypothesis-python/tests/quality/test_discovery_ability.py::test_can_produce_unstripped_strings
1.58s call     hypothesis-python/tests/quality/test_poisoned_trees.py::test_can_reduce_poison_from_any_subtree[15993493061449915028-10]
1.52s call     hypothesis-python/tests/quality/test_shrink_quality.py::test_minimize_longer_list_of_strings
1.18s call     hypothesis-python/tests/quality/test_shrink_quality.py::test_duplicate_containment

(1 durations < 1s hidden.  Use -vv to show these durations.)
======================================= 254 passed in 95.14s (0:01:35) =======================================

tybug · 2024-01-06T20:15:08Z

Regardless, this test failure reproduces a very low fraction of the time for me locally. Could this contract either be relaxed, or the test rewritten in such a way that it's more stably reproduced?

This is indeed a flaky test and was separately reported in #3829. Let's track it in that issue; you can ignore that test failure for now. I suspect its flakiness is a result of recent changes (not from this PR) and may be indicative of a deeper issue.

vreuter · 2024-01-07T01:37:43Z

Regardless, this test failure reproduces a very low fraction of the time for me locally. Could this contract either be relaxed, or the test rewritten in such a way that it's more stably reproduced?

This is indeed a flaky test and was separately reported in #3829. Let's track it in that issue; you can ignore that test failure for now. I suspect its flakiness is a result of recent changes (not from this PR) and may be indicative of a deeper issue.

Much appreciated @tybug , sorry I'd not noticed the other report of that test flakiness. Sounds good, thanks!

…f one_of; close HypothesisWorks#3819

…ypothesisWorks#3819 Revised per suggestion here: HypothesisWorks#3820 (comment)

…thod

…ssertion logic

…Works#3819

Zac-HD

Thanks again @vreuter!

…version 6.92.6 CI on behalf of the Hypothesis team (1): Bump hypothesis-python version to 6.92.6 and update changelog Vince Reuter (16): add warning all elements to sampled_from are strategies, suggestive of one_of; close #3819 try catching expected warning in test contexts add explicit stacklevel argument to pass linting generate patch release description file remove warning-compensatory changes to test suite narrow the severity and cases for alternate API use suggestion; close #3819 remove unused helper method, add param ids, work better with staticmethod make implementation valid for more use cases add indirect tests ensure that err reprod doesn't squash error notes add direct draw tests for #3819; condense and narrow assertion logic simplify no-error check in test; HypothesisWorks/hypothesis#3820 more informative note for #3819 ignore mypy check for new attr limit the number of elements shown for API use suggestion; #3819 locally ignore undefined attribute for mypy Zac Hatfield-Dodds (2): tweak formatting + move note remove unused ignore

vreuter force-pushed the vr/warn-sampling-stategies-issue3819 branch 2 times, most recently from fb88761 to f647beb Compare December 22, 2023 14:22

vreuter changed the title ~~Warn when argument to sampled_from suggests one_of was intended~~ Warn when argument to sampled_from suggests that one_of was intended Dec 22, 2023

vreuter force-pushed the vr/warn-sampling-stategies-issue3819 branch 4 times, most recently from 0de010a to 3bbcf9b Compare December 22, 2023 19:32

vreuter added a commit to vreuter/hypothesis that referenced this pull request Dec 26, 2023

narrow the severity and cases for alternate API use suggestion; close H…

81232a2

…ypothesisWorks#3819 Revised per suggestion here: HypothesisWorks#3820 (comment)

vreuter force-pushed the vr/warn-sampling-stategies-issue3819 branch from 81232a2 to a8a5635 Compare December 26, 2023 17:32

vreuter added a commit to vreuter/hypothesis that referenced this pull request Dec 26, 2023

narrow the severity and cases for alternate API use suggestion; close H…

a8a5635

…ypothesisWorks#3819 Revised per suggestion here: HypothesisWorks#3820 (comment)

vreuter force-pushed the vr/warn-sampling-stategies-issue3819 branch from a8a5635 to 7706606 Compare December 26, 2023 17:33

vreuter added a commit to vreuter/hypothesis that referenced this pull request Dec 26, 2023

narrow the severity and cases for alternate API use suggestion; close H…

7706606

…ypothesisWorks#3819 Revised per suggestion here: HypothesisWorks#3820 (comment)

vreuter force-pushed the vr/warn-sampling-stategies-issue3819 branch from 5472452 to 6993160 Compare December 27, 2023 00:35

Zac-HD reviewed Dec 30, 2023

View reviewed changes

vreuter added a commit to vreuter/hypothesis that referenced this pull request Dec 31, 2023

simplify no-error check in test; HypothesisWorks#3820

ffe03c4

vreuter requested a review from Zac-HD December 31, 2023 01:00

vreuter force-pushed the vr/warn-sampling-stategies-issue3819 branch from 4543005 to 699850f Compare January 3, 2024 15:25

vreuter changed the title ~~Warn when argument to sampled_from suggests that one_of was intended~~ Add note to TypeError when each element to sampled_from is a strategy Jan 4, 2024

add warning all elements to sampled_from are strategies, suggestive o…

a004dbb

…f one_of; close HypothesisWorks#3819

vreuter and others added 17 commits January 8, 2024 15:47

try catching expected warning in test contexts

ea54125

add explicit stacklevel argument to pass linting

347e437

generate patch release description file

7a417ca

remove warning-compensatory changes to test suite

1ab904a

narrow the severity and cases for alternate API use suggestion; close H…

e50d6a4

…ypothesisWorks#3819 Revised per suggestion here: HypothesisWorks#3820 (comment)

remove unused helper method, add param ids, work better with staticme…

80337c1

…thod

make implementation valid for more use cases

0a97a7c

add indirect tests

8d73152

ensure that err reprod doesn't squash error notes

8e2e279

add direct draw tests for HypothesisWorks#3819; condense and narrow a…

4fdeade

…ssertion logic

simplify no-error check in test; HypothesisWorks#3820

195683e

more informative note for HypothesisWorks#3819

dd36487

ignore mypy check for new attr

ed50209

limit the number of elements shown for API use suggestion; Hypothesis…

2c37fb7

…Works#3819

locally ignore undefined attribute for mypy

b61590e

tweak formatting + move note

82408d9

remove unused ignore

6101917

Zac-HD force-pushed the vr/warn-sampling-stategies-issue3819 branch from 9e67f68 to 82408d9 Compare January 8, 2024 12:05

Zac-HD approved these changes Jan 8, 2024

View reviewed changes

Zac-HD merged commit e2e5b4c into HypothesisWorks:master Jan 8, 2024
47 checks passed

This was referenced Jan 15, 2024

Warnings for side effects during import #3837

Merged

Warn when generating overly large repr #3848

Merged

Zac-HD mentioned this pull request Mar 16, 2024

Structural difference between sampled_from and one_of: motivation and possible unification? #3922

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add note to TypeError when each element to `sampled_from` is a strategy #3820

Add note to TypeError when each element to `sampled_from` is a strategy #3820

vreuter commented Dec 22, 2023

Zac-HD commented Dec 22, 2023

vreuter commented Dec 27, 2023 •

edited

Loading

Zac-HD commented Dec 27, 2023

vreuter commented Dec 29, 2023 •

edited

Loading

Zac-HD left a comment

Zac-HD Dec 30, 2023

vreuter Dec 31, 2023

Zac-HD Dec 30, 2023

vreuter Dec 31, 2023

Zac-HD Dec 30, 2023

vreuter Dec 31, 2023

Zac-HD Jan 8, 2024

vreuter Jan 8, 2024

vreuter commented Jan 3, 2024

jobh commented Jan 3, 2024

vreuter commented Jan 3, 2024

vreuter commented Jan 3, 2024 •

edited

Loading

vreuter commented Jan 3, 2024 •

edited

Loading

tybug commented Jan 6, 2024 •

edited

Loading

vreuter commented Jan 7, 2024

Zac-HD left a comment

Add note to TypeError when each element to sampled_from is a strategy #3820

Add note to TypeError when each element to sampled_from is a strategy #3820

Conversation

vreuter commented Dec 22, 2023

Zac-HD commented Dec 22, 2023

vreuter commented Dec 27, 2023 • edited Loading

Zac-HD commented Dec 27, 2023

vreuter commented Dec 29, 2023 • edited Loading

Zac-HD left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vreuter commented Jan 3, 2024

jobh commented Jan 3, 2024

vreuter commented Jan 3, 2024

vreuter commented Jan 3, 2024 • edited Loading

vreuter commented Jan 3, 2024 • edited Loading

tybug commented Jan 6, 2024 • edited Loading

vreuter commented Jan 7, 2024

Zac-HD left a comment

Choose a reason for hiding this comment

Add note to TypeError when each element to `sampled_from` is a strategy #3820

Add note to TypeError when each element to `sampled_from` is a strategy #3820

vreuter commented Dec 27, 2023 •

edited

Loading

vreuter commented Dec 29, 2023 •

edited

Loading

vreuter commented Jan 3, 2024 •

edited

Loading

vreuter commented Jan 3, 2024 •

edited

Loading

tybug commented Jan 6, 2024 •

edited

Loading