Remove sub-ir examples #4007

tybug · 2024-05-29T19:12:24Z

Part of #3921.

For instance, we now deduplicate the following perfectly:

st.integers(n, m) with m - n <= 127 (due to integer weights drawing another int...)
st.lists(st.integers())
probably many more

In fact, our deduplication tracking may now be so good that we need to artificially duplicate inputs in order to coax out flaky errors!

Accordingly, this fixes our duplication of timezone_keys in #3932.

@given(st.integers(0, 99), st.integers(0, 99))
@settings(database=None, max_examples=100000)
def f(n1, n2):
    seen.append((n1, n2))
f()
# 10000, 10000
print(len(set(seen)), len(seen))

hypothesis-python/src/hypothesis/internal/conjecture/data.py

tybug · 2024-05-29T19:16:58Z

hypothesis-python/src/hypothesis/internal/conjecture/data.py

        def start_example(self, i: int, label_index: int) -> None:
-            depth = len(self.example_stack)
-            self.groups[label_index, depth].append(i)
+            key = (self.examples[i].ir_start, self.examples[i].ir_end)
+            self.groups[label_index].add(key)


LazyStrategy causes duplicate example boundaries, since best I can tell we wrap all of our core strategies in a lazy strategy. I've worked around that here by making groups unique up to (ir_start, ir_end). The ideal is "don't create examples for LazyStrategy", but a naive fix there breaks data.draw(label=...) overrides, since they don't get passed to the underlying strategy call. May be doable with more plumbing.

hypothesis-python/src/hypothesis/internal/conjecture/engine.py

tybug · 2024-05-29T19:54:20Z

OK, sorry, I shouldn't have tried to push through both sub-ir removal and generate_mutations from in the same pull. I'll split the latter to a dependent pull.

tybug · 2024-05-29T21:10:10Z

...actually, removing just sub-ir examples is flaky, but isn't flaky when we also migrate mutations from. I'm not even going to look into why. Let's bundle both changes together. Sorry for all the back and forth lol.

…ng lists

tybug · 2024-05-31T21:31:06Z

test_inquisitor_comments_basic_fail_if_either flaked, though I don't know how given that we set derandomize=True.

tybug · 2024-06-01T03:26:36Z

similarly I don't see how test_observability is flaky with a seed 🤔

time for some ci-driven development...

tybug · 2024-06-02T01:28:04Z

test_observability failure reproduces locally with ./build.sh check-coverage(and not when running tests normally), just like CI. I'm suspecting Interactions with sys.settrace, since running with coverage sets a tracer.

tybug · 2024-06-13T23:12:28Z

Here's the test_observability culprit. The failing example was found somewhere in 20 < n <= 100 calls and so only reproduced when not running under coverage:

hypothesis/hypothesis-python/tests/common/setup.py

Lines 52 to 58 in 395649a

    
           settings.register_profile( 
        
               "default", 
        
               settings( 
        
                   max_examples=20 if IN_COVERAGE_TESTS else not_set, 
        
                   phases=list(Phase),  # Dogfooding the explain phase 
        
               ), 
        
           )

(I don't want to talk about how long I spent tracking this down.) Arguably a determinism footgun, but also clearly good for performance. Possibly we should use max_examples=not_set when @seed is set on tests? But I don't think profiles can express that.

Zac-HD

Hmm, the performance hit might not be that big since we started using xdist under coverage - we could try it and see?

hypothesis-python/src/hypothesis/internal/conjecture/data.py

tybug · 2024-06-14T02:22:42Z

seeing about a 50% performance hit locally for ./build.sh check-coverage when removing the max_examples override (112s -> 163s). Probably not worth it since check-coverage is already our longest CI job? 3.12 coverage support can't come fast enough 🙂

I've worked around this locally by explicitly setting max_examples, btw, so we can revisit this later if we want.

tybug · 2024-06-14T02:44:19Z

@jobh I'm going to toss this test_gc_hooks_do_not_cause_unraisable_recursionerror flake over to you, I think (should we just increase max_runs?): https://github.com/HypothesisWorks/hypothesis/actions/runs/9509679886/job/26212997475?pr=4007

jobh · 2024-06-14T06:13:06Z

should we just increase max_runs?

That will not help unfortunately, as there is a crash (segfault?) which bypasses all of that.

When I looked into these crashes previously, I found them to be unreproducible locally; but managed to reproduce on CI/master at the time so I'm not too worried about a regression.

[edit: Unreproducible may be too strong claim. I've never gotten build.sh working locally, so "unreproducible locally using plain pytest" would be correct.]

After seeing comments in pytest-dev/pytest#3216 I think it would be close to impossible to track down the root cause here. @Zac-HD, is there a place to move the test to where it would run without xdist? Otherwise I think we should just skip the test unconditionally.

Zac-HD · 2024-06-14T21:03:46Z

After seeing comments in pytest-dev/pytest#3216 I think it would be close to impossible to track down the root cause here. @Zac-HD, is there a place to move the test to where it would run without xdist? Otherwise I think we should just skip the test unconditionally.

We have some tests that invoke a new pytest process to run entirely independently, and could do that for this one too. If that doesn't work either for some reason, skipping sounds good to me.

Zac-HD

Looks good to me, let's merge as-is and fix the flake in a separate PR 🚀

tybug requested a review from Zac-HD as a code owner May 29, 2024 19:12

tybug commented May 29, 2024

View reviewed changes

hypothesis-python/src/hypothesis/internal/conjecture/engine.py Outdated Show resolved Hide resolved

tybug commented May 29, 2024

View reviewed changes

hypothesis-python/src/hypothesis/internal/conjecture/engine.py Outdated Show resolved Hide resolved

remove sub-ir examples

af3e859

tybug force-pushed the remove-sub-ir-examples branch from e158a66 to de854e2 Compare May 29, 2024 19:45

adjust tests

774b366

tybug force-pushed the remove-sub-ir-examples branch from de854e2 to 774b366 Compare May 29, 2024 19:55

tybug changed the title ~~Remove sub-ir examples and migrate generate_mutations_from to the IR~~ Remove sub-ir examples May 29, 2024

tybug added 2 commits May 29, 2024 15:57

add release notes

22d3847

migrate generate_mutations_from to the ir

3f82f86

tybug mentioned this pull request May 31, 2024

Account for time spent in garbage collection #3979

Merged

tybug added 5 commits May 31, 2024 15:46

update tests, fix coverage / flakes

f359b26

update type hints

7b49995

use DataObserver for trial_data

acd1fba

change test_can_produce_large_binary_strings to same parameters as lo…

bea5da1

…ng lists

try fix tests again

874c4a2

try debug test_observability

d41e094

tybug added 4 commits June 13, 2024 19:08

remove unused labels

e406b78

reduce more internal examples

9457c1b

fix test_observability flakiness

1b02707

add start == end todo

f8e5830

Zac-HD reviewed Jun 13, 2024

View reviewed changes

hypothesis-python/src/hypothesis/internal/conjecture/data.py Outdated Show resolved Hide resolved

return sets in mutator_groupsf

efeebb3

draw from strategies with examples to cover missing pareto code

7b41fda

fix type hints

88a18e6

bump py3.13

d8c97fc

Zac-HD approved these changes Jun 14, 2024

View reviewed changes

Zac-HD merged commit 6c155a4 into HypothesisWorks:master Jun 14, 2024
54 checks passed

tybug deleted the remove-sub-ir-examples branch June 14, 2024 21:06

This was referenced Jun 14, 2024

Migrate our core representation to an IR layer #3921

Open

Ideas for some strategy optimizations #3932

Closed

jobh mentioned this pull request Jun 18, 2024

Test tweaks #4012

Merged

tybug mentioned this pull request Jun 18, 2024

example generation regression between 6.47.0 -> 6.103.1 #4014

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove sub-ir examples #4007

Remove sub-ir examples #4007

tybug commented May 29, 2024 •

edited

Loading

tybug May 29, 2024

tybug commented May 29, 2024

tybug commented May 29, 2024

tybug commented May 31, 2024

tybug commented Jun 1, 2024

tybug commented Jun 2, 2024 •

edited

Loading

tybug commented Jun 13, 2024 •

edited

Loading

Zac-HD left a comment

tybug commented Jun 14, 2024

tybug commented Jun 14, 2024 •

edited

Loading

jobh commented Jun 14, 2024 •

edited

Loading

Zac-HD commented Jun 14, 2024

Zac-HD left a comment

Remove sub-ir examples #4007

Remove sub-ir examples #4007

Conversation

tybug commented May 29, 2024 • edited Loading

tybug May 29, 2024

Choose a reason for hiding this comment

tybug commented May 29, 2024

tybug commented May 29, 2024

tybug commented May 31, 2024

tybug commented Jun 1, 2024

tybug commented Jun 2, 2024 • edited Loading

tybug commented Jun 13, 2024 • edited Loading

Zac-HD left a comment

Choose a reason for hiding this comment

tybug commented Jun 14, 2024

tybug commented Jun 14, 2024 • edited Loading

jobh commented Jun 14, 2024 • edited Loading

Zac-HD commented Jun 14, 2024

Zac-HD left a comment

Choose a reason for hiding this comment

tybug commented May 29, 2024 •

edited

Loading

tybug commented Jun 2, 2024 •

edited

Loading

tybug commented Jun 13, 2024 •

edited

Loading

tybug commented Jun 14, 2024 •

edited

Loading

jobh commented Jun 14, 2024 •

edited

Loading