Add type annotations to `hypothesis/internal/conjecture/engine.py` #3961

qthequartermasterman · 2024-05-01T03:58:10Z

In an attempt to help with #3074, I have added type annotations to hypothesis/internal/conjecture/engine.py. I have checked these type hints with both mypy and pyright, and verified a few of the trickier-to-statically-determine types by checking their runtime types in the tests.

There are three things that I'm still confused about.

Can Overrun be a valid value for self.interesting_examples?

In all places but this one sample (lines 389 in this PR), all of the keys for self.interesting_examples are ConjectureResult. However, this assignment may possibly be Overrun. Indeed, in one test, test_backend_can_shrink_bytes, this does occur. This seems (to me) to be the only place in the tests where this Overrun assignment occurs. If Overrun is a valid value, then this violates the typing in several other places where self.interesting_examples is accessed, but nothing seemed to ever trip up the asserts I had placed in those places (and since removed). If Overrun is indeed valid, then I will need to update the hints for self.interesting_examples, and I'd like to better understand the guarantees that ensure Overrun is never a value in other places (so the type narrowing asserts are legal). Many, but not all, of those references are in shrink_interesting_examples for example.

            if changed:
                self.save_buffer(data.buffer)
                self.interesting_examples[key] = data.as_result()  # type: ignore
                self.__data_cache.pin(data.buffer)
                self.shrunk_examples.discard(key)

Potentially uninitialized variable trial_data because of a caught PreviouslyUnseenBehavior exception

In line 842, trial_data is defined inside of a try-except block. In the happy path, this variable is always defined. On the otherhand, unless I am misunderstanding something, if a PreviouslyUnseenBehavior is thrown, that variable will not be defined, meaning trial_data.observer.killed and a few other later instances will reference an uninitialized variable. This never happens in the tests, and I've certainly never experienced it while using hypothesis.

Am I misunderstanding the flow of the code here? Is this try-except superfluous? Or is there a genuine error in how the exception is handled?

                try:
                    trial_data = self.new_conjecture_data(
                        prefix=prefix, max_length=max_length
                    )
                    self.tree.simulate_test_function(trial_data)
                    continue
                except PreviouslyUnseenBehaviour:
                    pass

                # If the simulation entered part of the tree that has been killed,
                # we don't want to run this.
                assert isinstance(trial_data.observer, TreeRecordingObserver)
                if trial_data.observer.killed:
                    continue

                ...

What is the proper type for stack in debug_data?

In the debug_data method there is a stack=[[]] on line 587. The expected type for stack is unclear to me. My initial impression based on stack[-1].append(int_from_bytes(data.buffer[ex.start : ex.end])) is that stack should be List[List[int]]. But these conditionals, where node:List[int] give conflicting types.

                if len(node) == 1:
                    stack[-1].extend(node)
                else:
                    stack[-1].append(node)

In the case of length-1 node, the final element of stack (which I thought is a List[int]) gets extended by one more integer. This makes sense to me. The else case, however, would mean that the last element of stack would have a list of ints appended as its final element. I.e. stack[-1] == [1,2,3,..., [1,2,3,4]] This means that stack would have to be List[List[int | List[int]]]. But this type then starts tripping on variance issues.

To further confuse matters, my debugger never seemed to stop on that strange else clause, so I never could actually find the proper runtime type.

What is the correct type here? I'm clearly not understanding what the code is doing to this data structure.

    def debug_data(self, data: ConjectureData | ConjectureResult) -> None:
        if not self.report_debug_info:
            return

        stack: list = [[]]

        def go(ex: Example) -> None:
            if ex.length == 0:
                return
            if len(ex.children) == 0:
                stack[-1].append(int_from_bytes(data.buffer[ex.start : ex.end]))
            else:
                node: list[int] = []
                stack.append(node)

                for v in ex.children:
                    go(v)
                stack.pop()
                if len(node) == 1:
                    stack[-1].extend(node)
                else:
                    stack[-1].append(node)

        go(data.examples[0])
        assert len(stack) == 1

On top of those three confusions I have, mypy also complains about some string literals as keys to a TypedDict (line 247), because it does not consider the concatenation of two literals to be a literal. See the code snippet below. Personally, I feel like this is an error in mypy--pyright does not have this complaint. If we would like mypy to be fully happy, I can either add a # type: ignore, or refactor the code to only use true literals. ~~If we don't care that mypy complains, then no change is needed.~~ (I see now that mypy is enforced in CI. Happy to do whatever the maintainers recommend to satisfy mypy.)

    def _log_phase_statistics(self, phase: Literal["reuse", "generate", "shrink"]) -> Generator[None, None, None]:
        ...
        finally:
            self.statistics[phase + "-phase"] = {  # The key here is the concatenation of two string literals.
                "duration-seconds": time.perf_counter() - start_time,
                "test-cases": list(self.stats_per_test_case),
                "distinct-failures": len(self.interesting_examples),
                "shrinks-successful": self.shrinks,
            }

Doubtless, I've missed or misunderstood some pieces. I am happy to revise and iterate!

tybug · 2024-05-01T04:35:32Z

d5b7136

ah, this one is my fault...I created the release file in the wrong location in #3957, and I think because we were in an interim period we had a correct file on master and CI didn't catch it. We should delete that file since that change has already been rolled into 6.100.2.

hypothesis-python/RELEASE.rst

hypothesis-python/src/hypothesis/internal/conjecture/engine.py

hypothesis-python/src/hypothesis/internal/conjecture/shrinker.py

Zac-HD · 2024-05-01T05:47:32Z

Wow! Thanks so much for contributing, this looks great 😀

Can Overrun be a valid value for self.interesting_examples?

No - these are specifically ConjectureResult objects with .status=Status.INTERESTING, whereas Overrun is a special-case to represent =Status.OVERRUN with lower memory usage.

Potentially uninitialized variable trial_data because of a caught PreviouslyUnseenBehavior exception

Ah, PreviouslyUnseenBehavior can only be raised by simulate_test_function(), so this is safe in practice. However it's also perfectly safe to move the trial_data = self.new_conjecture_data(...) assignment outside the try: block, so let's do that for clarity.

What is the proper type for stack in debug_data?

Looks like Ls: TypeAlias = list["Ls | int"], in that it can be arbitrarily nested. I haven't dug into this but the test might be helpful. We should also have complete branch coverage from the tests/conjecture/ tests, so adding a raise Exception to that line and then running the tests should find some code that exercises it if that'd be helpful.

On top of those three confusions I have, mypy also complains about some string literals as keys to a TypedDict (line 247), because it does not consider the concatenation of two literals to be a literal. See the code snippet below. Personally, I feel like this is an error in mypy--pyright does not have this complaint. If we would like mypy to be fully happy, I can either add a # type: ignore, or refactor the code to only use true literals.

Either of these sound like a good solution to me.

qthequartermasterman · 2024-05-03T02:01:31Z

Sorry for the delay in addressing comments. I've had a busy couple of days.

Can Overrun be a valid value for self.interesting_examples?

No - these are specifically ConjectureResult objects with .status=Status.INTERESTING [...]

Thanks for this clarification. I've added a type: ignore on line 418, because putting a narrowing assert causes one of the unit tests to fail, for some reason. Evidently in test_backend_can_shrink_bytes, Overrun ends up being the result.

Potentially uninitialized variable trial_data because of a caught PreviouslyUnseenBehavior exception

[...] it's also perfectly safe to move the trial_data = self.new_conjecture_data(...) assignment outside the try: block, so let's do that for clarity.

Thanks for this clarification. I've gone ahead and moved the trial_data out of the try block.

What is the proper type for stack in debug_data?

Looks like Ls: TypeAlias = list["Ls | int"], in that it can be arbitrarily nested. [...]

Looks like stack: List[Ls] and node: Ls is the right combination to make the type checker happy, and it seems to match the runtime conditions. Great call out.

On top of those three confusions I have, mypy also complains about some string literals as keys to a TypedDict (line 247), because it does not consider the concatenation of two literals to be a literal. See the code snippet below. Personally, I feel like this is an error in mypy--pyright does not have this complaint. If we would like mypy to be fully happy, I can either add a # type: ignore, or refactor the code to only use true literals.

Either of these sound like a good solution to me.

I went with a type: ignore. Putting a bunch of if-else statements just to construct string literals felt a little too convoluted.

I believe I've addressed all comments at this point. Both mypy and pyright are happy. Happy to make any additional changes, if desired.

qthequartermasterman · 2024-05-03T03:36:15Z

@Zac-HD I don't think I understand what is failing in CI at the moment. Do you have any hints on what needs to be fixed?

tybug · 2024-05-03T03:37:14Z

hypothesis-python/src/hypothesis/internal/conjecture/engine.py

+        self.interesting_examples: Dict[
+            Optional[InterestingOrigin], ConjectureResult
+        ] = {}


I believe this is Dict[InterestingOrigin, ConjectureResult]. data.interesting_origin is optional, but we only add to this dict when it's nonnull.

Typing this as Dict[InterestingOrigin, ConjectureResult], and then adding assert key is not None before writing to this dictionary causes these tests to fail at that assert:

FAILED hypothesis-python/tests/conjecture/test_alt_backend.py::test_backend_can_shrink_bytes - AssertionError FAILED hypothesis-python/tests/conjecture/test_data_tree.py::test_stores_the_tree_flat_until_needed - AssertionError FAILED hypothesis-python/tests/conjecture/test_data_tree.py::test_split_in_the_middle - AssertionError FAILED hypothesis-python/tests/conjecture/test_data_tree.py::test_stores_forced_nodes - AssertionError FAILED hypothesis-python/tests/conjecture/test_data_tree.py::test_correctly_relocates_forced_nodes - AssertionError FAILED hypothesis-python/tests/conjecture/test_data_tree.py::test_is_not_flaky_on_positive_zero_and_negative_zero - AssertionError FAILED hypothesis-python/tests/conjecture/test_data_tree.py::test_low_probabilities_are_still_explored - AssertionError FAILED hypothesis-python/tests/conjecture/test_data_tree.py::test_can_generate_hard_values - AssertionError FAILED hypothesis-python/tests/conjecture/test_data_tree.py::test_can_generate_hard_floats - AssertionError FAILED hypothesis-python/tests/conjecture/test_engine.py::test_can_index_results - AssertionError FAILED hypothesis-python/tests/conjecture/test_engine.py::test_non_cloneable_intervals - AssertionError FAILED hypothesis-python/tests/conjecture/test_engine.py::test_deletable_draws - AssertionError

That's why I had typed it as Optional[InterestingOrigin]. If we're certain the type should be Dict[InterestingOrigin, ConjectureResult], then a type-narrowing assert will cause the tests to fail. We could get around this with a type: ignore on the lines that write to the dictionary, but I'm curious why so many tests fail.

ah, we've been abusing mark_interesting in tests by not specifying an interesting origin. It's undeniably ergonomic but also means the type is more lenient than it needs to be in practice. I think we can leave a note here saying the type is InterestingOrigin everywhere but tests, where it is Optional[InterestingOrigin] for convenience, and leave cleaning this up for later.

Sounds good to me! I left a comment # At runtime, the keys are only ever type InterestingOrigin, but can be None during tests. above both this hint and self.shrunk_examples that you mentioned below.

Should an issue be opened up to track making that change to the tests?

Yeah, I think it's worth it - we can define a simple InterestingOrigin and paste it into all the tests, and then tightening up these annotations will help us work on the engine 🙂

hypothesis-python/src/hypothesis/internal/conjecture/engine.py

Zac-HD · 2024-05-03T08:41:34Z

@Zac-HD I don't think I understand what is failing in CI at the moment. Do you have any hints on what needs to be fixed?

I think some change in this PR probably caused this test to start failing, although I don't know why:

hypothesis/hypothesis-python/tests/cover/test_composite.py

Lines 197 to 206 in 8005910

    
           @pytest.mark.skipif(sys.version_info[:2] == (3, 9), reason="stack depth varies???") 
        
           def test_warns_on_strategy_annotation(): 
        
               with pytest.warns(HypothesisWarning, match="Return-type annotation") as w: 
        
                   @st.composite 
        
                   def my_integers(draw: st.DrawFn) -> st.SearchStrategy[int]: 
        
                       return draw(st.integers()) 
        
               assert len(w.list) == 1 
        
               assert w.list[0].filename == __file__  # check stacklevel points to user code

#3968 is also failing in other PRs, so I'm confident that's not anything in this PR.

hypothesis-python/src/hypothesis/internal/conjecture/engine.py

Zac-HD · 2024-05-03T08:58:42Z

Sorry for the delay in addressing comments. I've had a busy couple of days.

No worries at all, you're not the only one! But more importantly, volunteer projects like Hypothesis are a labor of love and it's really important to me that they don't start to feel like an obligation. If you want to work on it, we're delighted to have you helping out! But if that changes, or if you'd just like a maintainer to finish polishing one PR so you can start on another, let us know and move on with our blessings 🙂

Zac-HD

This all looks great to me - thanks @qthequartermasterman!

The remaining CI failures are all from test_warns_on_strategy_annotation() on Python 3.10 or 3.11 - and we already have a skipif-py39 on that due to the best-guess stack depth here. Probably we just need to add an additional case for the affected Python versions... great if you want to look into that, or I can try fixing it over the weekend.

Zac-HD · 2024-05-09T17:05:58Z

hypothesis-python/src/hypothesis/internal/conjecture/engine.py

+        self.interesting_examples: Dict[
+            Optional[InterestingOrigin], ConjectureResult
+        ] = {}


Yeah, I think it's worth it - we can define a simple InterestingOrigin and paste it into all the tests, and then tightening up these annotations will help us work on the engine 🙂

hypothesis-python/src/hypothesis/internal/conjecture/engine.py

fixes HypothesisWorks#3968, for good this time

qthequartermasterman · 2024-05-13T13:40:42Z

Thanks for finishing this out @Zac-HD !

Zac-HD · 2024-05-13T15:09:17Z

Happy to help! It was a remarkably annoying merge conflict 😅

qthequartermasterman requested a review from Zac-HD as a code owner May 1, 2024 03:58

Zac-HD reviewed May 1, 2024

View reviewed changes

tybug reviewed May 3, 2024

View reviewed changes

Zac-HD mentioned this pull request May 3, 2024

Add tests for failed validation in Pandas extra #3968

Closed

Zac-HD reviewed May 3, 2024

View reviewed changes

hypothesis-python/src/hypothesis/internal/conjecture/engine.py Outdated Show resolved Hide resolved

hypothesis-python/src/hypothesis/internal/conjecture/engine.py Outdated Show resolved Hide resolved

Zac-HD added the internals Stuff that only Hypothesis devs should ever see label May 3, 2024

qthequartermasterman requested review from Zac-HD and tybug May 9, 2024 00:53

Zac-HD reviewed May 9, 2024

View reviewed changes

Zac-HD force-pushed the add-types-to-engine branch from 92bab87 to 3f3377d Compare May 12, 2024 20:50

Zac-HD mentioned this pull request May 13, 2024

Fix stack-depth of warning in @st.composite #3985

Open

Zac-HD and others added 3 commits May 12, 2024 18:28

de-flake coverage

ca4297b

fixes HypothesisWorks#3968, for good this time

initial pass of engine.py types through static analysis

35d655b

Final type-annotation tweaks

8b28c80

Zac-HD force-pushed the add-types-to-engine branch from 032daed to 8b28c80 Compare May 13, 2024 01:28

Zac-HD approved these changes May 13, 2024

View reviewed changes

Zac-HD enabled auto-merge May 13, 2024 01:29

Zac-HD merged commit affb5b8 into HypothesisWorks:master May 13, 2024
52 of 53 checks passed

tybug mentioned this pull request May 13, 2024

Fix overly strict type assertion #3988

Merged

dycw mentioned this pull request May 15, 2024

Hypothesis 6.100.8 introduced a regression in hypothesis.internal.conjecture.engine.ConjectureRunner #3992

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add type annotations to `hypothesis/internal/conjecture/engine.py` #3961

Add type annotations to `hypothesis/internal/conjecture/engine.py` #3961

qthequartermasterman commented May 1, 2024 •

edited

Loading

tybug commented May 1, 2024

Zac-HD commented May 1, 2024

qthequartermasterman commented May 3, 2024 •

edited

Loading

qthequartermasterman commented May 3, 2024

tybug May 3, 2024

qthequartermasterman May 4, 2024

tybug May 4, 2024

qthequartermasterman May 9, 2024

Zac-HD May 9, 2024

Zac-HD commented May 3, 2024

Zac-HD commented May 3, 2024

Zac-HD left a comment

Zac-HD May 9, 2024

qthequartermasterman commented May 13, 2024

Zac-HD commented May 13, 2024

Add type annotations to hypothesis/internal/conjecture/engine.py #3961

Add type annotations to hypothesis/internal/conjecture/engine.py #3961

Conversation

qthequartermasterman commented May 1, 2024 • edited Loading

tybug commented May 1, 2024

Zac-HD commented May 1, 2024

qthequartermasterman commented May 3, 2024 • edited Loading

qthequartermasterman commented May 3, 2024

tybug May 3, 2024

Choose a reason for hiding this comment

qthequartermasterman May 4, 2024

Choose a reason for hiding this comment

tybug May 4, 2024

Choose a reason for hiding this comment

qthequartermasterman May 9, 2024

Choose a reason for hiding this comment

Zac-HD May 9, 2024

Choose a reason for hiding this comment

Zac-HD commented May 3, 2024

Zac-HD commented May 3, 2024

Zac-HD left a comment

Choose a reason for hiding this comment

Zac-HD May 9, 2024

Choose a reason for hiding this comment

qthequartermasterman commented May 13, 2024

Zac-HD commented May 13, 2024

Add type annotations to `hypothesis/internal/conjecture/engine.py` #3961

Add type annotations to `hypothesis/internal/conjecture/engine.py` #3961

qthequartermasterman commented May 1, 2024 •

edited

Loading

qthequartermasterman commented May 3, 2024 •

edited

Loading