Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange HealthCheck filtering failure #3497

Closed
langfield opened this issue Oct 28, 2022 · 4 comments
Closed

Strange HealthCheck filtering failure #3497

langfield opened this issue Oct 28, 2022 · 4 comments
Labels
performance go faster! use less memory!

Comments

@langfield
Copy link

See the following MWE.

"""MWE."""
from dataclasses import dataclass
from beartype import beartype
from beartype.typing import List

import hypothesis.strategies as st
from hypothesis import settings
from hypothesis.stateful import (
    RuleBasedStateMachine,
    Bundle,
    rule,
    initialize,
)

# pylint: disable=missing-class-docstring, missing-function-docstring


@beartype
@dataclass
class State:
    xs: List[str]


@st.composite
def elements(draw, states: Bundle) -> st.SearchStrategy[str]:
    state = draw(states)
    pith = "$|".join(state.xs)
    pattern: str = f"^(?!{pith}$)"
    return draw(st.from_regex(pattern))


class Machine(RuleBasedStateMachine):
    def __init__(self):
        super().__init__()
        self.state = State(xs=[])

    states = Bundle("states")

    @initialize(target=states)
    def init_state(self) -> State:
        return self.state

    @rule(x=elements(states=states))
    @beartype
    def add(self, x: str) -> None:
        self.state.xs.append(x)


Machine.TestCase.settings = settings(max_examples=100)
TestMachine = Machine.TestCase

This test fails with the following output.

(anki) user@computer:~$ pytest test.py
====================================================== test session starts =======================================================
platform linux -- Python 3.9.11, pytest-7.1.1, pluggy-1.0.0
rootdir: /home/mal
plugins: timeout-2.1.0, hypothesis-6.56.3, mock-3.7.0
collected 1 item

test.py F                                                                                                                  [100%]

============================================================ FAILURES ============================================================
______________________________________________________ TestMachine.runTest _______________________________________________________

self = <hypothesis.stateful.Machine.TestCase testMethod=runTest>

    def runTest(self):
>       run_state_machine_as_test(cls)

conda/envs/anki/lib/python3.9/site-packages/hypothesis/stateful.py:400:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
conda/envs/anki/lib/python3.9/site-packages/hypothesis/stateful.py:222: in run_state_machine_as_test
    run_state_machine(state_machine_factory)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

factory = <class 'test.Machine'>

    @settings
>   @given(st.data())
E   hypothesis.errors.FailedHealthCheck: It looks like your strategy is filtering out a lot of data. Health check found 50 filtered examples but only 0 good ones. This will make your tests much slower, and also will probably distort the data generation quite a lot. You should adapt your strategy to filter less. This can also be caused by a low max_leaves parameter in recursive() calls
E   See https://hypothesis.readthedocs.io/en/latest/healthchecks.html for more information about this. If you want to disable just this health check, add HealthCheck.filter_too_much to the suppress_health_check settings for this test.

conda/envs/anki/lib/python3.9/site-packages/hypothesis/stateful.py:107: FailedHealthCheck
----------------------------------------------------------- Hypothesis -----------------------------------------------------------
You can add @seed(225781618033313213706118631490124347638) to this test or run pytest with --hypothesis-seed=225781618033313213706118631490124347638 to reproduce this failure.
==================================================== short test summary info =====================================================
FAILED test.py::TestMachine::runTest - hypothesis.errors.FailedHealthCheck: It looks like your strategy is filtering out a lot ...
======================================================= 1 failed in 0.21s ========================================================

However, when I comment out the Machine.TestCase.settings line, it passes. Since I believe 100 examples is the default anyway, what is going on here?

@langfield
Copy link
Author

Also, is there a way to display the reason why a test was marked with Status.INVALID? It seems like it's either that the strategy is empty, or the depth has exceeded some maximum. Can this be printed somehow so you could see which strategy is causing the problem?

@Zac-HD
Copy link
Member

Zac-HD commented Oct 29, 2022

Status.INVALID usually just means that you hit an assume(False) or too many retries on a .filter(), and unfortunately this is not introspectable without getting deep into private internal implementation details.

In this case, I think the problem is probably that your regex with negative lookahead isn't turning into a workable strategy?

There might also be a performance issue where our stateful "swarm testing" randomly disables a subset of rules, and this doesn't correct for having fewer rules - but that doesn't explain why every test case is filtered out.

@Zac-HD Zac-HD added the performance go faster! use less memory! label Oct 29, 2022
@langfield
Copy link
Author

langfield commented Oct 30, 2022

@Zac-HD Thanks for the quick reply! I really appreciate it! Any ideas about why this issue disappears when the settings line is commented-out?

The st.from_regex() call was indeed causing the issue. I replaced it with a st.text().filter(lambda s: s not in <set>) and it gave me a much better valid-to-invalid ratio. Is there a cleaner way to do this?

Status.INVALID usually just means that you hit an assume(False) or too many retries on a .filter(), and unfortunately this is not introspectable without getting deep into private internal implementation details.

What if I just print some stuff in here?

if strategy.is_empty:
self.mark_invalid()
if self.depth >= MAX_DEPTH:
self.mark_invalid()

@Zac-HD
Copy link
Member

Zac-HD commented Jun 4, 2023

I think the depth issue has been substantially improved in #3654, while the other part is basically the same problem as #434 and unfortunately we don't have any great ideas on how to solve it beyond maybe "lots of observability engineering for PBT" which is an ongoing research topic.

@Zac-HD Zac-HD closed this as completed Jun 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance go faster! use less memory!
Projects
None yet
Development

No branches or pull requests

2 participants