Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'Unsatisfiable' error when generating a dataframe from rows and columns without a fill value #2678

Closed
snthibaud opened this issue Nov 27, 2020 · 4 comments · Fixed by #2691
Closed
Labels
legibility make errors helpful and Hypothesis grokable

Comments

@snthibaud
Copy link

snthibaud commented Nov 27, 2020

According to the documentation, absent values should be filled based on the column dtype, but that does not happen. I created the following minimal example of the problem. Any ideas?

from hypothesis.extra.pandas import data_frames, indexes, column
from hypothesis.strategies import sampled_from, text

print(
    data_frames(
        columns=[column(c, text(), dtype=str) for c in ['a', 'b']],
        rows=sampled_from([{'a': 'x'}]),
        index=indexes(dtype=int, min_size=1)
    ).example()
)

The error is:

  File ".../python3.8/site-packages/hypothesis/core.py", line 763, in run_engine
    raise Unsatisfiable(
hypothesis.errors.Unsatisfiable: Unable to satisfy assumptions of hypothesis example_generating_inner_function.
@Zac-HD
Copy link
Member

Zac-HD commented Nov 27, 2020

Hey @snthibaud - that's a great bug report 😄

It looks like the problem is in providing both rows and columns - if you provide both, they are validated against each other. If you just use columns, the following example works:

from hypothesis.extra.pandas import data_frames, indexes, column
from hypothesis import given, strategies as st


@given(
    data_frames(
        columns=[
            column("a", st.just("x"), dtype=str),  # note: more specific elements
            column("b", st.text(), dtype=str),
        ],
        index=indexes(dtype=int, min_size=1),
    )
)
def test(df):
    pass

We should try to provide a more specific error message for this case though!

@Zac-HD Zac-HD added the legibility make errors helpful and Hypothesis grokable label Nov 27, 2020
@snthibaud
Copy link
Author

snthibaud commented Nov 27, 2020

Hello @Zac-HD ,

Thank you! What exactly is meant by 'validated' here?

I read this phrase in the same section:

Any values missing from the generated rows will be provided using the column’s fill.

So I assumed that 'a' would be the specified string 'x' and column 'b' would be filled by some other text (provided by text() ).

I actually need a Dataframe with columns depending on each other (and I cannot specify all columns for all rows).
Does my bug report still stand? 😇

@snthibaud
Copy link
Author

When quoting that line I noticed the word 'fill' and it turns out to be a keyword to the 'column' function!
With this subtle change, it does work as I intended:

from hypothesis.extra.pandas import data_frames, indexes, column
from hypothesis.strategies import sampled_from, text

print(
    data_frames(
        columns=[column(c, fill=text(), dtype=str) for c in ['a', 'b']],
        rows=sampled_from([{'a': 'x'}]),
        index=indexes(dtype=int, min_size=1)).example())

Thanks for your help! 😄

@Zac-HD
Copy link
Member

Zac-HD commented Nov 27, 2020

Aha, that would do it! We can definitely provide a better error message though, so I'm keeping the issue open for that 😁

@Zac-HD Zac-HD changed the title 'Unsatisfiable' error when using 'str' columns in Pandas DataFrame 'Unsatisfiable' error when generating a dataframe from rows and columns without a fill value Nov 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
legibility make errors helpful and Hypothesis grokable
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants