Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong data generation using dataclasses, Sequence and from_type #2257

Closed
mvaled opened this issue Dec 4, 2019 · 5 comments · Fixed by #2269
Closed

Wrong data generation using dataclasses, Sequence and from_type #2257

mvaled opened this issue Dec 4, 2019 · 5 comments · Fixed by #2269
Labels
bug

Comments

@mvaled
Copy link
Contributor

@mvaled mvaled commented Dec 4, 2019

Hypothesis from_type sometimes generates bytes when the dataclass has an attribute of type Sequence[X]:

>>> from dataclasses import dataclass
>>> from typing import Sequence
>>> @dataclass
... class Item:
...     name: str
...     

>>> @dataclass
... class Row:
...     name: str
...     items: Sequence[Item]
...     

>>> strategies.from_type(Row).example()   # <--- Wrong data type
Row(name='', items=b'\x00')

>>> strategies.from_type(Row).example()
Row(name='\x13', items=[])

>>> strategies.from_type(Row).example()
Row(name='', items=[Item(name='')])
@mvaled

This comment has been minimized.

Copy link
Contributor Author

@mvaled mvaled commented Dec 4, 2019

I'm using hypothesis 4.50.6. But the issue is reproducible in master, commit 89d494d

@mvaled

This comment has been minimized.

Copy link
Contributor Author

@mvaled mvaled commented Dec 4, 2019

The issue seems to be in the implementation of from_typing_thing:

>>> from_typing_type(Sequence[Item])                                                                                                                                                          
one_of(binary(), lists(elements=builds(Item)))

which includes binary().

@Zac-HD Zac-HD added the bug label Dec 4, 2019
@mvaled

This comment has been minimized.

Copy link
Contributor Author

@mvaled mvaled commented Dec 4, 2019

@Zac-HD

I found the underlying cause is that:

>>> try_issubclass(typing.ByteString, Sequence[Item])                                                                                                                                         
True
@mvaled

This comment has been minimized.

Copy link
Contributor Author

@mvaled mvaled commented Dec 4, 2019

I think typing.ByteString should not be a subclass of Sequence. There are other things to consider:

>>> from_typing_type(Sequence)                                                                                                                                                                
one_of(binary(), lists(elements=text()))

One may argue that if binary() is valid, so is text(), and the result should be one_of(binary(), text(), lists(elements=text())).


In Python we don't have a type for a single char and or byte; so it's difficult to express Sequence[Byte] or Sequence[Char]; and make ByteString a subclass of the first and str and subclass of the last.

mvaled added a commit to merchise/hypothesis-python that referenced this issue Dec 4, 2019
@mvaled

This comment has been minimized.

Copy link
Contributor Author

@mvaled mvaled commented Dec 4, 2019

@Zac-HD

I've created a PR #2258 but I feel it deserves a tests. However, I'm not sure if it's ok to depends of dataclasses for tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.