-
Notifications
You must be signed in to change notification settings - Fork 586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hypothesis.extra.pandas.column, errors out when elements = lists and unique = True #3144
Comments
|
Have no idea myself sorry, here's just a reproducer for folks. >>> from hypothesis import strategies as st
>>> from hypothesis.extra import pandas as pdst
>>> col1 = pdst.column(elements=st.lists(st.integers()), dtype=list, unique=False)
>>> pdst.data_frames([col1]).example()
0
0 []
>>> col2 = pdst.column(elements=st.lists(st.integers()), dtype=list, unique=True)
>>> pdst.data_frames([col2]).example()
TypeError: unhashable type: 'numpy.ndarray' |
|
Well, the problem is just that we track a You get the same error from |
|
I had labeled this a bug because I think that Even if the user has insight into the fact that we require "hashability" in order to draw unique elements, and used |
|
The underlying problem here is that there's no such thing as |
When defining a pandas.columns with lists
We got this error
... /usr/local/lib/python3.8/dist-packages/hypothesis/internal/conjecture/data.py:884: in draw return strategy.do_draw(self) /usr/local/lib/python3.8/dist-packages/hypothesis/strategies/_internal/lazy.py:168: in do_draw return data.draw(self.wrapped_strategy) /usr/local/lib/python3.8/dist-packages/hypothesis/internal/conjecture/data.py:884: in draw return strategy.do_draw(self) /usr/local/lib/python3.8/dist-packages/hypothesis/strategies/_internal/lazy.py:168: in do_draw return data.draw(self.wrapped_strategy) /usr/local/lib/python3.8/dist-packages/hypothesis/internal/conjecture/data.py:884: in draw return strategy.do_draw(self) /usr/local/lib/python3.8/dist-packages/hypothesis/strategies/_internal/core.py:1393: in do_draw return self.definition(data.draw, *self.args, **self.kwargs) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ draw = <bound method ConjectureData.draw of ConjectureData(VALID, 1961 bytes, frozen)> @st.composite def just_draw_columns(draw): index = draw(index_strategy) local_index_strategy = st.just(index) data = OrderedDict((c.name, None) for c in rewritten_columns) # Depending on how the columns are going to be generated we group # them differently to get better shrinking. For columns with fill # enabled, the elements can be shrunk independently of the size, # so we can just shrink by shrinking the index then shrinking the # length and are generally much more free to move data around. # For columns with no filling the problem is harder, and drawing # them like that would result in rows being very far apart from # each other in the underlying data stream, which gets in the way # of shrinking. So what we do is reorder and draw those columns # row wise, so that the values of each row are next to each other. # This makes life easier for the shrinker when deleting blocks of # data. columns_without_fill = [c for c in rewritten_columns if c.fill.is_empty] if columns_without_fill: for c in columns_without_fill: data[c.name] = pandas.Series( np.zeros(shape=len(index), dtype=c.dtype), index=index ) seen = {c.name: set() for c in columns_without_fill if c.unique} for i in range(len(index)): for c in columns_without_fill: if c.unique: for _ in range(5): value = draw(c.elements) > if value not in seen[c.name]: E TypeError: unhashable type: 'numpy.ndarray' /usr/local/lib/python3.8/dist-packages/hypothesis/extra/pandas/impl.py:578: TypeErrorBut it works fine, when we set column
unique=False:column(name='group_array', elements=lists(integers(min_value=-2147483648, max_value=2147483647), max_size=5, min_size=5, unique=True), dtype=list, fill=None, unique=False)The text was updated successfully, but these errors were encountered: