ExtensionDtype should be hashable #22476
So, a safe and low-tech option is to have EA authors declare the meaningful attributes of their ExtensionDtype. For PeriodDtype, this is freq. For CategorcialDtype its the categories and order.
class PeriodDtype(ExtensionDtype): _attributes = ['freq'] def __eq__(self, other): if isinstance(other, compat.string_types): try: other = self._construct_from_string(other) except TypeError: return False if isinstance(other, type(self)): return all( getattr(self, attr) == getattr(other, attr) for attr in self._attributes ) return False def __hash__(self): return hash(tuple(getattr(self, attr) for attr in self._attributes))
That requires a couple things from all the
CategoricalDtype.categories fails on both those counts, so the default implementation would work for it, but that's fine. I think that's a less common case.
The downside is a tiny bit of duplication. We could do fancier things with metaclasses probably, but that seems like overkill.
referenced this issue
Oct 4, 2018
I guess I missed this somehow - my attempt at a SetArray (#22382) has been dormant for some time while other issues were more pressing. In any case, having hashability as a general requirement makes it impossible to implement an extension array for sets, unless there is a way to circumvent that requirement somehow.
While implementation is not around the corner, I'd like for this to be a possibility further down the line. In the same spirit, @jreback commented on my first attempt at sets (without using an extension array) in #21547 (comment):
EDIT: If need be, I guess the backend could use an array of
I haven't looked at the details, but what about a SetDtype isn't hashable? Keep in mind that this is for dtypes, not arrays (which can't be hashable, since they're mutable).…
On Tue, Nov 20, 2018 at 10:38 AM h-vetinari ***@***.***> wrote: I guess I missed this somehow - my attempt at a SetArray (#22382 <#22382>) has been dormant for some time while other issues were more pressing. In any case, having hashability as a general requirement makes it impossible to implement an extension array for sets, unless there is a way to circumvent that requirement somehow. — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#22476 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABQHIqO22peQdrViEjLvQAgfp8sgw4bPks5uxC_ugaJpZM4WItuD> .
Hmm yes that is ambiguous. I'll rephrase it when I go through the docs before the release.…
On Tue, Nov 20, 2018 at 10:47 AM h-vetinari ***@***.***> wrote: @TomAugspurger <https://github.com/TomAugspurger> Thanks for the quick response. This was my misunderstanding - I read "Pandas now requires that extension dtypes be hashable" in https://pandas-docs.github.io/pandas-docs-travis/whatsnew/v0.24.0.html#extensiontype-changes, and got the wrong idea. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#22476 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABQHInu-_BMJDaXG0N-GzizpjjM-izlPks5uxDI2gaJpZM4WItuD> .