New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH/CLN: Fix DataType hashing #1172

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
2 participants
@cpcloud
Member

cpcloud commented Oct 16, 2017

Before this PR, every datatype's hash function was hash(type(self)), which
means collisions galore (triggering eq comparison when using complex types
as dict keys). This PR provides a hash function based on __slots__ giving
every type a unique hash based on type(self), its slots (if any) + the
nullable field from the DataType parent class.

@cpcloud cpcloud force-pushed the cpcloud:cleanup-hashing branch 6 times, most recently from d92802d to 89bcb86 Oct 16, 2017

@cpcloud cpcloud self-assigned this Oct 22, 2017

@cpcloud cpcloud added this to the 0.12 milestone Oct 22, 2017

@cpcloud cpcloud added the enhancement label Oct 22, 2017

@cpcloud cpcloud force-pushed the cpcloud:cleanup-hashing branch 4 times, most recently from c9a67d6 to 45f7ec7 Oct 22, 2017

@cpcloud

This comment has been minimized.

Member

cpcloud commented Oct 24, 2017

@wesm can you review this when you get a chance?

@cpcloud cpcloud requested a review from wesm Oct 24, 2017

@wesm

wesm approved these changes Oct 24, 2017

Looks reasonable to me. Thanks for doing this!

self._name = name
if not isinstance(schema, Schema):
raise TypeError(
'schema argument to HasSchem class must be a Schema instance'

This comment has been minimized.

@wesm

wesm Oct 24, 2017

Member

Minor typo

This comment has been minimized.

@cpcloud

cpcloud Oct 24, 2017

Member

Thanks.

@@ -390,78 +415,79 @@ def valid_literal(self, value):
class Int8(Integer):
__slots__ = ()

This comment has been minimized.

@wesm

wesm Oct 24, 2017

Member

Do __slots__ inherit in any reasonable way? (I assume not)

This comment has been minimized.

@cpcloud

cpcloud Oct 24, 2017

Member

That's right. Stated here: https://docs.python.org/3.6/reference/datamodel.html#notes-on-using-slots

The action of a __slots__ declaration is limited to the class where it is defined. As a result, subclasses will have a __dict__ unless they also define __slots__ (which must only contain names of any additional slots).

@cpcloud

This comment has been minimized.

Member

cpcloud commented Oct 24, 2017

Merging on green.

@cpcloud cpcloud force-pushed the cpcloud:cleanup-hashing branch 3 times, most recently from 92ca85b to daea4ca Oct 24, 2017

@cpcloud cpcloud force-pushed the cpcloud:cleanup-hashing branch from daea4ca to 3189e43 Oct 27, 2017

@cpcloud cpcloud closed this in bb9ef9e Oct 28, 2017

@cpcloud cpcloud deleted the cpcloud:cleanup-hashing branch Oct 28, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment