Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-640: [Python] Implement __hash__ and equality for Array scalar values Arrow scalar values #1765

Closed
wants to merge 5 commits into from
@@ -73,6 +73,8 @@ cdef class ArrayValue(Scalar):
raise NotImplementedError(
"Cannot compare Arrow values that don't support as_py()")

def __hash__(self):

This comment has been minimized.

Copy link
@pitrou

pitrou Mar 20, 2018

Contributor

Ok, I think we misunderstood each other.
When you asked about restricting yourself to integers, I thought you meant about writing a fast path that avoids calling as_py(). If there is no fast path, then there's no need to check for ints or any other types.

This comment has been minimized.

Copy link
@AlexHagerman

AlexHagerman Mar 20, 2018

Author Contributor

Sorry about that. I've switch it to hash() using as_py() without any type checks. Per the JIRA ticket can we consider a separate ticket for a fast path custom hash on ints?

This comment has been minimized.

Copy link
@pitrou

pitrou Mar 21, 2018

Contributor

Per the JIRA ticket can we consider a separate ticket for a fast path custom hash on ints?

Definitely.

This comment has been minimized.

Copy link
@AlexHagerman

AlexHagerman Mar 22, 2018

Author Contributor

Ticket created https://issues.apache.org/jira/browse/ARROW-2339. Anything else you would suggest for this PR?

return hash(self.as_py())

cdef class BooleanValue(ArrayValue):

@@ -171,3 +171,30 @@ def test_dictionary(self):
categorical.categories)
for i, c in enumerate(values):
assert v[i].as_py() == c

def test_int_hash(self):
# ARROW-640
int_arr = pa.array([1, 1, 2, 1])
assert hash(int_arr[0]) == hash(1)

def test_float_hash(self):
# ARROW-640
float_arr = pa.array([1.4, 1.2, 2.5, 1.8])
assert hash(float_arr[0]) == hash(1.4)

def test_string_hash(self):
# ARROW-640
str_arr = pa.array(["foo", "bar"])
assert hash(str_arr[1]) == hash("bar")

def test_bytes_hash(self):
# ARROW-640
byte_arr = pa.array([b'foo', None, b'bar'])
assert hash(byte_arr[2]) == hash(b"bar")

def test_array_to_set(self):
# ARROW-640
arr = pa.array([1, 1, 2, 1])
set_from_array = set(arr)
assert isinstance(set_from_array, set)
assert set_from_array == {1, 2}
ProTip! Use n and p to navigate between commits in a pull request.
You can’t perform that action at this time.