Skip to content

Commit

Permalink
hash_string() should not depend on python's internal unicode represen…
Browse files Browse the repository at this point in the history
…tation, also fixes explosion/sense2vec#5 for py2
  • Loading branch information
henningpeters committed Mar 6, 2016
1 parent 7adbd7a commit b740f20
Showing 1 changed file with 2 additions and 4 deletions.
6 changes: 2 additions & 4 deletions spacy/strings.pyx
Expand Up @@ -23,10 +23,8 @@ import ujson as json


cpdef hash_t hash_string(unicode string) except 0:
# This has to be like this for
chars = <char*>PyUnicode_AS_DATA(string)
size = PyUnicode_GET_DATA_SIZE(string)
return hash64(chars, size, 1)
chars = string.encode('utf8')
return hash64(<char*>chars, len(chars), 1)


cdef unicode _decode(const Utf8Str* string):
Expand Down

0 comments on commit b740f20

Please sign in to comment.