Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

latin characters in hashtags breaks the entities extraction #16

Closed
sagar opened this Issue · 0 comments

2 participants

@sagar

text = "El caso #Bárcenas en el New York Times http://goo.gl/e5Mio #MarcaEspaña"
extractor = Extractor(text)
hts = extractor.extract_hashtags_with_indices(False)
print hts
[{'indices': (8, 10), 'hashtag': u'B'}, {'indices': (60, 70), 'hashtag': u'MarcaEspa'}]

@dryan dryan referenced this issue
Merged

Upgrade to 2.0 #17

6 of 6 tasks complete
@dryan dryan closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.