Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jwktl hangs on russian orthography #72

Open
michael-newsrx opened this issue Aug 15, 2019 · 2 comments
Open

jwktl hangs on russian orthography #72

michael-newsrx opened this issue Aug 15, 2019 · 2 comments

Comments

@michael-newsrx
Copy link

michael-newsrx commented Aug 15, 2019

Using English version.

Latest wiktionary downloaded and parsed.

Using gradle: compile 'com.github.dkpro:dkpro-jwktl:56499bdaab' to obtain latest snapshot.

I'm analyzing text from various sources and some Russian (I presume) text is in my test data, the operation "wkt.getEntriesForWord("Статтю", true);" hangs like it is in an infinite loop.

Was expecting an empty entries list, not app hang.

Example term: Статтю

@chmeyer
Copy link
Member

chmeyer commented Aug 19, 2019

Not really an infinite loop, but definitely unexpected behavior. As a quick-fix, you can remove the boolean param (i.e., use wkt.getEntriesForWord("Статтю");instead. Normalization of titles is not supported for non-Latin alphabets and causes this issue also for other, e.g., Russian entries. I'll see if I can solve the actual issue in one of the later versions. Please report back if removing the normalization param helps for you.

@chmeyer chmeyer added this to the JWKTL 1.2.0 milestone Aug 19, 2019
@chmeyer chmeyer self-assigned this Aug 19, 2019
@michael-newsrx
Copy link
Author

michael-newsrx commented Aug 19, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants