-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POS tags are not correct #2
Comments
I'm not sure if this is a problem with the way we are using hazm or something else. the service calls tagger.tag(بگو) and gets back: @elijahjcooke any thoughts? |
So the problem is for some reason Hazm is not tokenizing the text correctly. Hazm should break the text into sentences and then break it into words but for some reason is breaking the individual characters apart instead of the words. |
Arethusa currently only sends single words to the parser, not entire sentences. |
I haven't tried it with multiple sentences, as it makes the treebanking On Wed, Feb 10, 2016 at 2:22 PM, Bridget Almas notifications@github.com
|
Ok then I might know a fix to the problem, @balmas Will Arethusa be automatically updated if change the code on github? |
Thanks! I'll deploy tomorrow! |
ah, sorry misunderstood the question here .. the morphology service api will not be automatically updated but I'm happy to deploy for testing when you're ready. |
For some words it doesn't refer to anything:
http://services.perseids.org/pysvc/morphologyservice/analysis/word?word=%D9%84%D8%B7%D9%81&lang=per&engine=hazm
For some gives wrong POS:
http://services.perseids.org/pysvc/morphologyservice/analysis/word?word=%D8%A8%DA%AF%D9%88&lang=per&engine=hazm
It refers to noun, although بگو is a verb.
The text was updated successfully, but these errors were encountered: