Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

күлә instead of күләм #18

Closed
mansayk opened this issue Jan 16, 2019 · 7 comments
Closed

күлә instead of күләм #18

mansayk opened this issue Jan 16, 2019 · 7 comments
Labels
question Further information is requested

Comments

@mansayk
Copy link
Member

mansayk commented Jan 16, 2019

^күләмдәге/күлә<n><sg><sg><px1sg><loc><subst><nom>$

@mansayk mansayk added the invalid This doesn't seem right label Jan 16, 2019
@jonorthwash
Copy link
Member

@mansayk, what is the context for this word, and what command did you use to get the analysis?

I believe both analyses are possible, and the transducer should return both. If you're running it through the disambiguator (tagger) as well, then you need to provide some context for it to perform accurately—if there's no surrounding text, then it essentially just guesses.

@jonorthwash jonorthwash added question Further information is requested and removed invalid This doesn't seem right labels Jan 16, 2019
@mansayk
Copy link
Member Author

mansayk commented Jan 16, 2019

echo "күләмдәге" | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt ^күләмдәге/күлә<n><sg><sg><px1sg><loc><subst><nom>$

I don't remember the context of that exact case, but for example, it might be:
"Бик зур күләмдәге эш башкарылган!" ("A lot of work has been done.")

@mansayk
Copy link
Member Author

mansayk commented Jan 16, 2019

And there is no word "күлә" in Tatar language and I think it should not be returned at all.

@jonorthwash
Copy link
Member

@IlnarSelimcan, do you know why күлә is in the transducer?

@mansayk
Copy link
Member Author

mansayk commented Jan 28, 2019

I marked this word as Use/Arch, but there is no difference:

echo 'күләмдәге' | apertium-destxt -n | lt-proc -z -w 'apertium-tat/tat.automorf.bin' | cg-proc -z 'apertium-tat/tat.rlx.bin' | cg-proc -z -w -1 'apertium-tat/dev/mansur.bin' | apertium-retxt
^күләмдәге/күлә<n><sg><sg><px1sg><loc><subst><nom>$

Does this Use/Arch tag work? How can I remove stems with this tag from analysis?

@mansayk
Copy link
Member Author

mansayk commented Jan 28, 2019

I just saw your message in another issue about Use/Arch not being implemented yet. So my previous question has an answer now.

@mansayk
Copy link
Member Author

mansayk commented Feb 15, 2019

I disabled this word temporarily.

@mansayk mansayk closed this as completed Feb 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants