-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Irregular inflections are still incorrect #802
Comments
Unfortunately I do not have the time to examine the irregularities myself. Could you please post the verb which is not inflected properly, along with base1, base2, base3, base4, base5 base-te and base-ta forms? Thanks! |
I don't think I'm communicating this effectively. It is not an issue of individual verbs, it's an issue of entire classes of verbs/adjectives being handled incorrectly. You don't need to reinvent the wheel, nor try to fix every irregular individually. EDICT already has a conjugator that works well and covers most non-archaic examples: http://edrdg.org/~smg/cgi-bin/hgweb-jmdictdb.cgi/file/tip/python/conj.py?style=gitweb Note the above files need to be saved and opened to display the characters correctly. |
Thanks, but that's precisely it: I unfortunately lack the time and skills to grok Python scripts and figure out how to run them. Also, I can't use them in Android - I'd need to convert them to Java first. Also, Aedict already has an inflector which I'd like to use (it's not public though). The best way for you to provide me with data is to state the following here:
|
I see. I don't have the time to chug through all this right now, not sure when I will. Does this help?: I've extracted the relevant explanation of the algorithm from the python code, see info.txt. I took the conjugation csv and substituted in all the descriptors so they don't need to be looked up, see conjo.xlsx. Even EDICT is missing a bunch of archaics and some things like v5uru, but it's a start. The spreadsheet contains plenty of conjugations I'm sure Aedict already gets right. The ones that need to be looked at more carefully are: |
Thanks, that CSV file is pretty nifty. I'll revisit those specials one by one and I'll let you know. |
Can you please provide how exactly adj-ix should be inflected differently than adj-i? I don't see any difference in conjo.xlsx. Can you please provide an example? Can you please tell me how かっこいい should be inflected? |
v1-s: Ichidan verb kureru special class: くれる; the difference to v1 is that form5 does not end with ro so it's くれ! |
Cool! Check over the algorithm notes at the bottom of info.txt. I'll walk through adj-ix non-past negative plain with かっこいい.:
3a) Skip because euphr is not null. The only ones that currently have these extra steps are adj-ix, vk, vs-s, and vs-i. So basically いい, くる, and する, but processing by the POS code is necessary because each of these is effectively an entire class of verbs with multiple dictionary entries for compounds that conjugate like them. |
Thanks! v5k-s fixed. Now for this pesky adj-ix |
Fixed adj-ix |
Fixed v5r-i |
v5u-s done |
|
With this I believe that all inflections are in place. Fixed in Aedict 3.44; please reopen if any inflections are off. |
3.44 isn't on the play store yet, but just preempting...:
I'm trying to find WWWJDIC's actual conjugation code, because clearly it can't be using just the sheet I posted. So far I've come up dry. I'll try emailing the developer. |
Thanks for letting me know. You're right with Regarding |
vs-s: Fixed in Aedict 3.45 |
For future reference, I've attached the corrected spreadsheet. The JMDict developers have been notified and fixed the issue with vs-s. |
See #758
The solution that closed the above bug was to special case 行く, いい, etc. That's not a good solution, because it failed to catch extensions like もっていく and かっこいい.
A better solution would be for the inflection panel to take in the POS code as an input and correctly handle the irregular ones like vs, vk, adj-ix, v5k-s, etc. The full list of JMDict POS codes should be considered, there are many obscure irregulars: http://www.edrdg.org/wwwjdic/wwwjdicinf.html#code_tag.
The text was updated successfully, but these errors were encountered: