-
-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate Japanese board #36
Comments
For a Japanese board, I would imagine each tile being a kana. A kana is a syllable (like letters) that make up a "word". There are two ways of writing the same kana, hiragana and katakana. There is already a word game involving linking kana (Shiritori) which uses a similar mechanism. I am not a Japanese speaker but I definitely see a Japanese board being feasible. |
Disclaimer: I do not speak Japanese as mother tongue. I'm learning Japanese as a hobby and can understand/read it to some degree. But I would be interested to see Lexica with Japanese words. I agree with @Riotism. IMHO it would be possible and reasonable to use only kana. The two syllabaries (hiragana and katakana) are used for different purposes but are otherwise interchangeable. Therefore you could use just one of them. To get a Japanese word list, I think you would have to start with a good dictionary and get readings (kana) for all words, because most Japanese words are written with a script called kanji, which are logograms like the Chinese characters. Then you could convert all kana into one of the syllabaries and use the result as word list. Currently the dictionary most free apps are using is JMdict/EDICT. It can be used under the terms of the Creative Commons licence. I wrote a simple Python script to get the dictionary and create a usable word list from that. Here is the result: https://gist.github.com/wichmann/7912e0f7694ad8fdbd584b94b2e792f0. |
Oh, that is great, thanks so much @wichmann! I've taken your word list, and it does indeed work successfully (working on my fork on a branch called
I will try and prepare a release with it to get further feedback, but before that I'll quickly test:
@wichmann - Do you mind if I include your script in the Also, would you be able to provide any feedback on the letter scores I've taken from Wikipedia and added here? |
I've taken the "small letters" and put them next to what I think looked like (to my naive English-reading eyes) to be the larger versions of the same letter, giving them the same score: Commit message above explains further. |
Now I've dealt with the diacritics in this commit: Only a few more characters left:
Any feedback for these? |
FYI, I'm guessing that the idea in #71 will also be appropriate here, based on the wikipedia article about Scrabble letters, and how they seem to be somewhat normalized (with regards to diacritics). If so, it will probably have to wait until myself or someone else is able to implement the neccessary changes to the guts of Lexica and how it stores word lists internally. |
@pserwylo - Thanks for all your work. Of course you can include my script. As license the GPLv3+ is fine by me. As for the characters left: "ゐ" and "ゑ" are obsolete hiragana which are not used today, only in old texts. "〜" represents a Japanese tilde, IMHO it is never used in words, only for ranges or special purposes. All words with these three characters can be eliminated from the word list, as there are only a few of those. "を" is used as a grammatical marker ("particle") and in loan words, but usually not in japanese dictionary words. Mostly, it is present in the word list, because the list contains phrases where it serves as particle. "が" and "ぎ" are just versions of "か" and "き" with diacritics. "ー" is used as a symbol for a long vowel, almost never used with hiragana, only with katakana. My script tries to convert all words to hiragana and it falsely leaves these characters in. In hiragana the symbol should be replaced by the vowel which it represents. Maybe there is a better way to make the conversion in the script?! |
Closing as a Japanese dictionary has existed for some time. If there are any issues with it, we can always open new issues. |
Pretty much exactly the same as #35 (but that is for Chinese). The Japanese UI translation was one of the first to be contributed, so I want to also have a Japanese version of the board, but I don't have enough knowledge of the language to figure out if it is meaningful or how to go about implementing it.
The text was updated successfully, but these errors were encountered: