-
Notifications
You must be signed in to change notification settings - Fork 7
Show word frequency information in the JMDict entry detail screen #539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi Martin, I played a bit around with the word and kanji frequencies computed by Tatsuhiko Matsushita (http://www17408ui.sakura.ne.jp/tatsum/English_top_Tatsu.html) to see how such information could be useful. Just an example to be clear. I found "ぜいたく" in a text and discovered its meaning, "luxury", using Aedict. The hiragana form is underlined, but it has also a kanji form "贅沢" which is also underlined.
My current targets are 1000 kanjis and 10000 words with a core target to 500k/5000w.So I safely directly memorized ぜいたく and add 沢 in my kanji to be learned later. In conclusion it would be nice to be able to set two frequency levels for words, to be used to show them in three different levels of grey in all word screens (black, dark grey, light grey). The same for kanji in the kanji specific screens. Another use would be to start with a list of kanji ordered by their frequency. Take one and show iits most frequent words. Learned them and the most frequent kanji in them, the sample sentences being used to facilitate the memorization of words and kanji at the same time. Andre |
Regarding the kanji frequency list: I am currently using this list of 1000 most frequent kanjis. The problem is that I cannot remember where I got this list ;) Please use it as-is ;) 日 is the most frequent kanji, followed by 一, etc. 日一国会人年大十二本中長出三同時政事自行社見月分議後前民生連五発間対上部東者党地合市業内相方四定今回新場金員九入選立開手米力学問高代明実円関決子動京全目表戦経通外最言氏現理調体化田当八六約主題下首意法不来作性的要用制治度務強気小七成期公持野協取都和統以機平総加山思家話世受区領多県続進正安設保改数記院女初北午指権心界支第産結百派点教報済書府活原先共得解名交資予川向際査勝面委告軍文反元重近千考判認画海参売利組知案道信策集在件団別物側任引使求所次水半品昨論計死官増係感特情投示変打男基私各始島直両朝革価式確村提運終挙果西勢減台広容必応演電歳住争談能無再位置企真流格有疑口過局少放税検藤町常校料沢裁状工建語球営空職証土与急止送援供可役構木割聞身費付施切由説転食比難防補車優夫研収断井何南石足違消境神番規術護展態導鮮備宅害配副算視条幹独警宮究育席輸訪楽起万着乗店述残想線率病農州武声質念待試族象銀域助労例衛然早張映限親額監環験追審商葉義伝働形景落欧担好退準賞訴辺造英被株頭技低毎医復仕去姿味負閣韓渡失移差衆個門写評課末守若脳極種美岡影命含福蔵量望松非撃佐核観察整段横融型白深字答夜製票況音申様財港識注呼渉達良響阪帰針専推谷古候史天階程満敗管値歌買突兵接請器士光討路悪科攻崎督授催細効図週積丸他及湾録処省旧室憲太橋歩離岸客風紙激否周師摘材登系批郎母易健黒火戸速存花春飛殺央券赤号単盟座青破編捜竹除完降超責並療従右修捕隊危採織森競拡故館振給屋介読弁根色友苦就迎走販園具左異歴辞将秋因献厳馬愛幅休維富浜父遺彼般未塁貿講邦舞林装諸夏素亡劇河遣航抗冷模雄適婦鉄寄益込顔緊類児余禁印逆王返標換久短油妻暴輪占宣背昭廃植熱宿薬伊江清習険頼僚覚吉盛船倍均億途圧芸許皇臨踏駅署抜壊債便伸留罪停興爆陸玉源儀波創障継筋狙帯延羽努固闘精則葬乱避普散司康測豊洋静善逮婚厚喜齢囲卒迫略承浮惑崩順紀聴脱旅絶級幸岩練押軽倒了庁博城患締等救執層版老令角絡損房募曲撤裏払削密庭徒措仏績築貨志混載昇池陣我勤為血遅抑幕居染温雑招奈季困星傷永択秀著徴誌庫弾償刊像功拠香欠更秘拒刑坂刻底賛塚致抱繰服犯尾描布恐寺鈴盤息宇項喪伴遠養懸戻街巨震願絵希越契掲躍棄欲痛触邸依籍汚縮還枚属笑互複慮郵束仲栄札枠似夕恵板列露沖探逃借緩節需骨射傾届曜遊迷夢巻購揮君燃充雨閉緒跡包駐貢鹿弱却端賃折紹獲郡併草徹飲貴埼衝焦奪雇災浦暮替析預焼簡譲称肉納樹挑章臓律誘紛貸至宗促慎控 |
Regarding the word frequency list, I am currently using http://ftp.monash.edu.au/pub/nihongo/00INDEX.html Michiel Kamermans word occurency data. Mainichi Shimbun's frequency list is not as good as it reflects a specific part of JP language only. |
Kanji's commonality information item has been added to Aedict 3.37 - just click the (i) button next to the kanji, to show the commonality information. |
Le 2015-09-18 08:04, Martin Vysny a écrit :
|
Le 2015-09-18 07:54, Martin Vysny a écrit :
|
Le 2015-09-18 07:54, Martin Vysny a écrit :
|
Yes, every word-based search should automatically be sorted, most frequent words first. This includes the Kanji Detail screen's "WORDS" tab. |
perfect Envoyé depuis un mobile Samsung -------- Message d'origine -------- Max. 6 stars, 6 stars most common, 0 stars least common. Matsushita: 6 stars are roughly index 1..3333, 5 stars are roughly index 3334.6666, etc. Please let me know if this is okay. — |
Changed the star pattern a bit: Matsushita 6 stars roughly correspond to index 1..4000, 5 stars correspond to 4001..8000, ..., 1 star correspond to index 16000..21000, zero stars mean that the entry is not present in the Matsushita list at all. |
Fixed in Aedict 3.37 |
@mvysny Hi Martin, I'm interested in purchasing Aedict3, but I would like to easily see the word frequencies from Matsushita (I want to add only common words to Anki). The current screenshots on Google Play don't show the frequency stars or numbers for words. Can you confirm that the frequencies are still being shown in the app? |
@denyeo: Hi denyeo, I apologize, the screenshots are quite old ;) Yes, the Matsushita+Kamermans occurence index (the number of stars) is shown for the newest Aedict. Just make sure that you have newest dictionaries installed, and you should be good to go. I have added an Aedict screenshot showing the Matsushita occurence index, please wait a couple of hours before it will appear on Google Play |
@mvysny Thanks for adding the screenshot! It shows a frequency for 母 (a single kanji), so I'd like to confirm that:
(Matsushita in fact has a 60,000 word frequency list, available at http://tatsuma2010.web.fc2.com/ Vocabulary Database for Reading Japanese (for Teachers) Ver. 1.0, which you might consider using in future for lower frequencies such as 20,000-30,000. But this is a small and unessential thing.) Appreciate it! |
Yes, the Matsushita frequency list is in fact VDLJ-GL 1.0 taken from http://www17408ui.sakura.ne.jp/tatsum/English_top_Tatsu.html Sure, the frequency list applies to words, not to kanjis per se. Yet, the screenshot of a single hon kanji could in fact be confusing, so I updated the screenshot to show the word 書斎. Attaching screenshot, I have also updated the screenshot at Google Play. |
@mvysny You're the best. I've bought the app. Hope many others do! |
Thanks man for your support, I hope Aedict will serve you well ;) |
Hi Martin,
A problem I am facing when trying to read in a new language is to have access to easy but interesting texts with a good integration with a dictionary, grammar and learning tool (including word frequencies). Aedict is going in the right direction but... It uses phrases only, not text. And there is no word frequencies and the kanji frequencies are too general and not so much integrated in the learning tools.
As a source of good texts I am using News Web Easy (http://www3.nhk.or.jp/news/easy/index.html). And I recently discovered Kanji Web Easy (http://www.kanjiwebeasy.com/) to help focusing on what is really useful to learn (at least for News Web Easy).
Would it be possible to integrate (hyperlinks might be enough) these tools in Aedict offering a complete learning environment for all levels? I am pretty sure Sebastien from Kanji Web Easy will be open to a collaboration. I have no idea concerning the guys behind News Web Easy (http://www.aovill.com/).
Of course other computed frequencies available on the web could be used.
Andre
The text was updated successfully, but these errors were encountered: