USAGE:
- copy Anki's collection.anki2 file into findvocab folder
- run python findvocab.py [W<f>|m<n>|w<k>|r<k>|a<n>|Ar|kr|ko|nk|jo]
- W<f>: use file <f> as word list, defaults to "word_list"
- m<n>: set minimum length for words to <n>, defaults to 0
- w<k>: with additional kanji <k>
- r<k>: require words to include kanji <k>
- a<n>: allow words to include <n> unknown kanji, defaults to 0
- Ar: require words to include at least one unknown kanji
- kr: kanji required, words w/o any kanji are omitted
- ko: kanji only, words w/ at least one non kanji character are omitted
- nk: no kanji, words w/ at least one kanji are omitted
- jo: jouyou kanji only, words w/ non jouyou kanji are omitted
examples:
(1) python3 findvocab.py
(-) python3 findvocab.py kr m2
(2) python3 findvocab.py Ar a3 m2 jo
(-) python3 findvocab.py w戻係犯 ko m2 Wmy_own_list
- a dialog will appear, choose your kanji deck
- view output file
FUNCTION:
(1) the script will determine words that you're already able to read/write but don't appear in your anki deck
(2) the script will determine words that use between one and three jouyou kanji, have a minimum length of two characters and don't appear in your anki deck
WORD LISTS:
(a) word_list based on http://pomax.nihongoresources.com/index.php?entry=1222520260
(b) word_list_small from http://www.offbeatband.com/2010/12/the-most-commonly-used-japanese-words-by-frequency/ which is based on (b)