determines words that you're already able to read/write but don't appear in your anki deck
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore
COPYING
README
findvocab.py
jouyou
word_list
word_list_small

README

USAGE:

- copy Anki's collection.anki2 file into findvocab folder
- run python findvocab.py [W<f>|m<n>|w<k>|r<k>|a<n>|Ar|kr|ko|nk|jo]

    - W<f>: use file <f> as word list, defaults to "word_list"
    - m<n>: set minimum length for words to <n>, defaults to 0
    - w<k>: with additional kanji <k>
    - r<k>: require words to include kanji <k>
    - a<n>: allow words to include <n> unknown kanji, defaults to 0
    - Ar: require words to include at least one unknown kanji
    - kr: kanji required, words w/o any kanji are omitted
    - ko: kanji only, words w/ at least one non kanji character are omitted
    - nk: no kanji, words w/ at least one kanji are omitted
    - jo: jouyou kanji only, words w/ non jouyou kanji are omitted

    examples:
        (1) python3 findvocab.py
        (-) python3 findvocab.py kr m2
        (2) python3 findvocab.py Ar a3 m2 jo
        (-) python3 findvocab.py w戻係犯 ko m2 Wmy_own_list

- a dialog will appear, choose your kanji deck
- view output file


FUNCTION:

(1) the script will determine words that you're already able to read/write but don't appear in your anki deck
(2) the script will determine words that use between one and three jouyou kanji, have a minimum length of two characters and don't appear in your anki deck


WORD LISTS:

(a) word_list based on http://pomax.nihongoresources.com/index.php?entry=1222520260
(b) word_list_small from http://www.offbeatband.com/2010/12/the-most-commonly-used-japanese-words-by-frequency/ which is based on (b)