Skip to content

weibeld/anki-japanese

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Anki Japanese

Anki deck with sub-decks for learning Japanese:

  1. Vocabulary
    1. vocab-read
    2. vocab-speak
    3. vocab-write
  2. Kanji
    • TODO: kanji to keyword (keyword: Heisig and possibly own)
    • TODO: keyword to kanji
    • TODO: kanji to kaki-kata
    • Radicals (not priority)
      • Radical to keyword
      • Keyword to radical
      • Kanji to radical (both 214 and 69)
  3. Grammar
    • TODO: grammar item to explanation

Export and import

Export:

  1. Anki > File > Export...
    • Export format: Anki Deck Package (.apkg)
    • Include: japanese deck
    • Uncheck Include scheduling information and check Include media
  2. Click Export... and save as japanese.apkg

Note: the above has to be done from the main Anki window (small window showing all decks), not from the Browse window (which shows all notes, note types, etc.).

Import:

  1. Import deck
    • Anki > File > Import...
    • Select japanese.apkg
    • Follow import dialog
  2. Install kanji data files
    cp collection.media/* ~/Library/Application\ Support/Anki2/User\ 1/collection.media

Note: the above will add the deck to the current collection.

JavaScript debugging

  1. Make sure the AnkiWebView Inspector add-on is installed
  2. In the card preview window, right-click on any element in a preview window and select Inspect

Custom font

Kanji stroke order font v4.004 is included in the japanese.apkg file and thus will be installed in the local Anki copy when the deck is imported (this font will also be automatically synced to AnkiDroid).

To manually install a custom font into Anki, proceed as follows:

  1. Download the font as a .ttf file
  2. Move the .ttf file into the Anki media folder by prepending an underscore to the file name:
    • For example:
      mv KanjiStrokeOrders_v4.004.ttf ~/Library/Application\ Support/Anki2/User\ 1/collection.media/_KanjiStrokeOrders_v4.004.ttf
      

      Note: the underscore causes the file to be ignored by the Anki Check Media function.

  3. Declare the font in the card template CSS:
    @font-face {
      font-family: MyName;
      src: url("_KanjiStrokeOrders_v4.004.ttf");
    }

    Note: MyName may be any arbitrary name.

  4. Use the font in the card template CSS:
    font-family: MyName;

Resources:

Kanji data

The kanji data in collection.media has been sourced from kanjiapi.dev:

The format of the full data file is as follows:

{
  "kanjis": {
    "日": {...},
    "本": {...},
    ...
  },
}

Note: the {...} objects correspond to the data returned by the above /kanji/ API endpoint.

The kanji data files in collection.media have been obtained from the full data file via the following processing commands.

Transforming into list:

cat kanjiapi_full.json | jq '.kanjis | to_entries' >data.json

Note: this transforms the kanjiapi.dev data into a JSON list of objects with "key" and "value" fields, where "key" is the kanji and "value" is the API entry for that kanji. This makes the further processing simpler.

Counting entries:

cat data.json | jq length

Note: at the time of this writing, there were 13,108 entries in the data.

Listing CJK Compatibility code block entries:

cat data.json | jq '[ .[] | select(.value.unihan_cjk_compatibility_variant) ]'

Note: the above command lists all entries with the unihan_cjk_compatibility_variant field. At the time of this writing, 75 of the 13,108 entries have this field. Entries with this field are code points in the CJK Compatibility code block. These kanjis already have a coresponding "real" kanji in the data and they are regarded as duplicates of these "real" kanjis by many text processors (see kanjiapi.dev documentation). Therefore, it's best to filter out all the entries with the unihan_cjk_compatibility_variant field.

Filtering out CJK Compatibility code block entries:

cat data.json | jq '[ .[] | select(.value.unihan_cjk_compatibility_variant == null) ]' >data-clean.json

Note: the above creates a cleaned data set which does not contain any entries with the unihan_cjk_compatibility_variant field. At the time of this writing, 13,033 entries remain in this cleaned data set.

Splitting into files:

cat data-clean.json | jq -r '.[] | "\(.key)=\(.value)"' |
  while IFS='=' read key value; do
    echo "$value" >_japanese_"$key".json
  done

Note: the above creates a separate JSON file for each entry. The file name is _japanese_<kanji>.json (e.g. _japanese_日.json) and the content is the the kanjiapi.dev data for the corresponding kanji.

Notes

Collection Package (.colpkg) vs. Deck Package (.apkg)

See documentation:

  • A Collection Package exports all decks and all media (even the media that is not used in any cards)
  • A Deck Package exports either all or only a single deck
  • A Collection Package exports all media in the media folder, even files that are not used by any card
  • A Deck Package exports only the media that is used in any of the cards of the exported decks
  • When importing a Collection Package, all existing Anki content is replaced with the content of the Collection Package
  • When importing a Deck Package, the contained deck(s) is/are added to the existing collection
  • The idea of a Collection Package is to export and import the entire Anki content (e.g. for sharing or backup)
    • In the case of sharing, a Collection Package may, for example, be imported into a separate profile in Anki
  • The idea of a Deck Package is to export individual decks, mainly for sharing
    • A Deck Package can be imported into an existing collection

Media folder

Some files in the Anki media folder will be automatically included in the .apkg file when exporting the deck. This includes the custom font file and the JavaScript file.

Dev log

2025-03-19

  • Media
    • Prepend _ kanjiapi.dev JSON files in media folder to prevent Anki from listing them as "unused files" in the Check Media window (see documentation)
    • To distinguish the media of this project in the media folder from media from other decks:
      • Subfolders in the media folder are not allowed (there will be a corresponding message in the Check Media window when attempting to do so)
      • Solution: add common prefix to all media of this project (currently _japanese_*)
  • Sharing strategy
    • Parent deck japanese with all specific decks (vocab-read, vocab-write, etc.) as sub-decks
    • Export parent deck as Deck Package (.apkg). This is the usual way for sharing decks on Shared Decks (see documentation).
    • kanjiapi.dev data files are regarded by Anki as "unused" (because they are accessed by JavaScript and not referenced directly in the note fields). Therefore, request these data files to be installed separately in the media folder by potential users.
      • The custom font seems to be included in the Deck Package (.apkg) even though it's only referenced from CSS too

2025-03-18

  • Regarding incorporation of kanjiapi.dev (https://kanjiapi.dev/) data in cards
    • Cannot make JavaScript web requests from cards (sandboxed environment)
    • Complete kanjiapi.dev data is 98 MB in size and contains 13,108 kanjis
    • Envisioned solution: split kanjiapi.dev data into separate files (one file for each kanji), place these files in Anki's media folder, then access the relevant files from the cards.

2025-03-02

About

Anki deck for Japanese words

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published