This directory contains a vector dataset of Kanji characters used in the machine learning experiment described in this blog post. It is derived from KanjiVG, an educational open source project for teaching people Kanji.
We have converted and simplified the original
.svg each stroke as a short polyline sequence, and stored the data into the stroke-3 format for training
sketch-rnn. The train/validation/test split sizes are 10000/600/500 respectively.
|short_kanji.npz||88||original dataset used in mentioned blog post|
|kanji.rdp25.npz (recommended)||133||rebuilt dataset using RDP line simplification with epsilon=0.25|
|kanji.rdp100.npz||80||RDP with epsilon=1.0|
|kanji.rdp200.npz||70||RDP with epsilon=2.0 (same as QuickDraw dataset)|
KanjiVG is Copyright © 2009-2015 Ulrich Apel
This dataset, and all files in this directory is licensed under a Attribution-ShareAlike 3.0 Unported License (CC BY-SA 3.0).