Switch branches/tags
Nothing to show
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
kanji.rdp25.npz Add files via upload Feb 14, 2018
short_kanji.npz Add files via upload May 17, 2017
short_kanji_sample.svg Add files via upload May 17, 2017


KanjiVG Dataset


This directory contains a vector dataset of Kanji characters used in the machine learning experiment described in this blog post. It is derived from KanjiVG, an educational open source project for teaching people Kanji.

We have converted and simplified the original .svg each stroke as a short polyline sequence, and stored the data into the stroke-3 format for training sketch-rnn. The train/validation/test split sizes are 10000/600/500 respectively.


filename max length description
short_kanji.npz 88 original dataset used in mentioned blog post
kanji.rdp25.npz (recommended) 133 rebuilt dataset using RDP line simplification with epsilon=0.25
kanji.rdp100.npz 80 RDP with epsilon=1.0
kanji.rdp200.npz 70 RDP with epsilon=2.0 (same as QuickDraw dataset)


KanjiVG is Copyright © 2009-2015 Ulrich Apel

Creative Commons License
This dataset, and all files in this directory is licensed under a Attribution-ShareAlike 3.0 Unported License (CC BY-SA 3.0).