Skip to content
Branch: master
Find file History

Latest commit

Thomas Deselaers Copybara-Service
Thomas Deselaers and Copybara-Service Initial commit.
PiperOrigin-RevId: 297120965
Latest commit 96ec3ae Feb 25, 2020


Type Name Latest commit message Commit time
Failed to load latest commit information. Initial commit. Feb 25, 2020

The Didi dataset: Digital Ink Diagram data

This repository contains a Colab notebook that demonstrates how to handle the Digital ink data from the Didi dataset.

The dataset contains digital ink drawings of diagrams with dynamic drawing information. The dataset aims to foster research in interactive graphical symbolic understanding. The dataset was obtained using a prompted data collection effort.

Download the Didi dataset

We provide the raw data in NDJSON format as well as the prompts in png, dot, and xdot format.

The dataset and details about its construction and use are described in this ArXiV paper: The Didi dataset: Digital Ink Diagram data.

Visualizing and converting the data.

We are providing a colab notebook that demonstrates how to read and visualize the data. It also provides functions to convert the data to TFRecord files for easy use in tensorflow.

Training and evaluating a model

First download the Didi dataset. For this you can either download the raw data or use our demo colab to convert the data into TFRecord files.

Our paper gives more information about a potential train/validation/test split of the data.


The data is licensed by Google LLC under CC BY 4.0 license. The code is released under an Apache 2 license.

You can’t perform that action at this time.