Unown-MNIST
is a dataset of artificially augmented unown pokemon images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 28 classes. After so many years, MNIST has become kind of boring. While Unown-MNIST
is certainly too easy to be taken seriously as a benchmarking dataset, at least can serve as an amusing sanity check for new algorithms.
Here's an example of how the data looks like:
The dataset is distributed as a set of uint8
serialized numpy
arays:
File | Size | Array dimension | Description |
---|---|---|---|
X_train.npy | 45M | (60000, 28, 28) |
Training images |
X_test.npy | 7,5M | (10000, 28, 28) |
Test images |
Y_train.npy | 469K | (60000,) |
Training labels |
Y_test.npy | 79K | (10000,) |
Test labels |
Once you have downloaded the files, you can load them in Python like this:
import numpy as np
X_train, Y_train = np.load("X_train.npy"), np.load("Y_train.npy")
X_test, Y_test = np.load("X_test.npy"), np.load("Y_test.npy")
To play arround with the augmentation steps applied to generate the dataset, and generate your own, have a look at the notebook dataset_generation.ipynb.
I prepared this dataset to train a GAN+CPPN model, so I could create some cool unown transitions and new species: