Skip to content

lopeLH/unown-mnist

Repository files navigation

Unown-MNIST

Unown-MNIST is a dataset of artificially augmented unown pokemon images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 28 classes. After so many years, MNIST has become kind of boring. While Unown-MNIST is certainly too easy to be taken seriously as a benchmarking dataset, at least can serve as an amusing sanity check for new algorithms.

Here's an example of how the data looks like:

Get the data

The dataset is distributed as a set of uint8 serialized numpy arays:

File Size Array dimension Description
X_train.npy 45M (60000, 28, 28) Training images
X_test.npy 7,5M (10000, 28, 28) Test images
Y_train.npy 469K (60000,) Training labels
Y_test.npy 79K (10000,) Test labels

Once you have downloaded the files, you can load them in Python like this:

import numpy as np

X_train, Y_train = np.load("X_train.npy"), np.load("Y_train.npy")
X_test, Y_test = np.load("X_test.npy"), np.load("Y_test.npy")

To play arround with the augmentation steps applied to generate the dataset, and generate your own, have a look at the notebook dataset_generation.ipynb.

Okay, but why?

I prepared this dataset to train a GAN+CPPN model, so I could create some cool unown transitions and new species:

About

MNIST-like dataset of unown pokemons.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published