Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I get the new data you used? #1

Closed
peternara opened this issue Jun 5, 2018 · 2 comments
Closed

Can I get the new data you used? #1

peternara opened this issue Jun 5, 2018 · 2 comments

Comments

@peternara
Copy link

Can I get the new data(included training format) you used?
thanks.:)

@symoon11
Copy link
Owner

symoon11 commented Jun 6, 2018

I uploaded training dataset(4000 images) and test dataset(1000 images).
Each dataset consists of 3 gz files(image, label-digit, label-color) and has the same format as the ordinary MNIST dataset.

When you use the dataset

  1. Unzip the datasets.
  2. Each file is a byte code. So you have to change it to integer. Here is an example of decoding the datasets.

import numpy as np
from struct import *

images = open('test-images-ubyte', 'rb')
digits = open('test-label-digit-ubyte', 'rb')
colors = open('test-label-color-ubyte', 'rb')

for i in range(4000):
    image_byte = images.read(28 * 28 * 3)
    digit_byte = digits.read(1)
    color_byte = colors.read(3)

    image = np.reshape(unpack(len(image_byte) * 'B', image_byte), [28, 28, 3])
    digit = unpack(len(digit_byte) * 'B', digit_byte)
    color = unpack(len(color_byte) * 'B', color_byte)

Then, you can get an image and the corresponding digit and color,
Note that digit is not a one-hot vector, but a scalar value.

If there is any problem with the datasets, feel free to contact me.
Thanks.

@peternara
Copy link
Author

peternara commented Jun 8, 2018

@drillermoon I try it!! thanks :) I appreciate the code you have also released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants