New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The tfrecords file is 8 times larger than raw image data #9675
Comments
Because you use Int64List. |
@ppwwyyxx , but he uses BytesList for the image, not Int64.
I think we should reopen issue. |
|
@ppwwyyxx , Probably the are not jpeg anymore when stored in tfrecord. I edited my code in the previous post. |
Then it definitely will be several times larger and the factor depends on how well JPEG works on your images. The factor is 5.x on the whole ImageNet, btw (reference). |
Is it because tfrecords stores decompressed images? |
TFRecord stores bytes so you can do any encoding you want. |
@ppwwyyxx This solution really reduces the size. But when use |
I find the solution |
For anyone who is confused how serialization of digital images works, this is a pretty wonderful explanation of "why the size of TFRecords might be larger" than the original image, from the ground up. Here: https://planspace.org/20170403-images_and_tfrecords/ |
I try to write a tfrecords file, but the file is larger than raw data.
but I write this 'example_string' to tfrecords file , the tfrecords file size become 192 kb, I cann`t understand why tfrecords file size serval times larger than 'example_string' and raw image data
The text was updated successfully, but these errors were encountered: