Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

serialization fails for double values when format = 'ascii' #3

Closed
rlowrance opened this issue Feb 4, 2014 · 9 comments · Fixed by #32
Closed

serialization fails for double values when format = 'ascii' #3

rlowrance opened this issue Feb 4, 2014 · 9 comments · Fixed by #32

Comments

@rlowrance
Copy link

When serializing an object using format == 'ascii', the values of number objects are truncated to 6 significant digits. Thus the loaded value can differ from the saved value.

Example program to illustrate this:

require 'torch'

local x = 913062344
local file = 'test_torch_save_load.data'
local format = 'ascii'

torch.save(file, x, format)
local y = torch.load(file, format)

print(x, y)
assert(x == y)
print('ok')

@soumith
Copy link
Member

soumith commented Feb 4, 2014

wow, that's a big catch.

@soumith
Copy link
Member

soumith commented Feb 4, 2014

Bug is from THDiskFile.c line 308ish

Default print and scan precision of fscanf and fprint is 6 digits (actually there's no standard I think).

It's better if we use %a for fprintf and fscanf so that the results will be exact.

@soumith
Copy link
Member

soumith commented Feb 4, 2014

but I cant send in a simple pull request to fix this, because it would break existing saved files.
@andresy needs to create a new version for the serialization format.

@koraykv
Copy link
Member

koraykv commented Feb 27, 2014

I don't think ascii is appropriate for conserving precision. This could cause the size of exported files to explode. Isn't it better to use binary format if you want to conserve full precision? It is faster to read/write and much smaller file size.

@soumith
Copy link
Member

soumith commented Feb 27, 2014

Ascii is needed for cross-platform compatibility (see other issues wrt
torch's big-endian/little-endian incompatibility). Ascii can have exact
precision with the %a fprintf formatter without taking up too much space.

On Thu, Feb 27, 2014 at 4:07 PM, koray kavukcuoglu <notifications@github.com

wrote:

I don't think ascii is appropriate for conserving precision. This could
cause the size of exported files to explode. Isn't it better to use binary
format if you want to conserve full precision? It is faster to read/write
and much smaller file size.

Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-36229316
.

@andresy
Copy link
Member

andresy commented Mar 20, 2014

Note sure what to do here.

  • I could augment the default precision with %g (or switch to %f) (but how much precision would that be?)
  • I could add a method to set the default precision (infinite would be with %a)

@soumith
Copy link
Member

soumith commented Mar 20, 2014

%a would be ideal but it is not C89. So maybe increasing the precision a bit makes sense.

@soumith
Copy link
Member

soumith commented Mar 20, 2014

%f by default would be 6 digits I think unless you specify explicit number of digits to pick up for decimal and fractional parta

@clementfarabet
Copy link
Member

Increasing the precision is probably enough at this point. Maybe twice as much as we have now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants