New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fragility of serialize() for digest() use? #95
Comments
I think you just demonstrated (quite nicely) that reordering in Both are base R functions I use as is. There is nothing wrong with the |
In a nutshell we now have: No more, no less. |
So, would you say that it's better if use cases such as |
This is not the I have sympathy for your concern but you are talking to the wrong entity. Either take it up with R Core for base R (though I doubt they promised reordering would lead to identical serialization) or with the |
Sorry I didn't phrase my question very well. I just wanted to ask that since "It depends" and "TBD" are valid answers to that question. Anyways, thanks for your time (and the super quick replies). |
First a little background: I use
testthat
and itsexpect_known_hash()
(which in turn usesdigest::digest()
) for a package that I'm developing. I noticed that my laptop (Mac) and our CI (Linux) were returning different hashes for one object, which was causing my tests to fail.As part of debugging, I used
dput()
to inspect the objects and not surprisingly, there was some differences in floating point values. I figured that was probably the cause (there is pull request r-lib/testthat#822 that references your digest() vs. sha1() vignette), but decided to go a bit further anyway.I saved the objects as rds files, transferred them to the same machine, and loaded into the same R session. To my surprise, they now passed both
all.equal()
andidentical()
but still had different hashes withdigest::digest()
.Looking at the output of
dput()
for the two objects, they were otherwise identical (also the floating point values), but attributesclass
androw.names
were in a different order (the objects aredata.frame
s). Also the output ofserialize()
is different, as is to be expected because of the differentdigest::digest()
hashes. I don't know what determines their order in the output ofdput()
, nor if it's the same underlying reason at play withserialize()
. But whatever the cause, doesn't this behavior ofserialize()
seem a bit fragile fordigest::digest()
? I know the package has been around for quite some time, so I guess this must be some kind of rare edge case. Anyways,digest::sha1()
gives identical hashes.Below is a simple reprex that shows this behavior:
Created on 2019-01-17 by the reprex package (v0.2.1)
The text was updated successfully, but these errors were encountered: