Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add control of track_times to allow reproducible hickling #13

Merged
merged 2 commits into from
Jan 6, 2015

Conversation

ebenolson
Copy link
Contributor

Because h5py includes the creation time by default when creating datasets, saving the same data twice will not produce identical files.

As discussed here, h5py/h5py#225 this is an issue if you want to store files in version control, or do quick comparisons by hashing (in my case, I wanted to verify that my pseudorandom data generator was functioning properly).

telegraphic added a commit that referenced this pull request Jan 6, 2015
Add control of track_times to allow reproducible hickling
@telegraphic telegraphic merged commit 320c25b into telegraphic:master Jan 6, 2015
@telegraphic
Copy link
Owner

Hi @ebenolson

Thanks for the pull request. That does sound like useful functionality so I've merged your changes :)

Could you please add a simple test in test_hickle.py to show exactly what this does (and so that we don't accidentally break it in the future)? How do you do your comparisons, with a md5 hash I'm guessing?

@ebenolson
Copy link
Contributor Author

Thanks, hickle is fantastic but this quirk of HDF5 had me chasing my tail for a while yesterday. Would be nice if it could be set at the file level, but as far as I can tell that's not an option.

I added code to the test file that wraps all the tests and re-runs each twice with track_times disabled, checking that the md5 is equal. It's a bit convoluted because the timestamp granularity is 1 second, and I didn't want to slow it down too much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants