Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use archivist.hash.function to select desired hash algorithm #321

Merged
merged 4 commits into from
Dec 31, 2017
Merged

Use archivist.hash.function to select desired hash algorithm #321

merged 4 commits into from
Dec 31, 2017

Conversation

zzawadz
Copy link
Contributor

@zzawadz zzawadz commented Dec 11, 2017

I was wondering how to add the possibility to select other hashing algorithm without affecting the package structure.

In this pull request there's my proposition - just create simple wrapper on the digest, and use the environmental variable archivist.hash.function to specify desired algorithm.

adigest <- function(x) digest(x, algo = getOption("archivist.hash.function", default = "md5"))

What do you think about this solution?

@pbiecek
Copy link
Owner

pbiecek commented Dec 11, 2017

Thanks, very interesting and elegant solution.

Is the archivist.hash.function part of the .ArchivistEnv environment?
This is the place where we keep archivist options/defaults.
(see https://github.com/pbiecek/archivist/blob/master/R/zzz.R)

Also it would be nice to have somewhere an example of how to set other hashing function.
Maybe as an example to the archivist::aoptions()?

@zzawadz
Copy link
Contributor Author

zzawadz commented Dec 12, 2017

Is the archivist.hash.function part of the .ArchivistEnv environment?

Not now, but I'll move it there.

Also it would be nice to have somewhere an example of how to set other hashing function.
Maybe as an example to the archivist::aoptions()?

Definitely:)

@pbiecek
Copy link
Owner

pbiecek commented Dec 12, 2017

Great,
I've heard that some people are complaining about md5 so this will help a lot.
I just wonder if the hash function shall be stored in the metadata for an artifact.

That being said,
right now the hash is used as a primary key in the archivist meta-data database.
The length of hash for "sha512" is 4x longer than for "md5" (128 chars vs 32 chars)
I need to check if this will work smoothly with SQLite and PostgreSQL backends

@zzawadz
Copy link
Contributor Author

zzawadz commented Dec 22, 2017

I added one simple test and moved the option to archivist aoptions. I'll try to find some time to explore the package and write more tests for different hasing functions.

@pbiecek
Copy link
Owner

pbiecek commented Dec 31, 2017

Thanks, very valuable commit.
Fixes #323

@pbiecek pbiecek merged commit ba2d5b1 into pbiecek:master Dec 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants