-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up re-signing via per-file hash #83
Comments
Do you mean per-file versioning? You have access to a git log, which can be interpreted as a versioning system since you know if a file changed across commits & the commit id being the version. Regarding 1., we could encode the file in a binary format that is not human readable. Regarding 2., could we not sign the file each time it's changed? |
No, just repository versioning.
Correct. I thought the huggingface API did not support git versioning and it could only download at head. I was wrong.
+1, I had a similar idea. I think that should work. @McPatate do you think we need this optimization at all? The motivation was to improve signing speed because we thought that model versioning was not supported by the huggingface API, but that assumption was wrong. Do you think it'd be enough to:
If users version their models, they'd only need to sign when the model actually changes.. which means they need to re-sign large files and cannot use the per-file hash anyway... Wdut? |
Closing in favor of #111 |
Certain models like huggingface repositories currently do not support versioning. This means that signing multiple times with small diffs can be expensive. It may be useful to have a format listing per-file hashes, in order to speed up re-hashing. Example:
This file would be kept under the model folder or in another configuration folder (probably the latter?). When signing, the caller would give us a list of files that need to be re-computed (from git diff), and our library would only re-hash the necessary files. We'd also pass back the recomputed file to the caller, so that they can store it where they want. The user would need a git hook that does all this for them.
This file could be the final file we sign, or a helper file that is only stored on a dev machine to speed up signing when making changes to the repo. (probably the latter?)
The main danger (?) is that:
@McPatate any thoughts?
The text was updated successfully, but these errors were encountered: