-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate using git for tracking sparse updates and git smudge to apply them. #80
Comments
This is cool. A small note that it's not just dense updates that we'd stop on - one could imagine other update types (e.g. "randomly set the values by drawing from a normal distribution with seed N" or "set all the values to 1") which are not dense per se (i.e. they don't involve storing explicit parameter values) but do involve setting all the parameter values while ignoring the previous values. Probably best to distinguish between updates that are truly updates (i.e. they rely on modifying the previous state) or aren't and use just look for the first instance of the latter kind of update. As an aside, I think it's informative to think about a from-scratch training run - ideally the first commit would just say "add these parameter groups and initialize them in this way". I had something else to say but I forgot, maybe I will think of it another time. |
Yeah, we definitely want to support stopping at other update types. I think a recursive solution would handle that. A "true update" would look up the previous update type in git and call the |
#84 was very close to implementing this approach, however, it seems like it is not possible to do this correctly when using git to time travel ( The gist of it is that the code that looks back through the git history to build the parameter needs to know where in the history to start looking. In something like a checkout, we only the commit we are at, within the smudge filter there isn't a way to know what commit we are going to. So basically the result is that whenever we time travel, we end up with the smudged model checkpoint of where we were not where we wanted to be. Running I talked with @nkandpa2 about this issue and neither of us found a way to fix it. Thus we took a lot of the ideas on how this implementation of updates worked and applied them to a file system based method of tracking and applying updates in #92. I'm closing this as we don't think the git approach will work, but I'll leave the branch with the implementation on my fork as it may be useful to revisit in the future. |
Re-opening this discussion. I can't remember if we talked about this solution but why wouldn't it work to store the the hash of HEAD in the metadata file at clean time? For example:
Now say, I run Are there any issues with this solution? |
We talked about this solution and it seems like it will work. This branch has some tools for getting files from the git history which should help in the multi-pointer PR too. We can get this up and running once the multi-pointer branch is working with dense updates. One of the main questions to explore for us is if we will be able to track the last update directly or if we will need to iterate through history to find it but either way till work. In the original git-tracks-updates implementation I occasionally has times where it was slow to re-build indices on something like a checkout. In the new format, the only file getting index'd is the main metadata file (not each parameter file) so it should be faster? One question this does bring up is our tree processing algorithms. Currently we essentially process the parameter tree depth first where each parameter is processed individually (which might involve moving backwards through the git history) which could cause repeated work. It might be more efficient to collect all parameters that have changed in a batch and then go back in time once updating each parameter as appropriate. But before a large refactor like that we should 1) test that it is actually an issue and 2) check if memoization of our "get file from git history" function fixes any issue there is. |
Closed by #114 |
Instead of tracking/applying sparse updates manually (for example storing them in a different directory) can we just checking sparse updates and then move backwards through git history to build the real value (apply updates).
I have written this recursive smudge where when you smudge a file it will be transformed to include the content at each point in the history where it changed (and the commit the change happened at).
Note, we can't run something like
git checkout ${COMMIT}
from inside a smudge but we can run things likegit show
andgit rev-list
.We can apply this same idea to parameters. Reading in a sparse update will recurse backwards through history, until it hits a dense update. Once the dense update (which just returns the value) each sparse update (read from git) will be applied as we move back up the stack.
The main open questions are:
tensorstore
read a tensor when the binary blob (and the metadata file) are bytes sequences fromgit show
The text was updated successfully, but these errors were encountered: