Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make git diff prettier #35

Closed
craffel opened this issue Oct 13, 2022 · 5 comments
Closed

Make git diff prettier #35

craffel opened this issue Oct 13, 2022 · 5 comments
Assignees
Milestone

Comments

@craffel
Copy link
Contributor

craffel commented Oct 13, 2022

Involves making a custom difftool

@craffel craffel added this to the git proof-of-concept milestone Oct 13, 2022
@craffel
Copy link
Contributor Author

craffel commented Nov 2, 2022

Depends on #46

@nkandpa2
Copy link
Collaborator

Took a first cut at this. Here's a sample for some feedback:
Screen Shot 2022-11-28 at 12 21 22 AM

@craffel I think you said this depends on #46 because we want the diff tool to give some idea of what intermediate changes were applied between two versions of a checkpoint (e.g., between commit A and commit B there were 2 sparse updates and one dense update). As per my current understanding this will not be possible with the information that git provides external diff tools. The interface git enforces for diff tools is that they take the following parameters: diff-tool path old-file old-hash old-mode new-file new-hash new-mode. Since there's no information about what commit the two versions come from, we won't be able to display the intermediate changes applied.

@blester125
Copy link
Collaborator

I think with the way I setup the parameter updates on the FS we can actually figure out the updates applied.

Basically each update to a parameter is stored in a sub-dir that is named with the hash of the parameters for that update. So if we follow to metadata backpointers from .../params/updates/${new-file["${param_name}"]["hash"]} (or the parameter metadata file which I think has dup information rn) until we hit old-file["${param_name}"]["hash"] we should be able to tell what updates were applied.

I think this is dependent on the diff direction through? Like we need the on-disk .git_theta/... dir to be the one for the most recent model. If the on-disk value is older commit we can't do this.

A demo FS dir can be found here https://github.com/r-three/git-theta/blob/3e8566a83b6ef1e5cc5a24dc41ad7e0d73b8a0d3/git_theta/updates/base.py

@nkandpa2
Copy link
Collaborator

Yeah, just took a look at the updates PR and this seems feasible for diff-ing versions that do not have a dense update between them (since the updates folder is purged upon a dense update). I guess I'll wait on implementing this feature until after #92 gets merged.

@blester125 blester125 linked a pull request Jan 30, 2023 that will close this issue
@craffel craffel modified the milestones: paper, first release Feb 3, 2023
@blester125
Copy link
Collaborator

Finished in #207

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants