Skip to content

Conversation

@stephan-gh
Copy link
Contributor

I was looking the the git format-patch man page a while ago and thought these extra command line parameters might help to reduce the ugly random diff when regenerating patches.

See commit description for details.
Regenerated patches with these changes: https://github.com/Minecrell/Paper/commit/410eaf85a776aaf3476aff12725897fab7ea768e

Eventually, some parts of cleanupPatches could be removed after this change.

Spottedleaf pushed a commit to Tuinity/Tuinity that referenced this pull request Jul 17, 2019
Copy link
Member

@aikar aikar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No on the zero commit. causes huge problems when doing upstream updates (git uses that data to help apply patches when the line numbers no longer line up), but the other things are nice.

@stephan-gh
Copy link
Contributor Author

No on the zero commit. causes huge problems when doing upstream updates (git uses that data to help apply patches when the line numbers no longer line up), but the other things are nice.

As far as I know, only the index aaa..bbb lines are used for the --3way merge.

The commit hash in From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 is the original commit the patch was generated from.

It is not very helpful when applying the patch. The reason is that only the person that generated the patch will eventually still have that commit hash in their local repository. Every time the patch is applied a new commit hash is generated.

I'm quite sure that commit hash is simply ignored when applying the patch...

@aikar
Copy link
Member

aikar commented Jul 17, 2019

I thought the same too, and git surprised me. git uses the hashes to build a cross reference lookup "This patch 0003 references a non existent commit ID, but I have record of a previous patch 0001 using this ID that ended up being applied as this other ID, so use this ID as the base instead"

though i don't know how much thats tied to commit vs file hashes, but i don't see it as worth the risk, as you don't know how much it bites you until the next upstream update and stuff fails to apply that would have otherwise.

@stephan-gh
Copy link
Contributor Author

I can imagine that it does a lot of magic with the index lines, but I still doubt that also applies to the commit hash in the From line. I had a glance at the Git source code before opening this pull request and I was not able to find any use for the commit hash. The only exception was some internal use together with git rebase, but that's not used when invoking git am directly.

The --zero-commit option was really the original reason for me to open this PR, since that is the part that changes literally every time a patch is regenerated. In my opinion there is not much risk but I could be wrong. So at the end it's up to you: If you think it's too risky then I'll remove it. Or maybe you could just give it a try?

@zachbr
Copy link
Contributor

zachbr commented Jul 20, 2019

I'm not convinced that first From ... Mon Sep 17 00:00:00 2001 line hash is used anywhere. When we did our version of this we zero'd significantly more than just that. I'm not sure we should draw any comparisons from that experience.

@zachbr
Copy link
Contributor

zachbr commented Jul 24, 2019

I'll try to run through some scenarios later this week (oh god I'm giving out time frames again) to confirm that this shouldn't cause any glaring issues.

@zachbr zachbr self-assigned this Jul 24, 2019
@zachbr zachbr self-requested a review August 1, 2019 00:38
@Spottedleaf
Copy link
Member

I've had local issues specifically with --no-signature and patch rebuilding failing to generate new changes. Removing the flag fixes the issue (re-adding the flag again gives the same issue).

I will be looking into taking this issue to git over the next week or so, but decided to comment here to let anyone know who is using these flags to perhaps be aware of issues and a solution.

@stephan-gh
Copy link
Contributor Author

stephan-gh commented Nov 4, 2019

@Spottedleaf That doesn't really make sense. I'm not sure if I understand the issue you have, but if your issue is that new changes don't show up in the regenerated patches you might want to check if the cleanupPatches function is now reverting changes because it expects the Git version to be there.

@aikar
Copy link
Member

aikar commented Nov 4, 2019

While From may not be needed, as we've demonstrated, the rest are absolutely required, so we're talking about simply 1 set of hashes per file.

Not worth the complexity over a single extra diff section especially if the solution provided, Leaf has mentioned has problems.

@zachbr
Copy link
Contributor

zachbr commented Nov 4, 2019

What complexity? No arm chair quarterbacking in this. Either you have issues or you don’t.

Leafs issue remains an open question. No changes required at this time.

@stephan-gh
Copy link
Contributor Author

@aikar
There is still --full-index, which reduces diffs where one of the index hashes changes in length depending on the number of objects you have in your (local) repository.

Like here: df98489#diff-1ec14d830e42ae37bf8a5d4b7d83d5cfL8

@zachbr
Copy link
Contributor

zachbr commented Nov 4, 2019

@Spottedleaf is this just some generic issue with rebuilds on these patches or is there some specific case you’ve narrowed it too?

Would like to see if we can get some reproduction on it and go from there.

@aikar
Copy link
Member

aikar commented Nov 4, 2019

Oh yes that flag is good, I had been meaning to look into why that problem was happening ages ago.

@aikar
Copy link
Member

aikar commented Nov 4, 2019

@zachbr the issue stems that if someone builds on an older repo that has more stale/temp objects in it, and that git actually encounters a short hash conflict, it adds 1 more character to the shorthash to overcome the conflict.

So people on fresh repos would nearly always result in a standard shorthash, but occasionally older checkouts might have an object conflict resulting in other patch files getting changed in a way that cleanupPatches can't filter out.

@Spottedleaf
Copy link
Member

Cleanup patches is reverting the patches

@Spottedleaf
Copy link
Member

Spottedleaf commented Nov 4, 2019

The specific issue was fixed here:
Tuinity/Tuinity@60bd62e#diff-1dc4086bc209a34819532fe08eb11b4cR1567

@stephan-gh
Copy link
Contributor Author

With this PR, pretty much none of cleanupPatches is really needed, so it just needs to be modified to handle only the remaining cases where useless diff is still generated, if there are any. I didn't do it because that method is a bit tooo magic for me.

@aikar
Copy link
Member

aikar commented Nov 5, 2019

edit: nevermind, I swear I thought I saw zero-commit erased index lines too, but I see it doesn't now. (got confused on some actual new file adds where source = 00000, thinking that was all index lines)

Ok, does look like that's fine then.

@aikar
Copy link
Member

aikar commented Nov 5, 2019

I think cleanup will still be needed, as otherwise a change of a lower file will result in the index lines of every subsequent patch changing, generating more noise instead of reducing it.

@Spottedleaf
Copy link
Member

Spottedleaf commented Jan 23, 2020

I ran into some issues a while ago with patches again not rebuilding correctly - decided to cut cleanup patches entirely and I've not run into further issues.

@zachbr zachbr removed their assignment May 3, 2020
@aikar aikar changed the base branch from ver/1.14 to master May 3, 2020 01:56
Add --zero-commit --full-index --no-signature to git format-patch
to reduce the diff when doing minor changes in patches.

 --zero-commit: Print 00000000... commit hash in first "From" line
   of the patch. This commit hash is useless because it changes
   every time the patch is applied.

 --full-index: Print full SHA-1 hash on "index aaaa..bbbb" lines.
   This makes the lines much longer, but avoids random changes to
   add/remove a character based on the number of objects that exist
   in the local repository.

 --no-signature: Omit Git version from generated patches.
@aikar aikar changed the title scripts/rebuildPatches.sh: Reduce diff when regenerating patches Speed up rebuilding patches and reduce diff May 6, 2020
@aikar aikar merged commit 5c0bfff into PaperMC:master May 6, 2020
@stephan-gh stephan-gh deleted the format-patch-clean branch May 6, 2020 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants