Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebase vs. git pull vs. git merge #56

Open
ryekerjh opened this issue Jan 12, 2019 · 24 comments
Open

Rebase vs. git pull vs. git merge #56

ryekerjh opened this issue Jan 12, 2019 · 24 comments
Assignees

Comments

@ryekerjh
Copy link
Contributor

@TomAlperin @mwallert @dunlavy @carlosvargas Please do a quick write up on the benefits of each of these and your suggestions on which to use at Shift3

@carlosvargas
Copy link
Contributor

carlosvargas commented Jan 14, 2019

Not sure if you're asking about git pull (which is really git fetch && git merge or git pull --rebase or rebasing PRs or merging PRs, but they're all kinda connected, so I'll comment about each one lol

When updating my local, I always prefer to rebase on top of any remote changes. I don't like the merge commit that get's created with git pull on the feature branch since it "pollutes" my branch history with changes that were not mine. It also seems more appropriate for my feature to be built off the latest changes on the remote branch instead of merging them multiple times through the branch lifecycle. In addition, it gives me the opportunity to reword or squash any "WIP" or "small fix" commits which makes the change history cleaner for when PR time comes in.

However, when merging back into develop, I prefer to merge (with a different message than the GH default of Merge pull request #) in order to keep track of the commit history of my branch. If the changes are minimal, a squash or rebase is fine, but most of the time, I prefer a regular commit merge.

When merging to master though, a squash sometimes makes more sense since each commit is a "prod ready" change. So I should be able to go to any commit and have a buildable and functional product. In this case, we don't really care about any history (that still lives in develop), we just want the deployable changes. It also makes it faster to do a git bisect on master to find which feature broke what functionality. We can then fix it or jump into develop and keep doing git bisect on the feature's history (most of the time this is overkill).

So I guess, my answer is.....use all when appropriate? lol

@ryekerjh
Copy link
Contributor Author

@carlosvargas Thanks so much for this! This is exactly what I was hoping for! if @dunlavy or @TomAlperin have any additional suggestions, I'd love to hear them!

@ryekerjh ryekerjh assigned ryekerjh, TomAlperin and mwallert and unassigned Fleury14 and ryekerjh Jun 5, 2019
@ryekerjh
Copy link
Contributor Author

@TomAlperin I'll give you till Monday to respond, but I'd like to get this written up as a standard for the shop. @Shift3/pr-team what do ya'll think? ^^^

@stephengtuggy
Copy link
Contributor

This link should tell you all you need to know.

I have personally experienced projects that I worked on, getting seriously messed up because I didn't know better than to use git rebase. My conclusion was never to use rebase again. It's just too dangerous.

As for cleaning up commit history? If you ask me, git commit history should be regarded as a source of truth, not as something to prettify after the fact. The only time you should alter git commit history is to get rid of some huge file or sensitive info that never should have been checked in in the first place. For that purpose, use a tool such as this. Don't use git rebase, except if bfg or a similar tool instructs you to. Which it doesn't, AFAIK.

@ryekerjh
Copy link
Contributor Author

@stephengtuggy That is an interesting opinion. What kinds of catastrophic things happened? We've had a few unfortunate overwrites on IFG but nothing deadly. @DropsOfSerenity @mwallert What do you guys have to say about this ^^^?

@stephengtuggy
Copy link
Contributor

@ryekerjh The project I was working on ended up with branch A that appeared to be based on branch B, but had additional code changes, with no record of those changes. It was very confusing, and eventually became just about impossible to fix.

@justincohan
Copy link

In my experience rebasing can be more difficult and can lead to mistakes when you have to manually merge changes. But I don't think it's necessary to require someone to use one way over another.

@ryekerjh
Copy link
Contributor Author

In my experience rebasing can be more difficult and can lead to mistakes when you have to manually merge changes. But I don't think it's necessary to require someone to use one way over another.

I have certainly seen this in the projects I have used Rebase on. I definitely don't want to "pick" one way over another, but I do think a doc explaining the strengths vs. weaknesses of each method and how we want each to be performed @Shift3 would be useful, no?

@ryekerjh The project I was working on ended up with branch A that appeared to be based on branch B, but had additional code changes, with no record of those changes. It was very confusing, and eventually became just about impossible to fix.

AHHHH!! 😱 That would be good to document in the rebase docs for this. Do you know how that discrepancy arose? Or is it just one of those black-box rebase things?

@DropsOfSerenity
Copy link

DropsOfSerenity commented Aug 30, 2019

Theres a couple things being conflated here. @carlosvargas already covered a good amount of the good stuff. I'll try to be brief.

  • The whole point of commit history is to be useful in the future, it's not necessarily to show every step you as the developer took, but more to provide information to your future self or other developers.
  • Maintaining bundled and clean commit history allows us to use git bisect to track down issues in fewer steps and easier, than if each commit represents each atomic developer commit.
  • Feature branches should always be squashed into a single commit to aid this purpose.
  • When on a feature branch, rebase should always be used to maintain commit history parity with develop. This ensures you are building commit history on top of the history, not interspersed between.
  • When merging a feature branch into develop, always squash and commit and rewrite the commit message to represent the scope of work your branch represents. Basically the commit message should be the PR message.

@DropsOfSerenity
Copy link

DropsOfSerenity commented Aug 30, 2019

Additionally rebase when learned avoids many issues. Merges become easier and less error prone when your branch is consistently rebased. Reverting commits and bisecting to find bugs becomes easier when your commit history isn't every single typo fix or indentation change you've ever made.

The flip side of having issues with rebase, is spaghetti commit history, constant --no-ff merges that ruin any chance of tracking down changes, and merges that make you wanna tear your hair out as you try to disentangle Merged develop from develop commits over your own branch and other such nonsense.

@DropsOfSerenity
Copy link

DropsOfSerenity commented Aug 30, 2019

Lastly, I understand rebase can be hard, but most mistakes in rebase come from a misunderstanding or fear of how git and git commit history works. I think we could tackle that problem by either finding or making some nice and easy to understand learning resources to help clarify the concepts.

@ryekerjh
Copy link
Contributor Author

@DropsOfSerenity Thanks for that thorough response! My greatest fear/misunderstanding is how git chooses to create those funky merges that @stephengtuggy and I have seen before. If I knew the why/how, it would probably help prevent it in the future.

@stephengtuggy
Copy link
Contributor

@DropsOfSerenity I can see the advantages of what you are describing. If I could be sure that git rebase wouldn't mess up the repos I worked on the way I described, then I would certainly consider using it.

rebase is probably less dangerous on feature branches that only one person is working on. Squashing PR's might be OK too. We can discuss that further. I just wouldn't want to rebase on the development or master branches, etc.

@ryekerjh One of the distinguishing features of the project i worked on that got messed up, was that it had an upstream as well as an origin. That might make a difference.

@stephengtuggy
Copy link
Contributor

@DropsOfSerenity BTW, you also made a very good point that the overall purpose of git history is strictly utilitarian. And that a literal snapshot of every single atomic change you made is not necessary to achieve that.

Guys, what if we were to create a test repo or two? First see if we can reproduce the "funky merge" problem, then try to find the root cause and how to address it?

@coreyshuman
Copy link
Member

Feature branches should always be squashed into a single commit to aid this purpose.

I suspect @Karvel will have opinions on this, if you want to chime in.

When merging a feature branch into develop, always squash and commit and rewrite the commit message to represent the scope of work your branch represents. Basically the commit message should be the PR message.

Github PRs already provide the ability to isolate, review, and roll back a given feature. @DropsOfSerenity what are the advantages of the git bisect option as opposed to using Github's built in tools?

During PR review, I often flip through the individual commits. There is useful information to be gleamed from those, especially now that most of us are following the karma commit standard. Actually on that thought, how would a PRs react to rebasing after it was created? Wouldn't that mess up it's ability to label comments outdated / show new changes? Does anyone have personal experience with that?

@Karvel
Copy link
Contributor

Karvel commented Apr 9, 2020

I don't think this standard (either rebase vs merge or squash vs not squash) needs to be enforced shop-wide. I think this can be per-project. The preference here is tribal and people are disinclined to change. Within a project it makes sense to keep a standard consistent.

I personally find a ton of value in not squashing commit history because I also do extremely atomic commits, which tell a story of how I built or fixed something. I have been able to reference this history in multiple projects before (e.g. looking for commits for when I added IE11 compatibility). None of our projects are especially large, so I would rather retain information.

@stephengtuggy
Copy link
Contributor

Hey guys 👋 . Hope you are all doing well.

I can still see this standards-and-practices repo. Does that mean that I am still welcome to participate in discussions like this one?

@FlavioShift3
Copy link
Contributor

FlavioShift3 commented Apr 10, 2020 via email

@carlosvargas
Copy link
Contributor

@coreyshuman a git bisect isn't really useful when doing a PR. Its use case is for when you are debugging something and are trying to find when a bug was introduced. You can try to do it by walking through the code, but most of the time it is easier to do that when you know which commit was the culprit. In addition, you also have the context of how the bug was introduced, maybe it was a side effect of a bigger feature and changing that affects other things, or it's just a simple bug.

This is why I think doing a squash on the master branch is beneficial. You're doing a bisect on a small subset of commits, which can allow you to find which deploy caused the issue. After that, you can jump to develop where you have the full history and continue the bisect there if needed, to get full context.

@stephengtuggy you're more than welcome to participate. I know it was briefly mentioned a while ago, that part of the reason why we have the S&P as a public repo, is to allow outside contributors to participate :)

@mwallert
Copy link
Contributor

mwallert commented Apr 16, 2020

This issue has been open for a long time and I was assigned sorry! 😅

When I first started using Git, I felt like preserving the entire commit history was the most important thing. It is very important don't get me wrong, I just don't like having random commits plugging up the dev branch when things are merged in. It's hard to track the history of a branch when every merge is 1-10+ commits.

On top of this, at times when I am cleaning up a pr or debugging something, I will commit things like chore(-console.log) or chore(signup component): fixes linter errors which does us no good when looking back on a base branch, or using bisect (which I don't use yet but I want to start!). Doing an interactive rebase, lets me squash everything down to one useful commit message, and leave the other commit as comments (or remove commented commits like the ones above). That way the history of the branch, is just a series of the most important changes that represent features/fixes. To me this is way more useful (again purely OPINION here) when looking back on a a branch like develop's history.

Reading this back its sounds a little aggro but I mean it in the most sincere way. I have really loved rebase, ever since @DropsOfSerenity showed me how to do it, but I still consider myself a rookie and am learning new benefits with using it all the time. I like the clean and linear history, but I agree with @Karvel that it should be project based, and I will also acknowledge that rebasing can go really wrong if not done right/carefully!

TLDR:
Merge vs Rebasing

This article goes pretty in-depth about the benefits and drawbacks of rebase vs merge.

@coreyshuman
Copy link
Member

@carlosvargas You are beginning to sway me. I have some additional questions:

  • If master is constantly squashed while dev is left unfolded, does that cause any weirdness or difficulties next time you want to merge dev into master?
  • How do Github PRs react to rebasing after it was created? Wouldn't that mess up it's ability to label comments outdated / show new changes?

It sounds like we should do a workshop or at least a S&P talk on the subject. Any takers for hosting this talk? This topic should probably start as a best practice, with pros and cons on when and how to use rebasing and squashing. At that point we could extend our onboarding and internal training to cover git flow in depth.

@stephengtuggy
Copy link
Contributor

I just hope that the appropriate distinction is being made here between upstream rebasing and other rebasing cases. Other than that, I don't know that I have much input to offer at this point.

@michaelachrisco
Copy link
Contributor

michaelachrisco commented May 7, 2020

Came into this late, but if anyone would like to see an actual project that uses the whole Feature branches should always be squashed into a single commit to aid this purpose.: https://github.com/westernmilling/gman-services/commits/master

It appears they are still maintaining the same standard. Its been around since 2015ish so you can see the evolution if you care to look.

At first, we were doing it by hand with git rebase but then it got complicated when more than two developers were working on branches. After Github started supporting the squash and merge action, it became easier and made looking at master/dev branches a whole lot easier.

git rebase HEAD~(N) where N is the number of commits to get back to development is still memorized in my head and probably will never go away. Also git reflog is your friend if squashes/merges/etc... go wrong, but thats another topic for another day.

@amerikan
Copy link
Contributor

amerikan commented Sep 28, 2021

git pull = git fetch && git merge
git pull --rebase = git fetch && git rebase if you're going for a more linear history. Your git log --oneline --graph will look prettier. 😄

Squashing: git merge --squash. IMO useful when the feature branch is clutter with redundant commits.

When using rebase you gotta be careful since you're literally re-writing history.

With Github PR's web UI you can do most of them:

Screen Shot 2021-09-28 at 10 52 11 AM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests