-
Notifications
You must be signed in to change notification settings - Fork 3
/
git_rebase.qmd
289 lines (229 loc) · 12 KB
/
git_rebase.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
# Rebase for synchronizing work {#sec-rebase}
> Thanks for your contribution; could you rebase this against main? Thanks!
That is also a common request from maintainers, and one that can cause confusion for new contributors.
When one is working on a problem, others may be working in parallel. And their parallel work may be finished before one's own work is. Thus the starting point for one person's work (branching point) can "go stale" making it harder to integrate.
While `git merge` and resolving syntax level conflicts can resolve some of this, it is often easier to understand and review work if it is presented as changes against an updated starting point.
As a concrete example imagine a project to build a dashboard.
Imagine that you fork the repo in January to implement a new type of visualization (let's say a pie chart). You work on this during January and February, finally nailing it down at the start of March. Meanwhile, though, others in the project have spent February introducing a whole new way of accessing databases.
By the time you make a pull request at the start of March things have changed a lot since you branched in January.
--January-|------February--------------|--March
__pie_chart_branch___________
/ \
main--|--|-----------------------|----|-------
\ /
\__new_database_____/
:::{.column-body-outset}
```{mermaid}
---
config:
gitGraph:
parallelCommits: true
mainBranchOrder: 1
---
gitGraph
commit
commit
commit type: HIGHLIGHT
branch pie_chart_branch order: 0
commit
commit
checkout main
commit
branch new_database order: 2
commit
commit
checkout main
merge new_database
commit
checkout pie_chart_branch
commit
commit
commit
checkout main
merge pie_chart_branch
```
:::
If you submit a PR without updating, the maintainers will likely ask you to update your branch to make it work with the new database system.
First thing to do is to update your local repository with the changes from upstream.
```sh
git pull upstream main
```
Then you could try two options:
1. Either merge `main` into `pie_chart_branch` yourself (this is what we did in earlier exercises)
2. Rebase `pie_chart_branch` on main
Option 1 is possible, but often merging your work involves touching parts of the system you don't know what much about and is better left to the core developers. In addition, merging in this way leaves merge commit messages and some projects really don't like those because they make the history harder to read. It also leaves all of your "intermediate" commits in the history (things like "fix typo" and "forgot comma"). Some find that too much information.
Option 2 is generally preferred, since it focuses on clear communication via PRs that are easier to read and review. Often this just means a single commit showing the differences between the most updated work and the work that you want to contribute. For those viewing the history, it will be just the same as if `pie_chart_branch` was created in late February and you did all the work very quickly!
Option 2 is called `rebase`. We specify a new starting point, and git gives us a little UI in which we can choose which commits to include, which to "squash", and we can even reorder commits.
<!-- git clone https://github.com/jameshowison/320d_rebase_example.git -->
<!-- cd 320d_rebase_example -->
<!-- echo "first commit" >> README.md -->
<!-- git add README.md -->
<!-- git commit -m "first commit" -->
<!-- echo "second commit" >> README.md -->
<!-- git add README.md -->
<!-- git commit -m "second commit" -->
<!-- git checkout -b pie_chart_branch -->
<!-- echo "pie_chart_one" >> pie_chart.md -->
<!-- git add pie_chart.md -->
<!-- git commit -m "pie chart 1" -->
<!-- echo "pie_chart_two" >> pie_chart.md -->
<!-- git add pie_chart.md -->
<!-- git commit -m "pie chart 2" -->
<!-- git checkout main -->
<!-- echo "database 1 on main" >> database.md -->
<!-- git add database.md -->
<!-- git commit -m "database 1" -->
<!-- echo "database 1 on main" >> database.md -->
<!-- git add database.md -->
<!-- git commit -m "database 2" -->
<!-- git checkout pie_chart_branch -->
<!-- echo "pie_chart_three" >> pie_chart.md -->
<!-- git add pie_chart.md -->
<!-- git commit -m "pie chart 3" -->
<!-- git push -->
<!-- git checkout main -->
<!-- git push -->
We have an example set up. First we will clone it, change directory into the repo, and then switch to the pie_chart_branch.
```sh
cd ~
git clone https://github.com/jameshowison/320d_rebase_example.git
cd 320d_rebase_example
git checkout pie_chart_branch
```
Then, to make git happy, be sure to set your user and email (this happens because DataCamp resets these variables from session to session):
```
git config --global user.name "John Doe"
git config --global user.email johndoe@example.com
```
Our git viz command (See @sec-gitviz to set up) shows us the situation:
```sh
$ git viz
* 2b6a2cf (origin/main, origin/HEAD, main) database 2
* 70c14e7 database 1
| * 246e49e (HEAD -> pie_chart_branch, origin/pie_chart_branch) pie chart 3
| * 4c12962 pie chart 2
| * 123593a pie chart 1
|/
* ccd6067 second commit
* 512525e first commit
* 20b7fb5 Initial commit
```
We have two branches (main and pie_chart_branch). pie_chart_branch branched off from main at "second commit", and since then more work has been added to main (the database related commits). We also have three separate commits for `pie_chart_branch`.
So, in a prettier diagram, the situation looks like this:
```{mermaid}
---
config:
gitGraph:
parallelCommits: true
---
gitGraph
commit
commit
commit type: HIGHLIGHT
branch pie_chart_branch
commit
commit
checkout main
commit
commit
checkout pie_chart_branch
commit
```
But we want it to look as though pie_chart_branch came off the tip of main, like this:
```{mermaid}
---
config:
gitGraph:
parallelCommits: true
---
gitGraph
commit
commit
commit type: HIGHLIGHT
commit
commit
branch pie_chart_branch
commit
commit
commit
```
To accomplish this we can use a git command called `rebase`. We say that we will "rebase" the pie_chart_branch against `main`.
```sh
git switch pie_chart_branch
git rebase -i main
```
This puts us into a textual UI, with guidance at the bottom. We can use the arrow keys to move around. In this UI we are editing a text file (so we are using the Nano editor). Each line is a commit, the lines are instructions and are executed in order.
![](git_rebase_files/rebase_no_squash.png)
We use <kbd>Ctrl</kbd> + <kbd>O</kbd> to save (aka "write-Out") and <kbd>Ctrl</kbd>+<kbd>X</kbd> to exit the editor. We should see a count of commits being rebased and then a message about a successful rebase. Using `git viz` will then show us:
```
* 16fa532 (HEAD -> pie_chart_branch) pie chart 3
* bf585ba pie chart 2
* a2591db pie chart 1
* 3417133 (origin/main, origin/HEAD, main) database 2
* 32c1b1c database 1
| * 83f6236 (origin/pie_chart_branch) pie chart 3
| * 6780d4e pie chart 2
| * 54b2ecb pie chart 1
|/
* 52bd3fd second commit
* 0aefb6a first commit
* 9ad07d3 Initial commit
```
This is a little hard to read. This is because the `origin/` branches are shwoing (and those are up on the GitHub server and not yet changed). Look closely at the `pie_chart_branch` and down the left column. We now see that `pie chart 1` commit comes directly after the two commits on main, just as though the work started there.
To get this to also be reflected on the server we would have to do a "force push" `git push -f` which makes the server reflect the local repo. Note that this is usual for rebase on branches, but one should be very hesitant with any force operations on main or a branch shared with others. (Since you are not the owners of the repo, you won't be able to do a force push here). After the force push, things are neater:
```
* 16fa532 (HEAD -> pie_chart_branch, origin/pie_chart_branch) pie chart 3
* bf585ba pie chart 2
* a2591db pie chart 1
* 3417133 (origin/main, origin/HEAD, main) database 2
* 32c1b1c database 1
* 52bd3fd second commit
* 0aefb6a first commit
* 9ad07d3 Initial commit
```
You might be thinking, didn't we do pretty much the same thing with `cherrypick`? And yes, in this situation `rebase` [is very similar to cherrypick](https://stackoverflow.com/questions/11835948/git-cherry-pick-vs-rebase).
Rebase does offer some additional opportunities though. Notice that the textual UI shows more options than just 'pick'. One otion is 'squash' which takes changes from multiple commits and presents them in a single commit. This can make a git history easier for teammates to understand.
## Adding squash
We want to _squash_ those three commits down to one commit, and (like before) we want that commit to represent changes _as though_ we had branched off main after the `database 2` commit.
First, remove the directory we have been working in and reset things with a new clone.
```sh
cd ..
rm -rf 320d_rebase_example
git clone https://github.com/jameshowison/320d_rebase_example.git
cd 320d_rebase_example
git switch pie_chart_branch
```
Just as before we use:
```
git rebase -i main
```
This puts us back into a textual UI, with guidance at the bottom. This time instead of leaving all three as 'pick' we change two of them to squash. We do this by editing the file so that two of the lines show `s` instead of `pick`.
![](git_rebase_files/squash_commits.png)
After we use <kbd>Ctrl</kbd> + O to save (aka "write-Out") and <kbd>Ctrl</kbd>+X to exit the editor, git completes the squash by including all the changes in a single commit. We are then returned to the editor to create a new commit message.
![](git_rebase_files/squash_commit_message.png)
Finally, we can use git viz to see the new situation.
```sh
* 32f895e (HEAD -> pie_chart_branch) Here I have tidied up the history to make it easier to read. Pie chart draws pies.
* 2b6a2cf (origin/main, origin/HEAD, main) database 2
* 70c14e7 database 1
| * 246e49e (origin/pie_chart_branch) pie chart 3
| * 4c12962 pie chart 2
| * 123593a pie chart 1
|/
* ccd6067 second commit
* 512525e first commit
* 20b7fb5 Initial commit
```
Ignore the remote branch (origin/pie_chart_branch), and focus only on the local branch (HEAD -> pie_chart_branch). Instead of all three commits, we now only see a single commit (c367b8a).
Note that in these examples we did not encounter any conflicts (because edits were all on different files). In reality, though, when commits are moved around, rebased, and squashed, sometimes those operations result in conflicts (just as happens with merge). In that case git add the conflict symbols (`<<<<<<` and `==========` and `>>>>>>>>>`) which all need to be removed, then the files added and finally we run `git rebase --continue`. Git does provide useful messages through that process with guidance on the likely next steps.
# Rebase Exercise
In groups of three, a Maintainer and two contributors, begin by setting up the collaboration network (a new repo, forks, clones and setting upstream). Then work together to implement these steps:
1. Maintainer makes two commits to the main branch, contributors use `git pull upstream main` to synchronize.
2. Each contributor creates a feature branch, and adds two commits to a file with their name.
3. Contributors create PRs that the Maintainer sees on the upstream/sahred repository.
4. Maintainer can accept one, merging to main.
5. Maintainer then asks the other contributor to rebase their contribution.
6. The contributor doing the rebase first does a `git pull upstream main` to get the new work onto the main branch in their local git.
7. Then the contributor uses rebase and squash (as above) and uses `git push -f` to get their fork syncrhonized with their rebase.
8. Maintainer should then check the PR and they will find only a single commit (the two previous ones squashed together), now looking as though it branched off main after the first contributor's work. Maintainer can then accept the rebased PR.
9. Both contributors should sychronize (`git pull upstream main` and then git push to sync their forks.)