/
chap3.tex
267 lines (195 loc) · 16.9 KB
/
chap3.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
% chap3.tex - Week 3
\cleardoublepage
%\phantomsection
\chapter{Week 3}
\section{Day 1 - ``But how do I see what's going on?''}
\subsection{Logging in Git}
Perhaps the best feature of a version control system is the level of accountability that it offers if set up correctly. What do we mean by this? People often mistake the word \textbf{accountability} for the word \textbf{blame}. This is not true at all. Accountability is key in understanding the events that led up to a particular bug being introduced, or a situation occuring. How this is dealt with, is a up to the management teams, but accountability should not be something that is revered, it should be something that is looked upon as a tool to help define the cause of a problem.
By its very nature, a version control system is also a logging system. Every time we committed something into the repository in the last chapter, we supplied a log message. In fact, if we don't supply a commit message, let us see what happens.
\begin{Verbatim}[frame=single,fontsize=\relsize{-3}]
john@akira:~/coderepo$ git status
# On branch master
# Untracked files:
# (use "git add <file>..." to include in what will be
committed)
#
# my_third_committed_file
nothing added to commit but untracked files present (use "git
add" to track)
john@akira:~/coderepo$ git add my_third_committed_file
john@akira:~/coderepo$ git commit -a -m ''
Aborting commit due to empty commit message.
john@akira:~/coderepo$
\end{Verbatim}
So, Git will actually not allow you to commit with a blank message. This is actually fantastic news, as people are far less likely to write a useless message than they are a blank one. It is very important that when using a version control system you write in a useful commit message. If you fixed a bug, say so. If you added a new function, why not put that in too. When someone wants to find out what a certain commit was for, or even when you come back to the project six months later and realise you've forgotten everything, log messages are crucial in piecing back together a history of development.
\begin{trenches}
``So John, I've been committing and all that,'' started Rob, ``but how do I see the history of what I have done.''
``It's really pretty simple,'' replied John, ``But it really depends on what you want to know.''
Rob placed his thumb and forefinger onto his chin. ``Well, for now, I just want to see a list of all of my commits.''
``That one's the simplest of all.''
\end{trenches}
At its simplest, \texttt{git log} will give an output of all of the commits that have been applied to the current branch. Depending on what type of machine you are using it on, the output from \texttt{git log} will be navigatable, usually using the up and down arrows, with 'q' used to quit. Let's have a quick look at the output of our test repository and see what the log messages look like.
\begin{Verbatim}[frame=single,fontsize=\relsize{-3}]
john@akira:~/coderepo$ git log
commit 6ca160c7226731bf80973fc5bc81f6b9beda7795
Author: John Haskins <john.haskins@tamagoyakiinc.koala>
Date: Mon Feb 21 20:59:32 2011 +0000
Finished adding initial files
commit e86ddea25341a75275d316d8ca545aa7c73e97b3
Author: John Haskins <john.haskins@tamagoyakiinc.koala>
Date: Mon Feb 21 20:06:57 2011 +0000
Made a few changes to first and second files
commit 88206926cb60aed53d21ede69f9ca5b7c69cb983
Author: John Haskins <john.haskins@tamagoyakiinc.koala>
Date: Sat Feb 19 09:23:47 2011 +0000
My First Ever Commit
john@akira:~/coderepo$
\end{Verbatim}
The \texttt{git log} command shows us a chronological list of all of the commits to the repository and also gives us several more important pieces of information. In total there are four pieces of information displayed by default.
\begin{itemize}
\item \textbf{commit} - This is the SHA-1 hash of the commit object that is stored inside the repository. You can find more information about this in the \emph{What's inside the Git repository?} section \emph{Week 2}. This is how we refer to the commit. If someone asked you in what commit you \emph{Made a few changes to first and second files}, you could reply that you did that in commit e86dd. As explained earlier, it is good to remember that you don't need to remember or type out the whole \textbf{e86ddea25341a75275d316d8ca545aa7c73e97b3}, only the first part is required. Generally, the first five characters will do.
\item \textbf{Author} - This is the name and email address of the author of the commit. When we begin to look at merging, you will see that the author of a commit, is not necessarily the \emph{committer} of the commit. If you want to find out more about how to set these options, see the breakout box in this Week, called \emph{Changing your identity}.
\item \textbf{Date} - The date is simply the date at which the commit was created. Again, note that when we start looking at merging, the date will be the date the commit was created, not the date it was merged into the repository.
\item \textbf{Commit Message} - This is the log message that was added along with the commit when it was created. Hopefully you can now see how important it is to create useful and meaningful messages in here.
\end{itemize}
\begin{framed}
\subsubsection{Note on Commit IDs}
Please note, that if you are following the commits and changes on your local computer, you may not and probably will not have the same commit IDs as are presented in this book. You are advised to use them here to follow what is happening, but to substitue them with your own values, when you start working with the rest of this chapter.
\end{framed}
\begin{framed}
\subsubsection{Changing your identity}
Particularly when working with other people, or when publishing your repository to a public location, it's a good idea to make sure people know who you are and how to get in contact with you. Every time you make a commit to a repository, Git gives the opportunity to take note of who posted the commit. When you first install Git, it probably won't have the correct information in there for you, so it's important that you take the time to set this up.
To set up your name and email address, we need to modify the \texttt{gitconfig} again.
\begin{Verbatim}[frame=single,fontsize=\relsize{-3}]
$ git config --global user.name "John Haskins"
$ git config --global user.email
"john.haskins@tamagoyakiinc.koala"
\end{Verbatim}
That's it. Now by default, Git will use this setting whenever you commit to a repository, unless you override it by locally modifying the repository's \texttt{.gitconfig}.
\end{framed}
\section{Day 2 - ``But I need more information''}
\subsection{Digging a little deeper}
\begin{trenches}
``I know John, and next time I will make a note of it, but right now, I'd really like to know where this file got changed,'' Klaus pointed at the piece of paper containing a print out, ``specifically when this function was introduced.''
John smiled. His hands danced over the keyboard as he finished compiling an email. ``And you've no idea when this was added at all?'' he asked.
``No, sorry John, I don't.'' He pondered, ``I guess I could write a script to untar all the versions we've created in the last week and search through them.'' He sighed, ``Can't the wonderful Git help us out here?''
A head popped up over the cubicle wall. ``You wanna find out when a function was introduced to a file?'' It was Rob. ``After John showed me the basics, I went and read up on it a little. Git has some really powerful searching within the log tool.''
``Well come on then,'' blurted Klaus, ``Don't keep me hanging on.''
A chime of the popular 1966 hit sprang out in the office.
Klaus pulled a hand down over his face, ``Oh don't you all start!''
\end{trenches}
Git can actually do some rather powerful searching to assist a developer in their daily tasks. It would have been useful if the particular item that was being searched for had been included in the log, but sometimes, things either get missed, or there are just too many changes introduced in one commit to list them all.
In these instances, the \texttt{git log -S<string>} command comes to our aid. This command will search through the commits in a repository and will return a list of commits which introduced or removed a specific string into the repository. First of all, let's run this against our test repository.
\begin{Verbatim}[frame=single,fontsize=\relsize{-3}]
john@akira:~/coderepo$ git log -SChange1
commit e86ddea25341a75275d316d8ca545aa7c73e97b3
Author: John Haskins <john.haskins@tamagoyakiinc.koala>
Date: Mon Feb 21 20:06:57 2011 +0000
Made a few changes to first and second files
john@akira:~/coderepo$
\end{Verbatim}
You can see that \texttt{git log} has shown us the commit that instantiated the change. As you can imagine, when using a large code base, this tool can be invaluable. It allows us to pinpoint a specific moment when a certain string of text entered the repository. When running this against a very large repository, this could take a long time, and so the ability to shrink the search scope down will result in a much faster result. To do this we can append a path to our previous command.
\begin{Verbatim}[frame=single,fontsize=\relsize{-3}]
john@akira:~/coderepo$ git log -SChange2 my_first_committed_file
john@akira:~/coderepo$ git log -SChange2 my_second_committed_file
commit 6ca160c7226731bf80973fc5bc81f6b9beda7795
Author: John Haskins <john.haskins@tamagoyakiinc.koala>
Date: Mon Feb 21 20:59:32 2011 +0000
Finished adding initial files
john@akira:~/coderepo$
\end{Verbatim}
If you remember from our committing back in Week 2, we added the string \texttt{Change2} to the second file but not the first. So the first time we run this command, it fails, as we are searching against \texttt{my\_first\_committed\_file}. The second time we run it, we are searching against \texttt{my\_second\_committed\_file} and this is where we see a result. Commit 6ca16 contains the commit we are looking for.
\section{Day 3 - ``So what actually changed?''}
\subsection{Doing the diff dance}
Knowing what the committer thinks they committed is brilliant. However, sometimes it's just not enough. The reason for this is stated fairly precisely in the first sentance of this paragraph, so let us add a little formatting to being out the real meaning. Knowing what the committer \emph{thinks} they committed is brilliant. By looking at the commit message we only know as much as the committer wants us to. If they are the helpful sort, this will probably all that we need, most of the time. On the other hand there is always the situation where you'd like to know a little more about what was actually placed into the repository.
The \texttt{git diff} command can show us exactly that. For more infomation about diff in general, see the diff breakout box in this chapter. Think of a diff as an easy way of looking at the differences between two files, surrounded by a little context. This can often be enhanced by a visual diff viewer, but for now, let's stick with our simple text based \texttt{git diff}.
If we want to find out what the changes are between our current commit and one of the previous ones, we can write a command like the one below. Notice that below, \textbf{e86dd} refers to the second commit that we made to the repository.
\begin{Verbatim}[frame=single,fontsize=\relsize{-3}]
john@akira:~/coderepo$ git diff e86dd
diff --git a/my_second_committed_file b/my_second_committed_file
index 3ad4cc3..095b9cd 100644
--- a/my_second_committed_file
+++ b/my_second_committed_file
@@ -1 +1,2 @@
Change1
+Change2
john@akira:~/coderepo$
\end{Verbatim}
What this is telling us, is that between \textbf{e86dd} and our current commit \textbf{6ca16}, we added the line \emph{Change2} to the file \texttt{my\_second\_committed\_file}. We can see this by the preceeding \texttt{+} on the line \texttt{Change2}. Let's make a few changes to our repository and see how the diffs look. We're actually going to make a few changes to the files using a text editor so that you can't see what we've done. Then, hopefully, when we run the \texttt{git diff} you'll be able to see clearly what has happened.
\begin{Verbatim}[frame=single,fontsize=\relsize{-3}]
john@akira:~/coderepo$ git log HEAD~1..HEAD
commit fa65f06cc62291bb0cd47aef9e05953d6655fc8e
Author: John Haskins <john.haskins@tamagoyakiinc.koala>
Date: Tue Mar 1 21:17:57 2011 +0000
Messed with a few files
john@akira:~/coderepo$
\end{Verbatim}
The command \texttt{git log HEAD~1..HEAD} tells Git to show us the git log for all commits between \texttt{HEAD~1} and \texttt{HEAD}. The notation used here is something new to us, but seeing as HEAD points to the most current commit, HEAD~1 points to the commit previous to HEAD. In this way, we are telling Git to show us only the most recently commit.
As it turns out, John Haskins didn't really create a very meaningful log message. \emph{Messed with a few files} is pretty unhelpful in the grandscheme of things. So let's be thankful that this isn't Tamagoyaki Inc's core repository and take a look at what actually happened in the commit \textbf{fa65f}.
\begin{Verbatim}[frame=single,fontsize=\relsize{-3}]
john@akira:~/coderepo$ git diff HEAD~1..HEAD
diff --git a/my_second_committed_file b/my_second_committed_file
index 095b9cd..c9887f8 100644
--- a/my_second_committed_file
+++ b/my_second_committed_file
@@ -1,2 +1 @@
-Change1
-Change2
+Changed this file completely
diff --git a/my_third_committed_file b/my_third_committed_file
new file mode 100644
index 0000000..5d27866
--- /dev/null
+++ b/my_third_committed_file
@@ -0,0 +1 @@
+Addition to the line
john@akira:~/coderepo$
\end{Verbatim}
As you can see, we have several things going on here, so let's take each of them in isolation and see what is going on. We are going to disect the diff to see what each section means.
\begin{Verbatim}[frame=single,fontsize=\relsize{-3}]
diff --git a/my_second_committed_file b/my_second_committed_file
\end{Verbatim}
This first line tells us that we are dealing with \texttt{my\_second\_committed\_file}. This is showing that we are comparing the first revision, or a, against the second revision, b.
\begin{Verbatim}[frame=single,fontsize=\relsize{-3}]
index 095b9cd..c9887f8 100644
\end{Verbatim}
This second line actually tells us the beginning of the object IDs, as they are stored in the repository. Note that these IDs are not the commit IDs, but the actual blob IDs that Git uses to refer to the file. For more information on this, checkout the emph{Object's living in harmony} breakout box.
\begin{Verbatim}[frame=single,fontsize=\relsize{-3}]
--- a/my_second_committed_file
+++ b/my_second_committed_file
\end{Verbatim}
The next few lines are telling us which is the original file, and which is the new file, so we can use this as a reference.
\begin{Verbatim}[frame=single,fontsize=\relsize{-3}]
@@ -1,2 +1 @@
-Change1
-Change2
+Changed this file completely
\end{Verbatim}
The next bunch of lines are generally referred to as a hunk. The hunk has two important pieces of information. Section \texttt{-1,2} tells us that in the original file, we are looking at the original file (\texttt{-}), that the starting line where the change takes place is line 1 (\texttt{1}) and that the hunk applies to two lines (\texttt{2}). The next section tells us that in the new file, the change takes place as line 1, and because the comma and remaining number are omitted, we can infer that the hunk applies to only 1 line.
The next three lines show what happened. Strings \texttt{Change1} and \texttt{Change2} were deleted from the file, whereas \texttt{Changed this file completely} was added to the file.
Looking at the next diff segment, we can see it applies to a different file. Essentially this hunk is no different to the last, the only interesting portion is shown below.
\begin{Verbatim}[frame=single,fontsize=\relsize{-3}]
new file mode 100644
index 0000000..5d27866
--- /dev/null
+++ b/my_third_committed_file
\end{Verbatim}
This shows us that \texttt{my\_third\_committed\_file} is actually a new file. Notice the \texttt{/dev/null} and the \texttt{0000000} object ID, indiciating that there was no original file.
\clearpage
\section{Summary - John's Notes}
\subsection{Commands}
\texttt{git log} - Return a navigatable list of commits to a repository
\texttt{git log -S<string>} - Show all commits that either introduced or removed a particular string from the repository
\texttt{git log -S<string> <path>} - Show all commits that either introduced or removed a particular string from the repository, but restrict the search to a specific path
\texttt{git log HEAD~1..HEAD} - Show all commits between HEAD~1 and HEAD, essentially the last commit
\texttt{git diff HEAD~1..HEAD} - Show the actual differences between HEAD~1 and HEAD
\subsection{Terminology}
\textbf{Branch} - A way of working on the same set of code in parallel without modifications overlapping
\textbf{Diff} - Shows the actual differences between files
\textbf{Hunk} - A section of a diff output
%git log -p
%DIFFs
%Patching
%git show master:<path>
%git diff -- stylesheet.css master:stylesheet.css
%git checkout -b <branch>