# Making Changes In Git
  
Next, you’ll examine how Git stores data, learn essential commands to compare files and repositories at different times, and understand the process for restoring earlier versions of files in your data projects.

## Resources
  
**Notebook Syntax**
  
<span style='color:#7393B3'>NOTE:</span>  
- Denotes additional information deemed to be *contextually* important
- Colored in blue, HEX #7393B3
  
<span style='color:#E74C3C'>WARNING:</span>  
- Significant information that is *functionally* critical  
- Colored in red, HEX #E74C3C
  
---
  
**Links**
  
[Git commands](https://git-scm.com/docs)  
  
---
  
**Notable Functions**
  
<table>
  <tr>
    <th>Index</th>
    <th>Command</th>
    <th>Usage</th>
  </tr>
  <tr>
    <td>1</td>
    <td>git init</td>
    <td>Initialize a new Git repository in the current directory.</td>
  </tr>
  <tr>
    <td>2</td>
    <td>git clone &lt;repository URL&gt;</td>
    <td>Clone a remote Git repository to your local machine.</td>
  </tr>
  <tr>
    <td>3</td>
    <td>git add &lt;file&gt;</td>
    <td>Add changes in a specific file to the staging area.</td>
  </tr>
  <tr>
    <td>4</td>
    <td>git commit -m "&lt;message&gt;"</td>
    <td>Commit the changes in the staging area with a descriptive message.</td>
  </tr>
  <tr>
    <td>5</td>
    <td>git status</td>
    <td>Check the status of your working directory and staging area.</td>
  </tr>
  <tr>
    <td>6</td>
    <td>git log</td>
    <td>View the commit history of the current branch.</td>
  </tr>
  <tr>
    <td>7</td>
    <td>git pull</td>
    <td>Fetch changes from a remote repository and merge them into the current branch.</td>
  </tr>
  <tr>
    <td>8</td>
    <td>git push</td>
    <td>Push local changes to a remote repository.</td>
  </tr>
  <tr>
    <td>9</td>
    <td>git branch</td>
    <td>List all branches in the repository.</td>
  </tr>
  <tr>
    <td>10</td>
    <td>git checkout &lt;branch name&gt;</td>
    <td>Switch to a different branch.</td>
  </tr>
  <tr>
    <td>11</td>
    <td>git merge &lt;branch name&gt;</td>
    <td>Merge changes from one branch into the current branch.</td>
  </tr>
  <tr>
    <td>12</td>
    <td>git fetch</td>
    <td>Fetch changes from a remote repository without merging.</td>
  </tr>
  <tr>
    <td>13</td>
    <td>git diff &lt;file-name&gt;</td>
    <td>Show the differences between the working directory and the last commit for a specific file.</td>
  </tr>
  <tr>
    <td>14</td>
    <td>git reset &lt;file&gt;</td>
    <td>Unstage changes in a specific file.</td>
  </tr>
  <tr>
    <td>15</td>
    <td>git remote -v</td>
    <td>List all remote repositories and their URLs.</td>
  </tr>
  <tr>
    <td>16</td>
    <td>git rm &lt;file&gt;</td>
    <td>Delete a file from both the working directory and the Git repository.</td>
  </tr>
  <tr>
    <td>17</td>
    <td>git stash</td>
    <td>Temporarily save changes that are not ready to be committed.</td>
  </tr>
  <tr>
    <td>18</td>
    <td>git checkout -b &lt;new branch name&gt;</td>
    <td>Create a new branch and switch to it.</td>
  </tr>
  <tr>
    <td>19</td>
    <td>git remote add &lt;remote name&gt; &lt;repository URL&gt;</td>
    <td>Add a new remote repository.</td>
  </tr>
  <tr>
    <td>20</td>
    <td>git push -u &lt;remote name&gt; &lt;branch name&gt;</td>
    <td>Push a local branch to a remote repository and set up tracking.</td>
  </tr>
  <tr>
    <td>21</td>
    <td>git rebase &lt;branch name&gt;</td>
    <td>Reapply commits on top of another branch.</td>
  </tr>
  <tr>
    <td>22</td>
    <td>git tag &lt;tag name&gt;</td>
    <td>Create a new tag for a specific commit.</td>
  </tr>
  <tr>
    <td>23</td>
    <td>git remote remove &lt;remote name&gt;</td>
    <td>Remove a remote repository from your Git configuration.</td>
  </tr>
  <tr>
    <td>24</td>
    <td>git log --oneline</td>
    <td>View the commit history in a concise one-line format.</td>
  </tr>
  <tr>
    <td>25</td>
    <td>git clean -df</td>
    <td>Remove untracked files and directories from the working directory.</td>
  </tr>
  <tr>
    <td>26</td>
    <td>git cherry-pick &lt;commit&gt;</td>
    <td>Apply changes from a specific commit to the current branch.</td>
  </tr>
  <tr>
    <td>27</td>
    <td>git remote prune &lt;remote name&gt;</td>
    <td>Remove remote branches that no longer exist on the remote repository.</td>
  </tr>
  <tr>
    <td>28</td>
    <td>git fetch --all</td>
    <td>Fetch changes from all remote repositories.</td>
  </tr>
  <tr>
    <td>29</td>
    <td>git stash pop</td>
    <td>Apply the most recently stashed changes and remove them from the stash.</td>
  </tr>
  <tr>
    <td>30</td>
    <td>git remote show &lt;remote name&gt;</td>
    <td>Show information about a specific remote repository.</td>
  </tr>
  <tr>
    <td>31</td>
    <td>git blame &lt;file&gt;</td>
    <td>Display the commit history and authorship of each line in a file.</td>
  </tr>
  <tr>
    <td>32</td>
    <td>git log --graph</td>
    <td>Visualize the commit history as a graph with branching and merging.</td>
  </tr>
  <tr>
    <td>33</td>
    <td>git reset --hard HEAD</td>
    <td>Reset the current branch to the last committed state (use with caution).</td>
  </tr>
  <tr>
    <td>34</td>
    <td>git log --oneline</td>
    <td>View the commit history in a concise one-line format.</td>
  </tr>
  <tr>
    <td>35</td>
    <td>git bisect start</td>
    <td>Start a binary search to find a specific commit that introduced a bug.</td>
  </tr>
  <tr>
    <td>36</td>
    <td>git reflog</td>
    <td>View a log of all Git references (branches, HEAD, etc.) and their history.</td>
  </tr>
  <tr>
    <td>37</td>
    <td>git tag -a &lt;tag name&gt; -m "&lt;message&gt;"</td>
    <td>Create an annotated (with a message) tag for a specific commit.</td>
  </tr>
  <tr>
    <td>38</td>
    <td>git log --author=&lt;author&gt;</td>
    <td>View the commit history filtered by a specific author.</td>
  </tr>
  <tr>
    <td>39</td>
    <td>git clean -n</td>
    <td>Preview the untracked files and directories that would be removed by `git clean`.</td>
  </tr>
  <tr>
    <td>40</td>
    <td>git stash list</td>
    <td>List all stashes in the repository.</td>
  </tr>
  <tr>
    <td>41</td>
    <td>git log --grep=&lt;pattern&gt;</td>
    <td>View the commit history filtered by a commit message pattern.</td>
  </tr>
  <tr>
    <td>42</td>
    <td>git log -p &lt;file&gt;</td>
    <td>View the commit history with the changes introduced in a specific file.</td>
  </tr>
  <tr>
    <td>43</td>
    <td>git commit --amend</td>
    <td>Amend the last commit with new changes or a new commit message.</td>
  </tr>
  <tr>
    <td>44</td>
    <td>git blame -L &lt;start&gt;,&lt;end&gt; &lt;file&gt;</td>
    <td>Display the commit history and authorship of lines within a specific range in a file.</td>
  </tr>
  <tr>
    <td>45</td>
    <td>git log --since=&lt;date&gt;</td>
    <td>View the commit history since a specific date.</td>
  </tr>
  <tr>
    <td>46</td>
    <td>git stash apply &lt;stash&gt;</td>
    <td>Apply changes from a specific stash without removing it from the stash list.</td>
  </tr>
  <tr>
    <td>47</td>
    <td>git checkout &lt;commit&gt;</td>
    <td>Switch to a specific commit, creating a detached HEAD state.</td>
  </tr>
  <tr>
    <td>48</td>
    <td>git reset &lt;commit&gt;</td>
    <td>Unstage commits and move the branch pointer to a specific commit (soft reset).</td>
  </tr>
  <tr>
    <td>49</td>
    <td>git log --follow &lt;file&gt;</td>
    <td>View the commit history of a file, even if it was renamed.</td>
  </tr>
  <tr>
    <td>50</td>
    <td>git mv &lt;old file&gt; &lt;new file&gt;</td>
    <td>Rename a file while automatically staging the change.</td>
  </tr>
  <tr>
    <td>51</td>
    <td>git stash drop &lt;stash&gt;</td>
    <td>Delete a specific stash from the stash list.</td>
  </tr>
  <tr>
    <td>52</td>
    <td>git remote rename &lt;old name&gt; &lt;new name&gt;</td>
    <td>Rename a remote repository in your Git configuration.</td>
  </tr>
  <tr>
    <td>53</td>
    <td>git log --stat</td>
    <td>View the commit history with a summary of file changes for each commit.</td>
  </tr>
  <tr>
    <td>54</td>
    <td>git rebase -i &lt;commit&gt;</td>
    <td>Interactively rebase commits starting from a specific commit.</td>
  </tr>
  <tr>
    <td>55</td>
    <td>git pull --rebase</td>
    <td>Fetch changes from a remote repository and rebase your local branch instead of merging.</td>
  </tr>
  <tr>
    <td>56</td>
    <td>git log --graph --oneline</td>
    <td>Visualize the commit history as a compact graph with one-line commit messages.</td>
  </tr>
  <tr>
    <td>57</td>
    <td>git remote prune origin --dry-run</td>
    <td>Preview the remote branches that would be pruned without actually deleting them.</td>
  </tr>
  <tr>
    <td>58</td>
    <td>git reflog show &lt;branch&gt;</td>
    <td>Show the reflog for a specific branch.</td>
  </tr>
  <tr>
    <td>59</td>
    <td>git clean -dfx</td>
    <td>Remove untracked files and directories, including ignored ones, from the working directory.</td>
  </tr>
  <tr>
    <td>60</td>
    <td>git cherry-pick --abort</td>
    <td>Abort the current cherry-pick operation.</td>
  </tr>
   <tr>
    <td>61</td>
    <td>git --version</td>
    <td>Show the installed Git version.</td>
  </tr>
  <tr>
    <td>62</td>
    <td>git add .</td>
    <td>Add all changes in the current directory to the staging area.</td>
  </tr>
  <tr>
    <td>63</td>
    <td>git diff -r HEAD &lt;file-name&gt;</td>
    <td>Show the differences between the current branch (HEAD) and the last commit for a specific file.</td>
  </tr>
  <tr>
    <td>64</td>
    <td>git diff -r HEAD</td>
    <td>Show the differences between the current branch (HEAD) and the last commit for all files.</td>
  </tr>
</table>


  
---
  
**Language and Library Information**  
  
CLI (Command Line Interface)
  
---
  
**Miscellaneous Notes**
  
NaN

## Storing data with Git
  
Time to explore the commit structure in detail. We'll examine how Git stores data and the process for viewing this information.
  
**The commit structure**
  
Git stores data through commits, which have three parts. The first is the commit itself, which contains metadata such as the author, commit message, and time of the commit. The second part is a tree, which tracks the names and locations in the repo when that commit happened. For each file listed in the tree, there is a blob, which is short for binary large object. A blob may contain data of any kind. Blobs contain a compressed snapshot of the contents of the file when the commit happened.
  
<center><img src='../_images/storing-data-with-git.png' alt='img' width='740'></center>
  
**Visualizing the commit structure**
  
Here, we visualize three commits to our repo to see these three individual components.
  
<center><img src='../_images/storing-data-with-git1.png' alt='img' width='740'></center>
  
**Visualizing the commit structure**
  
In the first commit, we can see a unique identifier ending in six-five. This identifier is known as a hash, which we will discuss later in the video. In the tree, we see two files were modified - report.md and mental-health-survey.csv. The blob shows a snapshot of what the files contained at that time.
  
<center><img src='../_images/storing-data-with-git2.png' alt='img' width='740'></center>
  
**Visualizing the commit structure**
  
In the second commit, there are three files in the tree, but only two were modified - mental-health-survey.csv and a newly created summary-statistics.csv. Therefore, the tree links report.md to the blob from the previous commit, as it wasn't modified in this commit.
  
<center><img src='../_images/storing-data-with-git3.png' alt='img' width='740'></center>
  
**Visualizing the commit structure**
  
In the third and most recent commit, report.md and mental-health-survey.csv are modified, with updated blobs created and linked to the tree. The summary statistics file wasn't changed, so the tree links to the blob in the second commit.
  
<center><img src='../_images/storing-data-with-git4.png' alt='img' width='740'></center>
  
**Git log**
  
We can view commit information using the `git log` command, which will display all commits made to the repo in chronological order, starting with the oldest. It will show the commit hash, author, date, and commit message. If the output doesn't fit in the terminal window, there will be a colon at the end of the output, indicating there are more commits. We can move through the history by pressing the `space bar`. When we want to exit the log we can `press q` to return to the terminal.
  
<center><img src='../_images/storing-data-with-git5.png' alt='img' width='740'></center>
  
**Git hash**
  
Earlier, we mentioned a unique identifier called a hash. This is a 40 character string of numbers and letters, like this. It is called a hash because Git produces it using a pseudo-random number generator called a hash function. Hashes enable Git to share data efficiently between repos. If two files are the same, their hashes will be the same. Therefore, Git can tell what information needs to be saved in which location by comparing hashes rather than entire files.
  
<center><img src='../_images/storing-data-with-git6.png' alt='img' width='740'></center>
  
**Finding a particular commit**
  
Suppose we have observed issues since a particular date. We need to see what changed in a commit on that day to cause the issues. First we'll need to identify which commit caused the issue, so we run `git log` to view all commits and note the hashes for commits on the day in question. We only need to note the first six-to-eight characters of the hash. We can then run `git show` followed by the start of the hash for each commit in turn, until we find the commit we are looking for.
  
<center><img src='../_images/storing-data-with-git7.png' alt='img' width='740'></center>
  
**Git show output**
  
The output of `git show` displays the log entry for that commit at the top, followed by a diff output showing changes between the file in that commit and the current version in the repo. At the bottom we see the added line appears to have data in the wrong order, with gender in the first column instead of the second. This looks like the source of the issue!
  
<center><img src='../_images/storing-data-with-git8.png' alt='img' width='740'></center>
  
**Let's practice!**
  
Now let's use the commands we have learned to navigate the Git commit structure!

### Interpreting the commit structure
  
The commit structure in Git is complex, but understanding how it works is essential for navigating storage and accessing specific versions of files.
  
The image displays three commits. What is the commit hash for the last updated version of `report.md`?
  
<center><img src='../_images/git-tree-structure.jpg' alt='img' width='850'></center>
  
---
  
Possible Answers
  
- [ ] bac5332a
- [ ] a845edcb
- [x] ebe93178
  
Correct commit interpretation! Two other files were changed in the last commit, so the second commit represents the most recent version of `report.md`.

### Viewing a repository's history
  
Recall that every commit has a unique identifier called a hash.
  
Git has a command you can use to display all commits made to a repo, along with the hash, author, and time of the commit.
  
Using the console, run a command to find the hash of the commit that added a summary report file.
  
---
  
Possible answers
  
- [ ] 7f71eade
- [ ] 1182c282
- [ ] 36b761e4
- [x] e39ecc89
  
Solution
  
```sh
git log
```
  
Awesome work—the `git log` command is very useful for tracking changes at a high level before diving deeper into more specific version control tasks!

### Viewing a specific commit
  
A common workflow with Git is to view all commits, then compare files between a specific commit and the current version of the file.
  
You are located in `mh_survey`.
  
In this exercise, you will look into the commit history for `report.md`.
  
---
  
1. Use a command to display the repo's history.
2. Use the `hash` from the second most recent commit to display the difference between `report.md` in that commit versus the latest version.

In [None]:
%%sh
git log
git show # Hash in exercise starts with 36b761

It's common to refer to the log and then perform a follow-up command to find a specific piece of information! Did you notice that one line was added to `report.md` in this particular commit?

## Viewing changes
  
We've seen how Git stores data as part of the commit process, now we'll look at additional methods for comparing commits.
  
**The HEAD shortcut**
  
Recall that we previously used the `git diff -r HEAD` command, to compare versions of files in the staging area versus the last commit. We can include a tilde to pick a specific commit to compare versions with the staging area. For example, `HEAD~1` is a path to the second most recent commit, while `HEAD~2` points to the commit before that. Note that we must not use spaces before or after the tilde, or the command won't work.
  
<center><img src='../_images/viewing-changes-in-git.png' alt='img' width='740'></center>
  
**From commit hash to HEAD**
  
This diagram shows how `HEAD` maps to the commit. The last commit is referenced with `HEAD`, the one before with `HEAD~1`, and the one before that with `HEAD~2`.
  
<center><img src='../_images/viewing-changes-in-git1.png' alt='img' width='740'></center>
  
**Using HEAD with git show**
  
We can use the `HEAD` command with git show to look at changes made to files in that specific commit. Here, we run `git show HEAD~3` to look at the fourth most recent commit. The output shows the commit hash, author, date, log message, and diff. We can see that three lines were added to the report.md file.
  
<center><img src='../_images/viewing-changes-in-git2.png' alt='img' width='740'></center>
  
**What changed between two commits?**
  
We know that `git show` can be used to look at changes made in a particular commit, but what if we need to see changes between two commits? In this case we use the `git diff` command.
  
<center><img src='../_images/viewing-changes-in-git3.png' alt='img' width='740'></center>
  
**What changed between two commits?**
  
To see the difference between the fourth and third most recent commits we can use git diff along with their commit hashes, or, we can use the `HEAD` command along with a tilde and the numbers associated with those commits.
  
<center><img src='../_images/viewing-changes-in-git4.png' alt='img' width='740'></center>
  
**What changed between two commits?**
  
The output shows that a fourth line was added at the end of the report.md file in the more recent of the two commits, with the contents of the line shown at the bottom of the output.
  
<center><img src='../_images/viewing-changes-in-git5.png' alt='img' width='740'></center>
  
**Changes per document by line**
  
Suppose we want to see who made the last change to each line of a file, and when the change took place. We can use the git annotate command followed by the filename. Here, we have used annotate to see the changes made to report.md. In each line of the output there are five elements - the first eight digits of the hash, the author Rep Loop, the time of the commit, the line number, and the contents of the line. This may not seem particularly useful in this instance, but if we are working as part of a large team where multiple people are editing a file, then git annotate is a quick way to see who added a specific line of content and when.
  
<center><img src='../_images/viewing-changes-in-git6.png' alt='img' width='740'></center>
  
**Summary**
  
We've covered several commands, so let's recap briefly on what each one does and when to use it. `git show` can be used with `HEAD` and tilde to show what changed in a specific commit. `git diff` hash1 hash2 will display changes between two commits, as will `git diff` and different `HEAD` paths. Lastly, `git annotate` file will show line-by-line changes to a file and its associated metadata.
  
<center><img src='../_images/viewing-changes-in-git7.png' alt='img' width='740'></center>
  
**Let's practice!**
  
Now it's your turn to practice the various methods for comparing changes!

### Comparing to the second most recent commit
  
Being able to look at what happened in a specific commit is useful to check how files have changed over time.
  
Use an appropriate command to find out how current versions of files compare to the second most recent commit.
  
Choose the answer reflecting what changes occurred.
  
---
  
Possible answers
  
- [ ] mental_health_survey.csv had three lines added.
- [ ] report.md had one line added.
- [x] mental_health_survey.csv had 47 lines added.
- [ ] report.md had three lines added.
  
Solution
  
```sh
git log
git show HEAD HEAD~1
```
  
Awesome work—combining `git show` with a commit hash or `HEAD` plus the required `~` is a great way to see what happened in a specific commit!

### Comparing commits
  
Not only can Git be used to check what changed in a specific commit, it also allows you to compare changes between commits!
  
Which files were modified between the fourth most recent and second most recent commits?
  
---
  
Possible answers
  
- [ ] report.md
- [ ] mental_health_survey.csv
- [x] report.md and mental_health_survey.csv
- [ ] report.md, summary_statistics.csv, and mental_health_survey.csv
  
Solution
  
```sh
git diff HEAD~3 HEAD~1
```
  
Yes! By using `git diff` you get an output showing that both of these files have been changed between these two commits!

### Who changed what?
  
Sometimes you need to see more detail than the commands you've used previously can provide.
  
Your task is to use an appropriate command to show changes such as author, change made, time of change, and commit hash, for report.md.
  
---
  
1. Display line-by-line changes and associated metadata for report.md.

In [None]:
%%sh
git annotate report.md

```sh
$ git annotate report.md
e39ecc89        (  Rep Loop     2022-08-05 09:58:20 +0000       1)# Mental Health in Tech Survey
e39ecc89        (  Rep Loop     2022-08-05 09:58:20 +0000       2)TODO: write executive summary.
e39ecc89        (  Rep Loop     2022-08-05 09:58:20 +0000       3)TODO: include link to raw data.
36b761e4        (  Rep Loop     2022-08-05 09:58:21 +0000       4)TODO: remember to cite funding sources!
$
```
  
Awesome annotation skills! Notice that three lines were added in one commit and one more line added in a separate commit.