# Introduction to Git

<div style="text-align: right"> Wooyong Park, Yonsei University </div>
<div style="text-align: right"> Material from Datacamp </div>

## 1. Introduction to a Version Control

* `pwd`: returns the current working directory
* `ls`: returns the files inside the current directory
* `cd [dirname]`: changes the directory to the specificed directory name

### 1.1. Editing a File

* `nano [filename]`: can be used to delete, add, or change contents of a file
* `echo` : can be used to create/edit a file\
      * **Create**  - `echo [content] > [filename]`\
      * **Edit**    -`echo [content] >> [filename]`
* `Ctrl + O` & `Ctrl + X`: is the shortcut to save file

### 1.2. Checking Git Version

* `git --version`: returns which version of Git is installed

### 1.3. Saving Files

### 1.4. Repository (Git Project)

A Git project consists of two parts: (1) the files and directories we create and edit, (2) the Git storage with the name `.git`.

The combination of these two parts is called a repository, often referred to as a repo.

* **Staging**: Putting files in the staging area is like placing a letter in an envelope.
* **Committing**: Making a commit is like putting the envelope in a mailbox. After comit, you can't make any further changes.

* `ls -a`: shows the hidden files in the current directory, including `.git` file

#### 1.4.1. Staging
* `git add [filename]`: adds a single file to the staging area
* `git add .`: adds all files in the current directory to the staging area

#### 1.4.2. Committing
* `git commit -m "log message here"`: `git commit` makes a commit and the suffix `-m` adds *log message* for the commit.

#### 1.4.3. Checking Status
* `git status`: tells us which files are in the staging area, and which files have changes that aren't in the staging area yet.

### 1.5. Comparing files

#### 1.5.1. Unstaged Version vs Committed Version

* `git diff [filename]`:compares the unstaged version of a file to the last commit

 The line with the two `@@` symbols tells us the location of the changes.\
 The line that starts with `-` symbol tells us the line that is erased in the unstaged version.\
 The line that starts with `+` symbol tells us what is added.

#### 1.5.2. Staged Version vs Committed Version

* `git diff -r HEAD [filename]`: compares the staged version of a file to the last commit

`-r` indicates that we want to look at a particular revision of a file.\
`HEAD` is a shortcut for the last commit of the file.

According to ChatGPT, the code above is wrong. ChatGPT recommends the following code instead.

* `git diff --staged`: compares the staged version to the last commit.

#### 1.5.3. Multiple Staged Files vs Committed Version

* `git diff -r HEAD`: compares the staged version to the committed version of *all the files in the directory*

### 1.6. Storing data with Git

Git commits have three parts:

* **Commit**: contains the metadata
* **Tree**: tracks the names and locations in the repo
* **Blob**: binary large object(contains data of any kind, compressed snapshot of a file's contents)

* `git log`: displays all the commits made to the repo in chronological order, starting with the oldest.

* Press `space` to show more recent commits.

* Press `q` to quit the log and return to the terminal.

### 1.6.1. Git Hash

A hash is a unique indentifier that enables Git to share data efficiently between repos. If two files are the same, their hashs will be the same. Therefore, Git can tell what information needs to be saved in which location by comparing hashes.

To find a particular commit, we would open `git log`. After that, we copy the first 6-8 characters of the hash, and run `git show [hash 6-8 first characters]` to find out that particualr commit content.

* `git log`: opens the commitment log
* `git show [hash 6-8 first characters]`: shows the details of the commitment with the specified hash
* `git diff [hash1] [hash2]`: compares the two commitments with the specified hash

### 1.7. Viewing Changes

#### 1.7.1. The `HEAD` shortcut

Use `~` to pick a specific commit to compare versions.

* `HEAD~1`: the second most recent commit
* `HEAD~2`: the third most recent commit

**NOTE**: must not use spaces before or after the tilde `~`

* `git show HEAD~3`: shows the details of fourth most recent commit
* `git diff HEAD~2 HEAD~1`: compares the third most recent commit and the second most recent commit

#### 1.7.2. Changes per document by line

* `git annotate [filename]`: displays the detail of changes in commitment per document by line(hash, author, time, line#, line content)

### 1.8. Undoing changes before Commit

#### 1.8.1. Unstaging a File
* `git reset HEAD [filename]`: unstages a single file from the staging area
* `git reset HEAD`: unstages all files from the staging area

Why do we need `HEAD` when unstaging files from the staging area? What we do through `git reset HEAD` is we instruct Git to match the staging area with the current commit state. Thus, we do need to call `HEAD` so that the last buffer zone(i.e. the staging area) matches the last commit.

#### 1.8.2. Undoing changes to an unstaged file
* `git checkout -- [filename]`: reverses the unstaged file to the commited version
* `git checkout .`: reverses all unstaged files in current directory and any subdirectories to the commited version

`checkout` means switching to a different version(default to the last commit)

#### 1.8.3. The sequence of unstaging, undoing changes, making changes, restaging, recommitting

```
git reset HEAD
git checkout .
nano [filename]
git add .
git commit -m "log message"
```

### 1.9. Restoring and reverting

#### 1.9.1. Customizing the log output
If the project scale is large, `git log` alone would display excessive amount of commits. Therefore, customizing the log output would be adequate.

* `git log -3`: restricts the number of commits to three
* `git log -3 [filename]`: displays three most recent commit of the specified file
* `git log --since='Month Day Year'`: displays only the commits made since the specified date
* `git log --since 'M D Y' --until='M D Y'`: displays only the commits made between the specified dates

#### 1.9.2. Restoring an old version of a file
* `git checkout -- [filename]`: reverts the unstored file to its last commit
* `git checkout [hash 6-8] [filename]`: reverts the unstored file to the specified commit
* `git checkout HEAD~1 [filename]`: reverts the unstored file to the second to last commit

#### 1.9.3. Restoring a repo to a previous state
* `git checkout --`
* `git checkout [hash 6-8]`
* `git checkout HEAD~1`

#### 1.9.4. Cleaning a repository
* `git clean -n`: displays which files are not being tracked
* `git clean -f`: deletes the files that are not being tracked

## 2. Git Workflows

### 2.1. Configuring Git

* `git config --list`: displays a list of customizable settings

Git has three levels of settings:

1. `--local`: settings for one specific project
2. `--global`: settings for all of the projects
3. `--system`: settings for every users on the computer

An example of `git config --list`:

```
(base) Tommysui-MacBookPro:~ tommy$ git config --list
credential.helper=osxkeychain
init.defaultbranch=main
filter.lfs.clean=git-lfs clean -- %f
filter.lfs.smudge=git-lfs smudge -- %f
filter.lfs.process=git-lfs filter-process
filter.lfs.required=true
user.name=wyeconomics
user.email=tommypark822@gmail.com
```

#### 2.1.1. Changing the Settings
* `git config --global [setting] [value]`: changes the particular setting to the specified value

#### 2.1.2. Using an alias
By executing `alias.[aliasname]`, one can create an alias for committing files. This is typically used to shorten a command.

* `git config --global alias.[aliasname] [command]`: creates a global alias for the specified command

The following is a nice example.
```
git config --global alias.unstage 'reset HEAD'
```

Then entering `git unstage` would be equivalent to `git reset HEAD`.

#### 2.1.3. Tracking aliases
Git traks aliases by storing them in a `.gitconfig` file. One can access it by calling `git config --global --list`;

#### 2.1.3. Ignoring specific files
We can ignore certain files by creating a file called `.gitignore`
* `nano .gitignore`: creates a `.gitignore` file.

Using the `.gitignore` file, we can specify which files should be ignored. For example, if we add `*.log` to the `.gitignore` file, Git will ignore any file ending with `.log`;

### 2.2. Branches

Git uses **branches** to systematically track multiple versions of files.\
Branches are how directories are segmented in the process of their commit.\
Suppose we want to merge two commits from different branches. Then,

* `source`: is the branch we want to merge *from*
* `destination`: is the branch we want to merge *to*

* `git branch`: displays what branches exist for the project(one with the asterisk`*` is the branch the user is currently at)
* `git checkout -b [branchname]`: creates a new branch and moves to it
* `git diff branch1 branch2`: displays the differences between branches
* `git checkout [branchname]`: moves us to the specified *existing* branch
* `git merge [source] [destination]`: merges the source commit to the destination commit

When working on projects, developing across different components is common. This is the reason why switching branches is helpful, as it allows us to keep making progress concurrently. Suppose we want to test some new ideas, but we don't want to change our existing code until we have confirmed it works.

One can think of the `main` branch as the **ground truth**. Each branch exists for a specific task, and once the task is complete and the process is confirmed to be ground true, it is then merged to the `main` branch.

### 2.3. Handling Conflict

#### 2.3.1. Git conflicts
If the same file is both edited in two different brancehs, merging them will cause an error. After merging, we can locate the source of conflict by opening the file with `nano [filename]`.

```
<<<<<<< HEAD
=======
A) Write report.
B) Submit report.
>>>>>>> [branch2]
C) Submit expenses.
```

* `<<<<<<< HEAD`: indcates that the lines beneath it contain the file's contents in the latest commit of the *current branch*.
* `=======`: refers to the center of the conflict. THis being right below the first line indiactes that the lines beneath it are part of the file versions in the latest commit of the current branch. However, if the equal signs are after some content, this meas that the two files have different content on the same lines in different branches.
* `>>>>>>>`: indicates the second branch.

#### 2.3.3. How do we avoid conflicts?

* Prevention is better than cure.
* Use each branch for a specific task.
* Avoid editing a file in multiple branches.

### 2.4. Erasing Git from a project

#### 2.4.1. Removing Git
  * `rm -rf .git`: removes Git from the project

#### 2.4.2. Verifying Git is removed
* `git status`: verifies whether Git is removed

## 3. Collaborating with Git

### 3.1. Local Repositories

#### 3.1.1. Creating repos
* `git init [reponame]`: creates a new repository
* `git init`: converts an existing project(directory) into a Git repository

#### 3.1.2. Caution for nested repositories
Nested repo is a Git repo inside another Git repo. There will be two `.git` directories per project, so Git wouldn't be able to identify which to update.

### 3.2. Pulling remotes
* **Local Repos**: are repos stored on the computer
*  **Remote Repos**: help us collaborate with colleagues(mostly through **Github**)

#### 3.2.1. Cloning a repo
Cloning is copying existing repos, local or remote, to the local computer's current working directory.

* `git clone [local repo path]`: clones a local repo
* `git clone [local repo path] [new name]`: clones a local repo and gives new name
* `git clone [URL]`: clones a remote repo

Whenever we clone a repo, Git stores a remote tag in the new repo's configuration to track where the original was. If we are in a repo, we can check the names of its remotes through `git remote` or `git remote -v`.

#### 3.2.2. Naming a remote
When cloning, Git automatically names the remote `origin`.\
We can manually name the remote like writing `\label{tag}` in LateX to create labels.

* `git remote add [name] [URL]`: names the remote repo

#### 3.2.3. Gathering from a remote
* `git fetch [remote name] [local branch]`: fetches a Git remote with the specified name into the specified branch
* `git merge [remote name] [local branch]`: merges the local repo's local branch to the specified remote

`git pull` is equivalent to requesting both `git fetch` and `git merge`.
* `git pull [remote name] [local name]`: fetches and merges the remote to the local

It is important to save locally before pulling from a remote. If there is a change made in the local repo that is neither staged nor committed and then try to pull a remote, Git tells us that local changes would be overwritten and aborts the pull.

### 3.3. Pushing to a remote

#### 3.3.1. Pushing to a remote
After saving changes locally, one can push the new repo to a remote.
* `git push [remote name] [local branch]`: pushes the updated local to the specified remote

#### 3.3.2. Push/pull workflow
Pulling a remote into the local repo comes before pushing the local to a remote. If you don't start the workflow by pulling from the remote, Git can display conflicts. Thus, you have to start with pulling a remote first.