# Introducing `git` version control system

* A [version control](https://en.wikipedia.org/wiki/Version_control) system helps keeping track of changes in software source code.
* With a version control system, trying and testing possibly risky attempts can be easier.
* Currently in the late 2010s, [`git`](https://en.wikipedia.org/wiki/List_of_version_control_software) is one of the [available version control softwares](https://en.wikipedia.org/wiki/List_of_version_control_software), 
* Linus Torvalds created `git` in 2005 to maintain the Linux kernel.
* `git` is an [open source](https://github.com/git/git) distributed version control system. A repository may have remote versions and local versions that are (practically) identical.



[![Git Data Transport Commands](https://images.osteele.com/2008/git-transport.png)](https://blog.osteele.com/2008/05/my-git-workflow/)

[[ref0](https://git-scm.com/book/en/v2), [ref1](https://github.com/progit)]

| command | expected behavior | example |
|:-------:|:-----------------:|:-------:|
| `init` | initialize a git repository | `git init` |
| `clone` | clone a git repository | `git clone <repo url>`<br>`git clone file://<path>` |
| `log` | list the commit history | `git log`<br>`git log --help`<br>`git log --stat`<br>`git log --oneline --graph --all` |
| `status` | current status of a git repository | `git status` |
| `diff` | visualize changes after last commit and/or staging | `git diff`<br>`git diff HEAD`<br>`git diff HEAD^` |
| `config` | list or adjust configuration | `git config --list`<br>`git config --global --unset credential.helper` |
| `config user.name` | specify the user's name  | `git config user.name <your name>` |
| `config user.email` | specify the user's email address  | `git config user.email <your email>` |
| `remote` | manage remote repositories | `git remote add origin <remote repo>` |
| `add` | stage some change to commit | `git add <path to a changed file>`<br>`git add -p` |
| `commit` | create an entry of change | `git commit`<br>`git commit -m <message>` |
| `push` | upload the changes to a remote repository | `git push`<br>`git push -u origin <branch name>` |
| `checkout ` | switch workspace to a certain commit | `git checkout <commit hash>`<br>`git checkout -b <new branch>`<br>`git checkout -- <file to undo>` |
| [`branch`](https://git-scm.com/docs/git-branch) | manage branches | `git branch`<br>`git branch -r` |
| `blame` | relates each line of code with commits | `git blame <file path>`|
| [`rebase`](https://git-scm.com/docs/git-rebase) | move current branch on top of another branch | `git rebase <branch>`<br>`git rebase -i <commit>` |
| [`merge`](https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging) | merge another branch to the current branch | `git merge --no-ff <other branch>`|



# Practice

1. Go to the github website and log in.
1. Go to one of the repositories of your interest.<br>For this example, this page would use Wes McKinney's [Python for Data Analysis](https://github.com/wesm/pydata-book).<br>
Its repository addres is : https://github.com/wesm/pydata-book
1. Let's try cloning the repository.<br>
`git clone https://github.com/wesm/pydata-book`<br>
1. Now try `cd pydata-book` and `ls -a` commands.
<br>Note if a folder `.git` is visible.
1. Enter `pwd` to check the current full path.<br>
Let's assume the folder is : `/home/user/Documents/pydata-book`
1. `git remote` would list of available remote repository names.
1. `git remote get-url origin` would show the link to the `origin` repository.<br>If developers contribute to the [Python for Data Analysis](https://github.com/wesm/pydata-book), you would be able to update this repository using `git pull origin`.
1. If your name and email address are "ABC" and abc@naver.com respectively, enter `git config user.name ABC` and `git config user.email abc@naver.com`.
1. `git config --list` would show configurations of this repository.
1. Try `echo "test" > test.txt` to create a sample text file.
1. `git status` would show:
<br>The current branch 
<br>Sync with the branch of the remote repository
<br>One file that `git` is not trcking
1. Enter `git add test.txt`
1. `git status` would show:
<br>Branch and sync information would not change
<br>One file added to the stage (or index) to be committed
1. Enter `git commit -m "Added test.txt"`<br>
`git` would show messages.
1. Check `git status`.
1. Now `git log --state` would show the hash value, date & time, your name & email, commit message, and the file change of the commits.
1. Open the new file using an editor : `vi test.txt` or `nano test.txt`.
1. Add one more line, save, and exit the editor.
1. `git status` would show one file is changed.
1. `git diff` would show the changes in the files.
1. `git add test.txt` and `git commit -m "Changed test.txt"` would commit the file.
1. Check `git status` and `git log --stat`.
1. `git log --oneline --graph --all` would show the commit tree.
1. Enter `cd ..` to move up one level.
1. Enter `git clone /home/user/Documents/pydata-book temp`.<br>`git` would clone anoter repository in folder `temp`. (This example is just to show cloning a local repository is possible)
1. Enter `cd temp` and `git log`.
1. Enter `git remote`.
1. Enter `git remote get-url origin`. You would be able to see the remote repository location.

1. Try `git config --list`
1. Try `git remote add upstream https://github.com/wesm/pydata-book`
1. Now try `git remote` and/or `git remote get-url upstream`.<br>`git pull upstream` would update this local repository.



## Creating a `github` account



[![github](https://avatars1.githubusercontent.com/u/9919?s=200&v=4)](https://www.github.com)

* [`github`](https://www.github.com) is one of `git` remote [repository hosting services](https://en.wikipedia.org/wiki/Comparison_of_source_code_hosting_facilities#Version_control_systems).
* [`dev.naver.com`](https://dev.naver.com) used to provide such service until recent years.
* `github` also has an [education](https://education.github.com) service.
* May require to verify email address.



* A free user account can generate indefinite number of Public repositories.
* Usually a github repository address has following form:<br>`https://github.com/<github id>/<repository name>(.git)`<br>
ex : [`https://github.com/tensorflow/tensorflow.git`](https://github.com/tensorflow/tensorflow)
* A user can `fork` a public repository.<br>ex : `https://github.com/<github id>/tensorflow.git`<br>This is equivalent to having a clone with a link.
* If planning to use only one user account for a specific repository, following command is possible.<br>`git remote add <remote name> https://<github id>@github.com/<github id>/<repository name>(.git)`

* With an academic email address and a school ID card image, an instructor (or a [student](https://education.github.com/pack)) may upgrade to an education account; possible to create private repositories.
* Depending on the situation, an instructor may create an organization on the github; then a repository may have following form :<br>`https://(<github id>@)github.com/<organization id>/<repository name>(.git)`



### Authentication

* To avoid unauthorized source code change, a remote repository may require id+password authentication.
* To improve productivity during frequent pushes, `git` may utilize credential helper.
* A credential helper stores the authentication information possibly after encryption.
* Following command shows current (global) credential helper:<br>`git config (--global) credential.helper`
* However, credential information might be sensitive so please use with caution.



## Creating branches and switching between them

* Assume you want to test a *radical* new feature; if successful, it would be great new addition.
* However, you want the existing code base intact until success is certain.
* Then you can start a new branch.<br>Only when the new feature is successful, you would merge into the existing code base.

[![git branch](https://git-scm.com/book/en/v2/images/advance-master.png)](https://git-scm.com/book/en/v2/Git-Branching-Branches-in-a-Nutshell)

1. `git branch (--list)` would list branches of the local repository.<br>`git branch -r` to list remote branches.
1. `git branch <new branch name>` would start a new branch.
1. `git checkout <new branch name>` would switch to the new branch.
1. `git checkout -b <new branch name>` would do both steps above.
1. From now on, this new branch would accumulate commits.
1. After a few commits, when `git status` shows no uncommitted changes, try `git checkout <previous branch name>`.<br>Then check the files that you changed after previous step.
1. And then try `git checkout <new branch name>` again. What happened to your changes?

## Synchronizing after fork or distribution

* When you click on the `fork` button of a repository, you can duplicate it so that you can make changes.
* However, the developers may continue to the original (or *upstream*) repository; fix bugs and add more features.
* At some point of time, you may want to update your duplicate repository.
* [Github](https://help.github.com/articles/syncing-a-fork/) described the procedure to synchronize a fork repository with the upstream repository.

1. If not done yet, clone your remote fork repository to a local repository.
1. `git remote` will list names of remote repositories. Let's assume `origin` points to your fork repository.
1. Add the *upstream* repository address as `upstream`. <br>`git remote add upstream <upstream repository address>`
1. `git fetch upstream` would download updates from the upstream repository. However, this alone would not change your workspace yet.
1. Try `git log --oneline --graph --all`.  This would show you all the histories of local and remote branches.
1. Choose one of the local branches that you want to update.  Let's assume its name is `first_branch`.
1. Try `git rebase first_branch upstream/first_branch`. This would apply new commits in `upstream/first_branch` after fork to your local branch.<br>Depending on the situation, collsion may occur; then we should manually [resolve](https://git-scm.com/book/en/v2/Git-Tools-Advanced-Merging) the conflict.
1. Now `git push origin first_branch` to apply the new commits to the remote fork repository.
1. Repeat from `git log --oneline --graph -all` for all branches of interest.

## Travis-CI and Continuous Integration

* In short, if you have an open source software project, [Travis-CI](https://www.travis-ci.org) would be able to build, run test software, and reply reports as specified.
* Please refer to  the [Travis-CI documentation](https://docs.travis-ci.com/) for details.

## Exercises

### 00 : Your first commit

1. Clone your repository to your PC
1. Configure your name and email address
1. Make a new text file with a plain text
<br>What would you want to write to the file?
1. Add the new file
1. Commit the changes to your local repository with an appropriate message
1. Push your changes to your repository

### 01 : Sync with upstream

1. Clone your repository to your PC
1. Add the upstream repository url as remote
1. Fetch from the upstream repository
<br>Try `git log --oneline --all --graph`
1. Merge with the upstream branch
<br>How can we use `git merge` command?
<br>Did you see a message of "CONFLICT"?
1. Push your changes to your repository

### 02* : ipynb

* This is an optional assignment
* Please use a separate file : .txt, .ipynb, or .md

1. Propose a possible procedure to version-control .ipynb files
1. Propose a possible procedure to resolve conflict of an .ipynb file